Distributed, Fault-Tolerant In-Place Consensus Sequence on Innovative Hardware as a Building Block for Data Management
Quorum consensus algorithms like the Paxos algorithm are widely used as basic building blocks for fault-tolerance in distributed systems. Unfortunately, distributed quorum consensus causes much overhead to negotiate and safely store the consensus. We therefore plan to optimize Paxos-based fault-tolerance for sequences of consensus in three ways:
- exploit multicast and reduce operations of modern interconnects to reduce the latency and number of messages,
- use remote direct memory access (RDMA) in combination with NVRAM to manage a distributed shared state,
- modify Paxos to support a sequence of consensus decisions in-place and avoid separate memory resources for each Paxos instance.
- We then build efficient custom datatypes on top of consensus sequences, that support partial updates, multiple-reader-single-writer locks, or compare-and-swap semantics.
The resulting distributed fault-tolerant consensus will provide low latency and high-throughput decisions. It will allow to apply recoverable distributed consensus in new scenarios where it was avoided before due to its high latency. The optimized consensus can be used as a building block in current and future distributed data management and database systems – including those developed in SPP 2037 – that often rely on a sequence of decisions to process locks, transactions, to make atomic changes like compare and swap, to support replicated state machines, or to elect the next master etc.
This project is part of the DFG SPP 2037 on scalable data management for future hardware.
Publikationen
2021 |
|||
Lasse Thostrup, Jan Skrzypczak, Matthias Jasny, Tobias Ziegler, Carsten Binnig | DFI: The Data Flow Interface for High-Speed Networks | SIGMOD '21: International Conference on Management of Data, pp. 1825-1837, 2021 |
BibTeX
DOI |
2020 |
|||
Jan Skrzypczak, Florian Schintke, Thorsten Schütt | RMWPaxos: Fault-Tolerant In-Place Consensus Sequences | IEEE Transactions on Parallel and Distributed Systems, 31(10), pp. 2392-2405, 2020 |
BibTeX
DOI arXiv |
Jan Skrzypczak, Florian Schintke | Towards Log-Less, Fine-Granular State Machine Replication | Datenbank Spektrum, 20(3), pp. 231-241, 2020 |
BibTeX
DOI |
Thorsten Schütt, Florian Schintke, Jan Skrzypczak | Transactions on Red-black and AVL trees in NVRAM | arXiv, 2020 |
BibTeX
arXiv |
2019 |
|||
Gustavo Alonso, Carsten Binnig, Ippokratis Pandis, Kenneth Salem, Jan Skrzypczak, Ryan Stutsman, Lasse Thostrup, Tianzheng Wang, Zeke Wang, Tobias Ziegler | DPI: The Data Processing Interface for Modern Networks | 9th Biennial Conference on Innovative Data Systems Research, 2019 |
PDF
BibTeX |
Jan Skrzypczak, Florian Schintke, Thorsten Schütt | Linearizable State Machine Replication of State-Based CRDTs without Logs | Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, pp. 455-457, 2019 |
BibTeX
DOI |
Jan Skrzypczak, Florian Schintke, Thorsten Schütt | Linearizable State Machine Replication of State-Based CRDTs without Logs | arXiv, 2019 |
BibTeX
arXiv |
2018 |
|||
Lennart Steininger | Atomarer Datenzugriff auf verteilten persistenten Arbeitsspeicher | Master's thesis, Humboldt-Universität zu Berlin, Alexander Reinefeld, Jens-Peter Redlich (Advisors), 2018 |
BibTeX
|
2017 |
|||
Jan Skrzypczak | Weakening Paxos Consensus Sequences for Commutative Commands | Master's thesis, Humboldt-Universität zu Berlin, Alexander Reinefeld, Björn Scheuermann (Advisors), 2017 (preprint available as ZIB-Report 17-64) |
PDF (ZIB-Report)
BibTeX |