Quorum consensus algorithms like the Paxos algorithm are widely used as basic building blocks for fault-tolerance in distributed systems. Unfortunately, distributed quorum consensus causes much overhead to negotiate and safely store the consensus. We therefore plan to optimize Paxos-based fault-tolerance for sequences of consensus in three ways:
- exploit multicast and reduce operations of modern interconnects to reduce the latency and number of messages,
- use remote direct memory access (RDMA) in combination with NVRAM to manage a distributed shared state,
- modify Paxos to support a sequence of consensus decisions in-place and avoid separate memory resources for each Paxos instance.
- We then build efficient custom datatypes on top of consensus sequences, that support partial updates, multiple-reader-single-writer locks, or compare-and-swap semantics.
The resulting distributed fault-tolerant consensus will provide low latency and high-throughput decisions. It will allow to apply recoverable distributed consensus in new scenarios where it was avoided before due to its high latency. The optimized consensus can be used as a building block in current and future distributed data management and database systems – including those developed in SPP 2037 – that often rely on a sequence of decisions to process locks, transactions, to make atomic changes like compare and swap, to support replicated state machines, or to elect the next master etc.
This project is part of the DFG SPP 2037 on scalable data management for future hardware.