This is part#2 extension to #1 raft-protocol, although both can be read independently as well
Paxos is used for replication, master election, locking, replicating of metadata & configuration.
From #1 Data replication
Hardware failures are inevitable to escape for any distributed system. Some of the expected one’s are machine crashes, disk failures, network partitions and clock skews. To stay unaffected by these failures, replication is used to make data available & durable.
Replication in short means copying data to multiple physically isolated hardware and sync data continuously. While there are many pro’s due to replication while scaling, Implementation of replication is far from simple.Consensus based replication is one way to do it.
Consensus algorithms are used to reach consensus to commit a transaction in all the replicas.
Paxos is also consensus based replication protocol and it is more complex at the implementation layer compared to raft, it is still used in world class products such as Google Spanner, Megastore.
🏎️ It is used in F1 as well, not the race, but another distributed SQL database https://stephenholiday.com/notes/f1/
2PC: Two phase commit
Leader election is a different process, it can be static same leader, dynamic leader election using leader lease or any other way.
2 steps: 1. Prepare 2. Commit
Leader initially issues the ‘Prepare’ message to all the Acceptors and the Acceptors responds with a ‘Prepared’ message if the acceptor doesn’t have any other transaction in progress on the same file. On receiving ‘Prepared’ message from all the acceptors involved in the transaction, the leader begins the second phase of the 2PC. The leader issues a ‘Commit’ message to all the acceptors. Then the acceptors commit the value to the persistence storage that is agreed upon in the previous phase. Acceptors respond optionally with ‘Committed’ or ‘Aborted’ message depending on the implementation. If there is an in-progress transaction the acceptor responds with ‘Abort’ message in the first phase. If the Leader receives an Abort message from any one of the acceptors, then the leader issues, ‘Abort’ message to all the other acceptors to cancel the transaction that the leader initiated.
Problems;
Leader fails after prepare or during leader lease, system might take forever to comeback as acceptors have to drain waiting till timeout or health check
Any acceptor fails, entire transaction should be aborted
These leads to bottlenecks of latency and transaction failures.
Paxos tries to address these concerns suing Proposers, Acceptors and Learners.
The significant optimization in the Paxos compared to 2PC is that there is no need for all the acceptors to accept the proposal from the Proposer. Instead, it is considered that the consensus is reached when the majority of the acceptors accepts the proposal.
There can be more than one proposers at a time too.The acceptors that do not participate are learner, who can be synced asynchronously and can be moved out of transaction.
Happy case flow
Each paxos run will have only one consensus reached and it cannot be changed until another paxos run comes into place, paxos nodes must be persistent, they cannot forget what they accepted .
Contention can be caused in paxos, if several proposals can run int the same paxos run, exponential backoff is used to not stall the system in such cases.
Plata O Plomo
In direct to the general complications, you can see it needs to have other ways for state machine replication, log replication, these cons make RAFT more easier to implement
Let's summarise the key components and concepts of Paxos:
Proposers, Acceptors, and Learners:
Proposers suggest values to be agreed upon.
Acceptors receive proposals and decide whether to accept them.
Learners learn the chosen value after consensus is reached.
Phases of Paxos:
Phase 1 (Prepare/Promise): A proposer sends a prepare request with a proposal number to a quorum of acceptors.
Phase 2 (Propose/Accept): If an acceptor has not promised a higher proposal number, it responds with a promise and optionally the highest proposal it has accepted so far. The proposer then sends an accept request with the proposal number and value to the acceptors.
Quorum: A majority of acceptors must agree on a proposal for it to be chosen.
Implementations:
https://www.freecodecamp.org/news/how-to-implement-paxos-algorithm-in-pure-functions/
https://github.com/dibyendumajumdar/paxos
https://github.com/cocagne/paxos
References:
https://lamport.azurewebsites.net/pubs/paxos-simple.pdf
https://substack.com/home/post/p-145189829
https://medium.com/@logeshrajendran/paxos-a9d76ebf04f3
video: