What is Raft Protocol?
Achieve high read performance with single-key linearizability and leader leases.
Data replication
Hardware failures are inevitable to escape for any distributed system. Some of the expected one’s are machine crashes, disk failures, network partitions and clock skews. To stay unaffected by these failures, replication is used to make data available & durable.
Replication in short means copying data to multiple physically isolated hardware and sync data continuously. While there are many pro’s due to replication while scaling, Implementation of replication is far from simple.
Consensus based replication is one way to do it.
Consensus means multiple servers agreeing on a value, If in a cluster there are 5 servers, a quorum consensus is used to agree on reading and writing values from/to DB, to form a strong consistent system of 5 servers, at least 3 servers should be up and running to agree on a decision. Under consensus, there are Raft and Paxos most commonly used protocols where leaders manage the replication.
Paxos is three phase commit(3PC) protocol, which we will discuss in a later post. Keep subscribed
Raft was proposed as an alternative to Paxos, owing to solve complexities in paxos implementation by separating leader election process. “The separation of logic stems from the fact that Raft makes leader election a separate and mandatory step before the leader can get the data replicated, whereas a Paxos implementation would make the leader election process an implicit outcome of reaching agreement through data replication.” quoted from reference. This allows adding & removing servers to be easier, more like an abstraction. Raft imposes the restriction that only the servers with most up-to-date data can become leaders.
When there are no unplanned failures or planned changes, the leader election step can be skipped. The leader election step is automatic whenever such changes happen even when no new writes are coming into the system. These optimizations radically simplify edge cases in which a succession of leadership changes can result in data discrepancies, but the tradeoff is that leader election in Raft can be more complicated than its counterpart in Paxos.
A distributed SQL database uses the Raft consensus algorithm for both leader election and data replication. Instead of having a single Raft group for the entire dataset in the cluster, a distributed SQL database applies Raft replication at an individual shard level where each shard has a Raft group of its own.
Raft is the defacto standard today for achieving strong consistency
Going back to the definition: Raft protocol is used to achieve high read performance with single-key linearizability and leader leases.
Linearizability
A consistency model that ensures each operation on a shared object appears instantaneously at some point between its invocation and its response. It provides the illusion that operations are executed in a sequential order, which is consistent with the real-time order of operations.
Without Linearizability
With Linearizability
Linearizability means that modifications happen instantaneously, and once a registry value is written, any subsequent read operation will find the very same value as long as the registry will not undergo any modification.
Cons:Inter-Key Operations: Single-key linearizability does not provide guarantees about the order or atomicity of operations that span multiple keys. For applications that require strong consistency across multiple keys, additional mechanisms (like transactions) are needed.
Linearizability is one of the strongest single-key consistency models. It implies that every operation appears to take place atomically and in some total linear order. This means it’s consistent with the real-time ordering of those operations. In other words, the following should be true of operations on a single key:
Operations can execute concurrently, but the state of the database at any point in time must appear to be the result of some ordered, sequential execution of operations.
If operation A completes before operation B begins, then B should logically take effect after A.
The Raft consensus algorithm solves this problem.
Leader leases are a mechanism used in distributed systems to ensure a consistent and reliable leader election process.They are particularly useful in systems that require a single leader to coordinate actions or make decisions to maintain consistency.
References:
https://stackoverflow.com/questions/9762101/what-is-linearizability
https://www.yugabyte.com/tech/raft-consensus-algorithm/
https://www.yugabyte.com/blog/how-does-consensus-based-replication-work-in-distributed-databases
https://raft.github.io/
All of the contents are aggregated, copied, quoted from internet, I’m making notes and sharing them while learning
Good read, but I see you have written summary, it will be great if you extend on writing about internals of Raft implementation, how consensus is achieved.