The technique we use to achieve availability is replication.
Replication
Replication is the main technique used in distributed systems to increase availability. It consists of storing the same piece of data in multiple nodes (called replicas) so that if one of them crashes, data is not lost, and requests can be served from the other nodes in the meanwhile.
Copies of the same data
However, the benefit of increased availability from replication comes with a set of new complications.
Replication implies that the system now has multiple copies of every piece of data. These copies must be maintained and kept in sync with each other on every update.
Ideally, replication should function transparently to the end-user, or engineer. This is to create the illusion that there’s only one copy of every piece of data. This makes a distributed system look like a simple, centralized system of a single node that is much easier to reason about and develop software around.
Of course, this is not always possible. We may require significant hardware resources or need to give up other desirable properties to achieve this ideal. For instance, engineers sometimes willingly accept a system that provides much higher performance, but occasionally gives a non-consistent view of the data. Hence, they only do this under specific conditions—and in a specific way—they can account for when they design the application.
Therefore, there are two main strategies for replication:
Pessimistic replication
Optimistic replication