Distributed database design using CAP theorem
CAP is a theorem/standard which can be applied while designing distributed systems like distributed databases. A distributed database is a set of databases(nodes) which are spread across the network.
These set of databases(nodes) in the distributed database can be either designed as either a master/slave or a sharded model.
Consistency - Database should return the latest write on the DB.
Availability - Database should return data in a decent amount of time. But, the data might not be latest.
Partition tolerant - In spite of n/w failure, the database should always be available.
A distributed database can be designed either as a CP, AP, CA but not with all the 3 capabilities.
AP - Available and Partition tolerant - Sharded model (Data is distributed, No Master) -
Availability - Because, the data is distributed, any of the nodes can return the data. So, always available
Partition tolerant - If any node goes down, the other nodes can step up and returns the data since the data is distributed.
MORE READs, FEWER WRITEs databases can use this model
CP - Consistency and Partition tolerant - (Shraded/Distributed model but one node will be master)
Consistency - Since, you have a master, your writes always goes through the master. So, you always have the latest read.
Partition tolerant - If any node goes down, the other nodes can step up and returns the data since the data is distributed.
MORE WRITEs, FEWER READs databases can use this model
CA - Consistency and available (Master and slave (slaves syncs data from master), not sharded)
Consistency - Since, you have a master, So, you always have the latest read.
Availability - As long as the network is available, the DB is also available.
So, every model has a tradeoff. To met 2 aspects, you will have to relax the other aspect.
Comments
Post a Comment