KV cache is like a giant hash map and used to reduce the latency of data access, typically by
Putting data from slow and cheap media to fast and expensive ones.
Indexing from tree-based data structures of O(log n) to hash-based ones of O(1) to read and write
There are various cache policies like read-through/write-through(or write-back), and cache-aside. By and large, Internet services have a read to write ratio of 100:1 to 1000:1, so we usually optimize for read.
In distributed systems, we choose those policies according to the business requirements and contexts, under the guidance of CAP theorem.
Regular Patterns
Read
Read-through: the clients read data from the database via the cache layer. The cache returns when the read hits the cache; otherwise, it fetches data from the database, caches it, and then return the vale.
Write
Write-through: clients write to the cache and the cache updates the database. The cache returns when it finishes the database write.
Write-behind / write-back: clients write to the cache, and the cache returns immediately. Behind the cache write, the cache asynchronously writes to the database.
Write-around: clients write to the database directly, around the cache.
Cache-aside pattern
When a cache does not support native read-through and write-through operations, and the resource demand is unpredictable, we use this cache-aside pattern.
Read: try to hit the cache. If not hit, read from the database and then update the cache.
Write: write to the database first and then ==delete the cache entry==. A common pitfall here is that people mistakenly update the cache with the value, and double writes in a high concurrency environment will make the cache dirty.
==There are still chances for dirty cache in this pattern.== It happens when these two cases are met in a racing condition:
read database and update cache
update database and delete cache
Where to put the cache?
client-side
distinct layer
server-side
https://aerospike.com/learn/using-key-value-store-as-a-cache/