Fast key-value stores

Atul Adya, Robert Grandl, Daniel Myers, and Henry Qin. 2019. Fast key-value stores: An idea whose time has come and gone. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS '19). ACM, New York, NY, USA, 113-119. DOI:

The central thesis of this paper is that the overhead of serialisation makes remote in-memory key-value (the paper uses the abbreviation RInK) stores isn't worth the cost of building out a custom in-memory store with a domain specific API, specifically due to recent advances in auto-sharding.


Generally, RInK stores solve two problems:

  1. Caching objects over a storage system
  2. Store short-lived data, such as session information

Solving these two problems allows us to build applications that are stateless, which means that requests can be handled by any server. There are some scaling benefits, as more servers can be brought online to handle load spikes, and traffic can be shunted away from failed nodes.

RInK stores do this by providing a domain-indepedent API that operates over a sequence of bytes, e.g. the put, get, and delete functions. However, this pushes complexity back into the application server; the authors note that this

force[s] applications to repeatedly convert their internal data structures between native language representations and strings

Another problem the authors point out is when applications aren't using all of the data stored under a particular key, because there is an inefficiency of work here ("bytes are needlessly transferred only to be discarded"). Finally, there's the network latency cost.

The proposed solution is to build stateful application servers that "couples application code and long-lived memory state in the same process." If that's not possible, then building a custom in-memory store is the right answer - it's a network distance away but exposes a custom, domain-specific API. As a teaser as to the benefits of this, there's a note about a Google application that employed this idea of this "linked, in-memory key-value (LInK) store" that saw a 40% reduction of the 99.9% latency, with the potential for even further performance improvements.

There are three contributions that are going to be made in the paper:

  1. Arguments in favour of the idea that coupling application code and cached in-memory state has a lot of unappreciated performance benefits.
  2. The LInK abstraction, which is a "key-to-rich-object map linked into application servers or custom in-memory stores" to replace the RInK.
  3. Improving aplpication performance by switching to this system.

Notably, persisting data is considered outside the scope of this paper, and is ignored.


Internet services need to be highly available and reliable; one way to improve reliability is to make things as simple as possible. Since the 1990s, we've been converging on the idea of building stateless application servers (e.g. the Twelve factor app) because state adds complications. A stateless app is easier to debug, easier to maintain, and easier to operate.


My initial thought is that this is something that applies to Google, maybe, but not necessarily to everyone else. My intuition is that the performance hit from using a remote in-memory kv store just is probably not a high priority problem in most places, especially not balanced against the cost of building and maintaining a custom store. I never looked much into foundationDB, but I think (weakly) that it's a distributed SQLite, which might basically serve the same function. I think the biggest gain from the paper would be local caching of data, and I'm not sure to what extent FoundationDB does that. I'm not sure what a typical store size would be, but from past experience, it's under a gigabyte, which is tenable to store an entire copy locally - and I assume (quite possibly wrongly) that a distributed SQLite would have a way to share deltas [1], so there would an initial sync period for a new worker where the data wouldn't be local, and then it would be relatively in sync. In the past, I've used the remote KV store for both global things (e.g. a CSRF key) and for per-session stuff. As far as the per-session stuff, the work using the request data would have local access, assuming it created the data. It might be that the cost of reliably syncing this data isn't worth the trouble of the network roundtrip, and operations teams would probably have less experience with it.

It'd probably be better to invest the time in a custom library that does the translation, and then maybe keeps a local cache. There's still the sync question, though.