Historically, we have used Redis in two ways at GitHub:
We used it as an LRU cache to conveniently store the results of expensive computations over data originally persisted in Git repositories or MySQL. We call this transient Redis.
We also enabled persistence, which gave us durability guarantees over data that was not stored anywhere else. We used it to store a wide range of values: from sparse data with high read/write ratios, like configuration settings, counters, or quality metrics, to very dynamic information powering core features like spam analysis. We call this persistent Redis.
Recently we made the decision to disable persistence in Redis and stop using it as a source of truth for our data. The main motivations behind this choice were to:
- Reduce the operational cost of our persistence infrastructure by removing some of its complexity.
- Take advantage of our expertise operating MySQL.
- Gain some extra performance, by eliminating the I/O latency during the process of writing big changes on the server state to disk.
Transitioning all that information transparently involved planning and coordination. For each problem domain using persistent Redis, we considered the volume of operations, the structure of the data, and the different access patterns to predict the impact on our current MySQL capacity, and the need for provisioning new hardware.
For the majority of callsites, we replaced persistent Redis with
GitHub::KV, a MySQL key/value store of our own built atop InnoDB, with features like key expiration. We were able to use
GitHub::KV almost identically as we used Redis: from trending repositories and users for the explore page, to rate limiting to spammy user detection.