Performance matters. One of the first tools we as developers reach for when looking to get more performance out of a system is caching. As Reddit has grown in users and response times have improved, the amount of caching has grown to be quite large as well.
In this post we’ll talk about some of the nuts-and-bolts numbers of Reddit’s caching infrastructure—the number of instances, size of instances, and overall throughput. We hope that sharing this information may help others gauge what type of performance and sizing they can expect when building similar clusters. At the very least, we hope you’ll find it interesting to see a bit more about how Reddit works under the hood.
We’ll also go over the Reddit-specific type of work our caches do, how we use mcrouter to manage our caches more effectively, and the custom monitoring (MemcachedSlabCollector and mcsauna) we’ve written to help us understand what’s going on behind the scenes. We’ll also talk about some of the more subtle issues that we’ve run into when deploying changes to our caches.