True Zero Downtime HAProxy Reloads

HAProxy: Cornerstone of Reliable Websites

One primary goal of the infrastructure teams here at Yelp is to get as close to zero downtime as possible. This means that when users make requests for we want to ensure that they get a response, and that they get a response as fast as possible. One way we do that at Yelp is by using the excellent HAProxy load balancer. We use it everywhere: for our external load balancing, internal load balancing, and with our move to a Service Oriented Architecture, we find ourselves running HAProxy on every machine at Yelp as part of SmartStack.

We love the flexibility that SmartStack gives us in developing our SOA, but that flexibility comes at a cost. When services or service backends are added or permanently removed, HAProxy has to reload across our entire infrastructure. These reloads can cause reliability problems because while HAProxy is top notch at not dropping traffic while it is running, it can (and does) drop traffic during reloads…”


Building a high performance SSD SAN

“Over the coming month I will be architecting, building and testing a modular, high performance SSD-only storage solution.

I’ll be documenting my progress / findings along the way and open sourcing all the information as a public guide.

With recent price drops and durability improvements in solid state storage now is better time than any to ditch those old magnets.

Modular server manufacturers such as SuperMicro have spent large on R&D thanks to the ever growing requirements from cloud vendors that utilise their hardware…”

Luajit IO framework

“Pure Lua IO framework, with C efficiency due to its simple but powerful design, and, of course, thanks to luajit, the perfect JIT engine. The Http Server is out-of-the-box, which simulates the functionalities and performance of nginx and ngx_lua. Moreover, it could be used to develop generic TCP/UDP server…”

Building a Distributed Fault-Tolerant Key-Value Store

“First of all, let’s discuss briefly what a Key-Value store is, and how it compares to a relational database.

Key-Value Stores offer a simple abstraction over your data, working as a dictionary data-structure. Such database provides a mechanism for storage and retrieval of data that is modeled in manipulated by means of basic CRUD operations (create, read, update, delete). The API of these databases is usually kept simple, and even if they provide an SQL-like language like Cassandra’s Query Language, it’s intentionally kept much simpler than full-blown SQL.


This simpler functionality means that Key-Value Stores, and NoSQL databases in general, often give more responsibility to the user, who now needs to manually do a lot of the work that the system takes care of automatically in a relational database. They sacrifice the expressivity brought by an expressive language like SQL, and the integrity checks brought by these schema-based models. This in turn means that NoSQL systems are free to choose other trade-offs that will result in higher availability, performance, scalability or other specific qualities.

One important thing regarding RDBMS and NoSQL is their respective theoretical models, which establishes the guarantees that such a system provides to the end user. They are known as the ACID and BASE models, in one of those fancy metaphors made out of acronyms.

So, why are NoSQL systems popular nowadays? Well, not without some controversy, but the main selling points could be summarized as:

  • Speed
  • Single Point of Failure (SPOF) avoidance
  • Better support for Large amounts of unstructured data
  • Lower TCO (Total cost of operation, sysadmins)
  • Incremental scalability

This are the sort of qualities that define Google, Facebook and the other big players and their business models, so it makes perfect sense to them. Whether it makes sense for your particular situation (probably not), well, it’s the core of that controversy, and it’s not really the intention of this post to dig into that…”

RethinkDB 2.0 is now production ready

“RethinkDB 2.0 inverts the traditional database model. Instead of polling for changes, the developer can tell RethinkDB to continuously push live query updates to the application. Propagating updates from the persistence layer simplifies realtime application architecture by eliminating the need for external message queues and other extraneous plumbing. You can also take advantage of RethinkDB’s distributed architecture to effortlessly scale your realtime app…”