Unwinding Uber’s Most Efficient Service

Uber posted an article detailing how they built their “highest query per second service using Go”. The article is fairly short and is required reading to understand the motivation for this post. I have been doing some geospatial work in Golang lately and I was hoping that Uber would present some insightful approaches to working with geo data in Go. What I found fell short of my expectations to say the least…


Dockerizing MySQL at Uber Engineering

Uber Engineering’s Schemaless storage system powers some of the biggest services at Uber, such as Mezzanine. Schemaless is a scalable and highly available datastore on top ofMySQL¹ clusters. Managing these clusters was fairly easy when we had 16 clusters. These days, we have more than 1,000 clusters containing more than 4,000 database servers, and that requires a different class of tooling.

Initially, all our clusters were managed by Puppet, a lot of ad hoc scripts, and manual operations that couldn’t scale at Uber’s pace. When we began looking for a better way to manage our increasing number of MySQL clusters, we had a few basic requirements:

  • Run multiple database processes on each host
  • Automate everything
  • Have a single entry point to manage and monitor all clusters across all data center

The solution we came up with is a design called Schemadock. We run MySQL in Dockercontainers, which are managed by goal states that define cluster topologies in configuration files. Cluster topologies specify how MySQL clusters should look; for example, that there should be a Cluster A with 3 databases, and which one should be the master. Agents then apply those topologies on the individual databases. A centralized service maintains and monitors the goal state for each instance and reacts to any deviations.

Schemadock has many components, and Docker is a small but significant one. Switching to a more scalable solution has been a momentous effort, and this article explains how Docker helped us get here.


How Uber Is Disrupting Business With Machine Learning

With the company valued at over $62 billion and running in over 523 cities around the world with plans to launch a fleet of autonomous cars, Uber is constantly looking to disrupt and innovate. And they are leading the pack when it comes to the use of machine learning in their strategy.

At the Kaizen Data Conference hosted at Galvanizes’ San Francisco campus, Uber’s Head of Machine Learning, Danny Lange, discussed past, present, and future advancements in his area of expertise. This compelling talk covers artificial intelligence’s central role in business disruption and innovation, from Uber’s self-driving cars to emerging new technologies. Take a look:


Why we lost Uber as a user…

On 07/26/2016 01:53 PM, Josh Berkus wrote:
> The write amplification issue, and its correllary in VACUUM, certainly
> continues to plague some users, and doesn’t have any easy solutions.

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That’s a recipe for runaway table bloat; VACUUM can’t do much because
there’s always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn’t really help, we’re talking about minutes here), and
because of all of the indexes HOT isn’t effective. Removing the indexes
is equally painful because it means less efficient JOINs.

The Uber guy is right that InnoDB handles this better as long as you
don’t touch the primary key (primary key updates in InnoDB are really bad).

This is a common problem case we don’t have an answer for yet.

Josh Berkus
Red Hat OSAS
(any opinions are my own)



The early architecture of Uber consisted of a monolithic backend application written in Python that used Postgres for data persistence. Since that time, the architecture of Uber has changed significantly, to a model of microservices and new data platforms. Specifically, in many of the cases where we previously used Postgres, we now use Schemaless, a novel database sharding layer built on top of MySQL. In this article, we’ll explore some of the drawbacks we found with Postgres and explain the decision to build Schemaless and other backend services on top of MySQL.

The Architecture of Postgres

We encountered many Postgres limitations:

  • Inefficient architecture for writes
  • Inefficient data replication
  • Issues with table corruption
  • Poor replica MVCC support
  • Difficulty upgrading to newer releases

We’ll look at all of these limitations through an analysis of Postgres’s representation of table and index data on disk, especially when compared to the way MySQL represents the same data with its InnoDB storage engine. Note that the analysis that we present here is primarily based on our experience with the somewhat old Postgres 9.2 release series. To our knowledge, the internal architecture that we discuss in this article has not changed significantly in newer Postgres releases, and the basic design of the on-disk representation in 9.2 hasn’t changed significantly since at least the Postgres 8.3 release (now nearly 10 years old).

Why Uber Engineering Switched from Postgres to MySQL



At the lower levels, Uber’s engineers primarily write in Python, Node.js, Go, and Java. We started with two main languages: Node.js for the Marketplace team, and Python for everyone else. These first languages still power most services running at Uber today.

We adopted Go and Java for high performance reasons. We provide first-class support for these languages. Java takes advantage of the open source ecosystem and integrates with external technologies, like Hadoop and other analytics tools. Go gives us efficiency, simplicity, and runtime speed.

We rip out and replace older Python code as we break up the original code base into microservices. An asynchronous programming model gives us better throughput. We use Tornado with Python, but Go’s native support for concurrency is ideal for most new performance-critical services.

We write tools in C and C++ when it’s necessary (like for high-efficiency, high-speed code at the system level). We use software that’s written in those languages—HAProxy, for example—but for the most part, we don’t actually work in them.

And, of course, those working at the top of the stack write in languages beyond Java, Go, Python, and Node.

The Uber Engineering Tech Stack, Part I: The Foundation

Designing Schemaless, Uber Engineering’s Scalable Datastore Using MySQL