The early architecture of Uber consisted of a monolithic backend application written in Python that used Postgres for data persistence. Since that time, the architecture of Uber has changed significantly, to a model of microservices and new data platforms. Specifically, in many of the cases where we previously used Postgres, we now use Schemaless, a novel database sharding layer built on top of MySQL. In this article, we’ll explore some of the drawbacks we found with Postgres and explain the decision to build Schemaless and other backend services on top of MySQL.
The Architecture of Postgres
We encountered many Postgres limitations:
- Inefficient architecture for writes
- Inefficient data replication
- Issues with table corruption
- Poor replica MVCC support
- Difficulty upgrading to newer releases
Location and navigation using global positioning systems (GPS) is deeply embedded in our daily lives, and is particularly crucial to Uber’s services. To orchestrate quick, efficient pickups, our GPS technologies need to know the locations of matched riders and drivers, as well as provide navigation guidance from a driver’s current location to where the rider needs to be picked up, and then, to the rider’s chosen destination. For this process to work seamlessly, the location estimates for riders and drivers need to be as precise as possible.
Since the (literal!) launch of GPS in 1973, we have advanced our understanding of the world, experienced exponential growth in the computational power available to us, and developed powerful algorithms to model uncertainty from fields like robotics. While our lives have become increasingly dependent on GPS, the fundamentals of how GPS works have not changed that much, which leads to significant performance limitations. In our opinion, it is time to rethink some of the starting assumptions that were true in 1973 regarding where and how we use GPS, as well as the computational power and additional information we can bring to bear to improve it.
While GPS works well under clear skies, its location estimates can be wildly inaccurate (with a margin of error of 50 meters or more) when we need it the most: in densely populated and highly built-up urban areas, where many of our users are located. To overcome this challenge, we developed a software upgrade to GPS for Android which substantially improves location accuracy in urban environments via a client-server architecture that utilizes 3D maps and performs sophisticated probabilistic computations on GPS data available through Android’s GNSS APIs.
In this article, we discuss why GPS can perform poorly in urban environments and outline how we fix it using advanced signal processing algorithms deployed at scale on our server infrastructure.
Matt Ranney talks about the limits that some companies have encountered in their large microservices deployments and some non-microservices approaches to those same problems. He also talks about the non-microservices systems that Uber is building to maintain developer productivity with a large and growing engineering team.
Uber posted an article detailing how they built their “highest query per second service using Go”. The article is fairly short and is required reading to understand the motivation for this post. I have been doing some geospatial work in Golang lately and I was hoping that Uber would present some insightful approaches to working with geo data in Go. What I found fell short of my expectations to say the least…