Searching 20 GB/sec: Systems Engineering Before Algorithms

TL;DR: Four years ago, I left Google with the idea for a new kind of server monitoring tool. The idea was to combine traditionally separate functions such as log exploration, log aggregation and analysis, metrics gathering, alerting, and dashboard generation into a single service. One tenet was that the service should be fast, giving ops teams a lightweight, interactive, “fun” experience. This would require analyzing multi-gigabyte data sets at subsecond speeds, and doing it on a budget. Existing log management tools were often slow and clunky, so we were facing a challenge, but the good kind — an opportunity to deliver a new user experience through solid engineering.

This article describes how we met that challenge using an “old school”, brute-force approach, by eliminating layers and avoiding complex data structures. There are lessons here that you can apply to your own engineering challenges.

Sorting Algorithm Animations

These pages show 8 different sorting algorithms on 4 different initial conditions. These visualizations are intended to:

  • Show how each algorithm operates.
  • Show that there is no best sorting algorithm.
  • Show the advantages and disadvantages of each algorithm.
  • Show that worse-case asymptotic behavior is not always the deciding factor in choosing an algorithm.
  • Show that the initial condition (input order and key distribution) affects performance as much as the algorithm choice.

The ideal sorting algorithm would have the following properties:

  • Stable: Equal keys aren’t reordered.
  • Operates in place, requiring O(1) extra space.
  • Worst-case O(n·lg(n)) key comparisons.
  • Worst-case O(n) swaps.
  • Adaptive: Speeds up to O(n) when data is nearly sorted or when there are few unique keys.

There is no algorithm that has all of these properties, and so the choice of sorting algorithm depends on the application.

Make any algorithm lock-free with this one crazy trick

Lock-free algorithms often operate by having several versions of a data structure in use at one time. The general pattern is that you can prepare an update to a data structure, and then use a machine primitive to atomically install the update by changing a pointer. This means that all subsequent readers will follow the pointer to its new location – for example, to a new node in a linked-list – but this pattern can’t do anything about readers that have already followed the old pointer value, and are traversing the previous version of the data structure.

Those readers will see a correct, linearizable version of the data structure, so this pattern doesn’t present a correctness concern. Instead, the problem is garbage collection: who retires the old version of the data structure, to free the memory it’s taking now that it’s unreachable? To put it another way: how do you tell when all possible readers have finished reading an old version?

Of course, there are many techniques for solving this reclamation problem. See this paper for a survey, and this paper for a recent improvement over epoch-based reclamation. RCU, which is an API for enabling single-writer, multi-reader concurrency in the Linux kernel, has an elegant way of solving the problem.

Implementing a Stepping Debugger in JavaScript

In my previous post I introduced Unwinder, a project which implements continuations in JavaScript. What does that have to do with stepping debuggers? Unwinder uses continuations to implement a debugger, since it can reuse the same machinery to pause code at any time.

This post could be titled “Implementing Continuations in JavaScript,” but a lot more people know what a stepping debugger is. Besides, implementing a stepping debugger is a pretty cool use case for this stuff.

If you haven’t read the previous article (which explains in-depth what continuations are and heavily uses stepping debuggers to help explain them), here is a live example of a stepping debugger. You can edit the code, and click on any line to set a breakpoint…

How VectorDrawable works

We’ve talked about two of the biggest players in Android image format world (JPG & PNG) but it’s worth noting, that even with these advanced algorithms, there’s still a level of compression that these impressive formats can’t get to. To get there, we have to kick our concept of image creation to the curb, for something… a little more algorithmic.

Webinar Recording: Design Patterns and Modern C++

The recording of our May 24th webinar, Design Patterns and Modern C++, is now available on JetBrainsTV YouTube channel.

In this webinar, Dmitri Nesteruk shows how the classical Design Patterns can be applied to Modern C++. He is covering both their canonical implementations as well as possible improvements.

Demo project is available on GitHub. And if you have any suggestions or improvements, your pull requests are welcome!

TrailDB – An Efficient Library for Storing and Processing Event Data

tl;dr Today, we are open-sourcing TrailDB, a core library powering AdRoll. TrailDB makes it fast and fun to handle event data. Find it at

TrailDB was created at AdRoll to power processing of time-series of events. You can use it, for instance, to

  • Compute metrics, such as the bounce rate
  • Analyze usage patterns
  • Detect anomalies
  • Cluster and predict user behavior

Since 2014, AdRoll has used TrailDB to store and query over 20 trillion events originating from the web.