Four Things Working at Facebook Has Taught Me About Design Critique

“…at Facebook, critiques have played out a bit differently. The meetings are much more centralized around authentic critique and less about providing criticism or pushing an agenda.

Many of the methods we’ve incorporated for critiques come primarily fromJared M. Spool’s Moving from Critical Review to Critique.” What Spool writes about critique has made a tremendous impact on my understanding of what makes a critique worthwhile, particularly at Facebook. As a result, I’ve come to embrace the notion that dedicating a few hours every week for a meeting can undoubtedly prove itself to be valuable for everyone who attends…”

Writing high-performance servers in modern C++

“I mentioned in my previous post that I was able to build a prototype database engine within one day using Facebook’s Wangle so this post explains how I managed that. By the end of this post, you will be able to write a high-performance C++ server using Wangle. This post also serves as a tutorial which will be merged into Wangle’s…”

Immutable collections for JavaScript

Immutable data cannot be changed once created, leading to much simpler application development, no defensive copying, and enabling advanced memoization and change detection techniques with simple logic. Persistent data presents a mutative API which does not update the data in-place, but instead always yields new updated data.

Immutable provides Persistent Immutable List, Stack, Map, OrderedMap, Set, OrderedSet and Record. They are highly efficient on modern JavaScript VMs by using structural sharing via hash maps tries and vector tries as popularized by Clojure and Scala, minimizing the need to copy or cache data.

Immutable also provides a lazy Seq, allowing efficient chaining of collection methods like map and filter without creating intermediate representations. Create some Seq with Range and Repeat…”

Gorilla: A Fast, Scalable, In-Memory Time Series Database

“Large-scale internet services aim to remain highly available and responsive in the presence of unexpected failures. Providing this service often requires monitoring and analyzing tens of millions of measurements per second across a large number of systems, and one particularly eective solution is to store and query such measurements in a time series database (TSDB). A key challenge in the design of TSDBs is how to strike the right balance between eciency, scalability, and reliability. In this paper we introduce Gorilla, Facebook’s in-memory TSDB. Our insight is that users of monitoring systems do not place much emphasis on individual data points but rather on aggregate analysis, and recent data points are of much higher value than older points to quickly detect and diagnose the root cause of an ongoing problem. Gorilla optimizes for remaining highly available for writes and reads, even in the face of failures, at the expense of possibly dropping small amounts of data on the write path. To improve query eciency, we aggressively leverage compression techniques such as delta-of-delta timestamps and XOR’d oating point values to reduce Gorilla’s storage footprint by 10x. This allows us to store Gorilla’s data in memory, reducing query latency by 73x and improving query throughput by 14x when compared to a traditional database (HBase)-backed time series data. This performance improvement has unlocked new monitoring and debugging tools, such as time series correlation search and more dense visualization tools. Gorilla also gracefully handles failures from a single-node toentire regions with little to no operational overhead…”
paper: p1816-teller

One Trillion Edges: Graph Processing at Facebook-Scale

“Analyzing large graphs provides valuable insights for social networking and web companies in content ranking and recommendations. While numerous graph processing systems have been developed and evaluated on available benchmark graphs of up to 6.6B edges, they often face significant dif- ficulties in scaling to much larger graphs. Industry graphs can be two orders of magnitude larger – hundreds of billions or up to one trillion edges. In addition to scalability challenges, real world applications often require much more complex graph processing workflows than previously evaluated. In this paper, we describe the usability, performance, and scalability improvements we made to Apache Giraph, an open-source graph processing system, in order to use it on Facebook-scale graphs of up to one trillion edges. We also describe several key extensions to the original Pregel model that make it possible to develop a broader range of production graph applications and workflows as well as improve code reuse. Finally, we report on real-world operations as well as performance characteristics of several large-scale production applications…”

file: p1804-ching

Improving Facebook’s performance on Android with FlatBuffers

“On Facebook, people can keep up with their family and friends through reading status updates and viewing photos. In our backend, we store all the data that makes up the social graph of these connections. On mobile clients, we can’t download the entire graph, so we download a node and some of its connections as a local tree structure.

The image below illustrates how this works for a story with picture attachments. In this example, John creates the story, and then his friends like it and comment on it. On the left-hand side of the image is the social graph, to describe relations in the Facebook backend. When the Android app queries for the story, we get a tree structure starting with the story, including information about actor, feedback, and attachments (shown on the right-hand side of the image)…”

Wikipedia on HHVM

“If you’ve been watching our GitHub wiki, following us on Twitter, or reading the wikitech-l mailing list, you’ve probably known for a while that Wikipedia has been transitioning to HHVM. This has been a long process involving lots of work from many different people, and as of a few weeks ago, all non-cached API and web traffic is being served by HHVM. This blog post from the Wikimedia Foundation contains some details about the switch, as does their page about HHVM.

I spent four weeks in July and August of 2014 working at the Wikimedia Foundation office in San Francisco to help them out with some final migration issues. While the primary goal was to assist in their switch to HHVM, it was also a great opportunity to experience HHVM as our open source contributors see it. I tried to do most of my work on WMF servers, using HHVM from GitHub rather than our internal repository. In addition to the work I did on HHVM itself, I also gave a talk about what the switch to HHVM means for Wikimedia developers…”