thoughts…

rants and bookmarks about programming stuff…


Tumblr Architecture – 15 Billion Page Views a Month and Harder to Scale than Twitter

“With over 15 billion page views a month Tumblr has become an insanely popular blogging platform. Users may like Tumblr for its simplicity, its beauty, its strong focus on user experience, or its friendly and engaged community, but like it they do.

Growing at over 30% a month has not been without challenges. Some reliability problems among them. It helps to realize that Tumblr operates at surprisingly huge scales: 500 million page views a day, a peak rate of ~40k requests per second, ~3TB of new data to store a day, all running on 1000+ servers…”

http://highscalability.com/blog/2012/2/13/tumblr-architecture-15-billion-page-views-a-month-and-harder.html


Leave a comment

Concurrent Programming for Scalable Web Architectures

“Now let us take a step back and summarize a few important developments:

  1. The web is a dominant and ubiquitous computing technology. It will replace many traditional desktop application environments and provide ubiquitous information access.
  2. Web applications have to cope with increasing demand and scale to larger user bases. Applications that incorporate features such as collaboration and web real-time interaction are facing new challenges as compared to traditional web applications.
  3. Multi-core and multiprocessor architectures will dominate the processor landscape. At least for the next decades, performance gains of processors will mostly be attributed to increasing numbers of cores or processors and not to increased clock cycle frequencies.
  4. Large, scalable systems can only be designed when taking into account the essence of distributed and concurrent systems. Appropriate programming languages, frameworks and libraries are necessary to implement such systems.

In this thesis, we will bring together these individual developments to have a comprehensive analysis of a particular challenge: How can we tackle concurrency when programming scalable web architectures?…”

http://berb.github.com/diploma-thesis/community/


Leave a comment

Big List of 20 Common Bottlenecks

“In Zen And The Art Of Scaling – A Koan And Epigram Approach, Russell Sullivan offered an interesting conjecture: there are 20 classic bottlenecks. This sounds suspiciously like the idea that there only 20 basic story plots. And depending on how you chunkify things, it may be true, but in practice we all know bottlenecks come in infinite flavors, all tasting of sour and ash.

One day Aurelien Broszniowski from Terracotta emailed me his list of bottlenecks, we cc’ed Russell in on the conversation, he gave me his list, I have a list, and here’s the resulting stone soup.

Russell said this is his “I wish I knew when I was younger” list and I think that’s an enriching way to look at it. The more experience you have, the more different types of projects you tackle, the more lessons you’ll be able add to a list like this. So when you read this list, and when you make your own, you are stepping through years of accumulated experience and more than a little frustration, but in each there is a story worth grokking…”

http://highscalability.com/blog/2012/5/16/big-list-of-20-common-bottlenecks.html


Leave a comment

Big Data Architecture at LinkedIn

“In this interview with at QCon London, LinkedIn’s Sid Anand discusses the problems they face when serving high-traffic, high-volume data. Sid explains how they’re moving some use cases from Oracle to gain headroom, and lifts the hood on their open source search and data replication projects, including Kafka, Voldemort, Espresso and Databus…”

http://www.infoq.com/interviews/12-mar-sid-anand


Leave a comment

Scale Something: How Draw Something rode its rocket ship of growth

“We’ve learned to keep things simple. The original backend for Draw Something was designed as a simple key/value store with versioning. The service was built into our existing ruby API (using the merb framework and thin web server). Our initial idea was why not use our existing API for all the stuff we’ve done before, like users, signup/login, virtual currency, inventory; and write some new key/value stuff for Draw Something? Since we design for scale, we initially chose Amazon S3 as our data store for all this key/value data. The idea behind this was why not sacrifice some latency but gain unlimited scalability and storage.
The rest of our stack is pretty standard. Anyone who wants to build scalable systems will attempt to make every layer of the system scale independently from the rest. As the web frontend we use NGINX web server, which points to HAProxy software load balancer, which then hits our ruby API running on a thin web server. The main datastore behind this is MySQL – sharded when absolutely necessary. We use memcached heavily and redis for our asynchronous queueing, using the awesome ruby library called resque…”

http://www.gamasutra.com/view/news/168799/Scale_Something_How_Draw_Something_rode_its_rocket_ship_of_growth.php


Leave a comment

Just how big are porn sites?

“While it’s difficult domain to penetrate — hard numbers are few and far between — we know for a fact that porn sites are some of the most trafficked parts of the internet. According to Google’s DoubleClick Ad Planner, which tracks users across the web with a cookie, dozens of adult destinations populate the top 500 websites. Xvideos, the largest porn site on the web with 4.4 billion page views per month, is three times the size of CNN or ESPN, and twice the size of Reddit. LiveJasmin isn’t much smaller. YouPorn, Tube8, and Pornhub — they’re all vast, vast sites that dwarf almost everything except the Googles and Facebooks of the internet…”

http://www.extremetech.com/computing/123929-just-how-big-are-porn-sites


Leave a comment

Sharding Postgres with Instagram

“On Tuesday last week we had a terrific SFPUG meeting at which Mike Kreiger of Instagram explained how they grew and eventually sharded their 2TB of Postgres data to support 27 million users.  It’s a great presentation which explains the growth process of a successful web/mobile startup, as well as horizontally scaling PostgreSQL. Yes, you too can use PostgreSQL to make One Billion Dollars!
Video is on UStream.   Sorry you can’t see the slides on the video; we had technical issues with the camera.  Slides are here, you can click along with Mike talking on your own…”

http://www.databasesoup.com/2012/04/sharding-postgres-with-instagram.html


Leave a comment

What Powers Instagram: Hundreds of Instances, Dozens of Technologies

“One of the questions we always get asked at meet-ups and conversations with other engineers is, “what’s your stack?” We thought it would be fun to give a sense of all the systems that power Instagram, at a high-level; you can look forward to more in-depth descriptions of some of these systems in the future. This is how our system has evolved in the just-over-1-year that we’ve been live, and while there are parts we’re always re-working, this is a glimpse of how a startup with a small engineering team can scale to our 14 million+ users in a little over a year. Our core principles when choosing a system are…”

http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances-dozens-of

Follow

Get every new post delivered to your Inbox.

Join 518 other followers