Launching nginScript and Looking Ahead

I’ve been wanting to add more scripting capabilities to NGINX for a long time. Scripting lets people do more in NGINX without having to write C modules, for example. Lua is a good tool in this area, but it’s not as widely known as some other languages.

JavaScript was the most obvious language to add next. It’s the most popular language – #1 on GitHubfor the past three years. JavaScript is also a good fit for the way we configure NGINX.

I recently announced a working prototype of a JavaScript virtual machine (VM) that would be embedded within NGINX. Today we announced the launch of the first preview of this software, nginScript, atnginx.conf 2015.

This is another milestone in the development of NGINX open source software and NGINX Plus. I want to take the opportunity to explain what nginScript is, describe why it’s needed, share some examples, and talk about the future.

Launching nginScript and Looking Ahead

Agera – Reactive Programming for Android

Agera (Swedish for “to act”) is a super lightweight Android library that helps prepare data for consumption by the Android application components (such as Activities), or objects therein (such as Views), that have life-cycles in one form or another. It introduces a flavor of functional reactive programming, facilitates clear separation of the when, where and what factors of a data processing flow, and enables describing such a complex and asynchronous flow with a single expression, in near natural language.

Making 1 million requests with python-aiohttp

In this post I’d like to test limits of python aiohttp and check its performance in terms of requests per minute. Everyone knows that asynchronous code performs better when applied to network operations, but it’s still interesting to check this assumption and understand how exactly it is better and why it’s is better. I’m going to check it by trying to make 1 million requests with aiohttp client. How many requests per minute will aiohttp make? What kind of exceptions and crashes can you expect when you try to make such volume of requests with very primitive scripts? What are main gotchas that you need to think about when trying to make such volume of requests?

Lessons from Building a Node App in Docker

Here are some tips and tricks that I learned the hard way when developing and deploying web applications written for node.js using Docker.

In this tutorial article, we’ll set up the chat example in docker, from scratch to production-ready, so hopefully you can learn these lessons the easy way. In particular, we’ll see how to:

  • Actually get started bootstrapping a node application with docker.
  • Not run everything as root (bad!).
  • Use binds to keep your test-edit-reload cycle short in development.
  • Manage node_modules in a container for fast rebuilds (there’s a trick to this).
  • Ensure repeatable builds with npm shrinkwrap.
  • Share a Dockerfile between development and production.

This tutorial assumes you already have some familiarity with Docker and node.js. If you’d like a gentle intro to docker first, you can try my slides about docker (discussion on hacker news) or try one of the many, many other docker intros out there.

Load Balancing DNS Traffic with NGINX and NGINX Plus

NGINX Plus R9 introduces the ability to reverse proxy and load balance UDP traffic, a significant enhancement to NGINX Plus’ Layer 4 load-balancing capabilities.

This blog post looks at the challenges of running a DNS server in a modern application infrastructure to illustrate how the open source NGINX software and NGINX Plus can effectively and efficiently load balance both UDP and TCP traffic (for brevity, we’ll refer to NGINX Plus for the rest of the post).

Load Balancing DNS Traffic with NGINX and NGINX Plus

6 Lesser Known Python Data Analysis Libraries

Python offers a great environment and rich set of libraries to developers while working with data. There are tons of useful libraries out there for novice or experienced developers or analysts for helping out with processing or visualizing datasets. Some of the libraries are really popular and used by millions of developers, for example – Pandas, Numpy, Scikit-learn, NTLK etc. Some of the libraries are not so well known and turned out to be handy in my experience. This article introduces 6 such Python libraries when working with data. Readers might already be familiarized with some of them, but I hope this article still proves to be useful.

Cassandra anti-patterns: Queues and queue-like datasets

Deletes in Cassandra

Cassandra uses a log-structured storage engine. Because of this, deletes do not remove the rows and columns immediately and in-place. Instead, Cassandra writes a special marker, called a tombstone, indicating that a row, column, or range of columns was deleted. These tombstones are kept for at least the period of time defined by the gc_grace_seconds per-table setting. Only then a tombstone can be permanently discarded by compaction.

This scheme allows for very fast deletes (and writes in general), but it’s not free: aside from the obvious RAM/disk overhead of tombstones, you might have to pay a certain price when reading data back if you haven’t modelled your data well.

Specifically, tombstones will bite you if you do lots of deletes (especially column-level deletes) and later perform slice queries on rows with a lot of tombstones.

Symptoms of a wrong data model

To illustrate this scenario, let’s consider the most extreme case – using Cassandra as a durable queue, a known anti-pattern, e.g.

CREATE TABLE queues ( name text, enqueued_at timeuuid, payload blob, PRIMARY KEY (name, enqueued_at) );

Having enqueued 10000 10-byte messages and then dequeued 9999 of them, one by one, let’s peek at the last remaining message using cqlsh with TRACING ON:

SELECT enqueued_at, payload FROM queues WHERE name = 'queue-1' LIMIT 1;