Getting started with OTP: creating psycho families

“OTP or Open Telecommunications Platform is what makes Erlang — “Erlang” (there are a lot of other thing too but I just wanted to sound dramatic !). It is such an integral part of Erlang that whenever people talk or write about Erlang they usually use Erlang/OTP. So you might be wondering what is this OTP thing. OTP is nothing but a framework for creating servers and process hierarchies. Just like you have web frameworks like django, ruby on rails for creating websites, Erlang has OTP framework for creating servers and process trees. This post will describe what process trees are, why is this OTP thing so cool and also describe why OTP creates psycho families…”

http://bit.ly/1knHCzH

Introducing the Geo API in Redis

“Since a few days, the Geo API has been introduced in Redis. At the time of writing, the work is quite complete but still considered in progress: everything you’ll read here is actually available in the unstable development branch and not yet released for production (plans are to release it with the next 3.2 version).

The Geo API consists of a set of new commands that add support for storing and querying pairs of longitude/latitude coordinates into Redis keys. GeoSet is the name of the data structure holding a set of (x,y) coordinates. Actually, there isn’t any new data structure under the hood: a GeoSet is simply a Redis SortedSet…”

http://cristian.regolo.cc/2015/07/07/introducing-the-geo-api-in-redis.html

MariaDB 10.1 can do 1 million queries per second

The benchmark is sysbench-mariadb (sysbench trunk with a fix for a more scalable random number generator) OLTP simplified to do 1000 point selects per transaction. The data set is 1 million rows in 20 tables. Fewer tables can be used, but below 4 tables the performance drops somewhat due to a hot spot in the table definition cache.

This is the my.cnf used for this test:

[mysqld]
max_connections = 400
table_open_cache = 800
query_cache_type = 0
innodb_buffer_pool_size = 512M
innodb_buffer_pool_instances = 10
innodb_adaptive_hash_index_partitions = 20

And this the sysbench command line:

sysbench-mariadb --test=lua/oltp.lua --oltp-tables-count=20 \
--oltp-table-size=50000 --num-threads=... --oltp-read-only=on \
--oltp-point-selects=1000 --oltp-distinct-ranges=0 \
--oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 \
--max-time=100 --max-requests=0 run

https://blog.mariadb.org/10-1-mio-qps/

IncludeOS – Run your C++ code directly on virtual hardware

IncludeOS aims to be the thinnest, lightest possible layer, between your C++ code and virtual hardware. We provde a bootloader, standard libraries, lots (we hope) of modules, and the build- and deployment system. You provide the service.

IncludeOS is designed for KVM/Linux but previous versions have also been tested successfully on VirtualBox (which again runs on OS X Windows and Linux) and Bochs…

http://www.includeos.org/

Siberite is a simple, lightweight, leveldb backed message queue written in Go

“Siberite is a simple leveldb backed message queue server
(twitter/kestrel, wavii/darner rewritten in Go).

Siberite is a very simple message queue server. Unlike in-memory servers such as redis, Siberite is designed to handle queues much larger than what can be held in RAM. And unlike enterprise queue servers such as RabbitMQ, Siberite keeps all messages out of process, using goleveldb as a persistent storage.

The result is a durable queue server that uses a small amount of in-resident memory regardless of queue size…”

https://github.com/bogdanovich/siberite

What is a core dump and how do you parse one? (backtrace.io)

“A core represents a the state of a process at a point in time. It contains all the information that an engineer needs in order to inspect the process and its state even after the process has exited. This information includes thread information, mapped memory, register state and more. By using a debugger with the core file, engineers can interact with and inspect the state of the process as if they had attached a debugger to the process at the time when the core file was generated.

Ever wonder what exactly is contained in a core dump and how debuggers interact with them? This post is for you. We will explore how cores are generated and how software interacts with them…”

http://backtrace.io/blog/blog/2015/10/03/whats-a-coredump/

Higher Order Macros in C++

“C++ is a pretty powerful language for defining abstractions which let you get rid of redundancy. Functions and methods address duplicate chunks of imperative code. Base classes let you reuse data definitions. Templates let you do… well… almost anything.

Even so, there’s still often hunks of repetition that you can’t seem to eliminate. For example, let’s say we’re working with a language’s syntax. Typically, the parser generates an AST which then gets passed to the compiler. The compiler walks the AST using Ye Olde Visitor Patterne and generates some lower-level representation for it.

Depending on how rich your language is, you’ll have quite a few different AST classes to represent the different syntactic elements: literals, unary operators, infix expressions, statements, flow control, definitions, etc. V8, for example, has 40 classes to cover everything you can express in JavaScript…”

http://journal.stuffwithstuff.com/2012/01/24/higher-order-macros-in-c/

R vs Python: head to head data analysis

“There have been dozens of articles written comparing Python and R from a subjective standpoint. We’ll add our own views at some point, but this article aims to look at the languages more objectively. We’ll analyze a dataset side by side in Python and R, and show what code is needed in both languages to achieve the same result. This will let us understand the strengths and weaknesses of each language without the conjecture. At Dataquest, we teach both languages, and think both have a place in a data science toolkit.

We’ll be analyzing a dataset of NBA players and their performance in the 2013-2014 season. You can download the file here. For each step in the analysis, we’ll show the Python and R code, along with some explanation and discussion of the different approaches. Without further ado, let’s get this head to head matchup started!…”

R vs Python for Data Analysis — An Objective Comparison

Secondary indexing with Redis

“While Redis is not exactly a key-value store, since values can be complex data structures, it has an extrenal key-value shell: at API level data is addressed by the key name. It is fair to say that, natively, Redis only offers primary key access. However since Redis is a data structures server, certain data structures can be used for indexing, in order to create secondary indexes of different kinds, including secondary indexes and composite (multi-column) indexes.

This document explains how it is possible to create indexes in Redis using the following data structures:

Sorted sets to create secondary indexes by ID or other numerical fields.
Sorted sets with lexicographical ranges for creating more advanced secondary indexes and composite indexes.
Sets for creating random indexes.
Lists for creating simple iterable indexes.

Implementing and maintaining indexes with Redis is an advanced topic, so most users that need to perform complex queries on data should understand if they are better served by a relational store. However often, especially in caching scenarios, there is the explicit need to store indexed data into Redis in order to speedup common queries which require indexes…”