600k concurrent HTTP connections, with Clojure & http-kit

“http-kit is built for performance. Tens of thousands of requests per second on moderate PC…”



Machine Learning Cheat Sheet (for scikit-learn)

“As you hopefully have heard, we at scikit-learn are doing a user survey (which is still open by the way).
One of the requests there was to provide some sort of flow chart on how to do machine learning. As this is clearly impossible, I went to work straight away…”


MongoDB: How to limit results and how to page through results

“In this post we are going to take a look at how to limit results in MongoDB as well how to page through results. MongoDB use limit to limit the number of results return, MongoDB use skip to skip a number of records from the results set. Using limit in conjunction with skip enables you to do paging in MongoDB…”


Machine Learning, Big Data, Deep Learning, Data Mining… FAQ

“What’s the difference between machine learning, deep learning, big data, statistics, decision & risk analysis, probability, fuzzy logic, and all the rest?…”

“In mathematics there are many “logic” theories that have more than one truth value, and not just one universal “logic.” What’s up with that?…”

“What’s the difference between probability and decision analysis?…”




Py2.6+ and Py3.0+ backport of Python 3.3’s LRU Cache (Python recipe)

“Full-featured O(1) LRU cache backported from Python3.3. The full Py3.3 API is supported (thread safety, maxsize, keyword args, type checking, __wrapped__, and cache_info). Includes Py3.3 optimizations for better memory utilization, fewer dependencies, and fewer dict lookups…”



Optimising NginX, Node.JS and networking for heavy workloads

“Used in conjunction, NginX and Node.JS are the perfect partnership for high-throughput web applications. They’re both built using event-driven design principles and are able to scale to levels far beyond the classic Y10K limitations afflicting more archaic web servers such as Apache. Out-of-the-box configuration will get you pretty far, but when you need to start serving upwards of thousands of requests per second on commodity hardware, there’s some extra tweaking you must perform to squeeze every ounce of performance out of your servers.

This article assumes you’re using NginX’s HttpProxyModule to proxy your traffic to one or more upstream node.js servers. We’ll cover tuning sysctl settings in Ubuntu 10.04 and above, as well as node.js application and NginX tuning. You may be able to achieve similar results if you’re using a Debian Linux distribution, but YMMV if you’re using something else…”


Fn.py: enjoy FP in Python

“Despite the fact that Python is not pure-functional programming language, it’s multi-paradigm PL and it gives you enough freedom to take credits from functional programming approach. There are theoretical and practical advantages to the functional style:

  • Formal provability
  • Modularity
  • Composability
  • Ease of debugging and testing

Fn.py library provides you with missing “batteries” to get maximum from functional approach even in mostly-imperative program…”