A Quick Guide to Redis 3.2’s Geo Support

The Geo API has been around for a while, appearing in the Redis unstable branch about ten months ago and that was, in turn, based on work from 2014. There’s a bit of history in that development process, which being practical folk we’ll skip past and go straight to the stuff that makes your development day better.

At its simplest, the GEO API for Redis reduces longitude/latitude down into a geohash. Geohash is a technique developed in 2008 to represent locations with short string codes. The Geohash of a particular location, say Big Ben in London, would come out as “gcpuvpmm3f0” which is easier to pass around than “latitude 51.500 longitude -0.12455”. The longer the string, the more precise the geohash code.

That encoding into a string is good for humans and URLs but it isn’t particularly space efficient. The good news is geohashes can be encoded as binary and using 52 bits, a geohash gets down to 0.6 meter accuracy which is good enough for most uses. A 52-bit value which just happens to be able to be a small-enough integer to live in a Redis floating-point double safely and that’s what the Geo API works with behind the scenes.


Top 20 Python Machine Learning Open Source Projects

Continuing analysis from last year: Top 20 Python Machine Learning Open Source Projects, this year KDnuggets bring you latest top 20 Python Machine Learning Open Source Projects on Github. Strangely, some of the most active projects of last year have become stagnant and also some lost their position from top 20 (considering contributions and commits), whereas new 13 projects have entered into top 20.


Statistics for Hackers

Statistics has the reputation of being difficult to understand, but using some simple Python skills it can be made much more intuitive. This talk will cover several sampling-based approaches to solving statistical problems, and show you that if you can write a for-loop, you can do statistics.

Machine Learning Crash Course

Introduction, Regression/Classification, Cost Functions, and Gradient Descent

Machine learning (ML) has received a lot of attention recently, and not without good reason. It has already revolutionized fields from image recognition to healthcare to transportation. Yet a typical explanation for machine learning sounds like this:

“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.”

Not very clear, is it? This post, the first in a series of ML tutorials, aims to make machine learning accessible to anyone willing to learn. We’ve designed it to give you a solid understanding of how ML algorithms work as well as provide you the knowledge to harness it in your projects.


Perceptrons, Logistic Regression, and SVMs

In this post we’ll talk about one of the most fundamental machine learning algorithms: the perceptron algorithm. This algorithm forms the basis for many modern day ML algorithms, most notably neural networks. In addition, we’ll discuss the perceptron algorithm’s cousin, logistic regression. And then we’ll conclude with an introduction to SVMs, or support vector machines, which are perhaps one of the most flexible algorithms used today.


A Guide to Deep Learning by YN2

Deep learning is a fast-changing field at the intersection of computer science and mathematics. It is a relatively new branch of a wider field called machine learning. The goal of machine learning is to teach computers to perform various tasks based on the given data. This guide is for those who know some math, know some programming language and now want to dive deep into deep learning.
This is not a guide to:
• general machine learning
• big data processing
• data science
deep reinforcement learning

Naive Bayes Classification explained with Python

Machine Learning is a vast area of Computer Science that is concerned with designing algorithms which form good models of the world around us (the data coming from the world around us).

Within Machine Learning many tasks are – or can be reformulated as – classification tasks.

In classification tasks we are trying to produce a model which can give the correlation between the input data $X$ and the class $C$ each input belongs to. This model is formed with the feature-values of the input-data. For example, the dataset contains datapoints belonging to the classes ApplesPears and Oranges and based on the features of the datapoints (weight, color, size etc) we are trying to predict the class.

We need some amount of training data to train the Classifier, i.e. form a correct model of the data. We can then use the trained Classifier to classify new data. If the training dataset chosen correctly, the Classifier should predict the class probabilities of the new data with a similar accuracy (as it does for the training examples).


After construction, such a Classifier could for example tell us that document containing the words “Bose-Einstein condensate” should be categorized as a Physics article, while documents containing the words “Arbitrage” and “Hedging” should be categorized as a Finance article.

Another Classifier (whose dataset is illustrated below) could tell whether or not a person makes more than 50K, based on features such as Age, Education, Marital Status, Occupation etc.


As we can see, there is a input dataset $ X $ which corresponds to a ‘output’ $Y$. The dataset $X$ contains $m$ input examples$x^{(1)}, x^{(2)}, .. , x^{(m)}$, and each input example has $n$ feature values $x_1, x_2, ..., x_n$ (here $n\ =\ 7$).

There are three popular Classifiers within Machine Learning, which use three different mathematical approaches to classify data;

  • Naive Bayes, which uses a statistical (Bayesian) approach,
  • Logistic Regression, which uses a functional approach and
  • Support Vector Machines, which uses a geometrical approach.

Previously we have already looked at Logistic Regression. Here we will see the theory behind the Naive Bayes Classifier together with its implementation in Python.

For the rest of the post, click here.

(source: http://www.datasciencecentral.com/profiles/blogs/naive-bayes-classification-explained-with-python-code)

How to Steal an AI (reverse engineer machine-learning)

In a paper they released earlier this month titled “Stealing Machine Learning Models via Prediction APIs,” a team of computer scientists at Cornell Tech, the Swiss institute EPFL in Lausanne, and the University of North Carolina detail how they were able to reverse engineer machine learning-trained AIs based only on sending them queries and analyzing the responses. By training their own AI with the target AI’s output, they found they could produce software that was able to predict with near-100% accuracy the responses of the AI they’d cloned, sometimes after a few thousand or even just hundreds of queries.

“You’re taking this black box and through this very narrow interface, you can reconstruct its internals, reverse engineering the box,” says Ari Juels, a Cornell Tech professor who worked on the project. “In some cases, you can actually do a perfect reconstruction.”


An Introduction to Stock Market Data Analysis with Python


Advanced mathematics and statistics has been present in finance for some time. Prior to the 1980s, banking and finance were well known for being “boring”; investment banking was distinct from commercial banking and the primary role of the industry was handling “simple” (at least in comparison to today) financial instruments, such as loans. Deregulation under the Reagan administration, coupled with an influx of mathematical talent, transformed the industry from the “boring” business of banking to what it is today, and since then, finance has joined the other sciences as a motivation for mathematical research and advancement. For example one of the biggest recent achievements of mathematics was the derivation of the Black-Scholes formula, which facilitated the pricing of stock options (a contract giving the holder the right to purchase or sell a stock at a particular price to the issuer of the option). That said, bad statistical models, including the Black-Scholes formula, hold part of the blame for the 2008 financial crisis.

In recent years, computer science has joined advanced mathematics in revolutionizing finance and trading, the practice of buying and selling of financial assets for the purpose of making a profit. In recent years, trading has become dominated by computers; algorithms are responsible for making rapid split-second trading decisions faster than humans could make (so rapidly, the speed at which light travels is a limitation when designing systems). Additionally, machine learning and data mining techniques are growing in popularity in the financial sector, and likely will continue to do so. In fact, a large part of algorithmic trading is high-frequency trading (HFT). While algorithms may outperform humans, the technology is still new and playing in a famously turbulent, high-stakes arena. HFT was responsible for phenomena such as the 2010 flash crash and a 2013 flash crash prompted by a hacked Associated Press tweet about an attack on the White House.

This lecture, however, will not be about how to crash the stock market with bad mathematical models or trading algorithms. Instead, I intend to provide you with basic tools for handling and analyzing stock market data with Python. I will also discuss moving averages, how to construct trading strategies using moving averages, how to formulate exit strategies upon entering a position, and how to evaluate a strategy with backtesting.

DISCLAIMER: THIS IS NOT FINANCIAL ADVICE!!! Furthermore, I have ZERO experience as a trader (a lot of this knowledge comes from a one-semester course on stock trading I took at Salt Lake Community College)! This is purely introductory knowledge, not enough to make a living trading stocks. People can and do lose money trading stocks, and you do so at your own risk!


Trading Strategy

Call an open position a trade that will be terminated in the future when a condition is met. A long position is one in which a profit is made if the financial instrument traded increases in value, and a short position is on in which a profit is made if the financial asset being traded decreases in value. When trading stocks directly, all long positions are bullish and all short position are bearish. That said, a bullish attitude need not be accompanied by a long position, and a bearish attitude need not be accompanied by a short position (this is particularly true when trading stock options).

Here is an example. Let’s say you buy a stock with the expectation that the stock will increase in value, with a plan to sell the stock at a higher price. This is a long position: you are holding a financial asset for which you will profit if the asset increases in value. Your potential profit is unlimited, and your potential losses are limited by the price of the stock since stock prices never go below zero. On the other hand, if you expect a stock to decrease in value, you may borrow the stock from a brokerage firm and sell it, with the expectation of buying the stock back later at a lower price, thus earning you a profit. This is called shorting a stock, and is a short position, since you will earn a profit if the stock drops in value. The potential profit from shorting a stock is limited by the price of the stock (the best you can do is have the stock become worth nothing; you buy it back for free), while the losses are unlimited, since you could potentially spend an arbitrarily large amount of money to buy the stock back. Thus, a broker will expect an investor to be in a very good financial position before allowing the investor to short a stock.

Any trader must have a set of rules that determine how much of her money she is willing to bet on any single trade. For example, a trader may decide that under no circumstances will she risk more than 10% of her portfolio on a trade. Additionally, in any trade, a trader must have an exit strategy, a set of conditions determining when she will exit the position, for either profit or loss. A trader may set a target, which is the minimum profit that will induce the trader to leave the position. Likewise, a trader must have a maximum loss she is willing to tolerate; if potential losses go beyond this amount, the trader will exit the position in order to prevent any further loss (this is usually done by setting a stop-loss order, an order that is triggered to prevent further losses)…