A visual introduction to machine learning

In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions.


Model Tuning and
the Bias-Variance Tradeoff

The goal of modeling is to approximate real-life situations by identifying and encoding patterns in data. Models make mistakes if those patterns are overly simple or overly complex.



Face recognition with OpenCV, Python, and deep learning

In today’s blog post you are going to learn how to perform face recognition in both images and video streams using:

  • OpenCV
  • Python
  • Deep learning

As we’ll see, the deep learning-based facial embeddings we’ll be using here today are both (1) highly accurate and (2) capable of being executed in real-time.

To learn more about face recognition with OpenCV, Python, and deep learning, just keep reading!


AWS DeepLens: first impressions + tutorial

Getting up-and-running with Amazon’s new machine learning-enabled camera

tl;dr It’s awesome. Get one.

At the end of 2017, Amazon announced DeepLens, a camera with specialized hardware that allows developers to deploy machine learning and computer vision models to “the edge,” and integrate the data it collects with other AWS services.

On a whim, I put in a one-click order on Prime (devices started shipping just last week); it arrived a couple days later and just hours from unboxing — with one or two minor hiccups — I got it up-and-running and integrated with other AWS services. I’ve been pleasantly surprised, to say the least.


Machine Learning on AWS

Why machine learning on AWS?

Machine Learning for everyone

Whether you are a data scientist, ML researcher, or developer, AWS offers machine learning services and tools tailored to meet your needs and level of expertise.

API-driven ML services

Developers can easily add intelligence to any application with a diverse selection of pre-trained services that provide computer vision, speech, language analysis, and chatbot functionality.

Broad framework support

AWS supports all the major machine learning frameworks, including TensorFlow, Caffe2, and Apache MXNet, so that you can bring or develop any model you choose.

Breadth of compute options

AWS offers a broad array of compute options for training and inference with powerful GPU-based instances, compute and memory optimized instances, and even FPGAs.

Deep platform integrations

ML services are deeply integrated with the rest of the platform including the data lake and database tools you need to run ML workloads. A data lake on AWS gives you access to the most complete platform for big data.

Comprehensive analytics

Choose from a comprehensive set of services for data analysis including data warehousing, business intelligence, batch processing, stream processing, data workflow orchestration.


Control access to resources with granular permission policies. Storage and database services offer strong encryption to keep your data secure. Flexible key management options allow you to choose whether you or AWS will manage the encryption keys.


Consume services as you need them and only for the period you use them. AWS pricing has no upfront fees, termination penalties, or long term contracts. The AWS Free Tier helps you get started with AWS.

A Concrete Introduction to Probability (using Python)

This notebook covers the basics of probability theory, with Python 3 implementations. (You should have some background in probability and Python.)

In 1814, Pierre-Simon Laplace wrote:

Probability … is thus simply a fraction whose numerator is the number of favorable cases and whose denominator is the number of all the cases possible … when nothing leads us to expect that any one of these cases should occur more than any other.


Pierre-Simon Laplace

Laplace really nailed it, way back then! If you want to untangle a probability problem, all you have to do is be methodical about defining exactly what the cases are, and then careful in counting the number of favorable and total cases. We’ll start being methodical by defining some vocabulary:

  • Experiment: An occurrence with an uncertain outcome that we can observe.
    For example, rolling a die.
  • Outcome: The result of an experiment; one particular state of the world. What Laplace calls a “case.”
    For example: 4.
  • Sample Space: The set of all possible outcomes for the experiment.
    For example, {1, 2, 3, 4, 5, 6}.
  • Event: A subset of possible outcomes that together have some property we are interested in.
    For example, the event “even die roll” is the set of outcomes {2, 4, 6}.
  • Probability: As Laplace said, the probability of an event with respect to a sample space is the number of favorable cases (outcomes from the sample space that are in the event) divided by the total number of cases in the sample space. (This assumes that all outcomes in the sample space are equally likely.) Since it is a ratio, probability will always be a number between 0 (representing an impossible event) and 1 (representing a certain event).
    For example, the probability of an even die roll is 3/6 = 1/2.

This notebook will develop all these concepts; I also have a second part that covers paradoxes in Probability Theory.


A Neural Network in 11 lines of Python

Summary: I learn best with toy code that I can play with. This tutorial teaches backpropagation via a very simple toy example, a short python implementation.

Edit: Some folks have asked about a followup article, and I’m planning to write one. I’ll tweet it out when it’s complete at @iamtrask. Feel free to follow if you’d be interested in reading it and thanks for all the feedback!


Text Data Preprocessing: A Walkthrough in Python

In a pair of previous posts, we first discussed a framework for approaching textual data science tasks, and followed that up with a discussion on a general approach to preprocessing text data. This post will serve as a practical walkthrough of a text data preprocessing task using some common Python tools.

Data preprocessing
Preprocessing, in the context of the textual data science framework.Our goal is to go from what we will describe as a chunk of text (not to be confused with text chunking), a lengthy, unprocessed single string, and end up with a list (or several lists) of cleaned tokens that would be useful for further text mining and/or natural language processing tasks.