A Mathematical Explanation of Naive Bayes in 5 Minutes

Photo by Courtney Cook on Unsplash

Naive Bayes. What may seem like a very confusing algorithm is actually one of the simplest algorithms once understood. Part of why it’s so simple to understand and implement is because of the assumptions that it inherently makes. However, that’s not to say that it’s a poor algorithm despite the strong assumptions that it holds — in fact, Naive Bayes is widely used in the data science world and has a lot of real-life applications.

In this article, we’ll look at what Naive Bayes is, how it works with an example to make it easy to understand, the different types of Naive Bayes, the pros and cons, and some real-life applications of it…

https://towardsdatascience.com/a-mathematical-explanation-of-naive-bayes-in-5-minutes-44adebcdb5f8

Using artificial intelligence to detect product defects with AWS Step Functions

Factories that produce a high volume of inventory must ensure that defective products are not shipped. This is often accomplished with human workers on the assembly line or through computer vision.

You can build an application that uses a custom image classification model to detect and report back any defects in a product, then takes appropriate action. This method provides a powerful, scalable, and simple solution for quality control. It uses Amazon S3Amazon SQSAWS LambdaAWS Step Functions, and Amazon SageMaker.

To simulate a production scenario, the model is trained using an example dataset containing images of an open-source printed circuit board, with defects and without. An accompanying AWS Serverless Application Repository application deploys the Step Functions workflow for handling image classification and notifications.

https://aws.amazon.com/blogs/compute/using-artificial-intelligence-to-detect-product-defects-with-aws-step-functions/

Over 150 of the Best Machine Learning, NLP, and Python Tutorials

While machine learning has a rich history dating back to 1959, the field is evolving at an unprecedented rate. In a recent article, I discussed why the broader artificial intelligence field is booming and likely will for some time to come. Those interested in learning ML may find it daunting to get started.

https://medium.com/machine-learning-in-practice/over-150-of-the-best-machine-learning-nlp-and-python-tutorials-ive-found-ffce2939bd78#hn

Deploy machine learning models in production

Key features

  • Multi framework: Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.
  • Autoscaling: Cortex automatically scales APIs to handle production workloads.
  • CPU / GPU support: Cortex can run inference on CPU or GPU infrastructure.
  • Spot instances: Cortex supports EC2 spot instances.
  • Rolling updates: Cortex updates deployed APIs without any downtime.
  • Log streaming: Cortex streams logs from deployed models to your CLI.
  • Prediction monitoring: Cortex monitors network metrics and tracks predictions.
  • Minimal configuration: Cortex deployments are defined in a single cortex.yaml file.

https://github.com/cortexlabs/cortex

ID Card Digitization and Information Extraction using Deep Learning – A Review

In this article, we will discuss how any organisation can use deep learning to automate ID card information extraction, data entry and reviewing procedures to achieve greater efficiency and cut costs. We will review different deep learning approaches that have been used in the past for this problem, compare the results and look into the latest in the field. We will discuss graph neural networks and how they are being used for digitization.

While we will be looking at the specific use-case of ID cards, anyone dealing with any form of documents, invoices and receipts, etc and is interested in building a technical understanding of how deep learning and OCR can solve the problem will find the information useful.

https://nanonets.com/blog/id-card-digitization-deep-learning/

A Guide to S3 Batch on AWS

AWS just announced the release of S3 Batch Operations. This is a hotly-anticpated release that was originally announced at re:Invent 2018. With S3 Batch, you can run tasks on existing S3 objects. This will make it much easier to run previously difficult tasks like retagging S3 objects, copying objects to another bucket, or processing large numbers of objects in bulk.

In this post, we’ll do a deep dive into S3 Batch. You will learn when, why, and how to use S3 Batch. First, we’ll do an overview of the key elements involved in an S3 Batch job. Then, we’ll walkthrough an example by doing sentiment analysis on a group of existing objects with AWS Lambda and Amazon Comprehend.

https://www.alexdebrie.com/posts/s3-batch/

Python for NLP: Introduction to the TextBlob Library

This is the seventh article in my series of articles on Python for NLP. In my previous article, I explained how to perform topic modeling using Latent Dirichlet Allocation and Non-Negative Matrix factorization. We used the Scikit-Learn library to perform topic modeling.

In this article, we will explore TextBlob, which is another extremely powerful NLP library for Python. TextBlob is built upon NLTK and provides an easy to use interface to the NLTK library. We will see how TextBlob can be used to perform a variety of NLP tasks ranging from parts-of-speech tagging to sentiment analysis, and language translation to text classification.

https://stackabuse.com/python-for-nlp-introduction-to-the-textblob-library/

The Illustrated Word2vec

I find the concept of embeddings to be one of the most fascinating ideas in machine learning. If you’ve ever used Siri, Google Assistant, Alexa, Google Translate, or even smartphone keyboard with next-word prediction, then chances are you’ve benefitted from this idea that has become central to Natural Language Processing models. There has been quite a development over the last couple of decades in using embeddings for neural models (Recent developments include contextualized word embeddings leading to cutting-edge models like BERT and GPT2).

Word2vec is a method to efficiently create word embeddings and has been around since 2013. But in addition to its utility as a word-embedding method, some of its concepts have been shown to be effective in creating recommendation engines and making sense of sequential data even in commercial, non-language tasks. Companies like AirbnbAlibabaSpotify, and Anghami have all benefitted from carving out this brilliant piece of machinery from the world of NLP and using it in production to empower a new breed of recommendation engines.

In this post, we’ll go over the concept of embedding, and the mechanics of generating embeddings with word2vec. But let’s start with an example to get familiar with using vectors to represent things. Did you know that a list of five numbers (a vector) can represent so much about your personality?

https://jalammar.github.io/illustrated-word2vec/