Today’s blog post is a continuation of our recent series on Optical Character Recognition (OCR) and computer vision.
In a previous blog post, we learned how to install the Tesseract binary and use it for OCR. We then learned how to cleanup images using basic image processing techniques to improve the output of Tesseract OCR.
However, as I’ve mentioned multiple times in these previous posts, Tesseract should not be considered a general, off-the-shelf solution for Optical Character Recognition capable of obtaining high accuracy.
In some cases, it will work great — and in others, it will fail miserably.
A great example of such a use case is credit card recognition, where given an input image,
we wish to:
- Detect the location of the credit card in the image.
- Localize the four groupings of four digits, pertaining to the sixteen digits on the credit card.
- Apply OCR to recognize the sixteen digits on the credit card.
- Recognize the type of credit card (i.e., Visa, MasterCard, American Express, etc.).
In these cases, the Tesseract library is unable to correctly identify the digits (this is likely due to Tesseract not being trained on credit card example fonts). Therefore, we need to devise our own custom solution to OCR credit cards.
In today’s blog post I’ll be demonstrating how we can use template matching as a form of OCR to help us create a solution to automatically recognize credit cards and extract the associated credit card digits from images.
To learn more about using template matching for OCR with OpenCV and Python, just keep reading.