Take your Traditional OCR up a notch

Caroline Lea
September 13, 2019
05:03 pm

By: Greg Moler | Director of Imaging Solutions

While the baseline OCR landscape has not changed much, AWS aims to correct that. Traditional OCR engines are quite limited in what details they can provide. Being able to detect the characters is only half the battle, the ability to get meaningful data out of them becomes the challenge. Traditional OCR follows the ‘what you see is what you get’ mantra, meaning once you run your document through, the blob of seemingly unnavigable text is all you are left with. What if we could enhance this output with other meaningful data elements useful in extraction confidence? What if we could improve the navigation of the traditional OCR block of text?

Enter Textract from AWS. A public web service aimed to improve your traditional OCR experience in an easily scalable, integrable, and low cost package. Textract is built upon an OCR extraction engine that is optimized by AWS’ advanced machine learning. It has been taught how to extract thousands of different types of forms so you don’t have to worry about it. The ‘template’ days are over. It also provides a number of useful advanced features that other engines simply do not offer: confidence ratings, word block identification, word and line object identification, table extraction, and key-value output. Let’s take a quick look at each of these:

Confidence Ratings: Ability to intelligently make choices to accept results, or require human intervention based on your own thresholds. Building this into your work flow or product can greatly improve data accuracy

Word Blocks: Textract will identify word blocks allowing you to navigate through them to help identify things like address blocks or known blocks of text in your documents. The ability to identify grouped wording rather than sifting through a massive blob of OCR output can help you find the information you are looking for faster

Word and Line Objects: Rather than getting a block of text from a traditional OCR engine, having code-navigable objects to parse your documents will greatly improve your efficiency and accuracy. Paired with location data, you can use the returned coordinates to pinpoint where it was extracted from. This becomes useful when you know your data is found in specific areas or ranges of a given document to further improve accuracy and filter out false positives

Table Extraction: Using AWS AI-backed extraction technology, Table extraction will intelligently identify and extract tabular data to pipe into whatever your use case may need, allowing you to quickly calculate and navigate these table data elements.

Key-value Output: AWS, again using AI-backed extraction technology, will intelligently identify key-value pairs found on the document without having to write custom engines to parse the data programmatically. Optionally, send these key-value pairs to your favorite key-value engine like Splunk or Elasticsearch (Elastic Stack) for easily searchable, trigger-able, and analytical actions for your document’s data.

Contact us today to find out how Textract from AWS can help streamline your OCR based solutions to improve your data’s accuracy!