Published by Rasa Kundrotaitė at March 1, 2021 Last edited on August 18, 2021

Optical text recognition / Text Recognition

On February 8th, 2021 we released a new version of our platform that introduced an optical character recognition pre-trained model, otherwise known as text recognition. Regarding the update, we created this short guide on what text recognition is, its history and usage scenarios, how it works and how to make the best of it on the SentiSight.ai platform.

Optical character recognition in simple words

Optical character recognition (OCR), is a computer vision specialization task that enables the conversion of handwritten, typed or printed text into the machine-encoded text.

Usually, words and characters are extracted from scanned documents, pictures or subtitle text, however, following a recent increase in demand for more convenient remote teaching and learning tools, and technology innovations like real-time handwriting recognition, implemented by using a pen or stylus and a tablet instead of more traditional input devices such as a keyboard or a mouse, became increasingly popular.

How long has it been around us?

The first reading device was developed in 1914 by Emanuel Goldberg. Purposefully created for the blind and visually impaired people this machine was able to read characters and convert them into a standard telegraph code.

At the same time Edmund Fournier d’Albe developed a handheld scanner that made a specific sound for each character or letter when moved across the page and called it the Optophone. Technology that started out merely as assistance to visually impaired people, gradually evolved into a search system that allowed its users to find answers to their queries in archived databases and were later implemented in various sectors.

Wide variety of use cases

As the general public we often encounter optical character recognition on a daily basis, however, we do not necessarily acknowledge it. Nowadays OCR systems assist many industries in everyday tasks:

Banking: used to verify a customer’s identity by matching the cheque’s handwriting to a signature stored in a database. It speeds up the clearance process and completes the task without any human involvement.

Healthcare industry: scanned reports, treatments and hospital records are digitally stored in the database accessible to every healthcare worker. This allows for quicker diagnostics and improved logistics, since the required numbers of equipment and medicine can be extracted from digital records’ analysis.

Legal profession: by converting text into digital form that can then be processed by a computer, OCR eliminates the need for excessive paperwork, thus making it more sustainable. Its positive impact drastically changed how the legal industry operates. Once all the affidavits, statements and wills are digitized and stored in the database, finding the right paperwork is just one click away. Quick access to past documents significantly reduces the time spent on a case and cheapens the process.

Airport customs: extracts information from visitors’ passports and accelerates border checks.

Autonomous vehicles: they recognize traffic signs and number plates of cars that are in front of them. Nonetheless, additional information about automated license plate recognition can be found on a separate Neurotechnology product’s website that specializes in this task.

Optical character recognition can be implemented in the form of license plate recognition

Literature research: OCR is partly responsible for many articles and essays that have been written across both digital and print mediums. Scanned books and research reports, converted to searchable PDFs, make it possible to find relevant information in a blink of an eye. With OCR implementation in translators, it is easier than ever to interpret that foreign-language booklet’s content you always wondered about!

How does OCR work?

The process of text recognition gets complicated by numerous different fonts available in the market, therefore scanned images must be pre-processed before the selection of an OCR algorithm. It usually involves de-skewing (properly aligning), despeckling and converting an image to black-and-white in order to separate the text from its background.

After normalization of the image, text can be extracted either by matrix matching, that involves matching the pattern pixel-by-pixel which works best with already known fonts, or feature extraction, that decomposes characters into lines and loops and compares them with the dataset. To further increase OCR accuracy, post-processing measures should be taken. They involve restricting the output by a specific lexicon, remaining the initial textual representation and knowing the grammar of the language.

Optical character recognition on the SentiSight.ai platform

SentiSight.ai offers a helpful optical character recognition tool that converts your files to searchable text with ease.

To start this process, go to Pre-trained models and select Text recognition from a drop down list. Here you have to choose a lexicon from a wide variety of 75 languages to improve recognition accuracy in post-processing. If you wish to recognize a language that uses a non-Latin alphabet, you can select the option to include latin characters too. Upload your images and voilà – the results are presented on your screen.

Sentisight.ai’s optical character recognition tool displays segmented words and lines with a bounding box. The blue label in the top left corner indicates the predicted words while the black one in the top opposite corner shows the score of prediction accuracy as a percentage. Either one of them can be hidden by selecting an appropriate checkbox above the document. The results are listed at the bottom of the page which can also be downloaded in JSON format.

Our text recognition tool is accessible either via SentiSight.ai web-interface or via REST API. More information on the latter can be found in explanatory user guides.

Key benefits of OCR-based system

Digitizing images and extracting their text is advantageous in numerous aspects. Since the file is converted into a machine-searchable document, editing a specific part of the text is a piece of cake. Moreover, storing files in a digital database instead of physical archives provides quick access to them and can be backed up anytime at a low cost. Additionally, after the extraction digital text can be easily translated into any other language thus improving time spent on a translation process and its cost.

You can start enjoying a more sustainable lifestyle with convenient access to your files with our new version of SentiSight.ai now!

Try SentiSight.ai for Yourself

Optical Character Recognition Using SentiSight.ai

Optical character recognition in simple words

How long has it been around us?

Wide variety of use cases

How does OCR work?

Optical character recognition on the SentiSight.ai platform

Key benefits of OCR-based system

Related posts

Background Removal Tool using Image Recognition AI

New SentiSight.ai features and changes

Deep Dive: Role of Image Recognition in Defect Detection