On February 8th, 2021 we released a new version of our platform that introduced an optical character recognition pre-trained model, otherwise known as text recognition. Regarding the update, we created this short guide on what text recognition is, its history and usage scenarios, how it works and how to make the best of it on the SentiSight.ai platform.
Optical character recognition (OCR), is a computer vision specialization task that enables the conversion of handwritten, typed or printed text into the machine-encoded text.
Usually, words and characters are extracted from scanned documents, pictures or subtitle text, however, following a recent increase in demand for more convenient remote teaching and learning tools, and technology innovations like real-time handwriting recognition, implemented by using a pen or stylus and a tablet instead of more traditional input devices such as a keyboard or a mouse, became increasingly popular.
The first reading device was developed in 1914 by Emanuel Goldberg. Purposefully created for the blind and visually impaired people this machine was able to read characters and convert them into a standard telegraph code.
At the same time Edmund Fournier d’Albe developed a handheld scanner that made a specific sound for each character or letter when moved across the page and called it the Optophone. Technology that started out merely as assistance to visually impaired people, gradually evolved into a search system that allowed its users to find answers to their queries in archived databases and were later implemented in various sectors.
As the general public we often encounter optical character recognition on a daily basis, however, we do not necessarily acknowledge it. Nowadays OCR systems assist many industries in everyday tasks:
The process of text recognition gets complicated by numerous different fonts available in the market, therefore scanned images must be pre-processed before the selection of an OCR algorithm. It usually involves de-skewing (properly aligning), despeckling and converting an image to black-and-white in order to separate the text from its background.
After normalization of the image, text can be extracted either by matrix matching, that involves matching the pattern pixel-by-pixel which works best with already known fonts, or feature extraction, that decomposes characters into lines and loops and compares them with the dataset. To further increase OCR accuracy, post-processing measures should be taken. They involve restricting the output by a specific lexicon, remaining the initial textual representation and knowing the grammar of the language.
SentiSight.ai offers a helpful optical character recognition tool that converts your files to searchable text with ease.
To start this process, go to Pre-trained models and select Text recognition from a drop down list. Here you have to choose a lexicon from a wide variety of 75 languages to improve recognition accuracy in post-processing. If you wish to recognize a language that uses a non-Latin alphabet, you can select the option to include latin characters too. Upload your images and voilà – the results are presented on your screen.
Sentisight.ai’s optical character recognition tool displays segmented words and lines with a bounding box. The blue label in the top left corner indicates the predicted words while the black one in the top opposite corner shows the score of prediction accuracy as a percentage. Either one of them can be hidden by selecting an appropriate checkbox above the document. The results are listed at the bottom of the page which can also be downloaded in JSON format.
Our text recognition tool is accessible either via SentiSight.ai web-interface or via REST API. More information on the latter can be found in explanatory user guides.
Digitizing images and extracting their text is advantageous in numerous aspects. Since the file is converted into a machine-searchable document, editing a specific part of the text is a piece of cake. Moreover, storing files in a digital database instead of physical archives provides quick access to them and can be backed up anytime at a low cost. Additionally, after the extraction digital text can be easily translated into any other language thus improving time spent on a translation process and its cost.
You can start enjoying a more sustainable lifestyle with convenient access to your files with our new version of SentiSight.ai now!