Welcome back to Clearview blog! Here you’ll find regular articles about the latest in machine vision, including the latest breakthroughs in cutting-edge technology, technical theories, and insightful discussions on all things related to machine vision.
This is our third blog in a series on Optical Character Recognition (OCR) for Machine Vision. So far in this series, we have introduced Optical Character Recognition (OCR) and explored its history and uses, and taken a look at the String Reader tool from Matrox Imaging.
In this blog post, we’ll introduce and explore the SureDot OCR Tool from Matrox Imaging, and explore the unique advantages that can be gained with it. First, let’s look at CIJ printed text.
CIJ Printing
CIJ, or Continuous Inkjet, is a non-contact method of industrial printing that produces a continuous flow of ink droplets from a printhead nozzle.
These droplets are fired at the surface of package and labels in sequence to print text. Using electrostatic deflection, as many as 120,000 droplets can be printed per second.
While this is an extraordinarily efficient method of printing on large quantities, problems such as incorrect line speed, dirty printheads, and non-optimal distances between printhead and printing surface can lead to issues in legibility with CIJ printing. This creates potential issues for label verification, as some printed characters may be legible to human eyes but challenging for vision systems. Vice versa, it’s also possible that a vision system will read something that human eyes wouldn’t.
A good OCR/V system will need to recognise the ‘4’ in both instances, despite their differences.
The Traditional Machine Vision Approach to Reading CIJ
Historically, there have been two main challenges standing between machine vison systems and successful OCR for CIJ text.
1. Converting ‘dot’ characters into single-stroke characters
OCR has typically in the past only thrived in settings with single-stroke characters, such as in automatic number plate recognition (ANPR).
Dot characters and single-stroke characters
Rudimentary machine vision algorithms struggled to interpret the internal boundaries of characters with broken lines, and as CIJ consists of individual dots, pre-processing steps had to be taken to convert these characters from dots to single-stroke in order for vision systems to read these them. This is known as morphology.
Morphology of Alphanumerical Characters
The process of morphology involves dilating each of the dots by a radial pixel value. In the below example, we can see how this process involved some trial and error.
Furthermore, in the next example we can see how attempts to universally expand every dot by the same amount wasn’t always successful – some areas still contain gaps (which break the linearity), whilst other areas can become overcrowded and too swollen to read.
So far, we can see some challenges arising with this older method. These pre-processing steps are inherently time-consuming in their own right, and it only gets more challenging when printing errors – which happen frequently with CIJ printing – are factored in:
2. Dealing with transformed text