Page 130 - DCAP303_MULTIMEDIA_SYSTEMS
P. 130
Multimedia Systems
notes trend to meet larger volume and larger group of users after 30 years development of the desktop
OCR. Internet and broadband technologies have made WebOCR and OnlineOCR practically
available to both individual users and enterprise customers. Since 2000, some major OCR vendors
began offering WebOCR and Online software, a number of new entrants companies to seize the
opportunity to develop innovative Web-based OCR service, some of which are free of charge
services.
application-oriented oCr
Since OCR technology has been more and more widely applied to paper-intensive industry,
it is facing more complex images environment in the real world. For example, complicated
backgrounds, degraded-images, heavy-noise, paper skew, picture distortion, low-resolution,
disturbed by grid and lines, text image consisting of special fonts, symbols, glossary words and
etc. All the factors affect OCR products’ stability in recognition accuracy.
In recent years, the major OCR technology providers began to develop dedicated OCR systems,
each for a special type of images. They combine various optimization methods related the special
image, such as business rules, standard expression, glossary dictionary and rich information
contained in colour image, to improve the recognition accuracy.
Such strategy to customize OCR technology is called “Application-Oriented OCR” or “Customized
OCR”, widely used in the fields of Business-card OCR, Invoice OCR, Screenshot OCR, ID card
OCR, Driver-license OCR or Auto plant OCR, and so on.
7.2.2 oCr software in present scenario
One study based on recognition of 19th and early 20th century newspaper pages concluded
that character-by-character OCR accuracy for commercial OCR software varied from 71% to
98%; total accuracy can only be achieved by human review. Other areas—including recognition
of hand printing, cursive handwriting, and printed text in other scripts (especially those East
Asian language characters which have many strokes for a single character)—are still the subject
of active research.
Accuracy rates can be measured in several ways, and how they are measured can greatly affect
the reported accuracy rate. For example, if word context (basically a lexicon of words) is not used
to correct software finding non-existent words, a character error rate of 1% (99% accuracy) may
result in an error rate of 5% (95% accuracy) or worse if the measurement is based on whether
each whole word was recognized with no incorrect letters.
On-line character recognition is sometimes confused with Optical Character Recognition
(Handwriting recognition). The OCR is an instance of off-line character recognition, where the
system recognizes the fixed static shape of the character, while on-line character recognition instead
recognizes the dynamic motion during handwriting. For example, on-line recognition, such as that
used for gestures in the Penpoint OS or the Tablet PC can tell whether a horizontal mark was
drawn right-to-left, or left-to-right. On-line character recognition is also referred to by other terms
such as dynamic character recognition, real-time character recognition, and Intelligent Character
Recognition or ICR.
On-line systems for recognizing hand-printed text on the fly have become well known as
commercial products in recent years. Among these are the input devices for personal digital
assistants such as those running Palm OS. The Apple Newton pioneered this product. The
algorithms used in these devices take advantage of the fact that the order, speed, and direction of
individual lines segments at input are known. Also, the user can be retrained to use only specific
letter shapes. These methods cannot be used in software that scans paper documents, so accurate
recognition of hand-printed documents is still largely an open problem. Accuracy rates of 80% to
90% on neat, clean hand-printed characters can be achieved, but that accuracy rate still translates
to dozens of errors per page, making the technology useful only in very limited applications.
124 LoveLy professionaL University