Languages; Training - ScanSoft OMNIPAGE PRO 14 Manual

Table of Contents

Advertisement

Languages

The program can read over 110 languages with three alphabets: Latin,
Greek and Cyrillic. See the list in the OCR panel of the Options dialog
box. It shows which languages have dictionary support. A listing is also
provided on the ScanSoft web site.
In addition to user dictionaries, specialized dictionaries are available for
certain professions (currently medical, legal and financial) for some
languages. See the list and make selections in the OCR panel of the
Options dialog box.
The program identifies the language of recognized texts and displays it in the status
bar. This language marking is exported with the document. Use Set Language... in
the Tools menu to change the language marking for selected text. This does not
change the recognition language(s).

Training

Training is the process of changing the OCR solutions assigned to
character shapes in the image. It is useful for uniformly degraded
documents or when an unusual typeface is used throughout a document.
Training will be less useful for texts with random distortions. Here is an
example, based on the letter "g", which can be printed in different ways:
The first two examples do not need training, because both shapes are
normal for the letter "g" and the program can handle them. The third
example could benefit from training because the shape of "g" is unusual,
and all instances of "g" in the text are likely to look like this. The fourth
example is not good for training, because the first "g" is poorly printed,
and this shape is unlikely to appear again in the document.
Chapter 4
Languages
69

Hide quick links:

Advertisement

Table of Contents
loading

This manual is also suitable for:

C2424 - workcentre color solid inkOmnipage pro 10

Table of Contents