Advertisement

Quick Links

Untitled Document
Pro OCR User's Guide
file:///C|/VisioneerDoc/Main.html [1/20/2003 4:21:09 PM]

Advertisement

Table of Contents
loading

Summary of Contents for Visioneer PROOCR100

  • Page 1 Untitled Document Pro OCR User’s Guide file:///C|/VisioneerDoc/Main.html [1/20/2003 4:21:09 PM]...
  • Page 2 Guide Chapter 1: Introducing Visioneer Pro OCR 100 Chapter 1 Chapter 2: Introducing Visioneer Pro Learning Pro OCR Basics OCR 100 Chapter 3: This chapter introduces you to the Pro OCR application and to the concept of optical character Getting recognition (OCR).
  • Page 3 Pro OCR User’s Guide Recognized character recognition) application, such as Pro OCR. Document Every day you may spend a lot of time retyping printed text or numbers from hard copy documents. By using Pro OCR and a scanner as an input device, Chapter 6: you can eliminate much of this retyping.
  • Page 4 Pro OCR works with imperfect input pages that may have skewed lines of text, touching or broken characters, and fuzzy characters. © Copyright 1998 Visioneer, Inc. Reach us at www.visioneer.com. file:///C|/VisioneerDoc/html/ug_main.htm (3 of 3) [1/20/2003 4:21:10 PM]...
  • Page 5 Introducing Visioneer Pro OCR 100 Pro OCR User’s Guide Chapter 1 Introducing Visioneer Pro OCR 100 This chapter introduces you to the Pro OCR application and to the concept of optical character recognition (OCR). Why Pro OCR Pro OCR is an Optical Character Recognition (OCR) application. An OCR application converts images of text, such as those obtained from scanning a document or receiving a fax through your fax-modem, into editable text.
  • Page 6 Introducing Visioneer Pro OCR 100 Most basic OCR applications inspect the scanned page image, attempt to recognize the dots on the page as characters, and transform the image into a plain text file. Pro OCR does all of these basic tasks, but it can also get the entire page into your word processor or spreadsheet as is—retaining the shape, form, type, and spacing, as well...
  • Page 7 Copyright Information Pro OCR User’s Guide for Windows. Copyright ©1998 Visioneer, Inc. All rights reserved. Reproduction, adaptation, or translation without prior written permission is prohibited, except as allowed under the copyright laws. AnyPort, AutoFix, AutoLaunch, FormTyper, MicroChrome, PaperEnable, PaperLaunch, PaperPort, PaperPort Deluxe, PaperPort ix, PaperPort Links,...
  • Page 8 If you find physical defects in the materials or the workmanship used in making the product described in this document, Visioneer will repair, or at its option, replace, the product at no charge to you, provided you return it (postage prepaid, with proof of your purchase from the original reseller) during the 12-month period after the date of your original purchase of the product.
  • Page 9 file:///C|/VisioneerDoc/html/copyrt.htm Increase the separation between the equipment and receiver. Connect the equipment into an outlet on a circuit different from that to which the receiver is connected. Consult the dealer or an experienced radio/TV technician for help. This equipment has been certified to comply with the limits for a class B computing device, pursuant to FCC Rules.
  • Page 10 Table of Contents Contents Chapter 1: Introducing Visioneer Pro OCR 100 Chapter 2: Learning Pro OCR Basics Chapter 3: Getting Documents Chapter 4: Locating Text and Graphics Chapter 5: Setting Recognize Options and Proofing a Recognized Document Chapter 6: Saving and Printing Documents...
  • Page 11 Table of Contents Contents Chapter 1: Introducing Visioneer Pro OCR 100 Why Pro OCR Features and Highlights of Pro OCR Glossary file:///C|/VisioneerDoc/html/toc1.htm [1/20/2003 4:21:11 PM]...
  • Page 12 Glossary Glossary A4 Letter page size accelerator key alphanumeric word ASCII As Single Column locating method Auto OCR Auto brightness automatic document feeder (ADF) automatic processing background noise backup backwards compatible bit image bitmap bitmapped character bold text brightness broken character file:///C|/VisioneerDoc/html/glos.htm (1 of 9) [1/20/2003 4:21:11 PM]...
  • Page 13 Glossary built-in dictionary CCITT character character format character identification error character image character recognition character style clipboard column information compression confidence consistent document copyrighted document deferred job deferred processing degraded image dialog box desktop document area dots per inch (dpi) file:///C|/VisioneerDoc/html/glos.htm (2 of 9) [1/20/2003 4:21:11 PM]...
  • Page 14 Glossary draft quality text driver exporting export format file extension file formats file type fine resolution flatbed scanner font font family font mapping format retention Gallery Get Page grayscale image hard page breaks heavy character I-beam pointer file:///C|/VisioneerDoc/html/glos.htm (3 of 9) [1/20/2003 4:21:11 PM]...
  • Page 15 Glossary icon illegible character illegible character symbol image view input file formats insertion point italic text justification kerning landscape orientation layout layout analysis error Legal page size Lenient suspect threshold letter quality text line break Locate locate region locating locating method menu file:///C|/VisioneerDoc/html/glos.htm (4 of 9) [1/20/2003 4:21:11 PM]...
  • Page 16 Glossary menu bar multi-column text monospaced font monospaced font mapping newspaper style columns Normal locating method Normal suspect threshold numeric region On-Screen Verifier™ Optical Character Recognition (OCR) order of text regions orientation output file formats page controls page format page image page number box page orientation page size...
  • Page 17 Glossary page source picture element picture region pixel pixel-for-pixel plain text portrait orientation printer font Pro OCR Deferred format Pro OCR format Pro OCR process Pro OCR window Proof proportionally spaced font recognition accuracy Recognize recognized text recognizing region style resolution file:///C|/VisioneerDoc/html/glos.htm (6 of 9) [1/20/2003 4:21:11 PM]...
  • Page 18 Glossary Rich Text Format (RTF) sans serif sans serif font mapping scanner scanner driver scanning screen font scroll bars serif serif font mapping settings file sheetfed scanner side-by-side columns single-bit image single-step processing skewed text spell checking standard resolution status bar file:///C|/VisioneerDoc/html/glos.htm (7 of 9) [1/20/2003 4:21:11 PM]...
  • Page 19 Glossary status display area Stringent suspect threshold stroke weight Style ribbon stylized font subscript text superscript text supplementary dictionaries suspect character suspect threshold Tag Image File Format template template matching Template locating method text quality text region text style text view throughput TIFF touching characters...
  • Page 20 Glossary typeface type quality type size type style underline text User Defined page size user dictionary view selector window Windows word wrap zoom controls file:///C|/VisioneerDoc/html/glos.htm (9 of 9) [1/20/2003 4:21:11 PM]...
  • Page 21 file:///C|/VisioneerDoc/html/glossary.htm Glossary A4 Letter page size An A4 size page measures 8.33" x 11.66". accelerator key In Windows applications, a keyboard shortcut to a menu command. automatic document feeder (ADF). alphanumeric word A word made up of the alphabetic and numeric characters (A–Z, a–z, 0–9) in a character set.
  • Page 22 file:///C|/VisioneerDoc/html/glossary.htm automatic processing A method for using Pro OCR with minimal intervention. Automatic processing involves setting appropriate Gallery settings, before using Auto Start to read in one or more image files or scan in one or more pages. Once page images have been acquired, automatic processing Locates and Recognizes each page image in succession.
  • Page 23 file:///C|/VisioneerDoc/html/glossary.htm bitmapped character A character image made up of a pattern of dots that exists in a computer file or in memory as a bitmap. Bitmapped characters cannot be interpreted by a computer. In order for a computer to use bitmapped characters in a word processor or spreadsheet, the characters must first be interpreted by an OCR application and translated into ASCII text.
  • Page 24 file:///C|/VisioneerDoc/html/glossary.htm character format Font and style information applied to characters. Character format information includes the font name and type size, as attributes such as underline, bold, italic, or some combination of these properties. Compare with page format. character identification error An incorrectly recognized bitmapped character.
  • Page 25 file:///C|/VisioneerDoc/html/glossary.htm consistent document A set of pages or image files where the same Gallery settings apply to each page in the document. Pro OCR’s Auto Start feature can be used to best effect when a document is consistent. copyrighted document Most published or printed materials and documents are copyrighted.
  • Page 26 file:///C|/VisioneerDoc/html/glossary.htm dots per inch (dpi) A measure of the visual resolution of a display or output device. Monitor screens typically have resolutions in the range of 70 to 75 dpi. Most common laser printers have a resolution of 300 dpi. The lower the resolution of a page in dots per inch, the lower the visual quality of characters on that page.
  • Page 27 file:///C|/VisioneerDoc/html/glossary.htm fine resolution A term associated with FAX modems, referring to the highest resolution of the image files typically produced by these devices. Fine resolution is approximately 200 x 200 dpi, which is adequate for reliable recognition. flatbed scanner Scanner with a glass plate on which pages are placed face down.
  • Page 28 file:///C|/VisioneerDoc/html/glossary.htm grayscale image An image format where individual pixels can be expressed with more than a single bit, allowing the image to contain true shades of gray. Pro OCR will not open grayscale images. Compare with single-bit image. hard page breaks Special formatting that you put in manually in a text or word processor document.
  • Page 29 file:///C|/VisioneerDoc/html/glossary.htm input file formats Pro OCR can read documents saved by other applications in TIFF, PCX and DCX formats, as well as those documents saved in its own proprietary TIFF format. See also TIFF. insertion point The place in a text file where text is inserted or deleted. Indicated by a blinking vertical bar.
  • Page 30 file:///C|/VisioneerDoc/html/glossary.htm Lenient suspect threshold Tells Pro OCR to only highlight suspect characters it is very uncertain of. Very few characters are marked as suspect, compared to when the suspect threshold is set to normal or stringent. Use it when you’re dealing with documents containing fonts that you know from experience have been recognized accurately or when you’re less concerned with double-checking.
  • Page 31 file:///C|/VisioneerDoc/html/glossary.htm menu A list of choices from which the user can choose. Menus appear when you point to and click a menu title in the menu bar, or a pop-up menu title in a window or dialog box. menu bar The horizontal strip at the top of a window that contains menu titles.
  • Page 32 file:///C|/VisioneerDoc/html/glossary.htm numeric region Defines a numeric area on the page image in Image View and Text View. Numeric regions may be defined using Pro OCR’s manual region drawing feature, or may be recalled using the Template locating method. Compare with text region picture...
  • Page 33 file:///C|/VisioneerDoc/html/glossary.htm page image The bitmapped image of a scanned page, displayed in the image view in Pro OCR. page number box Shows which page is being viewed and how many pages are in the document. Double-click it to go to a specific page. See also page controls.
  • Page 34 file:///C|/VisioneerDoc/html/glossary.htm portrait orientation When you hold a page of text to read it, it is in portrait orientation when the page is taller than it is wide. Compare with landscape orientation. printer font The representation of a font or typeface used for printing by a printer.
  • Page 35 file:///C|/VisioneerDoc/html/glossary.htm proportionally spaced font Also known as a variable pitch font. Typeface in which each character takes up an amount of horizontal space consistent with its relative physical width, i.e. an “i” needs less space than a “w.” Times Roman and Helvetica are two common proportionally spaced typefaces.
  • Page 36 file:///C|/VisioneerDoc/html/glossary.htm Rich Text Format (RTF). sans serif Designation for font families in which the characters do not have serifs, which are the small strokes at the ends of characters. Common sans serif font families include Helvetica, Avant Garde, and Univers. Compare with serif. sans serif font mapping The font chosen for displaying sans serif text characters in text views.
  • Page 37 file:///C|/VisioneerDoc/html/glossary.htm settings file A file, saved by choosing Save Settings from the File menu, that saves the current gallery, processing preferences, display preferences, proofing preferences, and selected scanner information in a named settings file. To use a settings file, retrieve it by choosing Retrieve Settings from the File menu. sheetfed scanner Scanner with an integral sheetfeeder, but no flatbed, on which pages are placed and fed through the scanner.
  • Page 38 file:///C|/VisioneerDoc/html/glossary.htm spell checking Pro OCR automatically checks spelling during the Recognize step using its built-in dictionary and the current user dictionary. After Pro OCR finishes recognizing, you can check the spelling in a document using the user-configured Proof command. standard resolution A term associated with FAX modems, referring to the default resolution of the image files produced by these devices.
  • Page 39 file:///C|/VisioneerDoc/html/glossary.htm subscript text Text with the subscript attribute is below the baseline like this. superscript text superscript attribute is above the baseline like this. Text with the supplementary dictionaries Optional dictionaries that can be used during spell checking in Pro OCR. There are four supplementary dictionaries included with Pro OCR: geographical, legal, medical, and an expanded dictionary.
  • Page 40 file:///C|/VisioneerDoc/html/glossary.htm text region Defines a text area on the page image in the image view and the text view. Only text within defined text regions is recognized. Text regions may be defined manually or by using Pro OCR’s automatic locating settings. text style A piece of text’s attributes or styling, such as bold, italic, or underline.
  • Page 41 file:///C|/VisioneerDoc/html/glossary.htm type size The vertical height measurement of type, commonly expressed in points (72 points=1 inch). Pro OCR recognizes and preserves type ranging in size from 5 points to 64 points. type style The variations in characters, including font characteristics such as bold and italic, and styling characteristics such as underlining.
  • Page 42 file:///C|/VisioneerDoc/html/glossary.htm word wrap The automatic continuation of text from the end of one line to the beginning of the next. Word wrap lets you avoid pressing the Return key at the end of each line as you type. For example, when you input text in most word processors, lines of type are automatically “wrapped”...
  • Page 43: Table Of Contents

    Table of Contents Contents Chapter 2: Learning Pro OCR Basics The Basic Steps Starting Pro OCR Selecting a TWAIN-Compliant Scanner Learning About the Gallery Toolbar Tutorial Examples Example 1: Using Auto OCR to Scan a One-Page Simple Document and Save It in Pro OCR Format Example 2: Opening a File and Saving It in a Word Processor Format Example 3: Scanning a Document of Multi-Column Text Example 4: Scanning a Document With Tables and Saving in a Spreadsheet Format...
  • Page 44: Chapter 2: Learning Pro Ocr Basics

    TIP: If you use PaperPort software or scanners, see the Working with PaperPort document that came with Pro OCR. It provides tips and other information about using Pro OCR with these Visioneer products. The Basic Steps When you use Pro OCR, you convert an image of text and save it an editable format.
  • Page 45: Starting Pro Ocr

    The following procedure helps you to get acquainted with Pro OCR and make sure that everything is set up correctly. TIP: In addition to the following procedure, Visioneer provides two other ways to start and use Pro OCR: 1) From the Windows Start menu, choose Programs, and then choose Visioneer OCR Wizard.
  • Page 46 Learning Pro OCR Basics Feature Does this... Pull-down menus Contains commands and options that you use to set process options and initiate actions. Many of the commands in the pull-down menus are also available by using the Gallery buttons and Gallery buttons drop-down lists. Gallery toolbar Lets you change common settings, start Auto OCR, or individually perform any of the basic steps required to...
  • Page 47: Selecting A Twain-Compliant Scanner

    TWAIN-compliant devices. You can select the TWAIN device in the Pro OCR software. NOTE: If you are using Pro OCR with Visioneer’s PaperPort software or scanners, see the Working with PaperPort document that came with Pro OCR, instead of the following procedure.
  • Page 48: Learning About The Gallery Toolbar

    Learning Pro OCR Basics Figure 2-1: Select Source Dialog Box NOTE: If the scanner driver you want is not shown, make sure that the scanner is properly connected to the computer and that both the scanner and the computer are plugged in, turned on, and operating correctly. 2.
  • Page 49 Learning Pro OCR Basics NOTE: Often you will use Auto OCR to complete processing. However, sometimes it is better to perform each step individually. (This is also referred to as manual or single- step operation.) For example, you use the single-step procedures when you want to manually define locate regions, create a template, redo a step, recognize different type quality settings, or scan pages that have mixed orientations (portrait and landscape.) Button...
  • Page 50: Tutorial Examples

    Learning Pro OCR Basics Save As Saves the converted document in a variety of formats, such as text, Rich Text Format (RTF), or HTML. You can select options with the Gallery buttons by using the drop-down list next to each button. To select an option from a Gallery drop-down list: 1.
  • Page 51 Learning Pro OCR Basics Example 1: Using Auto OCR to Scan a One-Page Simple Document and Save It in Pro OCR Format This example shows how to convert (recognize) the text in a one-page document. You can find a ready-to-use sample in the back of the Getting Started Guide. Selecting Gallery Options Pro OCR processes a document using the options that are set in each drop-down list associated with a button of the Gallery toolbar.
  • Page 52 Learning Pro OCR Basics 6. Click End. Pro OCR continues with the second task to locate text regions on the page. A progress bar moves down the page. When Pro OCR finishes locating, it displays text boxes indicating located text regions, with arrows connecting each text region to the next.
  • Page 53 Learning Pro OCR Basics In the next step, Pro OCR recognizes the located text. While Pro OCR is recognizing, again a progress bar moves down the page. When Pro OCR finishes recognizing the text, the Recognition Completed dialog box appears. 7.
  • Page 54 Learning Pro OCR Basics Usually at this point you proof the document. For now, just save it. Saving a Document You can save the processed document to disk in different formats. For example, if you want to open the document again in Pro OCR, you select the Pro OCR format. To save the document: file:///C|/VisioneerDoc/html/02learn.htm (11 of 33) [1/20/2003 4:21:16 PM]...
  • Page 55 Learning Pro OCR Basics 1. Choose Save from the File menu, or click the Save As button on the Gallery toolbar. The Save As dialog box appears. 2. Choose Pro OCR from the Save As drop-down list. By saving the document in this format, you can edit the pages later within Pro file:///C|/VisioneerDoc/html/02learn.htm (12 of 33) [1/20/2003 4:21:16 PM]...
  • Page 56: Example 2: Opening A File And Saving It In A Word Processor Format

    Learning Pro OCR Basics OCR. If you save in another file format, you must open it in an application that supports that format. 3. Type in a name for the file in the File Name box. 4. Click Save. The text and format information of the document is saved in the format you’ve selected.
  • Page 57 Learning Pro OCR Basics 3. In the Pro OCR directory, select the file SAMPLEB.TIF. 4. Click Get. The sample file is read in and the progress bar moves down the page. Locating the Regions in a Document For Pro OCR to properly convert areas of a document, you must locate the regions of the page that will be recognized.
  • Page 58 Learning Pro OCR Basics Recognizing the Document The third step is to actually convert or recognize the text in a document. Pro OCR reads the text and displays the actual characters. Before recognizing the document, you should specify the quality of the image text. You can do this by using the Recognize drop-down list.
  • Page 59 Learning Pro OCR Basics To recognize the document: 1. Select Degraded or Fax Quality from the Recognize button drop-down list. 2. Click the Recognize button in the Gallery toolbar. Pro OCR displays a bar that moves through the document as Pro OCR recognizes the text.
  • Page 60 Learning Pro OCR Basics Proofing the Document After a document is recognized it appears in the text view. In this view, you can proof the document for errors and make changes to the document when you find problems. When you proof, you can: Inspect recognized text and edit it if necessary.
  • Page 61: Example 3: Scanning A Document Of Multi-Column Text

    Learning Pro OCR Basics Pro OCR displays the next suspect entry. 4. Repeat the previous steps until you have checked the entire document. 5. If you want to change the font style, select the text, and click the Style option. Saving the Document Saving the document places a permanent copy of it on disk.
  • Page 62 Learning Pro OCR Basics Locate Text Only prevents Pro OCR from locating any picture element in the document to be scanned. 3. Select Use Scanner from the Get Page drop-down list in the Gallery toolbar. 4. Click Auto OCR in the Gallery toolbar. Your scanner software dialog box appears.
  • Page 63 Learning Pro OCR Basics While Pro OCR recognizes the page, notice the boxes indicating located text regions around each column, and the arrows connecting each text region to the next. Note that by using Locate Text Only, the graphic element in the sample was not located and so a box does not appear around it.
  • Page 64 Learning Pro OCR Basics When Pro OCR finishes recognizing, the Recognition Completed dialog box appears. 7. Click OK. The document appears in the text view. To save the document 1. Choose Save As from the File menu, or click the Save As button in the Gallery toolbar.
  • Page 65: Example 4: Scanning A Document With Tables And Saving In A Spreadsheet Format

    Learning Pro OCR Basics Documents.” Example 4: Scanning a Document With Tables and Saving in a Spreadsheet Format This example introduces you to processing of multi-column text in tables, where you want the text to be recognized as all one text block and not broken into columns. You can use this procedure whenever you want to recognize tables and other documents that you don’t want broken into columns.
  • Page 66 Learning Pro OCR Basics Notice that the text regions are not drawn separately around each column. By using the Single Column locating method, you force Pro OCR to ignore columns and tell it to read the page from left to right, top to bottom. When Pro OCR is finished recognizing the page, the Recognition Completed dialog box appears.
  • Page 67 Learning Pro OCR Basics Pro OCR displays the document in the text view. To save the document: 1. Choose Save As from the File menu, or click the Save As button in the Gallery toolbar. The Save As dialog box appears. file:///C|/VisioneerDoc/html/02learn.htm (24 of 33) [1/20/2003 4:21:16 PM]...
  • Page 68: Example 5: Scanning And Saving A Document With Pictures

    Learning Pro OCR Basics 2. Choose Microsoft Excel from the Save as Type drop-down list. Notice that the following options are already selected. TIP: To change these options, click the Options button. 3. Type in a name for the file in the File Name box. 4.
  • Page 69 Learning Pro OCR Basics After scanning the sample document, it appears in the Pro OCR window. Pro OCR begins getting the page from the scanner. When the scanning is done, a dialog box appears asking if you want to scan additional pages. For this example, you won’t be scanning any additional pages.
  • Page 70 Learning Pro OCR Basics The Recognition Complete dialog box appears. 6. Click OK. The document appears in the text view. Notice that the graphic image appears and has a picture region drawn around it. To save the document: 1. Choose Save As from the File menu, or click the Save As button in the Gallery toolbar.
  • Page 71: Example 6: Locating A Document Using A Template

    Learning Pro OCR Basics 6. Click Save. The picture from the scanned page is now saved within the RTF file along with the recognized text. If you open this file in a word processor that supports pictures in RTF files, you see the recognized text and the pictures. 7.
  • Page 72 Learning Pro OCR Basics 3. In the Temp folder, find and select the file TEMPB.TPL. 4. Click Open. Pro OCR displays the name of the template you selected next to Template in the Locate drop-down list. 5. Select Open File from the Get Page drop-down list. 6.
  • Page 73: Example 7: Scanning A Document With Mixed Tables And Manually Locating Regions

    Learning Pro OCR Basics copyright in the footer were not recognized. If you save this page in an application or text format, only the displayed text is saved. 10. Save and close the document. Use the same procedures described in the earlier examples. Example 7: Scanning a Document with Mixed Tables and Manually Locating Regions This example shows you how to scan and manually locate a document with a table...
  • Page 74 Learning Pro OCR Basics 3. Press and hold the mouse button; then drag down and to the right until the box following the pointer encloses all of the column headers. 4. Release the mouse button. You have just manually located a text region. 5.
  • Page 75 Learning Pro OCR Basics 9. Using the same steps you used to create the text regions, drag the mouse until the box following it encloses all three columns of numbers and release the mouse button. Make sure the entire image of the number columns is enclosed by the new region you have defined.
  • Page 76 You have completed this example. A message appears asking if you want to save the document. 4. Choose Close from the File menu. Close the document without saving it. © Copyright 1998 Visioneer, Inc. Reach us at www.visioneer.com. file:///C|/VisioneerDoc/html/02learn.htm (33 of 33) [1/20/2003 4:21:16 PM]...
  • Page 77 Getting Documents Chapter 3 Getting Documents This chapter tells you how to get (acquire) documents with Pro OCR. It is assumed that you completed the procedures in “Starting Pro OCR,” “Selecting a TWAIN- Compliant Scanner,” in Chapter 2. In this chapter you learn: The basic steps for getting a page How to get a page using a scanner How to get a page from a file...
  • Page 78 Getting Documents Getting Pages From a Scanner You can use a scanner to get one page at time by using the Get Page button, or use a scanner with Auto OCR to get multiple pages automatically. This section tells you how to: Set scanning options Get one page using Get Page...
  • Page 79 Getting Documents To set Get Page Processing options: 1. Choose Options from the Tools menu. The Options dialog box appears with the Processing tab selected. 2. Select the options that you want to use. 3. Click OK. Selecting a Scanner as the Source When you get pages from a scanner by using Auto OCR, a deferred job, or Get Page, one or more page images are read in from the scanner.
  • Page 80 Getting Documents NOTE: If you did not previously select a scanner, the Select Scanner dialog box appears, letting you select one now. (You can also select a scanner by choosing Select Scanner from the Tools menu.) Getting a Page Using a Scanner During the single-step Get Page operation, you scan only one side of one page at a time.
  • Page 81 Getting Documents Pro OCR scans the page on the flatbed or the first page in the ADF, using the current brightness, page size, orientation, and scanning resolution settings. After the single page is read in, it appears using the previous magnification. NOTE: To find the most appropriate brightness setting for a page, use Get Page to scan the same page as many times as necessary.
  • Page 82 Getting Documents Page drop-down list. 2. Check the Locate and Recognize options to make sure they are set the way you want them. 3. Place the first page on the flatbed. Make sure the page is oriented correctly for your scanner and the page orientation you have selected in the Gallery.
  • Page 83 Getting Documents If the Enable Auto OCR Dialogs processing option is not selected, scanning is completed. Pro OCR begins locating and then recognizing. If the Enable Auto OCR Dialogs processing option is selected, Pro OCR asks for additional pages to scan after it finishes reading in the current page: file:///C|/VisioneerDoc/html/03get.htm (7 of 16) [1/20/2003 4:21:17 PM]...
  • Page 84 Complete the following procedure to use Auto OCR with scanners that have an ADF. NOTE: To use an ADF scanner with Pro OCR, you need the Pro OCR ISIS upgrade. For more information, visit Visioneer’s Web site at www.Visioneer.com. To automatically process one or more pages with a scanner that has an ADF: 1.
  • Page 85 Getting Documents 5. Click Auto OCR. Pro OCR begins getting pages. If the Enable Auto OCR Dialogs processing option is not selected, scanning is completed. Pro OCR begins locating and then recognizing. If the Enable Auto OCR Dialogs processing option is selected, Pro OCR asks for additional pages to scan.
  • Page 86 Getting Documents Scanning is completed. Pro OCR finishes getting pages and displays the first page of the scanned stack in the image view. The scanned double-sided text is correctly sequenced, in correct page order. Getting Pages from an Image File Typically, Pro OCR obtains the image of a page by working directly with your scanner.
  • Page 87 Getting Documents 1. Select Open File from the Get Page drop-down list. A checkmark appears next to it when selected. 2. Click the Get Page button in the Gallery toolbar. The Get Page dialog box appears. 3. Select the file and click Get. The file is read in and the progress bar moves down the page.
  • Page 88 Getting Documents 4. Click Auto OCR. 5. Find and select the file(s) that you want to process. 6. Click Add and then click Get. Pro OCR automatically processes the image file(s) according to the controls in the Locate and Recognize rows of the Gallery. 1.
  • Page 89 Getting Documents You can specify one or more image files for the Get Page step, and then have Pro OCR automatically locate and recognize them. If you’ve selected the Enable Auto OCR Dialogs processing option, you can also select one or more additional files after reading the initial files and before locating and recognizing begin.
  • Page 90 Getting Documents can add available files from as many directories and disks as necessary. Files are displayed in the Selected Files list in the order in which you add them. NOTE: To remove a file from the Selected list, select the file name and click the Remove button.
  • Page 91 Getting Documents More About Enabling Auto OCR Dialogs By default, after you’ve used Auto OCR to scan pages or to read in one or more files, Pro OCR displays a dialog box that prompts you to continue in one of several ways: Scan another page or stack of pages Scan the second side of a page or stack Open additional files...
  • Page 92 Getting Documents 2. To enable the dialogs, select Enable Auto OCR Dialogs. To disable the dialog boxes, deselect the option. © Copyright 1998 Visioneer, Inc. Reach us at www.visioneer.com. file:///C|/VisioneerDoc/html/03get.htm (16 of 16) [1/20/2003 4:21:17 PM]...
  • Page 93 Saving and Printing Documents Chapter 6 Saving and Printing Documents This chapter describes the input file formats and output file formats that Pro OCR supports and tells you how to save documents in a variety of these formats. Saving Documents and Other Pro OCR Items You can save the following documents and items: Documents (in various file formats) Templates (text, numeric, picture, and table region definitions and ordering...
  • Page 94 Saving and Printing Documents The Save As dialog box appears: If the document has been saved previously, the name of the document is displayed and selected in the File Name box. If the document has not already been saved, the File Name box is selected and contains the default file name: UNTITLED.XXX.
  • Page 95 Saving and Printing Documents Standard text file formats Word processor and spreadsheet file formats For more information about the different file formats, see “Supported Output File Formats” later in this chapter. 4. If you want to save any pictures in the document, select the Save Pictures option and choose a picture format from the Picture Format drop-down list.
  • Page 96 Saving and Printing Documents currently chosen in the Save as Type drop-down list, click the Options button to open the Save As Options dialog box. Most formats have additional options. If there are no options available for the format you’ve selected, the Options button is dimmed. The Save As Options dialog box has the following sets of options: If page breaks should be inserted between each page If formatting should be preserved or completely discarded, or if only...
  • Page 97 Saving and Printing Documents Point size Justification Number of columns Line spacing Paragraph indentation Page size Margin sizes Choose one of the Split Document options to either keep all pages in one file or split the document into multiple files: All Pages in One File: Choose this option to save all the pages in the document in one file.
  • Page 98 Saving and Printing Documents one image page per file. If you save in PCX format, Pro OCR automatically selects this option, because a PCX file can only have one page. When you use this option, Pro OCR automatically creates one file for each page.
  • Page 99 Saving and Printing Documents 2. Process the pages as you would normally. 3. When you save the document, choose Save As from the File menu. The Save As dialog box appears. 4. Click the Options button. The Options dialog box appears. 5.
  • Page 100 Saving and Printing Documents 5. Click OK. Saving Multiple Page Images as Separate Image Files In addition to Pro OCR format, you can save a document in a number of image output formats. Usually, you’ll save a copy of your document in one of these graphic formats when the document you’re processing has illustrations that you want to save and use in other applications.
  • Page 101 Saving and Printing Documents To save a template: 1. Choose Save Template As from the File menu. 2. Enter a file name. 3. Choose the format from the Save as Type drop-down list, and then click Save. You can open a saved template by double-clicking the Template button or by choosing Select Template from the File menu.
  • Page 102 Saving and Printing Documents formats. Pro OCR can save to a variety of output file formats at various stages of processing. Table 6-1: Proprietary Pro OCR Formats Pro OCR Pro OCR Text Only Pro OCR Deferred Table 6-2: Standard Image File Output Formats TIFF Uncompressed TIFF Group 3 TIFF PackBits...
  • Page 103 Saving and Printing Documents NOTE: If you don’t have any of the applications listed here, note that many word processor and spreadsheet applications can handle formats from other word processors and spreadsheets. Most Windows word processors can import RTF files, although some have only limited support for RTF.
  • Page 104 Saving and Printing Documents The current state of each page in the document is saved, including any locate regions or recognized text. Saving to Pro OCR Deferred Format The Pro OCR Deferred file format is a special case of the Pro OCR file format. Use it to save work in progress so that you can open the document later for further single-step processing (using the Open command in the File menu) or to complete processing (using the Process Deferred Jobs command in the File menu).
  • Page 105 Saving and Printing Documents suspect and illegible characters if there are any left, check spelling, and search for numbers, punctuation, symbols, and alphanumeric words. However, because you haven’t saved the page image, you can’t use the On-Screen Verifier. Text files take up a lot less space than image files. Image files are large, even when compressed.
  • Page 106 Saving and Printing Documents preserved. When you output a recognized document in Plain Text format, the text is sequentially output in the order in which the text blocks were located. Margins and columns are not preserved. Text with Line Breaks. Preserves text, tabs, and a carriage return at the end of each line.
  • Page 107 Saving and Printing Documents font, character spacing, and line length information. Hyper Text Markup Language (HTML). Inserts HTML tags to format the document for viewing in an HTML browser. Saving to Application Formats When you save to a specific application format, by default Pro OCR saves as much of this format, character, and font information as possible.
  • Page 108 Saving and Printing Documents Doesn’t Support If you have a word processor that Pro OCR does not support directly, try saving your document in one of the other Pro OCR word processor export formats. In addition, most word processors can import RTF files, although some have only limited support for RTF.
  • Page 109 Saving and Printing Documents Saving Pictures During the Locate and Recognize steps, if Locate Text and Pictures has been selected, Pro OCR processes any pictures, or other nontext information on the input page, as embedded graphic images. When you save to a graphic output file format or to a word processor format that supports embedded pictures, and you select the Save Pictures option in Save As, Pro OCR saves these embedded graphic images.
  • Page 110 Saving and Printing Documents 5. Click OK. © Copyright 1998 Visioneer, Inc. Reach us at www.visioneer.com. file:///C|/VisioneerDoc/html/06save.htm (18 of 18) [1/20/2003 4:21:18 PM]...
  • Page 111 Locating Text and Graphics Chapter 4 Locating Text and Graphics A locate region identifies an area of a page image to be recognized. You define locate regions in the image view using Pro OCR’s locating procedures. This chapter tells you how to: Identify the different kinds of locate regions Select the appropriate locating method Locate regions automatically and manually...
  • Page 112 Locating Text and Graphics Numeric Regions A numeric region is a locate region that Pro OCR recognizes as numbers (0–9) or one of the symbols shown in the following table. Table 4-1: Numeric Symbols ¥ = * “ % $ # £...
  • Page 113 Locating Text and Graphics region are recognized as numbers and not mistaken for letters. You can define numeric regions manually or with a template. Pro OCR does not define numeric regions automatically. You can also redefine a selected numeric region as any other kind of locate region using the Style menu or the Style ribbon. Picture Regions A picture region is a locate region that contains any kind of graphic, illustration, photograph, drawing, or picture.
  • Page 114 Locating Text and Graphics Typically, you locate a table by putting a single text or numeric region around all of the columns of the table. However, if you have a table where some columns are text and some columns are numeric, you may want to use the Make Table command. Make Table allows you to select different types of regions and then combine them into one object so that the text is exported into a tabular format, rather than columns.
  • Page 115 Locating Text and Graphics Deciding When to Use Multiple Columns or Single Column Only Depending on the content of the page, you can organize the actual flow of the text in different ways. In particular, does the text flow like a newspaper article (top to bottom first, and then left to right), or does it flow like a form (left to right first, and then top to bottom): When you look at a page like this, if you understand its contents, you know which...
  • Page 116 Locating Text and Graphics How to Locate Text and Picture Regions Locating is typically done after getting a page and before recognizing. You select a locating method to tell Pro OCR how to define and order locate regions on a page. Pro OCR uses the selected locating method with automatic processing and when you click the Locate button.
  • Page 117 Locating Text and Graphics TIP: To select White Out Text, choose Options from the Tools menu, and then select White Out Text in Pictures in the Processing Options. Locating with a Template If the locating method you selected, Multiple Columns or Single Columns Only, doesn’t work exactly as you want, you can manually create the appropriate locate regions on a page.
  • Page 118 Locating Text and Graphics 3. Click the Locate button in the Gallery toolbar. Pro OCR locates the document. 4. Manually adjust the locate regions. For example, to adjust the region size, such as to exclude text, click the border of the text region and drag to include or exclude text. To delete a text region, select the border of the region and press the Delete key.
  • Page 119 Locating Text and Graphics The Select Template dialog box appears. 3. Find and select the template that you want to use. 4. Click Open. Pro OCR displays the name of the template you selected next to Template in the Locate drop-down list. 5.
  • Page 120 Locating Text and Graphics In the image view, the order of locate regions is shown by arrows from the center of one locate region to the top-center of the next locate region. This sequence tells Pro OCR in what order it should process the regions: You can manually change the order of locate regions that either you or Pro OCR have defined.
  • Page 121 Locating Text and Graphics The order of the paragraphs (and the flow of the text) on the page might be as shown in Example 5-1 or as shown in Example 5-2: When the order of text regions is defined as in Example 5-1, the text is output to the word processor as in Example 5-3.
  • Page 122 Locating Text and Graphics Processing Resumes For Pro OCR to process resumes and legal documents properly, select Single Columns Only from the Locate drop-down list in the Gallery toolbar. Resumes often contain formatting elements that can be difficult for an OCR program to interpret, such as numerous indentations, bulleted items, and a wide mixture of both justified and centered text.
  • Page 123 Locating Text and Graphics About Columns, Locate Regions, and Output File Formats Pro OCR preserves virtually all page layout and text flow information in the documents it processes. However, when you save to a specific word processor format, Pro OCR preserves only as much of this layout information as the particular application format is designed to use.
  • Page 124 Locating Text and Graphics Create tables. You can use manually located regions to create a template, just as you can create templates from automatically located text regions. As with all other locating procedures, you can only manually define locate regions in the image view by selecting Image from the View menu.
  • Page 125 Locating Text and Graphics Numeric region Hold down the Ctrl key as you drag the cross hair pointer across the page image. Picture region Hold down the Ctrl + Shift keys, as you drag the cross hair pointer across the page image. As you drag, a box is drawn from the corner where you started to the cross hair of the pointer.
  • Page 126 Locating Text and Graphics Overlapping Locate Regions and Skewed Text When you manually create text or numeric regions, you should be aware of the following constraints. If a text or numeric region cuts through a character, only the part of the character that is within the region is located.
  • Page 127 Locating Text and Graphics These constraints are especially important when pages are skewed (read in crooked). Because locate regions are defined by rectangles that are square to the screen, when you have skewed text in a document, you may have to overlap text or numeric regions in order to not cut off any lines and get all the text into the appropriate region.
  • Page 128 Locating Text and Graphics For more information about Straighten Skewed Images, “Setting Scanning Options,” in Chapter 3. Selecting and Deselecting Locate Regions You can only select locate regions in the image view. You select a locate region to change its kind, delete it, or resize it. When any locate region is selected, sizing handles appear: To select a single locate region: 1.
  • Page 129 Locating Text and Graphics To deselect one or more locate regions while keeping the rest selected: Move the pointer over a selected locate region and shift-click. The locate region you clicked in is deselected, but all other selected locate regions stay selected. Repeat this step for each locate region you want to deselect.
  • Page 130 Locating Text and Graphics want to define a different locate region that includes the image in that region. NOTE: Only the defined locate region is deleted, not the underlying image. The underlying image never gets deleted. To delete a locate region: 1.
  • Page 131 Locating Text and Graphics The same conditions on the size, overlap, containment, and extent of locate regions apply to a resized locate region as to a newly created locate region. Reordering Locate Regions You can only reorder locate regions in the image view. Whenever you create a locate region, Pro OCR automatically links all locate regions on the page in sequence.
  • Page 132 2. Drag the pointer into locate region #2, then release the mouse button. The arrow originally leading into locate region #2 disappears, and a new arrow connects locate region #1 to locate region #2: © Copyright 1998 Visioneer, Inc. Reach us at www.visioneer.com.
  • Page 133 Setting Recognize Options and Proofing a Recognized Document Chapter 5 Setting Recognize Options and Proofing a Recognized Document When you recognize a document you convert an image into editable text. You can then proof and edit the text. This chapter tells you how to: Select the type quality option for recognizing.
  • Page 134 Setting Recognize Options and Proofing a Recognized Document To select the type quality for recognizing: Select a type quality from the Recognize drop-down list in the Gallery toolbar. Selecting Display Options Use the Options dialog box to select options that tell Pro OCR how to recognize a document and display the results.
  • Page 135 Setting Recognize Options and Proofing a Recognized Document Pro OCR recognizes and identifies over 2,000 typefaces. It can make correct judgments about character identity even when a character isn’t absolutely clear. However, sometimes Pro OCR cannot identify with certainty what a particular character is, and other times Pro OCR cannot identify a character at all.
  • Page 136 Setting Recognize Options and Proofing a Recognized Document Option Does this... Stringent Suspect Threshold Identifies ALL suspect characters. Use the stringent setting when it is important that you know about all possible mistaken identifications, or when using dictionaries will not aid in identification. For example, use Stringent when you recognize tables of numbers, documents with a lot of proper names, and whenever you need to check the...
  • Page 137 Setting Recognize Options and Proofing a Recognized Document The choices for the illegible character symbol include: The preset symbol is “~”. Every illegible character is represented by the same selected illegible character symbol. Choose a symbol that you otherwise don’t expect to have in your document, so that when you search for it you will only find the illegible characters.
  • Page 138 Setting Recognize Options and Proofing a Recognized Document you’ll always have the same fonts installed in your system as the fonts identified in the input document. To maintain as much similarity to the input document as possible, Pro OCR maps any identified fonts to three user-selectable fonts installed in your system: one monospaced font, one serif font, and one sans serif font.
  • Page 139 Setting Recognize Options and Proofing a Recognized Document TIP: You can display and proof a document using one display font or set of display fonts, and then change the settings to save the document with other fonts. This is one reason for saving a separate settings file.
  • Page 140 Setting Recognize Options and Proofing a Recognized Document You can locate region automatically using Locate with any of the locate settings, or they can be located manually. For more information about locating, Chapter 4, “Locating Text and Graphics.” 2. Select either Letter Quality, Dot Matrix Quality, or Degraded or Fax Quality from the Recognize drop-down list in the Gallery toolbar.
  • Page 141 Setting Recognize Options and Proofing a Recognized Document Setting the Zoom Levels The zoom controls are active in both the image view and the text view. Use them to change between zoom levels. You cannot zoom in closer than the pixel-for-pixel level (in the image view), or 400% (in the text view), or zoom out farther away than 25% in either view.
  • Page 142 Setting Recognize Options and Proofing a Recognized Document minimum zoom level, the zoom out control is dimmed. To zoom in or out: Click the Zoom In or Zoom Out icon on the Status bar. Selecting a Page to Display The page controls are available in both the image view and the text view. The page number box in between the page controls tells you what page of the open document is being displayed and how many pages there are in the document.
  • Page 143 Setting Recognize Options and Proofing a Recognized Document The requested page appears. The page number box changes to the new page number. Selecting Text or Image View The View controls are current in both the image view and the text view. Use them to change between the image view and the text view.
  • Page 144 Setting Recognize Options and Proofing a Recognized Document Edit a document Selecting Proofing Options Set Pro OCR Proof options to indicate if you want to proof whole lines and what combinations of words and punctuation you want to proof. To select Proofing options: 1.
  • Page 145 Setting Recognize Options and Proofing a Recognized Document 3. (Optional) If you select Combination Of, select any of the following options: Proof Suspect and Illegibles. Pro OCR selects each suspect or illegible character as it is encountered. Note that Pro OCR uses the selected suspect threshold display option to decide which characters are suspect.
  • Page 146 Setting Recognize Options and Proofing a Recognized Document character, the suspect or illegible character is visited and selected first, and the next time you use Proof, the same word is selected again. To proof a document: 1. In the text view, click the Proof button, or choose Proof from the Recognize menu.
  • Page 147 Setting Recognize Options and Proofing a Recognized Document Replace to make changes. TIP: If the selected text is misspelled, and you expect to find further instances of the word in this document, don’t edit it. Instead, use Find & Replace. The selected word is displayed as the Find text.
  • Page 148 Setting Recognize Options and Proofing a Recognized Document Click the text view button in the Status bar, or by choosing Text from the View menu. To edit text within a line: 1. Move the pointer over the text line. The pointer indicates the text selection. 2.
  • Page 149 Setting Recognize Options and Proofing a Recognized Document Cut, copy, paste, clear Use keyboard equivalents or click the right mouse button to select or deselect characters one at Hold down the Shift key while using the a time. arrow keys Lines don’t wrap when more characters are added to a line.
  • Page 150 Setting Recognize Options and Proofing a Recognized Document Each time you select another line, a box is drawn around it. The previously selected lines stay selected. The lines don’t have to be next to one another to be selected. 4. Repeat steps 2 and 3 for each additional text line you want to select. 1.
  • Page 151 Setting Recognize Options and Proofing a Recognized Document selected. To deselect all text lines: 1. Move the pointer outside all text lines. When the pointer is outside all text lines, it is the standard arrow pointer. 2. Click the mouse button. All selected text lines are deselected.
  • Page 152 Setting Recognize Options and Proofing a Recognized Document You can only apply a text style in the text view. You may apply a text style to any selected text. Text can be styled with any combination of Bold, Italic, and/or Underline. All text from the selected lines is changed to the selected style.
  • Page 153 Setting Recognize Options and Proofing a Recognized Document When you use Proof with the Misspelled Words proofing option selected, Pro OCR searches for words that are not in the General dictionary, the current user dictionary, or any supplemental Pro OCR dictionary in the dictionaries directory. Pro OCR selects the first candidate word it finds after the insertion point or the start of the current page.
  • Page 154 Setting Recognize Options and Proofing a Recognized Document To create a user dictionary: 1. Choose Select User Dictionary from the Tools menu. The following dialog box appears: 2. Type in the name of the new dictionary. 3. Click OK. The new dictionary is created and automatically selected. To select a user dictionary: 1.
  • Page 155 Setting Recognize Options and Proofing a Recognized Document 3. Click OK. The current user dictionary (and any changes you make to it) is used until you choose a different one. To add to the User Dictionary while editing in the text view: 1.
  • Page 156 Setting Recognize Options and Proofing a Recognized Document © Copyright 1998 Visioneer, Inc. Reach us at www.visioneer.com. file:///C|/VisioneerDoc/html/05recog.htm (24 of 24) [1/20/2003 4:21:21 PM]...
  • Page 157 Table of Contents Contents Chapter 3: Getting Documents Getting a Page—The Basic Steps Getting Pages From a Scanner Setting Scanning Options Selecting a Scanner as the Source Getting a Page Using a Scanner Using Auto OCR with Scanners Getting Pages from an Image File Selecting a File as the Source and Getting Pages Getting Files From Other Scanner Applications Getting Fax-modem Files...
  • Page 158 Table of Contents Contents Chapter 4: Locating Text and Graphics Kinds of Locate Regions Text Regions Numeric Regions Picture Regions Tables Pro OCR’s Locating Methods Locating Text and Pictures Locating with a Template Order of Locate Regions Examples of Locating Documents Processing Resumes Processing Legal Documents Processing Faxed Documents...
  • Page 159 Table of Contents Selecting and Deselecting Locate Regions Changing the Kind of a Locate Region Deleting a Locate Region Resizing a Locate Region Reordering Locate Regions Glossary file:///C|/VisioneerDoc/html/toc4.htm (2 of 2) [1/20/2003 4:21:22 PM]...
  • Page 160 Table of Contents Contents Chapter 5: Setting Recognize Options and Proofing a Recognized Document Selecting Type Quality Options Selecting Display Options Setting the Suspect Character Threshold Setting the Illegibles Character Symbol Selecting a Display Font Indicating Whether Pictures Appear During Text View Recognizing a Single Page Working with Recognized Pages in Text view Setting the Zoom Levels...
  • Page 161 Table of Contents Checking Spelling in a Document Adding Words to a User Dictionary Displaying a Summary of Recognized Errors Glossary file:///C|/VisioneerDoc/html/toc5.htm (2 of 2) [1/20/2003 4:21:22 PM]...
  • Page 162 Table of Contents Contents Chapter 6: Saving and Printing Documents Saving Documents and Other Pro OCR Items Saving a Document Saving Templates Saving Settings Supported Output File Formats Saving to Proprietary Pro OCR Formats Saving to Standard Image File Formats Saving to Generic Text File Formats Saving to Application Formats Format Suppression and Customizing...
  • Page 163 Table of Contents Contents Chapter 7: Creating and Processing Deferred and Batch Jobs The Advantages of Finish and Deferred Processing Guidelines for Using Finish Processing and Deferred Processing How it Works Setting Up and Processing Deferred Jobs Processing Deferred Jobs Batch Processing Glossary file:///C|/VisioneerDoc/html/toc7.htm [1/20/2003 4:21:23 PM]...
  • Page 164: Chapter 7: Creating And Processing Deferred And Batch Jobs

    Creating and Processing Deferred and Batch Jobs Chapter 7 Creating and Processing Deferred and Batch Jobs This chapter tells you how to process Deferred, Finish, and Batch jobs. Finish Processing lets you combine the efficiency of multi-step automatic operation with the power and flexibility of single-step interactive operation. You can process pages in your document according to their specific characteristics, while still having automatic processing available for the rest of the pages in the document.
  • Page 165: Guidelines For Using Finish Processing And Deferred Processing

    Creating and Processing Deferred and Batch Jobs Locate, or Recognize setting for the different pages in a document. You’ll also use these processes with a mixture of settings or when more than one person works on the documents or more than one workstation is used. Guidelines for Using Finish Processing and Deferred Processing You can combine automatic processing, single-step processing, Finish Processing, and deferred processing in a variety of ways:...
  • Page 166 Creating and Processing Deferred and Batch Jobs Use Create Deferred Job to get pages and save them in the Pro OCR Deferred format for processing later on. After you create a deferred job, you can use any combination of locating and recognizing on some or all pages and then save the document.
  • Page 167 Creating and Processing Deferred and Batch Jobs from any directory or disk. 4. Type in a new file name. 5. Click Save. If Open File is selected in Get Page as the source to get pages from, the Auto Get Page dialog box appears. If Use Scanner is selected as the source, Pro OCR immediately starts to scan.
  • Page 168: Processing Deferred Jobs

    Creating and Processing Deferred and Batch Jobs click the Get button. The Get Page process is the same as when you’re using Auto OCR with either a scanner or a file. 7. When you’re finished getting pages, click Finished. The pages are read in the same way that they are when you use Auto OCR. When all pages are read in, a dialog box tells you the process is completed.
  • Page 169 Creating and Processing Deferred and Batch Jobs Deferred jobs are saved in the Pro OCR Deferred format with image, locate regions (if any), and recognized text (if any). 3. Select the file you want to process and click Get. To select multiple files, click the Advanced button, choose a file, and click Add.
  • Page 170: Batch Processing

    Creating and Processing Deferred and Batch Jobs NOTE: The Process Deferred Jobs command does not process non-Pro OCR image files. If some of your files could not be processed, read them in again (using Get Page, Auto OCR or Create Deferred Job) and process them as you normally would.
  • Page 171 Creating and Processing Deferred and Batch Jobs Batch Process allows you to specify the source directory that contains image files, image file type, destination directory where the recognized results are saved, and the export Format. Pro OCR automatically performs the OCR job on each image file under the source directory, and exports the results to the destination directory.
  • Page 172 The progress of Batch Process is shown on the Title Bar of Pro OCR window. Each processed file appears in the destination directory that you specified. © Copyright 1998 Visioneer, Inc. Reach us at www.visioneer.com. file:///C|/VisioneerDoc/html/07defer.htm (9 of 9) [1/20/2003 4:21:23 PM]...
  • Page 173: Glossary

    Table of Contents Contents Chapter 8: Tips for Getting the Best Results Fixing Broken and Touching Characters Adjusting Brightness for Consistent Documents Handling Documents That Are Not Consistent Processing Documents with Different Page Sizes or Orientations Processing Documents with Different Character Quality Converting Parts of a Page in a Multipage Document Changing the Gallery Options Using Get Page Again...
  • Page 174: Chapter 8: Tips For Getting The Best Results

    Tips for Getting the Best Results Chapter 8 Tips for Getting the Best Results This chapter provides tips for getting the best results from Pro OCR by: Fixing broken and touching characters Adjusting the brightness to obtain consistent documents Processing inconsistent documents Changing a setting after completing autoprocessing Getting the best recognition Making sure page images are not skewed...
  • Page 175: Adjusting Brightness For Consistent Documents

    Tips for Getting the Best Results When characters are light or broken, use a lower (darker) setting. However, when there are both broken and touching characters on the same page or in the same document, trying to fix one problem may make the other problem worse.
  • Page 176 Tips for Getting the Best Results You may have to experiment with different settings. If your scanner supports Auto brightness, you may want to try it first before setting brightness manually. 2. Click the Get page button in the Gallery, or choose Get Page from the Process menu.
  • Page 177: Handling Documents That Are Not Consistent

    Tips for Getting the Best Results that you begin to make the characters too light and/or broken. Handling Documents That Are Not Consistent Sometimes the pages in your document are not consistent, for example, do not have the same page size. To handle this, you must change the Gallery options for each page.
  • Page 178: Processing Documents With Different Character Quality

    Tips for Getting the Best Results to be processed. 2. Click Get Page to get the page, or choose Get Page from the Process menu. 3. Repeat Steps 1 and 2 for each page in the document. 4. Choose Finish Processing from the Process menu. Make sure you select the Locate and Recognize options in the Gallery toolbar.
  • Page 179: Converting Parts Of A Page In A Multipage Document

    Tips for Getting the Best Results 5. To scan the page again with a different brightness setting, delete the page. 6. Increase or decrease the brightness. If the page image contained dark or touching characters, increase brightness. If the page image contained light or broken characters, decrease brightness. If the page image has a “noisy”...
  • Page 180 Tips for Getting the Best Results 1. Choose Create Deferred Job from the Process menu. The Create Deferred Job dialog box appears. 2. Select the files you want to process. Create Deferred Job lets you scan a stack of pages or read in a set of image files.
  • Page 181: Changing The Gallery Options

    Tips for Getting the Best Results 2. Determine which Locate and Recognize options in the Gallery apply to the majority of pages. For example, the locating options might be Locate Text Only and Single Columns Only. You’ll use these settings in step 4. 3.
  • Page 182: Using Get Page Again

    Tips for Getting the Best Results Recognize over again. Using Get Page Again You may want to use Get Page again if you scan pages in with an incorrect Page Size or Orientation setting, or if you didn’t use an appropriate brightness or scanning resolution setting.
  • Page 183: Using Recognize Again

    Tips for Getting the Best Results Using Recognize Again This may be necessary if the text on a page was not recognized accurately because of an incorrect type quality setting. You can recognize again at any step in the Pro OCR process.
  • Page 184: Making Sure Page Images Are Not Skewed

    Tips for Getting the Best Results 2. Click the Proofing tab, and select the following options: Suspects (Normal), Illegibles, and Misspelled Words. 3. Click OK. 4. Choose Proof from the Process menu. When Proof selects the character to replace, select the word that contains the suspect character or illegible character you want to replace.
  • Page 185: Using Numeric Regions When You're Recognizing Numeric Text

    Tips for Getting the Best Results Even with good quality characters on good quality paper, Pro OCR will have trouble locating and recognizing accurately if the type in the page image is skewed (crooked). This can happen either because the text is crooked on the page or because the page is scanned at an angle.
  • Page 186: Avoiding Markings On Pages

    If you don’t want to mark up your original document, make a photocopy and use whiteout on it. © Copyright 1998 Visioneer, Inc. Reach us at www.visioneer.com.
  • Page 187 Index Index accuracy of recognition ADF and Auto OCR All Pages in One File (Split Document options) application formats for saving Auto Get Page dialog box (1) Auto Get Page dialog box (2) Auto OCR from a file from a scanner with a flatbed with an ADF scanner auto orientation Batch Process dialog box...
  • Page 188 Index deferred processing advantages continuing job creating job explanation guidelines setting up Degraded or Fax Quality command deleting locate regions dictionary adding words creating General user See also user dictionary directories deferred jobs dictionaries discarding format when saving Display Options command (1) Display Options command (2) Display Options command (3) Display Pictures option...
  • Page 189 Index exporting to unsupported word processor fax example faxed document processing fax-modem files features file formats input fax-modem files files from other scanner applications TIFF output Pro OCR Pro OCR Text Only spreadsheet standard text word processor File menu Process Deferred Job command (1) Process Deferred Job command (2) Save As command File Properties dialog box...
  • Page 190 Index retrieving saving source controls (1) source controls (2) type quality controls Get Info get page basic steps files from unsupported scanners from file from scanner getting fax-modem files getting multiple files (1) getting multiple files (2) one scanned page scanning additional pages setting options (1) setting options (2)
  • Page 191 Index legal document processing locate regions changing the kind of defining manually defining the order deleting kinds of legal document example locating manually method to use numeric order of overlapping regions and skewed text overlapping text and pictures picture redefining reordering resizing resume example...
  • Page 192 Index numbers and alphanumeric words Numeric Region icon numeric regions One Page Per File option (Split Document options) On-Screen Verifier example of use (1) example of use (2) showing in Text View turning on or off opening a file Optical Character Recognition (OCR) defined uses for Options dialog box...
  • Page 193 Index PCX file format Picture Region icon picture regions defined white out text pictures, saving preserving format when saving printing Pro OCR file format (1) Pro OCR file format (2) Pro OCR Text Only file format Pro OCR window Process Deferred Job command (File menu) (1) Process Deferred Job command (File menu) (2) Process Deferred Jobs Complete dialog box Processed Deferred dialog box...
  • Page 194 Index resume processing retrieve settings rotate Save As command (File menu) Save As dialog box Save As Options Save As Options dialog box saving a template as HTML as plain text as RTF as speadsheet as text for database for spreadsheet for wordprocessor Gallery settings multiple documents as separate files...
  • Page 195 Index scanning additional pages one page second side selecting a scanner setting options with Auto OCR and ADF with Auto OCR and scanner with flatbed Select Source dialog box Select Template dialog box Select User Dictionary dialog box selecting a scanner single-step operation get page locate...
  • Page 196 Index tables defined scanning mixed single column template creating saving selecting using (1) using (2) using (3) text applying styles copying deleting deselecting regions selecting all lines selecting more than one line selecting single line Text Region icon Text Region icon text view editing operations editing text...
  • Page 197 (1) view controls (2) Visioneer format White Out Text option (1) White Out Text option (2) Wizard word processor exporting to unsupported saving to Zoom controls (Status bar)

This manual is also suitable for:

Pro ocr 100

Table of Contents