Extracting Active Text From An Image - Adobe ACROBAT 9 HOW-TOS Manual

Hide thumbs Also See for ACROBAT 9 HOW-TOS:
Table of Contents

Advertisement

61
#
A page scanned in older versions of Acrobat, or one created from a photo
or drawing, is only an image of a page, and you can't manipulate its con-
tent by extracting images or modifying the text. However, Acrobat can
convert the image of the document into actual text or add a text layer
to the document using optical character recognition (OCR). Be sure to
evaluate the captured document when the OCR process is complete to
make sure Acrobat interpreted the content correctly. It's easy to confuse
a bitmap that may be the letter I with the number 1, for example.
To capture the content of an image document, do the following:
1. Choose Document > OCR Text Recognition > Recognize Text Using
OCR. The Recognize Text dialog opens. Specify whether you want to
capture the current page, an entire document, or specified pages in a
multipage document.
2. Click the Edit button to open the Recognize Text - Settings dialog.
Choose one of three options in the PDF Output Style pop-up menu:
Searchable Image compresses the foreground and places the
searchable text behind the image. Compressing affects the image
quality.
Searchable Image (Exact) keeps the foreground of the page intact
and places the searchable text behind the image.
ClearScan rebuilds the page, converting the content into text, fonts,
and graphics.
If you select either Searchable Image or ClearScan OCR choices, you
can choose one of four options from the Downsample Images pop-
up menu—anywhere from 600 down to 72 dpi. Downsampling
reduces file size, but can also result in unusable images.
Click OK to return to the Recognize Text dialog.
extracting active text
from an image
#61:

Extracting Active Text from an Image

(continued on next page)
157
Rounding Up
the Suspects
Converting a bitmap of letters
and numbers into actual let-
ters and numbers may result
in items that can't be defini-
tively identified, known as
suspects. Here's how to fix it.
Select Document > Recog-
nize Text Using OCR > Find
First OCR Suspect to open
the dialog where Acrobat
identifies suspect characters
for you to confirm.
Work through the suspects
using several options:
Select the text in the Sus-
pect field and type the
correct letters.
Click Not Text when the
suspect isn't a word at all.
Click Find Next to go to
the next suspect.
Click Accept and Find to
confirm the interpreta-
tion, and go to the next
suspect.
Click Close to end the
process.
Depending on the
characteristics of the
document's text, you may
have to modify some
conversion results, such
as the font or character
spacing, using the TouchUp
text tool.
From the Library of Daniel Dadian

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents