Extracting Active Text From An Image - Adobe ACROBAT 9 HOW-TOS Manual

Hide thumbs Also See for ACROBAT 9 HOW-TOS:

Deployment manual (6 pages)

Table Of Contents

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

Table of Contents

A page scanned in older versions of Acrobat, or one created from a photo

or drawing, is only an image of a page, and you can't manipulate its con-

tent by extracting images or modifying the text. However, Acrobat can

convert the image of the document into actual text or add a text layer

to the document using optical character recognition (OCR). Be sure to

evaluate the captured document when the OCR process is complete to

make sure Acrobat interpreted the content correctly. It's easy to confuse

a bitmap that may be the letter I with the number 1, for example.

To capture the content of an image document, do the following:

1. Choose Document > OCR Text Recognition > Recognize Text Using

OCR. The Recognize Text dialog opens. Specify whether you want to

capture the current page, an entire document, or specified pages in a

multipage document.

2. Click the Edit button to open the Recognize Text - Settings dialog.

Choose one of three options in the PDF Output Style pop-up menu:

•

Searchable Image compresses the foreground and places the

searchable text behind the image. Compressing affects the image

quality.

•

Searchable Image (Exact) keeps the foreground of the page intact

and places the searchable text behind the image.

•

ClearScan rebuilds the page, converting the content into text, fonts,

and graphics.

•

If you select either Searchable Image or ClearScan OCR choices, you

can choose one of four options from the Downsample Images pop-

up menu—anywhere from 600 down to 72 dpi. Downsampling

reduces file size, but can also result in unusable images.

Click OK to return to the Recognize Text dialog.

extracting active text

from an image

#61:

Extracting Active Text from an Image

(continued on next page)

157

Rounding Up

the Suspects

Converting a bitmap of letters

and numbers into actual let-

ters and numbers may result

in items that can't be defini-

tively identified, known as

suspects. Here's how to fix it.

Select Document > Recog-

nize Text Using OCR > Find

First OCR Suspect to open

the dialog where Acrobat

identifies suspect characters

for you to confirm.

Work through the suspects

using several options:

•

Select the text in the Sus-

pect field and type the

correct letters.

•

Click Not Text when the

suspect isn't a word at all.

•

Click Find Next to go to

the next suspect.

•

Click Accept and Find to

confirm the interpreta-

tion, and go to the next

suspect.

•

Click Close to end the

process.

Depending on the

characteristics of the

document's text, you may

have to modify some

conversion results, such

as the font or character

spacing, using the TouchUp

text tool.

From the Library of Daniel Dadian

Table of Contents

Show Quick Links

Hide quick links:

Table of Contents

This manual is also suitable for:

Acrobat 9 standard Acrobat 9 professional Acrobat 9 extended

Extracting Active Text From An Image - Adobe ACROBAT 9 HOW-TOS Manual

Extracting Active Text from an Image

Hide quick links:

Related Manuals for Adobe ACROBAT 9 HOW-TOS

Related Content for Adobe ACROBAT 9 HOW-TOS

This manual is also suitable for:

Table of Contents