[PLUG] pdf, postscript and text
Rich Shepard
rshepard at appl-ecosys.com
Mon Sep 29 15:16:02 UTC 2003
On Mon, 29 Sep 2003, Aaron Burt wrote:
> Totally different PS generators, thus different PS code. Differences in
> graphics rendering and compression become significant here.
Aha! That makes sense; didn't think of it.
> From the sound of it (beeeg PS files, no text) these are PDFs containing
> bitmaps of printed or scanned output. In other words, you don't have
> text, just pictures of text. Does the text select tool in acroread work?
No! And that threw me for a loop. I thought that I could cut and paste if
I couldn't disassemble.
> Sounds like you're stuck with extracting bitmaps (GS can print to TIFF and
> other formats) and feeding 'em through an OCR program.
Oy, vey! Anyone have gocr or jocr running well? What I'll need to do is
convert _all_ these multi-hundred page documents into something I can use.
Groan!
Thanks, Aaron,
Rich
Dr. Richard B. Shepard, President
Applied Ecosystem Services, Inc. (TM)
2404 SW 22nd Street | Troutdale, OR 97060-1247 | U.S.A.
+ 1 503-667-4517 (voice) | + 1 503-667-8863 (fax) | rshepard@appl-ecosys.com
http://www.appl-ecosys.com/
More information about the PLUG
mailing list