[PLUG] pdf, postscript and text

Steve Bonds 1s7k8uhcd001 at sneakemail.com
Tue Sep 30 09:16:02 UTC 2003

On Mon, 29 Sep 2003, Aaron Burt aaron-at-speakeasy.org |PDX Linux| wrote:

> On Mon, 29 Sep 2003, Rich Shepard wrote:

> >   Second, with this same pdf file, I cannot get pstotext, ps2ascii or
> > prescript to successfully disassemble the postscript file to a text file.
> > All this used to work, of course. The postscript file was created on a
> > Microsoft system at a federal agency.
> >From the sound of it (beeeg PS files, no text) these are PDFs containing
> bitmaps of printed or scanned output.  In other words, you don't have
> text, just pictures of text.  Does the text select tool in acroread work?

If this is the case, how did it used to work?  Are these new/different PS
files than before?

How are the PS files actually created?  Do you have any control over that
process?  I.e. could you get them to stop pasting images of text into Word
before they print and run it through their own OCR first like they used
to?  ;-)

  -- Steve

More information about the PLUG mailing list