[PLUG] OCR software?

John Jason Jordan johnxj at comcast.net
Wed Apr 5 15:29:58 UTC 2006


Have had Xsane on my Ubuntu-64 laptop for some time, and it works
beautifully with my Canon LiDE-30 scanner. (I bought the scanner
specifically to use with Linux, so I researched first -- and it all
works perfectly!)

However, now, for the first time, I am trying to OCR some pages of
text. This is linguistics stuff, so I expected that it OCR utility
would choke a bit on IPA characters. But what I am getting is mostly
garbage. It would be faster to type it in de novo. Surely I am doing
something wrong here. 

The first thing that happened was that Xsane gave me an error message
"child process not launched," or something like that. Taking a wild
guess I assumed that this message meant that Xsane could not find an
OCR utility. In other words, installing Xsane included a menu item for
OCR, but not the utility. So I popped up Synaptic and searched on
"optical" and on "OCR." This turned up gocr and ocrad. I installed
gocr, the gocr-doc package and gocr-gtk. Afterwards Xsane was happy to
OCR the image, but the results were lousy. 

The problem is I can't figure out how to tweak gocr. From my experience
with OCR programs on Windows there should be a "learning" tool and
other goodies to enhance the results. For example, I expected a window
to pop up for editing the results, and corrections would be saved to a
learning configuration file somewhere. It should also use whatever
spell checker you have installed as well. But all I get is a prompt for
the output file name.

I assumed gocr-doc was the Help file, but I can't figure out how to
launch it or read it. There are no entries on the gnome panel to launch
gocr or gocr-doc. From the command line I tried gocr, gocr-doc and
gocr-gtk, but all returned just "command not found." There has to be
more to this utility, I just can't figure out how to get into the
options.

Then I installed ocrad, but not only does Xsane not see it, I can't
find any way to launch it. If I type "ocrad" from the command line the
cursor moves down without a prompt in front of it, but nothing appears
on screen and there are no error messages. It looks like something
might be running, but who knows?

As awesome as Xsane is, surely there is something better than this for
OCR. Any suggestions?



More information about the PLUG mailing list