| Author |
Message |
Ajay
Guest
|
Posted:
Fri Nov 04, 2005 6:08 am Post subject:
Converting a JPEG file to text |
|
|
Hello,
I am working on a project for which I need a software that shall take
in a .jpg file (that is basically a scanned paper filled with written
material) and seperate out only the words out for me in that document.
Do you guys know of any Commercial/Open Source softwares available that
shall do this?
I am sure there is stuff around. I do not want to reinvent the wheel.
Thanks
Ajay |
|
| Back to top |
|
 |
Charles
Guest
|
Posted:
Fri Nov 04, 2005 6:15 am Post subject:
Re: Converting a JPEG file to text |
|
|
On 3 Nov 2005 16:08:38 -0800, "Ajay" <nagarkarajay@gmail.com> wrote:
| Quote: | Hello,
I am working on a project for which I need a software that shall take
in a .jpg file (that is basically a scanned paper filled with written
material) and seperate out only the words out for me in that document.
Do you guys know of any Commercial/Open Source softwares available that
shall do this?
I am sure there is stuff around. I do not want to reinvent the wheel.
Thanks
Ajay
|
Search for OCR - Optical Character Recognition software. It does
that. |
|
| Back to top |
|
 |
Lorenzo J. Lucchini
Guest
|
Posted:
Fri Nov 04, 2005 8:33 pm Post subject:
Re: Converting a JPEG file to text |
|
|
Charles wrote:
| Quote: | On 3 Nov 2005 16:08:38 -0800, "Ajay" <nagarkarajay@gmail.com> wrote:
Hello,
I am working on a project for which I need a software that shall take
in a .jpg file (that is basically a scanned paper filled with written
material) and seperate out only the words out for me in that document.
Do you guys know of any Commercial/Open Source softwares available that
shall do this?
I am sure there is stuff around. I do not want to reinvent the wheel.
Thanks
Ajay
Search for OCR - Optical Character Recognition software. It does
that.
|
On the open source front, let me suggest GOcr (or JOcr, which is actually
the same thing, it merely has two names... don't ask me).
It's not too good, but as far as I know, it's about the best you can get
open source. OCR is really a field where open source is currently a bit
lacking.
by LjL
ljlbox@tiscali.it |
|
| Back to top |
|
 |
CSM1
Guest
|
Posted:
Fri Nov 04, 2005 8:56 pm Post subject:
Re: Converting a JPEG file to text |
|
|
"Lorenzo J. Lucchini" <ljlbox@tiscali.it> wrote in message
news:MiKaf.18268$hC1.3542@tornado.fastwebnet.it...
| Quote: | Charles wrote:
On 3 Nov 2005 16:08:38 -0800, "Ajay" <nagarkarajay@gmail.com> wrote:
Hello,
I am working on a project for which I need a software that shall take
in a .jpg file (that is basically a scanned paper filled with written
material) and seperate out only the words out for me in that document.
Do you guys know of any Commercial/Open Source softwares available that
shall do this?
I am sure there is stuff around. I do not want to reinvent the wheel.
Thanks
Ajay
Search for OCR - Optical Character Recognition software. It does
that.
On the open source front, let me suggest GOcr (or JOcr, which is actually
the same thing, it merely has two names... don't ask me).
It's not too good, but as far as I know, it's about the best you can get
open source. OCR is really a field where open source is currently a bit
lacking.
by LjL
ljlbox@tiscali.it
|
Simple OCR is a Royalty Free application. Source code is available for a
fee.
http://www.simpleocr.com/
--
CSM1
http://www.carlmcmillan.com
-- |
|
| Back to top |
|
 |
Charlie Hoffpauir
Guest
|
Posted:
Fri Nov 04, 2005 9:15 pm Post subject:
Re: Converting a JPEG file to text |
|
|
On 3 Nov 2005 16:08:38 -0800, "Ajay" <nagarkarajay@gmail.com> wrote:
| Quote: | Hello,
I am working on a project for which I need a software that shall take
in a .jpg file (that is basically a scanned paper filled with written
material) and seperate out only the words out for me in that document.
Do you guys know of any Commercial/Open Source softwares available that
shall do this?
I am sure there is stuff around. I do not want to reinvent the wheel.
Thanks
Ajay
|
OCR (Optical character Recognition) programs have been around for
years. They all work better on tif, or other file formats that do not
introduce artifacts in the scanned image. To work well with jpg, you
have to insure that there is limited compression, to reduce the
artifacts.
Charlie Hoffpauir
http://freepages.genealogy.rootsweb.com/~charlieh/ |
|
| Back to top |
|
 |
Bucky
Guest
|
Posted:
Sat Nov 05, 2005 4:20 am Post subject:
Re: Converting a JPEG file to text |
|
|
Ajay wrote:
| Quote: | I am working on a project for which I need a software that shall take
in a .jpg file (that is basically a scanned paper filled with written
material) and seperate out only the words out for me in that document.
Do you guys know of any Commercial/Open Source softwares available that
shall do this?
|
If you have MS Office, there is a program under MS Office Tools called
Document Imaging. I think you usually need 300 dpi to get decent
results with OCR. And you don't want to JPEG if possible because it is
lossy and introduces artifacts. Use TIFF or PNG or GIF. |
|
| Back to top |
|
 |
Peter D
Guest
|
Posted:
Sat Nov 05, 2005 11:11 pm Post subject:
Re: Converting a JPEG file to text |
|
|
Use any of the suggested OCT packages. Did one come with your scanner? Try
that.
Re jpeg and artifacts, test a few by converting to tiff or bmp and then
OCR-ing. If that works, batch convert the jpgs and then OCR them.
"Ajay" <nagarkarajay@gmail.com> wrote in message
news:1131062918.210937.170840@o13g2000cwo.googlegroups.com...
| Quote: | Hello,
I am working on a project for which I need a software that shall take
in a .jpg file (that is basically a scanned paper filled with written
material) and seperate out only the words out for me in that document.
Do you guys know of any Commercial/Open Source softwares available that
shall do this?
I am sure there is stuff around. I do not want to reinvent the wheel.
Thanks
Ajay
|
|
|
| Back to top |
|
 |
|
|
|
|