Skip to main content

View Post [edit]

Poster: aronsson Date: Nov 2, 2010 11:04am
Forum: texts Subject: Dual language OCR for dictionaries

For translation dictionaries,
e.g. http://www.archive.org/details/Tysk-svensk_ordbok_Hoppe_1920
OCR needs to enable both languages to give good results. I tried to set language to "German; Swedish", but this appears not to have worked. Is there a way?

Also, this dictionary uses a thin vertical bar to separate parts of words. I know that it is possible in ABBYY Finereader to allow a special character to appear inside words, giving good results, ignoring this vertical bar. But this option doesn't seem to be enabled at the Internet Archive.

Reply [edit]

Poster: Jeff Kaplan Date: Nov 2, 2010 11:51am
Forum: texts Subject: Re: Dual language OCR for dictionaries

hi. only one language can be chosen for OCR. I'd suggest uploading the file(s) again as a separate item for the second language. It might be helpful for search to choose a title that reflects the language of the OCR. I

I would need to check on your other point about ignoring the vertical lines.