Official Japanese History of World War 2
Re: Official Japanese History of World War 2
Eugen,
What OCR software did you use to create the translation? Does it handle the charts and tables? Also, if possible, could you please post a copy of the scan you used? I would like to: 1. compare the translation to the original text; and 2. get a feel for the quality of the scan required.
Yes, learning Japanese is a viable option. If one studies at the university level and takes two semesters a year (a reasonable pace, given the effort required) he or she should be able to read a newspaper after five semesters (2 1/2 years).
Wellgunde
What OCR software did you use to create the translation? Does it handle the charts and tables? Also, if possible, could you please post a copy of the scan you used? I would like to: 1. compare the translation to the original text; and 2. get a feel for the quality of the scan required.
Yes, learning Japanese is a viable option. If one studies at the university level and takes two semesters a year (a reasonable pace, given the effort required) he or she should be able to read a newspaper after five semesters (2 1/2 years).
Wellgunde
γνώθι σαυτόν
Re: Official Japanese History of World War 2
Eugen Pinak wrote on Tue 23 Apr 2013 00:23:
Also, which OCR program are you using?
An observation: you can see where a literate native Japanese should be able to race through proofing an OCR product in a very short period of time: he wouldn't need to compare original and product character-by-character, as I have to do, for he can read Japanese.
islandee wrote:As an example for those interested, and for my own future reference, I tried to step out the process of how I produce a 'machine translation', . . . But there it is, if someone wants to have a go at it.
Thank you for taking the time to plod through my link --- and to understand.Thank you very much. . . .
The differences between our two translations show the difference between proofing and not-proofing the OCRed text. Yes, not-proofing saves a lot of time. But not-proofing on this page was lucky: for some pages it can result in utter jibberish. And some clarifications: you write ---Note, that despite bad quality of the original image and total lack of any edition of either OCRed or translated text I already have an idea about contents of this page. Of course, if I want more from this page, I have to apply more efforts, but even there there is soft, which helps you to translate individual words/kanji/kana.
The 30 seconds has me puzzled. It takes me about 30 seconds just to scan the original. The OCRing following the scan is practically instantaneous. But additional time is needed to bring up each of google translate, Bing, etc. Am I possibly a victim of a poor Internet connection out here in backwoods Thailand?. . . Here is my take on this page. It took me roughly 30 seconds to ORC and translate it. . . .
Also, which OCR program are you using?
An observation: you can see where a literate native Japanese should be able to race through proofing an OCR product in a very short period of time: he wouldn't need to compare original and product character-by-character, as I have to do, for he can read Japanese.
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
Where can I get a few pages to test my OCR programs, please?
Re: Official Japanese History of World War 2
I've just posted five pages from Senshi Soshi vol 32 at:
SS32-324 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-324.jpg)
SS32-325 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-325.jpg)
SS32-326 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-326.jpg)
SS32-327 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-327.jpg)
SS32-328 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-328.jpg)
SS32-324 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-324.jpg)
SS32-325 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-325.jpg)
SS32-326 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-326.jpg)
SS32-327 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-327.jpg)
SS32-328 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-328.jpg)
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
Thanks! I'll see what the programs I have can do.islandee wrote:I've just posted five pages from Senshi Soshi vol 32 at:
SS32-324 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-324.jpg)
SS32-325 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-325.jpg)
SS32-326 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-326.jpg)
SS32-327 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-327.jpg)
SS32-328 (http://lanna-ww2.com/pages/z01000/y0100 ... 32-328.jpg)
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
How did this do? http://login.ibiblio.org/hyperwar/Page324.rtf
Higher resolution would be more accurate, at least 300 dpi if you can get it.
Higher resolution would be more accurate, at least 300 dpi if you can get it.
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
Word retains the format better. http://ibiblio.org/hyperwar/Page324.docx
Re: Official Japanese History of World War 2
I couldn't get the rtf file to download or to display.
No problem bringing in the docx file.
I checked the first page. Minor errors. I'm not sure the forum wants to clog the board with a blow-by-blow breakdown, so I'll send it to you privately. I can say that the majority of errors are repetitive, which makes them easier to fix. What program did you use?
No problem bringing in the docx file.
I checked the first page. Minor errors. I'm not sure the forum wants to clog the board with a blow-by-blow breakdown, so I'll send it to you privately. I can say that the majority of errors are repetitive, which makes them easier to fix. What program did you use?
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
FineReader.
On the road, more later.
On the road, more later.
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
FineReader has a "train character" feature. I could see the possible patterns, but I was certain so I didn't try to fix them. Five page too less than a minute to process on my machine, a quadcore gaming platform.
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
Okay, I've been through five pages of the History with the OCR program and I see that this is quite doable. What is needed is a person with at least average skill at reading Japanese to help the program learn the characters not currently in its library. After that it's pure mechanics to digitize the pages. I'm using FineReader 11, which markets for about $170 US. This would be a great project for someone or some group, and approaching the copyright holders would be the first step in this.
Re: Official Japanese History of World War 2
One approach might be to establish a non-profit corporation. You could call it something like "The Historical Document Foundation." The avowed purpose would be to make available to the English speaking public significant foreign language historical writings of the 20th Century. A secondary goal might be to provide training opportunities for professional translators. Even though the initial aim is specifically to translate Senshi Sosho, the broader the stated goal, the easier it would be to garner support. Besides Senshi Sosho I would like to see English translations of the various Israeli and Soviet official histories, the German OKW KTB and others. Eugen suggested that a $5000 per volume Ukranian translation might be possible. I think wheedleing that amount of money out of corporate donors might be doable. Once you have something publishable, you have a source of more funds for more translations. Of course, this all presupposes the cooperation of the copyright holders.
γνώθι σαυτόν
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
That would be a great idea. I think you could provide work relevant to the fields of study of college students with such a foundation.
BTW, if anyone want to see what FineReader did with the samples: http://ibiblio.org/hyperwar/PTO/Iwo/SenshiShoshi-5.docx
BTW, if anyone want to see what FineReader did with the samples: http://ibiblio.org/hyperwar/PTO/Iwo/SenshiShoshi-5.docx
-
- Member
- Posts: 1235
- Joined: 16 Jun 2004, 17:09
- Location: Kyiv, Ukraine
- Contact:
Re: Official Japanese History of World War 2
Wellgunde
I've used scan, provided by Islandee - see link in his post.
Islandee
Scan of a single page at 300 dpi takes me some 5-10 seconds.
OpanaPointer
I've used FineReader 10 (and online translators, of course). It handes tables well, as for chatrs - it depends on the quality of charts.What OCR software did you use to create the translation? Does it handle the charts and tables? Also, if possible, could you please post a copy of the scan you used? I would like to: 1. compare the translation to the original text; and 2. get a feel for the quality of the scan required.
I've used scan, provided by Islandee - see link in his post.
Islandee
It greatly depends on quality of the scans and use/non-use of unusual kanji. I've OCRed quite a lot of Japanese books and rarely had to complain about quality.The differences between our two translations show the difference between proofing and not-proofing the OCRed text. Yes, not-proofing saves a lot of time. But not-proofing on this page was lucky: for some pages it can result in utter jibberish.
OCR took me c.5 seconds, the rest was copy-paste and use of online translators.The 30 seconds has me puzzled. It takes me about 30 seconds just to scan the original. The OCRing following the scan is practically instantaneous.
Scan of a single page at 300 dpi takes me some 5-10 seconds.
OpanaPointer
Indeed. Online translation of your OCR was quite OK. BTW, I was puzzled by the narration - looks more like memoirs, than official history for me.Okay, I've been through five pages of the History with the OCR program and I see that this is quite doable.
-
- Financial supporter
- Posts: 5671
- Joined: 16 May 2010, 15:12
- Location: United States of America
Re: Official Japanese History of World War 2
So, I take the computer-related technical issues aren't that great?