Pamyat Naroda indexes. 3,400,000+ records processed
Pamyat Naroda indexes. 3,400,000+ records processed
Hi
I finished processing catalogues of TsAMO archival operative records stored at
https://pamyat-naroda.ru/ site
Processed 3.430.807 records
They are in excel 2007 format divided into 4 parts
http://rusfolder.com/45145567
http://rusfolder.com/45145566
http://rusfolder.com/45145568
http://rusfolder.com/45145569
All texts are in Russian. Field names are given as it is.
There are no direct links, but using site search engine one can easily find any document.
Hope you have no problems with download.
Later I can mirrow those files on my site.
P.S. don't download any download managers from their site!
Download routine is the next:
1. Check here
2. press left button "скачать"
3. Click to any adlink
4. wait for~30 secs
5. input captcha
6. download
Regards
Alex
I finished processing catalogues of TsAMO archival operative records stored at
https://pamyat-naroda.ru/ site
Processed 3.430.807 records
They are in excel 2007 format divided into 4 parts
http://rusfolder.com/45145567
http://rusfolder.com/45145566
http://rusfolder.com/45145568
http://rusfolder.com/45145569
All texts are in Russian. Field names are given as it is.
There are no direct links, but using site search engine one can easily find any document.
Hope you have no problems with download.
Later I can mirrow those files on my site.
P.S. don't download any download managers from their site!
Download routine is the next:
1. Check here
2. press left button "скачать"
3. Click to any adlink
4. wait for~30 secs
5. input captcha
6. download
Regards
Alex
- G. Trifkovic
- Forum Staff
- Posts: 2293
- Joined: 06 Nov 2004, 20:26
- Location: The South-East
Re: Pamyat Naroda indexes. 3,400,000+ records processed
Hi AMVAS,
and thanks for the index.
Best,
G.
and thanks for the index.
Best,
G.
- Jeff Leach
- Host - Archive section
- Posts: 1439
- Joined: 19 Jan 2010, 10:08
- Location: Stockholm, Sweden
Re: Pamyat Naroda indexes. 3,400,000+ records processed
Yes, thanks for the indexes. Once you figure out how to filter the results they are quite usefull.
Re: Pamyat Naroda indexes. 3,400,000+ records processed
Moreover, Jeff, there exists quite an easy way to download entire dossiers using direct links and download managerJeff Leach wrote:Yes, thanks for the indexes. Once you figure out how to filter the results they are quite usefull.
But right now I'm not ready to share this way :roll:
-
- Member
- Posts: 1235
- Joined: 16 Jun 2004, 17:09
- Location: Kyiv, Ukraine
- Contact:
Re: Pamyat Naroda indexes. 3,400,000+ records processed
AMVAS - thank very much for your work
Ha! Too late for me any way, as I've downloaded all I want one by one and even with that weird "hacking" to get pages above the 10th. But maybe you'му also found a way to mass-download files from germandocs...AMVAS wrote:Moreover, Jeff, there exists quite an easy way to download entire dossiers using direct links and download managerJeff Leach wrote:Yes, thanks for the indexes. Once you figure out how to filter the results they are quite usefull.
But right now I'm not ready to share this way :roll:
Re: Pamyat Naroda indexes. 3,400,000+ records processed
I have doubts you downloaded 7-10Tb of their contentEugen Pinak wrote:AMVAS - thank very much for your work
AMVAS wrote:
Ha! Too late for me any way, as I've downloaded all I want one by one and even with that weird "hacking" to get pages above the 10th. But maybe you'му also found a way to mass-download files from germandocs...
Now they finally fixed that bug with 10+ pages. ~On May 9th they introduced new search page. A bit better than the old one, but still not as powerful as we would like to have.
Look at rutracker.org for germandocs documents. They have plenty of them.
It's not my subject, so I didn't study opportunity to get copies en mass from that site.
As I can see, they use much simpler engine than pamyatnaroda with direct links like
http://wwii.germandocsinrussia.org/pages/112204/zooms/8 for full-scale pages.
So, I don't think downloading documents from them to be too much problem
Regards
Alex
-
- Member
- Posts: 1235
- Joined: 16 Jun 2004, 17:09
- Location: Kyiv, Ukraine
- Contact:
Re: Pamyat Naroda indexes. 3,400,000+ records processed
Certainly not, but I've decided for myself, that I shall download only data, relevant to my interests. Therefore I've limited myself to downloading various OOB and TOE data and not venturing any further.AMVAS wrote:I have doubts you downloaded 7-10Tb of their content
"Semion Semionovich..." (c)AMVAS wrote:Look at rutracker.org for germandocs documents. They have plenty of them.
Indeed, getting direct link to the file is relatively easy - but the filenames are in random, so re-numbering them in proper order will take more time, than download them one by one Of course, maybe somebody already solved this, but I have no idea, how to do itAMVAS wrote:As I can see, they use much simpler engine than pamyatnaroda with direct links like
http://wwii.germandocsinrussia.org/pages/112204/zooms/8 for full-scale pages.
So, I don't think downloading documents from them to be too much problem
Re: Pamyat Naroda indexes. 3,400,000+ records processed
Aha, I see...Eugen Pinak wrote:Certainly not, but I've decided for myself, that I shall download only data, relevant to my interests. Therefore I've limited myself to downloading various OOB and TOE data and not venturing any further.AMVAS wrote:I have doubts you downloaded 7-10Tb of their content
"Semion Semionovich..." (c)
One need to collect links to pages for every dossier and only then collect that pages into dossier folders. Not a work for manual downloadingIndeed, getting direct link to the file is relatively easy - but the filenames are in random, so re-numbering them in proper order will take more time, than download them one by one Of course, maybe somebody already solved this, but I have no idea, how to do it
Re: Pamyat Naroda indexes. 3,400,000+ records processed
P.S. Maybe for this site will work some offline browser
-
- Member
- Posts: 1235
- Joined: 16 Jun 2004, 17:09
- Location: Kyiv, Ukraine
- Contact:
Re: Pamyat Naroda indexes. 3,400,000+ records processed
May be. But any way navsource from Rutracker already uploaded all the folders from germandocsinrussia.org - bless him!AMVAS wrote:P.S. Maybe for this site will work some offline browser
Re: Pamyat Naroda indexes. 3,400,000+ records processed
That I've got, and all is replicated local now. (Happy to share, as usual).Eugen Pinak wrote:But maybe you'му also found a way to mass-download files from germandocs...
When I checked rutracker some months ago, there were some very convenient torrents to pdf compilations (that John Calvin also copied to his FTP). But I am not sure whether the updates published since were processed too. Anyway, it was not too difficult to find an automated way to download the lot from the site itself.
Re: Pamyat Naroda indexes. 3,400,000+ records processed
I approached for downloading full dossiers for about a year. Had no time to do this earlier.Mori wrote:That I've got, and all is replicated local now. (Happy to share, as usual).Eugen Pinak wrote:But maybe you'му also found a way to mass-download files from germandocs...
When I checked rutracker some months ago, there were some very convenient torrents to pdf compilations (that John Calvin also copied to his FTP). But I am not sure whether the updates published since were processed too. Anyway, it was not too difficult to find an automated way to download the lot from the site itself.
The major problem is even not downloading, but to make a logical structure of those records.
their main unit is document. But they don't assign those documents to dossiers in user available atributation.
Rutracker files are good enough, but they have the same disadvantage - poor structure. And without structure it's useless load.
Right now I obtain everything I need - software for indexation, analysis and downloading what I need
I got some surpries. For example there exists maps, which are not indexed by official search engine! One can get access to them only through direct links (which ordinary user has no of course!)
So, those maps are invisible!
- Jeff Leach
- Host - Archive section
- Posts: 1439
- Joined: 19 Jan 2010, 10:08
- Location: Stockholm, Sweden
Re: Pamyat Naroda indexes. 3,400,000+ records processed
I've downloaded about 2000 pages, one page as at a time. It wasn't a complete waste of time because I was forced looked at each document, allowing some intial evaluation of them.
After some trial and error, I settled on naming folders
fXXX opXXX dXXX.
It is the documents from the 'd = delo' that are collected in each folder. The documents in each folder had to be given new numbers to make sure the pages of each document were keep together.
The biggest problem with the search engine is that it will only display 100 documents at a time. Mass downloading isn't an issue for me. I have about 220,000 pages of wartime German documents and after 5 five year, I have only looked at about 10% and read maybe 2 - 3 %.
Sorry been rambling. What I wanted to say is I hope AMVAS will share some the hidden files. Any dealing with South or Southwest Front 22 June 1941 - 30 October 1941 would be really appreciated.
After some trial and error, I settled on naming folders
fXXX opXXX dXXX.
It is the documents from the 'd = delo' that are collected in each folder. The documents in each folder had to be given new numbers to make sure the pages of each document were keep together.
The biggest problem with the search engine is that it will only display 100 documents at a time. Mass downloading isn't an issue for me. I have about 220,000 pages of wartime German documents and after 5 five year, I have only looked at about 10% and read maybe 2 - 3 %.
Sorry been rambling. What I wanted to say is I hope AMVAS will share some the hidden files. Any dealing with South or Southwest Front 22 June 1941 - 30 October 1941 would be really appreciated.
Re: Pamyat Naroda indexes. 3,400,000+ records processed
I took the easy way out: I store files under a folder named by the Number and the Title of the document. It takes an extra copy/paste of said title, as well as one manual sequence of 4 clicks per document (not per page). In the end, that's not 100% automatic, as there are ca. 8 clicks per document, but it has the valuable benefit of ease of identification & navigation.AMVAS wrote: The major problem is even not downloading, but to make a logical structure of those records.
their main unit is document. But they don't assign those documents to dossiers in user available atributation.
Also, my method does not create any problem sequencing the pages, even within a folder.