Image Capture

1.  Creating items (Scanning procedures)

Scanning will be done on the scanning machine in Room 437A

1. Turn scanner on ( EPSON Expression 10000 XL)

2. Turn computer on

3. Windows Login, enter username and password

4. Open “Adobe Photoshop CS” found on desktop

5. Place object face down in the top left hand corner of scanner (there is small white arrow, place in that corner)

6. Go to ‘File’>’Import’>Epson Expression 10000XL’

a. This will bring up the ‘Epson Scan’ interface.

7. Change the mode to ‘Professional Mode’ 

8. Make sure the 'Resolution' and 'Image Type' are set to requirements for each document scanned (24 bit color/8 bit grayscale etc) (600dpi/2400dpi etc)

If scanned in grayscale must be ' 8bit grayscale'

9. Click ‘Preview’

a. Select the scanning area for the object.

b. Make sure that the image is 5000 pixels in the long dimension

10. Click ‘Scan’

a. After scanning is finished, you will get a raw image of the object in Photoshop.

11. Close ‘Epson Scan’ interface.

12. Once in Photoshop, make sure the image looks okay.

a. If the image is sideways or upside down, go to ‘Image’>’Rotate Canvas’>and then select the proper value.

b. If it needs to be cropped, select the 'Crop' tool on the side menu, and crop to selected size.

13. Once the image is carefully checked, save it as a TIFF file without compression (otherwise known as a “master file”) by going to ‘File’>’Save As’>select ‘TIFF’ for format.

a. Name the file following the file naming convention (i.e. UAP4427) and select the proper folder (C:\Documents and Settings\Administrator\My Documents\SpecialCollection\UAP).

b. Click ‘Save’>in TIFF options select

i. Image Compression-‘NONE’

ii. Byte Order- ‘IBM PC’

iii. Uncheck-‘Save Image Pyramid’

c. Click ‘OK’ 

d. NOTE: This is your master image which should be the highest quality you can afford.  (See pp. 24-25, 28 in Western States Digital Imaging Best Practices)

2. Transcribing Documents

1. Open a MS Word document

2. Go to File>Save As>and save the file according to the naming convention that has been set for the project.  In the case of Nuremberg, you will save the page with the same name as the image that you are transcribing.

3. Because the ink on the documents has begun to fade, OCR (Optical Character Recognition) is not the most efficient way to transcribe these documents.  As a result, they need to be transcribed manually.

Begin transcribing the information according to the following standards:

It is essential when transcribing archival documents to maintain the spelling, grammar, and punctuation of the original.  Remember, we are not correcting, just transferring the information from one medium to another (scholarly work can be done later in a different forum).

In the case of this project, it is not necessary to create an exact facsimile representation of the page.  In other words, while you will maintain spelling, paragraph breaks, and headers for the Nuremberg Trial Transcripts, it is not necessary to maintain line breaks within paragraphs, except if it marks an extended block quotation.  (Guidelines for transcriptions are always up to the editor of the project and may vary.  Obviously if working with poetry, line breaks are important.)

If there are words that are unclear or illegible, you should either provide your best interpretation of the word/letter and enclose it in brackets (i.e., [stop]) or note in brackets that it is illegible (i.e. [illegible]).  

Remember to include the page number as it appears on the page.

4. Once you have transcribed the document, in order to ensure accuracy, at least one other person should verify your transcription.  If there is disagreement about a portion of the transcription, you should ask the project editor/manager what she thinks it is and discuss the matter.  The project editor/manager will make the final decision as to how to handle the situation. 

3. Converting Documents from MS Word to XHTML

Once you have a complete MS word document transcription for the archival materials, you will need to convert it to a basic XHTML document.  This is relatively easy and essentially, your computer can do the work for you.  

1.  Open your word document in MS Word.  Go to File>Save as> and then save your file as Rich Text Format (.rtf)

2. While you can use an open-source program such as HTML Tidy to do this next step, the easiest thing to use TextEdit.app on a MAC.

3. To convert from your .rtf file to .html open your .rtf file in TextEdit.app (under applications in the Finder window).  

4. Go to TextEdit>Preferences>and click on the ‘Open and Save’ tab.

5. Under HTML saving options, under ‘Document Type’ select ‘XHTML 1.0 Transitional’; under ‘Styling’ select ‘No CSS’; under ‘Encoding’ select ‘Unicode UTF-8’; also click ‘Preserve White Space’.  Close the dialog box.

6. Next, go to File>Save as>and select ‘HTML’ from the ‘File Format’ pull down menu and save the file in the appropriate directory.

7. Close the file.