Learning patterns/Digitizing archival records
What problem does this solve? edit
Archives hold historical pictures and documents that may be of interest to the Wikimedia projects. Digitizing them preserves them and allows them to be used on Wikimedia projects where they will be made available to a global audience. For instance, a collection of over 120,000 records from the U.S. National Archives and Records Administration was uploaded to Wikimedia Commons, where they received over 1 billion views in 2013.
What is the solution? edit
You need to be careful when working with archival records—errors can result in mislabeling, mixing records together, or damaging them. Be sure to abide by the archives' policies on records use.
- Ensure acceptable copyright status. This can get very tricky, very fast. If you are working with a federal agency in the United States, the records are predominantly in the public domain, so that is not a problem. However, in the absence of a blanket declaration like that, the copyright status of a given document can be very unclear. Often times, archival documents will not have been published or have clearly discernible copyright information. If the archiving institution is reasonably certain that a given document (or set of documents) lacks copyright protection, then you can consider it to have "no known restrictions." See Commons:Template:WSM no known restrictions as an example.
- Retrieve the boxes from the archives. Know ahead of time which records you want to work with, since they may need to be retrieved from storage.
- Only one at a time. If you are working with multiple boxes, use only one box at a time. If you are working with multiple folders, only have one folder out at a time. If you are working with multiple documents, scan only one document at a time. Put the document back in its folder (or box) when you're done. This helps avoid mix-ups.
- Scan the document. When doing so, you want to capture the entire sheet, including borders and edges; don't crop anything out! Scan at 600dpi or better.
- Document-level metadata. Make sure each individual document that has been digitized has proper metadata. You may need to work with professional archivists on this to make sure that each document is properly represented. If you only have metadata at the collection level, use a generic description.
- Upload to Commons. If you only have a few documents to upload, the Upload Wizard is sufficient. For larger-scale uploads, consider a tool like Commonist or the GLAMwiki Toolset.
General considerations edit
When to use edit
This pattern is useful if you are hosting a scan-a-thon or are working with a cultural institution to digitize their collection. This is based off of the best practices that emerged at the U.S. National Archives and Records Administration, where "citizen archivists" digitize individual documents that are subsequently given document-level descriptions and uploaded to Commons.
See also edit
Related patterns edit
- Official Speeches of Salvador Allende project - Learning patterns - Explanation of the project and how these steps were also applied, with success.