Creating Derivatives

For generating access files for use



A common process is creating derivatives for digital materials. This might be creating compressed JPGs or PDFs for access from larger PNG or uncompressed TIFF preservation files. For born-digital records, we might create derivatives that are better for preservation. If we get WordPerfect files, for example, we might also create more modern version of the files.

The separate masters/ and derivatives/ subfolders in the working packages are for managing this. After ingest, derivatives/ is empty, but archivists can optionally make derivative versions of files and place them there so that they'll be included in the AIP. Often the directory structure is repeated in both, so we can semantically associate original files and derivatives.

Manual Derivatives Creation

You can manually create derivatives using any tool by accessing it though the filesystem.

Automated Derivatives Creation

The Processing app can automate the the creation of common derivative files.

  1. Enter the Package ID for the package that has files in the /masters directory you want to convert to derivatives

  2. Select the input format, such as png, tif, jpg, or pdf (default is png)

  3. Select the output format you want to convert to, such as jpg, pdf, or png (jpg is default)

  4. Optionally check the box and enter a subpath if you want to limit OCR to a certain folder or file path.

    1. Use Unix-style (/) path separators or a double backslash (\\) for Windows-style path separators

    2. Subpaths are relative to the \masters folder, so if you want to run OCR on the PDFs only in this folder:

      1. \\Lincoln\Library\SPE_Processing\backlog\ua809\ua809_PNTNy7MWUCgTDDPfLztoEU\masters\UniversityofAlbany SUNY 202213310\Box 03\1986\1986-02-04_v73i03

      2. Then enter:

      3. UniversityofAlbany SUNY 202213310/Box 03/1986/1986-02-04_v73i03

  5. Optionally enter a max pixel range to Resize

    1. Example: 1500x1500

    2. In this example all images will be reduced to 1500 pixels on the longest edge. The aspect ratio will remain the same.

    3. This is useful for large yearbooks and other 300+ page items scanned at 400 dpi, which produces PDF derivatives that are too large for users to download.

  6. Optionally enter a Density number to convert to

    1. This is a DPI, so 72 will convert images to 72 dpi and 300 will convert images to 300 dpi.

  7. Optionally select "Monochrome" to convert images to black and white (should be used rarely).

  8. Click "Submit"!

    1. The Resize, Density, and Monochrome inputs just get passed to ImageMagick

Notes for consideration:

  • Derivatives can also be used if redactions are needed within the documents

    • Redactions are used when there is personal/private information in an archival object, but the rest of the object is meaningful/valuable to the collection. 

      • For example, there was a convention, and participants listed their personal cell phone numbers as a part of their contact information, but used their work addresses and phone numbers as well. We would want the participant list because it shows who and how many people attended the convention, but we do not want researchers or others to have access to the personal cell phone numbers. 

      • I as the arranger would make the judgement call that I want this list included, but I am going to manually edit the document to strike the personal numbers, 

        • Other things that you may want to strike can include:

          • Home addresses

          • Social Security numbers

          • Banking details 

        • Basically anything that is not easily found online/through a quick Google or social media search.

          • If you wouldn't want it listed online for you, chances are someone else would feel the same! 

  • You may need to create derivatives as a way of organizing digital packages.

    • If the arrangement (or disarrangement) of the package is not compliant with DACS standards, you may use the derivatives file to clean the master's finals. 

    • You would NOT want to select "update the master files" option when you are on the "Package AIP" step