Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In addition to adding files to Hyrax, this requires creating digital object records in ArchivesSpace that points tpThere is a batch process for this that uses spreadsheets, but its always going to require a CLI input on the railsprod server, so this is going to be limited to Greg for the forseeable future. This workflow is described in Processing Ingested Digital Files and Batch Upload to Hyrax.

Individual files, however can be uploaded directly into Hyrax, which happens more often than getting a big batch back from a vendor.

Add DAO to ASpace

After individual files are uploaded to Hyrax, the archivist has to add the URL to the object in Hyrax to ArchivesSpace as a Digital Object record. This is a manual step.

Adding ASpace IDs to package

This script doesn't exist yet, as this setup was included in the batch Hyrax upload described in Processing Ingested Digital Files.

Yet, when files are uploaded to Hyrax individually, the Ref ID to the component in ArchivesSpace will have to be added into the package in the metadata/ directory.

→ When we currently create lower-quality access scans on the photocopier and upload them to Hyrax, there is a wonky script, processNewUploads.py that creates the AIP from Hyrax. We can do it this way since there's only a single copy of the file and no TIFF/JPEG derivative copies. However, since Hyrax doesn't have an AIP, this script is hacky and I'm kind of shocked it hasn't broken and has been running without issue for 4+ years.

The resource type and license/rights statement fields are also manually added in Hyrax so it would be good to add these to the package as well.

So the form here might have fields for:

  • package ID
  • ASpace ref ID
  • resource type dropdown
  • license/rights statement dropdown

Or it might be best to just have two fields and query Hyrax for the other info so we know its consistent.

  • package ID
  • ASpace ref ID

For consistency, we might want to recreate the CSV file that the batch upload process uses. There is code for this in processNewUploads.py we can use.

...

the URL in Hyrax.

Uploading Single Digital Objects to Hyrax

Bulk Hyrax Upload

→ after bulk upload, there is a second process to add Hyrax URLs to ArchivesSpace

Finalizing a Package into an AIP for Preservation Storage

The final processing step requires packaging the SIP together with the Processing Package into an AIP for long term preservation.

packageAIP.py copies the preservation files from the SIP and the derivatives/ and metadata/ directories from the Working Directory into the AIP. It runs some safety measures, like checking the hashes from the SIP with the AIP after its copied to make sure all the files are there, before finally deleting the SIP and the Working Directory package.

    packageAIP.py ger071_DZXPx2c6aKaV5zmdfasjJm

There are two additional options:

  • -u, --update : Uses the preservation files from the Working Directory package instead of the files originally ingested in the SIP. This is for when we are not keeping all the files that we originally ingested.
  • -n, --noderivatives : Will not include derivatives in the AIP. This is for cases were preservation copies, (like PDFs) are the same as derivatives.There in an option here to overwrite the original ingested files with the files in the \masters directory in the Processing Package
    • This is useful for when some materials are not retained during processing
  • Soon, this will also copy the AIP into additional redundant storage automatically

Filter by label (Content by label)
showLabelsfalse
max5
spacesSCA
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel = "kb-how-to-article" and type = "page" and space = "SCA"
labelskb-how-to-article

...