Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Ingesting scans of paper materials, digitized audio or video content, or born-digital materiaslmaterials, places them in a consistent storage with a backup. This prevents files from getting lost or accidentally deleted and ensures that they will be managed overtime.

...

During ingest, a second, read-only copy is created in \\Lincoln\Masters\Archives\SIP in a bagit bag. You shouldn't have to worry about this copy, but this way we always have a second copy of the ingested files in case of errors or accidental deletion.

If you are working with new born-digital materials haven't been accessioned in ASpace yet, use Accessioning New Born-Digital Records instead.

How to ingest digital materials

  1. Place files you want to ingest in the ingest folder in a folder named with the collection ID.
        • The ingest folder is: \\Lincoln\Library\SPE_Processing\ingest
        • Digitized files will be logged in the DigitizationExtentTracker.xlsx at \\Lincoln\Library\SPE_Automated\DigitizationExtentTracker so we can track the size and quantity of what we're digitizing.
        • Files here can have subfolders and any structure that is useful for preserving any meaningful order.
      ingest/
          ├─ apap101/
          │  ├─ minutes.docx
          │  ├─ report.pdf
          ├─ ua950.012/
          │  ├─ Issue1/
          │  │  ├─ page1.tif
          │  │  ├─ page2.tif
          │  │  │  ...   
      
        • Derivatives and metadata files can be added pre-ingest by placing them in subfolders for "derivatives" and "metadata" within the collection ID folder. Note: this means that original files cannot have root directories named "derivatives" or "metadata.
      ingest/
          ├─ ua746/
          │  ├─ image1.png
          │  ├─ image2.png
          │  ├─ ...
          │  ├─ derivatives/
          │  │  ├─ image1.jpg
          │  │  ├─ image2.jpg
          │  │  ├─ ...
          │  ├─ metadata/
          │  │  ├─ image_list.csv
          ├─ ...
  2. Enter the collection ID in the Ingest tab of the processing app, and click "Submit"
  3. Checkout the log to see if the ingest was sucessful or had any errors.

Simple Ingest

  1. Create a folder named for the collection ID in \\Lincoln\Library\SPE_Processing\ingest
    1. Use Find-It to find the correct collection ID
    2. Examples:
      1. \\Lincoln\Library\SPE_Processing\ingest\apap101
      2. \\Lincoln\Library\SPE_Processing\ingest\ua809
      3. \\Lincoln\Library\SPE_Processing\ingest\apap138
  2. Log on to the railsdev Processing server
    1. Open a Command Line shell
    2. run ssh railsdev
  3. Run: ingest <collection ID>
  4. You can now type "exit" to log off the server and the command will run in the background
  5. Check if an ingest is running: check ingest
  6. Results will log to \\Lincoln\Library\SPE_Processing\ingest\log\<collection ID> as: <timestammp>-ingest-<collection ID>.txtcomplete (smile)

Results of Ingest

  1. Files will be packaged unto a SIP bag here: \\Lincoln\Masters\Archives\SIP\<collection ID>\<package ID>
    1. SIP and AIP packages are here: https://github.com/UAlbanyArchives/packages

  2. Processing folder for package is created in \\Lincoln\Library\SPE_Processing\backlog\<collection ID>\<package ID>

  3. Master files are placed in \masters subfolder

  4. Example Processing package:
    • ua809_JxkK2VWVFu7F8VWaTe72BG
      • derivatives
      • masters
      • metadata

Examining Running Ingest

...

[1] 16994
  • To list running ingest processes run: check ingest
$ check ingest
gw234478 16994 12.9  0.2 206436 21184 pts/0    D    10:13   0:30 python3 /opt/lib/ingest-processing-workflow/ingest.py apap301
  • if you need to stop the process (not recommended) use this command:
sudo kill -9 <PID>

What is happening with "check"

  • "check" is a function defined in /etc/profile.d/processingFunctions.sh that runs:  ps aux | grep [i]ngest, etc.

If process is completed, but ingest folder being is not deleted

  • Run python script \\LINCOLN\Library\SPE_Processing\checkIngest.py
    • It does a compare between Ingest and Backlog to see all of the files were moved successfully.
  • Check logs: \\LINCOLN\Library\SPE_Processing\ingest\log
  • An example of a log report where the ingest folder couldn't be deleted:
  • Image Removed
  • If error, sort by date modified, a successful log should resemble this:
  • Image Removed
  • I

Advanced Use

Ingesting from directory other than \\Lincoln\Library\SPE_Processing\ingest

  1. Must use path accessible to the railsdev server
  2. Must convert to Linux path:
    • \\Romeo\SPE\folder1\folder2 is /media/SPE/folder1/folder2
    • \\Lincoln\Masters\Special Collections\Electronic_Records_Library is /media/Masters/Special Collections/Electronic_Records_Library
  3. Run ingest using -p flag:

ingest apap101 -p "/media/Masters/Special Collections/Electronic_Records_Library/apap101"

  • Will still log to 

    \\Lincoln\Library\SPE_Processing\ingest

What is happening:

...

usage: ingest.py [-h] [-p PATH] [-a ACCESSION] ID

positional arguments:
  ID                    Collection ID for the files you are packaging.

optional arguments:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  Path of files to ingest. Folder will be removed
                        afterwords.
  -a ACCESSION, --accession ACCESSION
                        Optional ArchivesSpace Accession ID for new
                        acquisitions.

...

nohup python3 /opt/lib/ingest-processing-workflow/ingest.py apap301 >> /media/SPE/ingest/apap301-ingest.log 2>&1

...

Filter by label (Content by label)
showLabelsfalse
max5
spacesSCA
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel = "kb-how-to-article" and type = "page" and space = "SCA"
labelskb-how-to-article

...