Born Digital Accession Workflow

Case Study: Born Digital Accession Workflow:
The Louie B. Nunn Center for Oral History,
University of Kentucky Libraries

by Doug Boyd and Sara Abdmishani Price

The accessioning process for born digital material has grown much more complex than with the analog process.  No longer are we accessioning a homogenous format such as the audio cassette.  Often, born digital material is coming in to the archive in parts.  For example, it is quite common for an interview to span multiple files that need to be stitched together in sequence.   We are now creating multiple instantiations of an interview that need to be stored in multiple locations.  Additionally, the data file sizes of many formats (especially high definition video) are creating storage challenges.  A very good example is our interview with Nathan Noble from our From Combat to Kentucky Oral History Project.  This interview was captured on high definition video, stored as an Apple ProRes 422 formatted quicktime file (.mov) and spanned 5 different files that totaled 157 Gigabytes in size.  The proprietary format and the data file size created numerous challenges for the Nunn Center.

There are numerous variables that require policy in order to effectively accession and preserve.  This interview / project was producing data file sizes that were beyond our storage capacity and demonstrated the strong need for consistent policies with regard to what surrogate versions we create in varying circumstances and where we were storing everything.  As a result, we began the evaluation process to create the following policies.  These policies were created by trial, error, and evaluation and are subject to ongoing change.  At the moment, this represents our efforts to document the accession process and create a record, the creation of surrogates, and the placement of these files in a large, complex, and changing preservation system. Included below are the Nunn Center’s “Born Digital Version/Instantiation Policies” which articulate what get’s created from the Master (original) versions in a variety of circumstances, an example of our “Born Digital New Accession Worksheet” which tracks surrogate creation and placement and our “Preservation Roadmap” which documents the locations of each version.

 BORN DIGITAL VERSION/INSTANTIATION POLICIES

 DEFINITIONS AND INSTANTIATIONS

 MASTER (Original)

  • Preserve original files in original form, including interviews that span multiple data files.
  • No transcoding
  • Retain ALL original settings
  • If master is video and is on digital tape, see sub-master guidelines.

SUB-MASTER VERSION

A sub-master retains all original settings and does not transcode or introduce additional compression.  Sub-masters are created in the following circumstances:

AUDIO

  • If audio interview master spans multiple digital files, stitch together and save as Sub-Master.
  • Sub-master File naming: [Accession#_Lastname_submst]
  • If interview master (audio) does not span multiple digital files, no sub-master is needed           

VIDEO

  • If total file size for born digital master interview is over 60gb this interview does not qualify for the creation of a sub-master version.  See the guidelines for the creation of the mezzanine version.
  • If total file size for born digital master interview is under 60gb and is contained in a single file, this interview does not qualify for the creation of a sub-master version.
  • If total file size for born digital master interview is under 60gb and spans multiple files, this interview does qualify for the creation of a sub-master version.

Create video sub-master according to the following guidelines:

  • stitch multiple files together in sequence
  • retain original Height/Width/Aspect Ratio/Frame Rate
  • no transcoding

If master video data file type is deemed to be an immediate obsolescence risk,

  • retain master versions
  • Create sub-master version (appropriate open format, minimize loss)
  • Create mezzanine version.  See the guidelines below.

If master interview is on digital tape, create surrogate data files.

  • Create sub-master according to following guidelines:
  • Do NOT stitch files in sequence
  • retain original Height/Width/Aspect Ratio/Frame Rate
  • no transcoding (for example, if an interview is recorded in HDV tape, utilize HDV codec during conversion to data file.

 Sub-Master file naming: Accession_Number_Lastname_p1(if necessary)_submst

MEZZANINE VERSION

 AUDIO

  • Mezzanine version should be derived from the sub-master and can be created simultaneous to the creation of access versions.  The mezzanine version can use compression, but is intended to still be a high-enough quality version so that it can be used as a source to create future access versions if necessary.
  • Current recommendation: Mp3 (320kbps)

VIDEO

  • For video, the mezzanine version may or may not be derived from the sub-master depending on the circumstance.  The mezzanine version can use compression, but is intended to still be a high-enough quality version that it can be used as a source to create future access versions if necessary.
  • If total file size for born digital master or sub-master version is under 60gb, no Mezzanine version is necessary.
  • If total file size for born digital master or sub-master interview is over 60gb, create video mezzanine version instead of a video sub-master.
  • If original video interview data file type is deemed to be an immediate obsolescence risk, create a sub-master and a mezzanine version in accordance with the guidelines above.
  • If master interview is on digital tape, sub-master data files have been created.  You should also create a mezzanine version.

 Create mezzanine according to the following guidelines:

  • stitch files in sequence
  • retain original Height/Width/Aspect Ratio/Frame Rate
  • Utilize the following Settings:
  • File Extension: mp4    
  • Estimated size: 13.65 GB/hour of source
  • Video Encoder
    •         Width: (100% of source)
    •         Height: (100% of source)
    •         Codec Type: H.264
  • Multi-pass: On, frame reorder: Off
  • Average data rate: 30 (Mbps)
  • Video: H.264 main profile

 Mezzanine file naming: Accession_Number_Lastname_p1(if necessary)_mezz

ACCESS VERSIONS

Access standards will change.  We assume that the access versions being created will need to be replaced and refreshed on a regular basis.  Access versions should be created utilizing automated, batch creation tools such as Telestream’s Episode or Apple’s Compression.

AUDIO

Audio access versions are to be created from, either, the master, the sub-master or the mezzanine version. The source for the access version must be a single data file that spans the entirety of the interview in sequence. The Nunn Center previously generated only an mp3 access version.  We have now adopted policy to create a “high quality” or mezzanine audio access version, A “web optimized” access version in .mp3 format as well as a “web optimized” access version in .ogg format.  The .ogg was added due to Firefox’s decision for html 5  not to support .mp3.  The audio mezzanine version serves as a primary access version that is high quality enough to create future derivatives if the .wav master is not accessible at the time.

Create Audio Access .mp3 according to the following guidelines.

  • Codec: Lame MP3
  •  Encoding Type: Bit Rate Based
  • BitRate: 64 kbit/s
  • Mode: Constant Bit Rate
  • Container: .mp3
  • Estimated File Size (1 hour): 28.2 mb/.028 gb

Create Audio Access .ogg according to the following guidelines.

  • Codec: Vorbis
  • Encoding Type: Bitrate Based
  • Bitrate: 64 kbit/s
  • Mode: Constant Bit Rate
  • Container: .ogg
  • Estimated File Size (1 hour):33 mb/ .033 gb

Create Audio Mezzanine .mp3 according to the following guidelines.

  • Codec: Lame MP3
  • Encoding Type: Bitrate Based
  • Bitrate: 320 kbit/s
  • Mode: Constant Bit Rate
  • Container: .mp3
  • Estimated File Size (1 hour): 144 mb/.144 gb

VIDEO

Video access versions are to be created from the master, the sub-master or the mezzanine version. The source for the access version must either be a single data file that spans the entirety of the interview in sequence. Our current standard for delivering digital video online is h.264 compression.  Current video server restrictions limit individual files to under 2gb.  Therefore, for video access versions, settings depend on length of interview.  Use the appropriate preset corresponding to the appropriate length of the interview.

Create Video Access version according to the following guidelines:

  • Codec: H.264
  • Resolution: Same as Source
  • Frame Rate: Same as Source
  • Bitrate: 800 kbits/second
  • Mode: Constant Bit Rate
  • Display Aspect Ration: Same as Source
  • Main Profile
  • Multipass
  • Targeted File Size (Under 2 Gb)

We use the automated batch creation opportunity to, additionally create the following derivative versions for video:

  • If needed, create mezzanine file (according to guidelines articulated above)
  • Create preservation audio (16 bit/44.1khz .wav) (Stripped from video file)
  • Create audio access version (.mp3 see guidelines above)

 STORAGE (LOCATIONS—See Maps on Following Pages)

DCC (Server)

  • MASTER (if total data for interview is < 60gb)
  • SUBMASTER (Only if MASTER was not a data file-i.e. HDV video tape)
  •  MEZZANINE
  • ACCESS

 HSM (Campus Data Backup)

  • MASTER
  • SUBMASTER

PRESERVATION REPOSITORY (OAIS model)

  • MASTER
  •  MEZZANINE
  • ACCESS

XHDD (External Hard Drives-Raid 0)

  • MASTER
  • SUBMASTER

Citation for Article

APA

Boyd, D. A., & Price, S. A. (2012). Case study: born digital accession workflow: the louie b. nunn center for oral history, university of kentucky libraries. In D. Boyd, S. Cohen, B. Rakerd, & D. Rehberger (Eds.), Oral history in the digital age. Institute of Library and Museum Services. Retrieved from https://ohda.matrix.msu.edu/2012/06/borndigital/.

Chicago

Boyd, Douglas A. and Sara Abdmishani Price. “Case Study: Born Digital Accession Workflow: The Louie B. Nunn Center for Oral History, University of Kentucky Libraries,” in Oral History in the Digital Age, edited by Doug Boyd, Steve Cohen, Brad Rakerd, and Dean Rehberger. Washington, D.C.: Institute of Museum and Library Services, 2012, https://ohda.matrix.msu.edu/2012/06/borndigital/

 

This is a production of the Oral History in the Digital Age Project (https://ohda.matrix.msu.edu) sponsored by the Institute of Museum and Library Services (IMLS).  Please consult https://ohda.matrix.msu.edu/about/rights/ for information on rights, licensing, and citation.

Permanent link to this article: https://ohda.matrix.msu.edu/2012/06/borndigital/

Leave a Reply