Meaningful access to audio and video passages:
A two-tiered approach for annotation, navigation,
and cross-referencing within and across oral history interviews
by Doug Lambert and Michael Frisch
Abstract
Despite the use of digital technology for recordings and the opportunity for online retrieval, meaningful access to recorded oral history collections still requires an approach to– and an architecture for– fluidly “getting inside” the interviews. Cataloging tapes or files does not bring users close enough to the material, while creation of a fine-grained index takes considerable time. In our practice of digital indexing over the past several years, we have worked in systems that put us close to recorded oral history interview collections via audio/video databases linked directly to a digital recording, which allow for subject access via custom thesauri. Within this model–where full-text word-for-word transcription is optional or simply applied as needed–we arrived at a two-tiered approach for creating the sub-interview units for annotation and metadata application. The Unit/Story method provides two levels of navigability for robust intra-interview access to oral histories. The unit level contains comprehensive, sequential passages approximately 10 minutes in length, annotated in real-time, to which broad controlled vocabulary terms can be applied. The story level allows for shorter, more detailed passages to be created immediately or over time. Additional stories can be refined and indexed by any number of project stewards, staff, interns, volunteers, or targeted users, either locally or online. Both the systematic units and the flexible, customizable story passages set the stage for development and application of a faceted thesaurus. This approach also facilitates balanced cataloging and indexing within and across collections that have both long and short interviews, unifying the scale of the navigable unit. Refining these content management approaches over dozens of projects has led us to de-emphasize the search for perfect software or rigidly defined practices, and to focus on getting various tools and teams of talented people working together efficiently.
Introduction
What is the best tool to use to build a bridge? Even someone who knows nothing about civil engineering would immediately recognize such a question to be flawed in concept. Many “tools” are used in building bridges, from the designers’ suites of mathematical and intellectual tools to the construction workers’ pieces of large equipment and small hand tools. In our work building digital oral history audio and video databases, we often hear versions of this question: What is the best tool to use for my oral history collection? In developing meaningful audio/video indexing and content mapping within and across interviews, we now know that a range of tools and skills are needed to respond to various projects’ goals and real-world constraints. And, just like building a bridge, to create something valuable takes time and effort supported by adequate resources.
Working on applied oral history indexing projects out of our consulting office at the University at Buffalo, NY (U.S.A) Technology Incubator, we are gaining experience with an array of digital tools and skills for shaping access to audio and video in a multi-media and multi-disciplinary field. In the process, we stumbled upon one of the most important “tools” in mapping and managing oral histories: an approach to the building blocks themselves. These are the intra-interview units which define passages in a recording when working within a digital environment. These units map the content in lieu of linear pages of transcript text. Working under a new model that is inherently database-driven rather than text/transcript-driven, we evolved a method for defining two kinds of passages within and across interviews called, respectively, “units” and “stories.” Creation of these two distinct but related layers of intra-interview units sets the stage for different levels and qualities of indexing.
In systems that allow us to index audio and video directly, without transcription, the core units are ranges of time defined within the larger digital files–analogous to highlighted passages in a book. In this model, the structure for organizing, annotating, and cross-referencing metadata can then be linked directly to the primary audio or video source. This supports meaning-focused indexing, leaving full-text transcription as an option, not as a requirement for access.
Experience on earlier projects showed us the challenges and limits of applying traditional library modes to indexing audio and video at the sub-interview level. For the long, unedited files in oral history collections, a catalog of subject headings only provides access to the tape/file level, and even a short segment of an interview may cover multiple subjects or topics. In contrast, a book indexing approach for electronic material brings one directly to a specific point in the content stream in full context. But this highly specific approach is time consuming to create. In addition, a catalog or an index calls for consistent and legible levels of navigability. These opposing forces between large and small, broad and narrow, efficient and detailed, led us to use a two-tiered approach to creating sub-interview units for recorded oral history collections. This method, which we call Unit/Story, allows for inter- and intra-interview access via custom thesauri. It also provides a sustainable workflow model for the real-world constraints of both large and small projects.
Challenges in Oral History Indexing
Transcription as a mode of mapping and navigating oral history recordings has an advantage in that the collection is then poised for access via full-text searches. Beyond that, transcriptions are time consuming to create, contain extraneous material irrelevant to the users, and often bottleneck or slow workflow on projects. Full-text searches, though powerful, are also limited by the fact that they are explicit and therefore terms to be searched may not appear in the literal text. Furthermore, while full-text modes are useful when searching a particular term or phrase—i.e., when you know what you want—they are less helpful when you do not know what you want, but rather wish to browse, explore, and listen to passages of audio and video at the source. For this more nuanced, multi-dimensional approach to indexing, new and multiple modes and technologies are needed.
As audio or video becomes readily “accessible” because it is digitized or online, meaningful ways to navigate within and across interviews become increasingly desirable. Paradoxically, however, accessibility gained by digital modes is generally offset by the size of collections and the size of the individual files. Cataloging file-length units, like shelved cassettes or videotapes, does not bring users close enough to the material, while creation of a fine-grained index on paper, or digitally, takes considerable time. So the first step is to subdivide the audio/video files of recorded interviews to appropriately sized units. But how large should these units be? They can be large, following general topical flow of the interview, which end up being generally between 5 to 20 minutes in length. Or they can be shorter, defining anything from short quotes or stories up to longer anecdotes of five minutes or more. Thus, many options are available and several decisions need to be made as to the level of granularity and specificity of cataloging and indexing oral history interviews.
Direct access to audio and video of recorded oral histories complicates traditional library cataloging and indexing because the content is not concise, edited, published material. A cataloged object in a library is already much more specifically defined as to its subject just by the fact that it is a bound, coherent, and usually physical object. But even so, a catalog only brings you to the object, not the content within it. In contrast, detailed indexing is typically bounded and focused on a particular set of users and, usually, on a very particular book. Any book has a relatively definable domain of users for whom the indexer can make targeted decisions about formatting conventions or the vocabulary terms chosen. As we increasingly index for Web use, the breadth of audience for our index is literally growing to a global scale.
Another challenge in defining subinterview units and annotating passages of audio and video recordings is that styles of interviews vary greatly across collections. Because the length of the stories, the question/answer cycles, and the themes covered vary enormously within and across interviews, an argument can be made for making the intra-interview divisions completely arbitrary. In other words, why not chop them into one, five, or ten minute segments, and leave it at that? There are viable arguments for this timed segment approach, as the decision making associated with defining interview units can be time consuming and demanding.
One additional challenge in oral history indexing is that it requires human input to do the work. Though some technology may be promising to reduce this factor, automated technologies have yet to prove highly useful at this level of annotation and metadata. In addition, there is enormous value added when a collection is cataloged and indexed by the people who created it for a group of people they have in mind to use it. At the same time, annotation and indexing work can be tedious and repetitive. To create a digital oral history collection that is accessible at the intra-interview level requires teams of people, including archival-minded librarians along with staff, interns, and volunteers who contribute to the volumes of annotation work for larger collections.
The Unit/Story Approach
The development of the Unit/Story method is predicated on several oral history indexing projects we did from 2002 to 2007. We worked with Interclipper™, an off-the-shelf database software that links the records in a database to defined passages of audio and video. (Interclipper was originally designed for, and is still used in, commercial market research. We adapted it for use in oral history because of its capacity to directly link playable passages of audio/video with database annotations and an index.) In this work, we favor summaries of content rather than word-for-word transcription as the basis for browsing material. These summary annotations are less dense than full transcriptions, so for the same length of audio or video, this level of detail is more ideal for helping individuals decide whether they want to to go deeper and actually listen to the passage from the primary source while having access in context to the entire interview.
The Unit/Story approach to annotating intra-interview passages came naturally due to a particular aspect of working in Interclipper. Because the marked passages in Interclipper are not physical units but simply time-code pointers to media files, different passages marking the same digital file are in no way mutually exclusive: they can be large, small, cross over each other, and be nested within each other. Or, a passage covering an identical time range can be annotated in completely different ways by different people for completely different purposes. Working under this architecture made us comfortable with the idea that there does not have be just one way to define content for indexing. Pushed by the inherent tension between the cataloging level and indexing subdivisions, it was a natural progression in the Interclipper environment to move beyond “either/or” and simply say “yes” to both, defining longer passages that cover everything generally, and highlighting shorter passages within those units with the most compelling material.
The two-tiered model of Unit/Story essentially provides one layer that functions more like a catalog, and another that functions more like an index. The unit level contains sequential passages approximately 10 minutes in length, annotated in slightly over real-time to which broadly controlled vocabulary terms can be applied. These units may vary from 5 to 15 minutes. Break points are decided by listening for either a natural ending of a topic, a break before a new question, or other natural pause in the flow of speech. This structure creates a highly functional level of access, navigability, and browsing capacity. The structure is broad enough for a collection of such units to be built up efficiently, but also short enough that the summaries can be scanned quickly before deciding whether to play the entire passage or not.
The story level, in contrast, allows for more specific passages to be created, described, and indexed within and across the units over time. Detailed stories take more time to make, and unlike the relatively strict format of the units, stories can vary widely in size and nature—from great quotes that are only five seconds long or anecdotes that last several minutes. With the comprehensive, unit-level annotations in place, there is immense flexibility to build-out the detailed story identification at a relaxed pace and by any number of project stewards, staff, interns, or volunteers in specific project or programming contexts, internally or externally, locally or remotely.
The advantages to this method are several. Because collections have both long and short interviews, Unit/Story unifies the scale of the navigable unit. This consistency makes it easier and more comfortable for users to understand what they are looking at. Since new segments are begun at points that are meaningful (estimated at 10 minutes, but divided during natural breaks), listening to the units is more pleasant than it would be if they had abrupt entry points due to arbitrary time intervals. Also, Unit/Story works for the multi-dimensional approach to indexing we use, referred to as CVS.
But, most importantly, it serves as a very flexible workflow model for project teams. Populating the collection with units represents an attainable level of work to be done in-house for basic collection stewardship, while detailed indexing is targeted to both internal and external user needs on an on-demand basis.
Details on the Two-Tiered Unit/Story Concept (See Figure 1 below):
1. Units are comprehensive, 5 to 15 minute segments, summarized in flow logs. The flow of text typically tracks the flow of speech in the audio-video, so something logged later in the unit is likely to be near the end of the summary, and vice versa, making it relatively easy to locate in the audio or video stream even without more precise referencing.
2. Basic instruction for the annotator is to provide a summary of the material so someone else can sense what is in the passage and decide if they want to listen to that passage. This is best done by the original interviewer, a librarian, researcher, another steward of the collection, or a combination of these individuals.
3. Units can be given unique descriptive titles, and/or assigned broad terms from a controlled vocabulary that can help link these approximately 10 minute passages to the same types of passages in other interviews. These controlled vocabularies can be developed a priori for certain interviews and/or refined once a number of these units are annotated, as informed by the content.
4. Stories are shorter passages within, and sometimes span across, the larger units. They can vary in length and in nature, and may include– but are not limited to—stories; anecdotes; sound bites; quotes created for specific uses such as presentations, research citations, inclusion in a book or documentary, or Web site highlights.
5. The number of stories identified within a unit may range from none to “a lot”. In cases where an interview is not of high quality or interest, or there is simply no time to identify these passages, there is no reason to enhance the summarized unit with specific stories. In other cases, several stories, multiple or alternate versions of stories, and stories within stories can be made as time and staffing allows.
The Unit/Story approach strikes a good balance between two often opposing responsibilities of a collection steward. The first is to provide high-quality, authoritative access as a librarian would. Though cataloging and indexing done by a human is itself never perfect, it represents a level of quality control, filtering, labeling, or abstracting, which– especially in a world of ubiquitous electronic material– is greatly needed. The second is to provide access on users’ terms. Since the breadth of audience and users for most collections is potentially worldwide, the notion of responsively and responsibly indexing for all of those users is next to impossible. Although the emerging reciprocal interactive Web 2.0 model is one that seems to push the focus away from librarians’ work to user tagging, we are embracing the benefits of both models in the Unit/Story approach: annotating and cataloging/indexing collections on a broad but thorough level, then allowing the detailed indexing of the collection to be incrementally populated by the stewards and/or the users—internal and external—over time.
Figure 1. Conceptual schematic of the Unit/Story architecture.
Building a Better Oral History Collection
Working on oral history indexing projects requires a suite of tools and techniques. Though the Interclipper software is the environment in which we created the Unit/Story method, there are other tools that are involved in the process. The ones we have identified and used for oral history indexing range from open-source to proprietary audio/video indexing tools such as: Microsoft® Excel and Access, Apple® iTunes, Audacity®, Pro Tools®, Adobe® Acrobat and Audition, Interclipper™ (Audio, Video, and DVD version), Omeka, Annotator’s Workbench, Stories Matter, Cindex™, and Zotero. We demonstrated—on three IMLS National Leadership grants and five Teaching American History grants—that combining tools such as these with tools the project team or participants already use yields productive results.
It is worth noting that almost none of these tools were designed for oral history audio/video indexing in particular. We have come to understand that this is intricate, involved work, and that no one platform can do it all for these large, complex collections. It has also been our observation, that no matter which tools we start with, the media, content, and metadata almost inevitably need to be migrated to a range of other storage and presentations platforms, including the Web. One particularly important destination for an indexed collection that is often forgotten due to current digital age enthusiasms, is print copies of the collection of summaries. Printed summaries on paper are proving far from irrelevant as a comfortable means of fluid access to oral history collections at the intra-interview level. We commonly create binders with all of the unit and story level summaries in hard copy. Once indexed in the database, cataloging the print copies in multiple ways or creating multiple-focused indexes is very easy. And current trends in electronic text publishing—especially interactive PDFs and similar formats that include internal hyperlinks and embedded audio and video files—only increase the evolving utility of print modes in which indexed oral histories can be encountered and explored far more easily and meaningfully than word-searched transcripts.
The work we do is under a collaborative, team-based approach. Although much of the annotation is done by us, we typically aim to turn that work over to those who own or are primarily responsible for the collection. Our job as consultants is to guide the stewards’ in-house staff in specific tools and methods and, more importantly, to guide them through the decision-making process as they explore all of the options for gaining the meaningful access that technology now allows. “Getting things to work together” is a common way to describe the core of our work, whether it involves migrating the in-and out-points of the units and the stories from one database to another, or evaluating which combination of team members would be most effective on which aspects of the project.
Though technology tools will remain the way we work with our material, we like to think of the Unit/Story approach as providing the right building blocks for the construction of an indexed collection. The model demands annotation of comprehensive units and allows for the open-ended creation of stories by and for any imaginable contextual set of users. This turns the focus of meaningful access to oral history away from technology alone, to human beings working with each other and with the technology–expanding skill sets while getting their hands dirty in their projects. Overall, building meaningful access to audio and video requires the right set of tools, well-chosen units of annotation such as those in our Unit/Story approach, the adaptation existing tools while creating new ones, and a lot of hard work.
Adapted from Proceedings of the XVI International Oral History Conference, July 7-11, 2010, Prague, Czech Republic
Citation for Article
APA
Lambert, D. and Frisch, M.(2012). Meaningful access to audio and video passages: a two-tiered approach for annotation, navigation, and cross-referencing within and across oral history interviews. In D. Boyd, S. Cohen, B. Rakerd, & D. Rehberger (Eds.), Oral history in the digital age. Institute of Library and Museum Services. Retrieved from /2012/06/meaningful-access-to-audio-and-video-passages-2/.
Chicago
Lambert, Doug and Michael Frisch. “Meaningful access to audio and video passages: A two-tiered approach for annotation, navigation, and cross-referencing within and across oral history interviews,” in Oral History in the Digital Age, edited by Doug Boyd, Steve Cohen, Brad Rakerd, and Dean Rehberger. Washington, D.C.: Institute of Museum and Library Services, 2012, /2012/06/meaningful-access-to-audio-and-video-passages-2/
This is a production of the Oral History in the Digital Age Project (/) sponsored by the Institute of Museum and Library Services (IMLS). Please consult /about/rights/ for information on rights, licensing, and citation.
Recent Comments