Digital Audio Recording: The Basics
by Doug Boyd
In the context of audio, “analog” refers to the method of representing a sound wave with voltage fluctuations that are analogous to the pressure fluctuations of the sound wave. Analog fluctuations are infinitely varying rather than the discrete changes at sample time associated with digital recording. Simply put, “digital audio” refers to the representation of sound in digital form.
Portable Recorders
At the present time, there are four mainstream types of portable digital recorders: solid state, which uses flash memory; hard-disk based, which records to an internal or external hard drive; CD recorders; and direct-to-computer recording which uses an analog-to-digital converter and sends the signal into the computer via firewire or a USB connection. Extensive information on all four types of recording can be found online. My blog Digital Omnium (www.digitalomnium.com) contains informational videos about some of the more popular recorders used by members of the oral history community and links to other useful sites. Compact Flash, SD, SDHC, SDHX are the primary media types (at present)
Solid State (Flash Memory) Recorders
- Familiar due to popularity of digital photography
- Flash memory is reusable
- Record at very high quality settings
- Upload at rapid pace
- File-based workflow
- Media is cheap (and getting cheaper)
- Portable, but requires a computer
- Fastest growing portable recorder market
Hard Disc Drive (HDD) Recorders
- Large capacity allows for longer recording time and higher recording quality
- Upload/transfer at rapid pace
- Record at very high quality settings
- File-based workflow
- Drives are getting cheaper
- Getting more portable
- Still associated with very high-end recording
Direct-to-Computer Recorders
- Requires a computer and a professional-level audio interface (A/D converter) with good microphone preamps. Internal, manufacturer-supplied sound cards and on-board microphone inputs on a computer are not recommended for interviewing.
- Large capacity allows for longer recording time and higher recording quality
- No need to upload until making a backup
- Record at very high quality settings
- File-based workflow
- Can yield noisy recording because laptops are often extremely loud (fans, hdd)
- Requires audio-editing software.
Compact Disc (CD-R) Recorders
- CD-Rs are accessible and inexpensive
- CDs are currently dominant in commercial market (but slipping)
- Limited to 80 minutes of recording and limited to 16-bit/44.1 kHz
- Familiar paradigm: record to media, remove media when finished, place media in a case when interview completed, label case, and store as master copy
- Larger recorders
- Many movable parts
- Must finalize discs before ejecting (usually takes 4 minutes)
- At “CD Quality” cannot record mono. (I am not saying record mono, just pointing this out)
Recorder Inputs
Portable recorders possess a variety of input types:
XLR inputs are the highest quality analog inputs. The connection is a “balanced” signal. A balanced connection enables the linking of analog audio devices, including microphones, to a recorder through impedance-balanced cables. Usually associated with professional-level audio equipment, these allow for longer cable lengths and reduce the addition of external noise to the signal. Balanced cables have either XLR or TRS plugs. Professional-level digitization will usually involve balanced outputs on the analog playback device and balanced inputs on the analog-to-digital converter. XLR cables transport a mono signal. Stereo recorders with XLR inputs will contain two XLR inputs. Single Point Stereo microphones will contain a modified version of XLR. In certain cases, this can be a 5-pin connector that splits into two XLR male cables.
A ¼ in. TRS (tip-ring-sleeve) contains the same technology as the XLR connector, but in the form of a ¼ inch connector. The balanced TRS connector carries a mono signal and requires balanced inputs. This allows for greater compatibility with higher-end recorders. A 1/4″ balanced input is fairly rare in portable digital audio recorders.
The1/8 in. mini-plug inputs are typically associated with lower-end recorders. On the connector, one ring signifies a mono jack and two rings signify a stereo jack. The mini-plug’s advantage is portability, but the preamps associated with this input type are usually sub-par. The mini-plug can be temperamental because it does not lock into the microphone input jack and because touching the jack while plugged in can cause static.
The Process of Digital Recording
A microphone converts sound waves into electrical energy which generates an electrical current that is analogous to the frequency and amplitude of the original sound wave.
The digital recorder’s microphone preamps boost the weak electrical signal generating from the microphone
The digital recorder converts the analog signal into digital data (analog-to-digital conversion)
The quality of digital audio is measured by the following parameters:
Recorder Settings
Sample Rate is the number of samples or “snapshots” taken of the signal and is measured in hertz/second. The higher the sample rate (or the more samples per second) the better the digital representation will be. CD quality equals 44,100 samples per second or 44.1 KHz and is the minimum recommended sample rate for field recordings.
Bit Depth refers to the number of bits used to represent a single sample. For example, 16-bit is a common sample size. While 8-bit samples take up less memory (and hard disk space), they are inherently noisier than 16-bit or 24-bit samples. The higher the bit depth, the better the recording; however, higher bit depths also lead to larger file sizes.
Bit Rate refers to a measurement of digital audio based on the following equation and is usually expressed in kilobits/second
Bit rate = (bit depth) x (sampling rate) x (number of channels)
Again, CD quality equals 16-bit and should be the minimum bit depth used for field recordings. As flash-based storage media dramatically decreases in cost, 24-bit field recordings are beginning to catch on in the field. These 24-bit recordings are becoming the standard bit depth for archival-quality analog-to-digital conversion– although many repositories and individuals continue to digitize using 16-bit.
Channels
Single-channel recording is known as monaural or mono recording. Stereo recording involves the recording of two channels (left and right). In interviewing situations, the two channels associated with stereo recording allow the separation and isolation of a channel for the interviewer and the interviewee. This gives the recording a spatial dimension missing in mono recording and enables a listener or transcriber to isolate a single voice in playback. Single-point stereo microphones involve the left-right separation but do so from a single source, yielding much less sound isolation. Stereo recording doubles the data footprint (or file size) of your recording. Some recorders are capable of using a stereo microphone setup but recording in mono by “mixing” the two channels together. The resulting file is half the file size of a stereo recording, but the channel isolation will be lessened. Some portable recorders will not allow you to record mono files.
Recording Quality and Compression
Digital recorders provide a variety of parameters regarding recording quality. Field recordings should be recorded without compression. Compressed recording formats are usually measured by bit rate, which is calculated by an equation involving bit depth, sample rate, and the number of channels being recorded.
Lossy compression (such as MPEG compression) involves an algorithm that applies psychoacoustic principles to determine what can be taken out of an audio signal in order to make the file smaller. This involves degrading the signal. Although these algorithms make compression more and more imperceptible, derivative files made from your master recording will be extremely low-quality and contain digital artifacts. [sample]. Compounding compression will seriously degrade audio quality. Flash and HDD storage has dramatically dropped in price, so today there is no financial need to make a compressed recording. The most common compressed format for audio is Mp3 [2] . Marantz has used Mp2 on some of their recorders.
For archival purposes, uncompressed is the best way to record. Compressed audio is best utilized for creating Web-deliverable files, not for recording the original interview.
File Formats
Most portable field recorders utilize a technique called pulse code modulation (PCM) for digital conversion of an uncompressed analog signal. This digital data is then saved as a data file.
Uncompressed
The standard file formats associated with uncompressed recordings are Wave (.wav), Broadcast Wave, and .AIFF. Portable field recorders normally utilize Wave files. AIFF files usually are associated with Mac computer applications. Since both are uncompressed, the quality is the same. The Wave file is the more common file format that professional and prosumer portable field recorders utilize. Wave, unlike AIFF, enables the incorporation of metadata into the digital file itself. Many low-end consumer recorders—ones designed for dictation and usually purchased at office supply stores— record in a proprietary compressed format that can only be accessed using a proprietary software package. Because format compatibility and interoperability are of the utmost importance from an archival perspective, these low-end consumer recorders are not recommended for recording oral histories.
Compressed
MPEG recording will dramatically decrease your data footprint and, thus, increase your recording time. The compression, however, degrades your recording quality. MPEG is ideal for placing audio files on the Web. Indeed, Mp3 files have become a standard uncompressed codec almost universally accepted by most computer players. Mp3 files are usually measured by bit rate rather than by sample rate and bit depth. Also included among more proprietary audio compression codecs are the Windows Media Audio Files, and Real Audio files. There are several “lossless” compression codecs available including FLAC (Free Lossless Audio Codec) and the Apple Lossless Audio Codec (ALE/ALAC). At this time, however, most recorders do not include these as a recording option.
File Size
File sizes for recordings are calculated by combing bit depth, sample rate, channels, and recording time.
16-bit/44.1 KHz/Stereo/.wav = 635.04 mb per hour
24-bit/96 KHz/Stereo/.wav = 2 gb per hour
A file size calculator can be very useful for determining the amount of storage you will need, both for interviewing and for archiving your interviews to servers, external HDDs, digital tape backup, or CDs and DVDs. There are numerous variables entailed with compressed audio file size calculation including bit rate and the compression “codec” used to encode the file. As a general rule, choosing mono over stereo cuts file size in half. Unfortunately, the spatial dimension and separation of the original stereo recording will be lost if converted to mono.
Recording Levels (For more detail on this topic see the OHDA essay Achieving Good Audio Recording Levels, or visit the video tutorial Digital Audio Recording: Recording Levels.)
- You want to record as strong a signal as possible
- You do not want your recording levels to clip
- Higher bit depth recording is more forgiving when boosting the levels of a low-level recording.
- Locate a comfortable range for your peaks (usually between -12 dB and -6 dB) at the beginning of the interview. A recording that has average peaks under -16 dB will normally have a greater amount of noise when the levels are boosted to optimal levels. I prefer to stay away from averaging in the -3dB range because of the unpredictability of an interview. If clipping occurs, don’t panic and gently back the levels down.
Manual level control
Manual level control involves the operator adjusting the levels by use of the input level or recording level controls. When recording with manual level control it is best to use a limiter to protect against clipping.
Limiter
A limiter sets a threshold above which the signal will be gently pushed down in order to prevent clipping. This is preferred over ALC or AGC (see below) as it allows the operator to set optimal levels and minimizes noise while still protecting the recording from clipping. Limiters are not foolproof, however, and good levels must still be determined by the operator.
Automatic Level Control/Automatic Gain Control (ALC/ALG)
ALC and AGC are circuits in a recorder that determine an average optimal level. Use of one of these will minimize the risk of clipping but typically not produce as high a quality of recording as manual level control, because they boost quiet moments in the recording up to record level and thus boost background noise.
It is also recommended that operators utilize headphones, at least in the beginning of the interview, in order to monitor the sound quality of your recording. Levels will not indicate if there is a minor buzz, or the introduction of extra noise from a ceiling fan or an air conditioning unit that you may not have noticed during setup. A microphone can greatly exaggerate noise unnoticed by the ear, and can often be rectified by a different (typically closer) microphone placement (See microphone section). Monitoring through headphones at the beginning of a recording can greatly improve sound quality and, on occasion, prevent the loss of an entire recording.
Final Thoughts on Digital Recording
- Use AC power if available, but keep freshly charged batteries in the recorder during recording. The accidental unplugging of digital recorders during an interview can result in the loss of the digital file. The batteries can serve as a backup.
- If possible, use a good external microphone. The better the microphone, the better the quality of the recording. An inexpensive field recorder with a good microphone can produce a better recording than an expensive field recorder with a poor built-in microphone. Although some built-in microphones now produce much better sound than they used to, external microphones continue to yield a better recording. Recorders with on-board microphones must be strategically placed in between the interviewer and interviewee. The greater the distance a microphone is placed away from the sounds to be recorded the noisier the recording.
- Record as uncompressed Wave files at a minimum quality setting of 16-bit/44.1KHz.
- Setting record levels manually with use of a limiter requires some practice. But, once you have become comfortable with it, they will yield a higher quality recording than one produced with use of an automatic level control. Pay attention to recording levels throughout the interview. Recorders with limiters are the easiest way to protect an interview from clipping.
- Turn your cell phone off during an interview. A cell phone in vibrate mode may not ring, but cellular phones periodically “handshake” with the local tower. This signal can often interfere with recorders, resulting in unwanted noise on the recording.
- Do not be penny wise and pound foolish. Inexpensive field recorders that utilize cheap components produce poor recordings—especially when not used with an external microphone. Digital recorders with inexpensive preamps will yield noisier recordings than a more professional-grade recorder, which nowadays are not that much more expensive. Better recorders are getting less expensive, and now can be purchased new at prices starting at $200. OHA will continue to monitor recorder qualities and make “best practice” recommendations. Consult the “Recorders” section of the Web site or consult online resources for up-to-date recommendations.
- Always back up your recordings!
Citation for Article
APA
Boyd, D. A. (2012). Digital audio recording: the basics. In D. Boyd, S. Cohen, B. Rakerd, & D. Rehberger (Eds.), Oral history in the digital age. Institute of Library and Museum Services. Retrieved from https://ohda.matrix.msu.edu/2012/06/digital-audio-recording/.
Chicago
Boyd, Douglas A. “Digital Audio Recording: The Basics,” in Oral History in the Digital Age, edited by Doug Boyd, Steve Cohen, Brad Rakerd, and Dean Rehberger. Washington, D.C.: Institute of Museum and Library Services, 2012, https://ohda.matrix.msu.edu/2012/06/digital-audio-recording/
This is a production of the Oral History in the Digital Age Project (https://ohda.matrix.msu.edu) sponsored by the Institute of Museum and Library Services (IMLS). Please consult https://ohda.matrix.msu.edu/about/rights/ for information on rights, licensing, and citation.
Recent Comments