Carpenters learn to “measure twice, cut once”. Ben Franklin advised to “plough deep while sluggards sleep”. Well I don’t have any catch phrases for audio production, except maybe “one coffee prevents too sloppy” which probably isn’t memorable. Still, there is a need to answer questions I am asked repeatedly as editors and post production supervisors prepare for audio mixes.
As non-linear systems have become more technically proficient, the audio pipeline from source material to final mix has changed completely. These days audio edits from the “off-line” cut usually become part of the source audio for a final mix. While I still believe preparing audio for a mix is best left to those of us who do it daily, it is a current trend that audio preparation is included as part of the picture edit. So, in cases where the schedule cannot allow a separate audio preparation step or the program simply doesn’t need it, I am offering some guidelines which will help an editor make edit and track layout decisions which benefit the final mix. Thus is born Richard’s Audio Prep Guidelines.
I have been consistently amazed by the results of this paper, but it is clear that some people pay no attention to my suggestions whatsoever or consistently do the exact opposite. That’s human nature, I suppose. Please, please read and understand the section on track layout and checkerboarding of audio edits. It will take 30 minutes out of your workday. When you consider that you may be saving an hour or more out of every mix by following my suggestions, it seems like a pretty good deal.
Read·Me·If·Nothing·Else, a quick overview
I consider 8 tracks to be a minimum reasonable number for delivery to a mix, and I suggest the following layout as a starting point for you. Alter it as needed.
1. Sync sound/Voice over
2. Sync sound/Voice over
3. Sync sound/B roll
4. Sync sound/B roll
5. B roll/Music
6. B roll/Music
7. Music
8. Music
Fig. 1, 8 track layout
Track layout must be altered to suit each particular job or scene, but please try to stick to some kind of layout scheme! A random interspersing of edits among tracks is difficult to follow and work with. Please notice that stereo elements should be on “odd-even” pairs such as 5/6 or 7/8. A pair such as 6/7 is possible, but such “even-odd” pairs should be avoided. Most audio equipment is set up for “odd-even” pairing.
Make intelligent checkerboarding choices. What is checkerboarding? It is thoughtful separation of dissimilar audio material onto different tracks. During the final mix, these different tracks can be processed differently to achieve a smooth and cohesive result. Brainless checkerboarding of every edit can be as bad as checkerboarding none of them. If handles are available at checkerboard points, please include them. For information about handles and why and when to checkerboard, read on further. A common situation which complicates the mix needlessly is when edits have been checkerboarded without paying attention to whether sound is foreground or background. This can lead to tracks which contain, for instance, tightly interspersed interview and B-roll/effects. These often must be separated into a workable arrangement during the mix.
When digitizing into the nonlinear editor, remember that your audio will usually become part of the final audio. It is important that “clipping” (red lights) during digitizing not be allowed. Clipping causes an ugly distortion which can never be removed. Also, please do not combine original source audio tracks. Keep them separate, or if they are both identical, you may choose to digitize only one of them. Take particular care, when digitizing through a mixing board, that the editing system’s audio outputs do not feed its own audio inputs. This may or may not cause a very audible feedback howl, but it will certainly cause a particular audio “coloration” which is immediately identifiable to the practiced ear, and can never be fixed except by re-digitizing. By all means use a standard edit room and mixer setup for consistency and keep notes about how material was digitized. By keeping notes, you can repeat those settings if you return to that source tape for more material later on, and the new material will then edit into the old seamlessly. This is a VERY common situation, and a problem easily avoided.
A Few Basics Mixed Thoughts
This is probably obvious, and I don’t mean to talk down to anyone, but a really good job of audio preparation demands a basic understanding of what happens during a final mix. Audio editors must be able to think like a mixer and anticipate the problems. For programs whose audio is largely a collection of on-location recordings (“natural sound”) my mixing job can be defined by four steps. In order of importance, they are:
1. Balancing of foreground sound against background sound and music.
2. Processing for highest spoken word clarity.
3. Processing to reduce unwanted ambient noises.
4. Smoothing out of distracting ambient noise changes from edit to edit.
Fig. 2, mix priorities
In reality, step one (balancing of foreground sound against background sound and music) cannot be accomplished properly until the other three steps have been completed, but it is still the most important step. During this process, I must also separate English dialog and music from all other elements, in order to satisfy standard international delivery requirements.
Track Attack
A final audio mixdown often has two, three, four (or more) different sound elements playing simultaneously, like an on-camera interview with additional background ambiance and music underneath. Commonly many more than three or four tracks are used to organize these elements prior to a final mix, for several reasons. First, we humans like information which is logically organized. For instance, during the mix I want to trust that edits placed on “music tracks” will probably be music. It is a really convenient thing. See the “Track Grouping” section for more info.
Another reason to use a number of tracks is to allow alternate versions of something to exist for later selection. Two performances of the same dialog line would obviously require two tracks. Or perhaps a couple of different music selections or sound effects could be played side by side.
You may want to overlap or play simultaneously two different sounds together. For example, if you were editing a scene with a dramatic argument, you might want to have one actor “cut off” the other, or have them both talk simultaneously, even if the scene wasn’t originally shot that way. Using two or more tracks allows you to do this.
A fourth and very important reason (to me) for placing sound edits on several tracks is to make my job easier when I use processing tools.
Signal Processing
Steps 2 and 3 of my mixing job (attaining highest clarity and reducing unwanted ambient noise) rely on audio tools known collectively as processors. One of those, the equalizer, is my first tool to reduce background noise, enhance clarity, control sibilance, etc. An equalizer is a collection of audio tone controls, kind of like your home music system’s bass and treble controls on steroids. Generally speaking there is one equalizer collection per track of audio, which is set by ear for each and every piece of audio on that track, one piece at a time. In addition to equalizers I also use compressors/limitors, de-essers, and other sound manipulators, each adjusted to work with the others. As you might expect, vast sound differences from edit to edit require vastly different processor settings to make a similar sounding result. Track layout, that is which sound edits are placed on which tracks at what time, can directly affect a mixer’s efficiency at adjusting processors during the mixdown. If you don’t understand why, keep reading. In fact, please keep reading anyway!
Checkerboarding
Checkerboarding is the technique of intelligently placing sound edits one after another on separate tracks, with or without an overlap. There are primarily two times when edit checkerboarding is crucial. The first and most obvious is when an actual overlap or crossfade between sounds is desired. The second is to aid the mixer in smoothing sound changes from edit to edit (step 4 on my list). The way track layout can be most useful to the smoothing process is often misunderstood, causing many track layout mistakes.
It stands to reason that if a number of sound edits came from the same location and were recorded at approximately the same time of day, there is a very good chance that they will require similar processor settings. Similar sounding edits grouped onto the same track(s) can help reduce mix time. Remember that mixing is generally a linear process. During the mix as each new edit comes along, the processing for that track is already set for whatever was previously on that track. If the new edits qualities are closely related to the previous edits, then little or no processor readjustment is needed. Cool! If, for instance, you are cutting a scene which contains indoor and outdoor interviews, it would be best to put the indoor interviews on a different track than the outdoor interviews since each location will probably require very different processor settings. If one of the indoor locations was by an air conditioner vent, and the others were in a quiet room, the air conditioner portions would ideally go onto a third track. During the mix, processing can be set for each track. The second and third outdoor edits might sound just fine by using the settings from the first outdoor edit, or perhaps require only small “tweaks”. And with clever edit layout, that track’s processor settings would not have to be altered to accommodate an intervening indoor interview. (These rules tend to apply no matter whether the mixer is using older analog equipment or modern digital methods.)
This raises the question, “If several people are interviewed at the same location, do they all get the same processing and would they all be placed on the same track?” I’m afraid there is no pat answer. Often, assigning edits to tracks using broad criteria such as indoor/outdoor, traffic/birds or noisy/quiet is sufficient. While it is likely that different people in the same location would get similar processing, if not exactly the same, there is no guarantee. You can decide how to assign edits to tracks based upon how many tracks you have available and how different the edits sound to you. Trust your judgment.
Is there a time when checkerboarding is NOT desired? ABSOLUTELY YES! If you are editing an interview, for instance, and at an edit point there is no change in background ambiance or voice quality, then do not checkerboard. This is common with a quiet setting, and less common during exterior scenes, due to undulating traffic, airplanes, dogs, and the like. I generally recommend that if you are listening on a reasonably good playback system (full range speakers or headphones in a quiet room), then the rule of thumb is if an edit sounds perfect to you there is probably no reason to checkerboard it.
A common situation is when an on-camera sound is featured, and is then lowered to make way for an entering off-camera voice-over. Start the voice-over on another track, please, while extending the sound-to-be-lowered on the same track (if possible).
Following checkerboard rules can lead to a trap. You should not allow foreground and background sounds to cohabitate on the same tracks during a scene. It is particularly difficult to work with an interview that is checkerboarded onto two or three tracks if there is also some B-roll sound interspersed onto some of the same tracks “in the holes”.
Handles for Smoothness
As you can see, it is very helpful within a scene to put different quality sounds on different tracks in order to facilitate processing. If these two different-quality sounds are contiguous (no silence between them), checkerboarding is crucial. And if there is a small amount of overlap (one track starts playing before another stops), then it is even easier to connect the sounds during the mix in a smooth and pleasing manner. These overlaps or “handles” can be as short as a frame, if that is all that can be given without including unwanted material. Handles of 10 frames or more are better when possible. Handles can be created where none exists naturally by finding a short bit of identical ambiance elsewhere and cutting it onto the desired piece of audio. Don’t create artificial handles if it can’t be done perfectly! This technique requires both the edit and the ambiance match to be absolutely perfect, a difficult if not impossible chore without subframe editing resolution. I have too often gotten edits where this was attempted with ambiance that not only didn’t match, but probably came from an entirely different location. Let me repeat. Don’t create artificial handles if it can’t be done perfectly! This is a common mistake which takes as much time to undo as it does to fix. See the upcoming “Handles for Delivery” section for more pleasurable reading on this.
Grouping of Tracks and Track Layouts
The tracks themselves may be conceptually grouped together for additional efficiency and organizational clarity. An editor can generally create intelligent groups of tracks which apply to individual scenes or the entire show. For editing, I define four main sound groups, which usually apply to entire programs. They are:
1. Voice over, ADR, and other clean studio voice recordings.
2. Natural or “sync” sound, i.e. on-set dialog and interviews.
3. Sound effects, B-roll, and foley.
4. Music.
Fig.3, Four Sound Groups
As the mix is performed, sound elements in these four groups generally receive equalization and other processing which is appropriate for that group only. For instance, I process voice over in ways that may not be appropriate for music or sync sound. If an editor gives me audio edits grouped intelligently in these ways, my job is easier, and that can mean less time spent for a better mix.
Suggested Track Layouts
Ideally, a sound editor will lay out audio tracks which keep grouping and checkerboarding techniques in mind. Here is a suggestion for laying out a documentary show to 8 tracks which may contain mono and stereo material.
1. Sync sound/Voice over
2. Sync sound/Voice over
3. Sync sound/B roll
4. Sync sound/B roll
5. B roll/Music
6. B roll/Music
7. Music
8. Music
Fig. 4, 8 track documentary layout
The actual order of tracks is not important. You will notice that there are at least two sets of tracks for each category, to allow for checkerboarding. The idea is to keep like-material on the same groups of tracks. But what about tracks 5 & 6 which seem to break the rules? Normally it would not be desirable to place non-music material onto music tracks, but where track space is limited such compromises must be made. Experience has shown that this particular compromise often works well. Another compromise is an unavoidable interspersing of material between recurring elements on a track. For instance, a voice over will often be a clean studio recording which will have a different sound quality from everything else. It is acceptable if other material is also on that track, if there is simply nowhere else for the other material to go, but please try to keep a bit of silence separating vo and non-vo sound. It is also okay if the voice over must change tracks between sections or scenes, but please try to keep a section of vo together on the same track. Also apply this guide to any other important recurring element.
Unfortunately, we sometimes have to deal with even fewer tracks, as some older Avid systems are still limited to working reasonably with 4 tracks at a time, so more compromises must be made. In such cases, it is probably best to ignore all checkerboarding and handles rules, deliver the program to us as OMFI files, and allow us to do fine audio editing and checkerboarding. This is obviously a budgetary decision.
Here is a guideline for 4 track layout. Again, the idea is to keep recurring elements confined to the same tracks. If you must use track 1 for sync as well as voice over, keep a bit of silence between them each time.
1. Voice-over & sync
2. Sync and B-roll
3. Sync and B-roll
4. Music
Fig. 5, 4 track documentary layout
My track layout suggestions are just that – they are only suggestions.
Handles for Delivery
Handles refer to the sound that is immediately preceding or following an edits desired material. These handles are often very important to me. Which method you use to deliver sound to me can affect how you should treat them in your preparation. Delivery via OMFI export computer files will automatically include these handles, so you do not need to think much about them. If you are delivering on tape or with non-OMFI file formats (.WAV, AIFF, or SD2) I will only get what you choose to give me, and I would rather have too much than too little (within reason). Most current video editing systems (including Avid) do not have subframe editing capability, so if you are trying to edit into a spoken phrase and you cannot pick a perfect point to edit in or out, then choose to give me a bit more sound than is wanted. In other words, start an edit earlier rather than later. Likewise, pick an edit-out point that is later to include a bit of extra material. A single frame extra will usually be enough. I have subframe capability and can work with such problems more easily. For example, imagine that you are editing an interview and want to use a phrase that begins with “and I think”, but you would rather pick it up with “I think”. You have little choice but to pick an edit point that will be somewhere in the middle of the continuous sound “and I”. This invariably results in a pop or click at the very beginning of the edit, and this click may or may not sound okay in the edit room, but rest assured it will not sound okay in the mix room! In such a case, I suggest that you choose to pick up the edit at least 1 frame earlier, (i.e. include a bit of the “and”) and let me find the best point. I can make finer edits, and I have different options for removing the click at the same time. Again, this is a less important consideration if you are delivering to me via OMFI export because I can always pull out the handle myself and get more. Having said that, I would still rather you include more (if convenient for you) so that I may hear the handle upon first listen rather than be forced to pull it out to find what else may or may not exist.
Good sound editors know that clicks are created when you edit into ANY sound that is not silence, whether it is music, spoken words, ambiance or whatever. Sometimes the clicks aren’t objectionable, but they are always there. Please be careful. When I remove these clicks I also must remove a small portion of possibly desirable sound. It is better if you give me a bit too much and let me find a perfect edit point. Again, with any delivery method other than OMFI export, this is crucial. So, at the risk of monotonous repetition, if an edit is problematic, simply include a little more sound than you really want to keep.
Avid Add Edit command causes Pro Tools audio problems
There is an editing technique which is very common and can cause real headaches in the mix. It is the practice of creating edits in a clip and adjusting gains of the resulting subclips at points where music/B roll sound is to be faded up or down. Crossfades are often applied to smooth the changes. This is a very effective trick to roughly balance audio during your edit. The problem comes after the audio has been exported via OMFI to mix within Pro Tools. There are differences in the way Avid and Pro Tools performs crossfades. Even though Pro Tools can perform many, many different styles of crossfades, during the translation process from OMFI to Pro Tools a standard “equal power” crossfade is always used. Equal power crossfades work very well for most material (and can always be changed manually at each edit), but have the unfortunate side effect of causing a volume “bump” when crossfading across “add edits”. This “bump” only occurs when crossfading between two pieces which are exactly the same, which is what happens when the Add Edit command is used and then a crossfade applied.
If time permits, all “add edits” should be removed before audio is exported via OMFI. If the actual edit cannot be removed, at least remove the crossfade associated with it. I must remove them if they are present.
A Final Reminder
There is an important benefit to transporting sound from the edit room to the mix room via Avid’s OMFI files, instead of multichannel audio or video tape. I retain the ability to work on audio edits using tools which the Avid does not provide, and I also get “handles”, the original sound just preceding or following edit points. This allows the Avid editor to be a bit sloppier with sound edits since they can be cleaned up effectively here. Unfortunately, such freedom bears a price tag. Extra time I spend editing is additional to the normal mix time. Audio editing time not spent in the edit room will be spent in the mix room! Therefore, delivering OMFI files does not remove any responsibility from the editor in any way. The real danger here is that a picture edit may be based on a flawed audio edit, which might have been avoided if time were taken with the audio in the first place. I have painful memories of dialog which was picture synched on both sides of an edit, while the edit itself was poorly chosen. Simply adding or removing a couple of frames would have made a better audio edit. Ouch.
I am usually under pressure to finish my job as quickly as possible. Therefore, I will use whatever techniques will make an acceptable final result as quickly as possible. I have found that undoing and re-editing sound is often NOT quick, so I ask that editors and audio preparers continue to take responsibility for their edits! Having said that, I acknowledge that I have more powerful tools, so some audio problems are better left for me to solve. Life is a compromise.
Frequently Asked Questions
What sample rate should my sequence use?
This depends on what is required for final delivery. Higher is “better”, but not always best! I usually prefer 48Khz since digital video tape and most transmission methods use 48K, and therefore digital audio laybacks to them are simple. DVD video standards allow up to 96Khz but the overwhelming majority of DVD audio is 48Khz, which is the standard rate for Dolby Digital and DTS. If you have started at 44.1Khz, for whatever reason, then stay with it. I can convert to another sample rate if delivery requires it. Many people use 44.1K, since this allows quick imports from CDs during the edit session, and uses somewhat less storage space. Many feature films are finished at 44.1K. While 48Khz is theoretically better, it is a marginal difference at best. If you have a lot of CD source material and can import the CDs digitally to your sequence, then stick with 44.1K. Otherwise use 48K.
What about bit depth?
This is a number which is always 16 or 24 in our work. It refers to the number of digital bits which are strung together to make up a single digital audio word. The higher number allows softer sounds to be reproduced with lower distortion. How soft is “softer”? Let me just say that 16 bits allow reproduction of loud to soft sounds that exceed our requirements. Well recorded 16 bit material is excellent and more than we need. 24 bit audio allows for even better quality soft sounds and so is slightly preferred. Unfortunately, not all digital video formats allow 24 bit audio to be used. Digibeta, for example, truncates at 20 bits. Until only recently, Avid software recognized only 16 bit audio. The best of all worlds is to give me 24 bit depth, if you have it. Otherwise, 16 and don’t sweat it.
How many tracks should I put the audio on?
8 or 10 is most common. If you need more than 12, we should probably talk first.
Should I remove clip gains and volume automation?
No, not if you intend to use OMFI to transfer audio to the mix. I can simply turn off the volume changes without removing them. They are occasionally useful to either use or at least refer to. The one exception is the Avid “Add Edit” feature. This places edits in material that is otherwise continuous, and is usually used in conjunction with volume changes (fading music under dialog, for instance). These should be removed. If the actual edit cannot be removed, at least remove the crossfade associated with it.
If you are transferring your audio to tape before mixing, then I would suggest that you remove any level changes associated with music, but leave others in.
What video formats can Pharoah use?
We currently accept only BetaSP. VHS is never desirable. We can also accept Avid media directly for both audio and video playback within Pro Tools. Quicktime DV movie files are also very good. Tapes must be digitized, files usually must be converted for Pro Tools use. We currently use Avid Express Pro for all digitizing and conversion.
What audio formats can Pharoah take?
We can work from or dub to DAT (no timecode), DA88 digital 8 track, CD, audio cassette, and ¼” reel to reel. Nearly any computer based multimedia format is okay, Mac or PC. Any format not listed (16, 24, 48 track) will need to be transferred to DA88 or ProTools. We will be glad to assist.
|