Viewfinder · Sound · Volume 04

Sound is underpriced.
The specific reason your
video underperforms.

Nine years of commissioning has taught us that most brand videos fail on the audio track. The visual production is competent, the writing is passable, the sound design is treated as an afterthought. The audience notices, even when they can't tell you what they noticed.

By Petar Vukelić8 min readViewfinderVolume 04
Cover

Ask a marketer about the budget breakdown of their most recent brand video and you will almost always get the same rough shape. The largest single line is production — cameras, crew, locations, sometimes talent. The second largest is post-production — editing, colour, visual effects. The third is a smaller line for music or licensing. Somewhere down the list, if it appears at all, is sound design and audio mixing, typically bundled into "post" as an undifferentiated cost.

This shape is, on our nine years of commissioning experience, the specific structural reason that most brand videos underperform. Not because the audio budget is small in absolute terms — though it usually is — but because the audio work is treated as an afterthought in the production process. Sound is commissioned late, revised little, mixed quickly, and delivered without the sort of specific attention the visual work receives at every step.

The audience, on the accounts we watch, notices the difference. They cannot, in most cases, articulate what they noticed. But their response to a well-sounded video is measurably different from their response to a badly-sounded one, holding everything else constant. This piece is about what that difference costs, and what the specific fix is.

What badly-sounded looks like

Bad audio, in commercial brand video, tends to manifest in a specific set of patterns. Dialogue and voice-over are often recorded on set with insufficient attention to acoustic treatment, producing tracks that are technically usable but audibly rough — room noise, reverb, slight EQ mismatches between takes. Post-production audio work then tries to compensate for the on-set problems using digital tools that produce their own audible artefacts. The result is a track that sounds, to a trained ear, "processed" — a specific and identifiable low-grade audio character that the audience registers as unpolished even when they cannot name the specific problem.

The music track, in the same videos, is typically pulled from a stock library at the last moment of the edit, without much consideration of whether the specific music choice supports the specific emotional arc of the piece. The music sits in the mix at a fixed level throughout the video, rather than being ducked, faded, or transitioned to support the visual and vocal storytelling. The overall balance between voice, music, and any sound effects is set by ear rather than to a specific loudness target, producing videos that are audibly quieter or louder than they should be relative to the platform's expected delivery standards.

All of these problems are, individually, small. In aggregate, they produce videos that sound distinctly amateur relative to the visual production standards, and that read to the audience as amateur even when the visual work is genuinely good.

What well-sounded costs

The specific investment required to fix these problems is smaller than most marketing teams assume. A competent production sound recordist on-set, working with adequate microphones and adequate acoustic sense, produces raw material that requires much less post-production work than an inexperienced recordist does. A specialist sound designer working on the edit, briefed with the same care as the picture editor, produces mixes that are noticeably better than mixes produced by whoever the video edit house happens to have available. Neither of these is a large cost. Both are commonly excluded from commercial brand video budgets.

On the commissions we have run over the last three years, adding roughly 8-14% to the total production budget on specific audio investment — a better on-set recordist, a specialist sound designer for post, an additional day of mix time, a properly-supervised master delivery — produces videos that are, on subsequent audience testing, materially more effective than the same production with the standard-industry audio approach. Aided brand recall on the well-sounded videos runs, on our test data, roughly 20-30% higher than on the industry-standard versions. Purchase intent runs 12-18% higher. These are not small differences.

"You cannot look at a video and see the audio budget. You can, however, watch a video and know within the first ten seconds whether the audio was treated as an afterthought or as work in its own right. The audience response follows the second impression more than the first."

Why the industry gets this wrong

The reason the industry systematically underprices audio work is, on our reading, structural rather than accidental.

The first structural reason is that the visual work is more visible in the production process. Everyone on set can see the camera. Everyone at the edit can see the picture cut. The audio work happens in specific technical spaces — recording booths, mixing rooms — that most people involved in commissioning never enter. What is not visible in the process is easily underweighted in the budget.

The second structural reason is that audio work is difficult to evaluate without training. A marketing lead reviewing a video edit can, with reasonable confidence, say whether the picture cut is working. The same marketing lead evaluating the audio mix is, in most cases, unable to distinguish between a mix that is technically competent and a mix that is truly good, and — more consequentially — is often unable to identify what is wrong with a mix that is genuinely bad. The evaluation happens at a lower level of expertise than the equivalent visual evaluation, which produces a lower standard being accepted.

The third structural reason is that specific audio problems are recoverable in post to a degree that specific visual problems are not. A slightly rough dialogue take can be cleaned up, mostly, with modern audio tools. A slightly out-of-focus shot cannot. This creates a specific asymmetry in the production budget: teams pay a premium for visual quality because they must, and they underpay for audio quality because they can, mostly, get away with it. The videos that would benefit from a real audio investment do not get one because the marginal problem is recoverable.

What to do

Three practical implications, in decreasing order of difficulty.

The first is to hire audio expertise into your commissioning process. If your team does not currently have someone who can evaluate an audio mix at the level a picture editor evaluates a picture cut, that person needs to exist in your workflow. This may be an in-house hire; it may be a specific external sound designer you consistently work with; it may be a briefly-retained consultant during commissioning cycles. The specific person matters less than the specific expertise being represented in the process.

The second is to add a specific audio line to your video production budget rather than bundling it into general post. Naming the line makes the investment visible. Making the investment visible makes it defensible in budget conversations. Making it defensible produces sustained investment over time. This is a small operational change with, on our experience, disproportionate downstream effect.

The third is to establish specific audio-quality standards for your video output — loudness targets, dialogue clarity requirements, music-and-effects balance guidelines — and to review deliverables against those standards before accepting them. Most marketing teams do not currently have such standards. The teams that do produce videos that sound consistently good; the teams that do not produce videos whose audio quality varies inconsistently even within the same commissioning cycle.

None of this is expensive. All of it is currently missing from most commercial video production. The gap is worth closing.