Evaluating a New Dolby Atmos Delivery Format: A Double-Blind Comparison of DD+JOC, AC-4 L4, and PCM

SUMMARY

Dolby Atmos is everywhere now, but many audiophiles and audio professionals have felt that the most common delivery codec–DD+JOC–is lacking in audio quality. Dolby’s newer AC-4 codec promises better fidelity, and in a double-blind Immersive Machines/AES New York test, listeners consistently chose AC-4 as the truer version of the mix.

 

“…what if you started fresh—designing a Dolby Atmos codec from the ground up, using modern algorithms and an immersive-first mindset?
“

AES New York, Engine Room Audio and Immersive Machines listening session.

 

BACKGROUND

The vast majority of consumers listening to Dolby Atmos music and soundtracks on Apple Music, Netflix, HBO Max, etc. are not hearing the lossless, high-fidelity PCM render that was heard in the studio by the mixing and mastering engineers, or by the directors and producers during post-production. Instead, they’re hearing a lossy, data-compressed version of the original mix, delivered using a format commonly referred to as Dolby Digital Plus with Dolby Atmos, or “DD+JOC”, with the JOC standing for “Joint Object Coding.” When I first listened to new albums on Apple Music or TIDAL on my living room home theater system, DD+JOC was exciting and filled the room with a big sound in a way that stereo simply could not do. But over time, the limitations of DD+JOC became more apparent. And when listening to DD+JOC in the more critical setting of a professional mix room, and with an uncompressed reference available for comparison, its limitations are obvious. The spatial distinctiveness of individual objects are blurred and the soundstage folds in. With the ability of a professional mix room to solo loudspeakers when listening, individual channels often sound warbly, exhibiting that ‘space monkey’ sound you hear when when too much noise reduction has been applied.

DD+JOC is fundamentally a clever retrofit. Dolby took the older Dolby Digital+ platform and found a way to add object audio coding and positional metadata without breaking compatibility with millions of legacy stereo and 5.1 systems. It was a smart engineering compromise and allowed Dolby Atmos to ship to the world quickly.

But what if you started fresh—designing a Dolby Atmos codec from the ground up, using modern algorithms and an immersive-first mindset?
 That’s Dolby AC-4.

But what if you started fresh—designing a Dolby Atmos codec from the ground up, using modern algorithms and an immersive-first mindset?
That’s Dolby AC-4.

When the Immersive Machines team added support for MP4 export in Immersive Master Pro, we integrated both DD+JOC and AC-4. During testing, it occurred to us that very few people had actually heard AC-4 on loudspeakers. TIDAL and Amazon Music have long delivered Dolby Atmos music using AC-4 IMS, which is specifically for binaural headphone and virtualized stereo speaker playback. But the newer version of AC-4 that supports both headphones and multichannel loudspeakers hadn’t been heard outside the lab. There were a lot of questions: How much better was AC-4 compared to DD+JOC? How would AC-4, with its tiny bit rate of 448 kbps stand up next to lossless 7.1.4-ch PCM renders in a studio setting? Those PCM renders were playing at a data rate of 13,824 kbps (!).

With the support of the New York section of the Audio Engineering Society and Engine Room Audio, we designed a double-blind listening event where we could put all three codecs: DD+JOC, AC-4, and the reference 7.1.4-ch PCM render, up against each other in a controlled test in which the same ADMs were used as the source material.* Several top engineers from wildly different genres generously contributed their mixes.

 

DD+JOC – AC-4 L4 – reference PCM

LISTENING

Everyone attending had an opportunity to sit in the sweet spot in Engine Room Audio’s excellent 7.1.4 mix room. I felt it was important to judge the sound not just with all speakers playing, but also with some speakers isolated. So for example, we would take a musical selection and:

  1. Listen to all codecs with all speakers on.

  2. Listen again with L/C/R off.

  3. Listen a third time while solo’ing individual speakers (like Left Top Front, or Right Rear Surround, etc.).



photo credit: Giulia Corrao

Listeners only knew that they were listening to A, B, or C, and I didn’t reveal what those letters were until the end. Listeners were asked to fill out a Google Form with a number of questions such as:

“When listening with all of the loudspeakers ON, how noticeable were artifacts (gating, swishing, graininess, loss of spatial precision, loss of frequency range)?”

“When listening while freely soloing individual loudspeakers, how noticeable were artifacts (gating, swishing, graininess, loss of spatial precision, loss of frequency range)?”

“Based on the musical examples, rate the overall audio quality of each encoder.”

photo credit: Giulia Corrao

RESULTS

I’ve always suspected that lossy codecs don’t distribute bitrate evenly across the immersive field. The mission-critical L/C/R typically sound much better to my ears than the height or rear surround channels do while listening to DD+JOC. A large dose of psychoacoustic spatial masking is at work here. Your sonic attention is diverted to channels with greater fidelity, and as long as some signal is present in the other channels, the immersive impression is maintained. Loud sounds mask quieter sounds in full playback, and the codecs take advantage of this. A dynamic program with high amplitude content in some channels could make artifacts more apparent in the other, lower amplitude channels. I personally find that all three codecs playing compressed pop music with all loudspeakers turned on in 7.1.4 sound pretty good. And when we look at the survey results of the question: “When listening with all of the loudspeakers ON, how noticeable were artifacts (gating, swishing, graininess, loss of spatial precision, loss of frequency range)?” …we see that compression artifacts were slightly more noticeable with DD+JOC compared to AC-4 or PCM. But when listeners had the opportunity to solo individual loudspeakers (removing the psychoacoustic masking), artifacts in DD+JOC became very noticeable, while being slightly noticeable in AC-4 and barely noticeable in the reference PCM (of course, by definition there are no artifacts in the reference, and while some listeners confidently identified the total lack of artifacts in the reference PCM, other listeners were unsure, so the vote was split in an interesting way).

When listening...with all speakers on, the reference PCM and AC-4 were in a statistical dead heat. Very impressive.


Rating the overall audio quality of each codec, the reference PCM was the strong preference, but only when listeners could solo loudspeakers. When listening in a conventional manner with all speakers on, the reference PCM and AC-4 were in a statistical dead heat. Very impressive. I would encourage anyone who does a similar comparison to remember that the vast majority of consumers can’t solo channels, and to not let the advertised bitrates bias your listening.


Near the end of the event, listeners began to share their strategies for identifying each codec and as I hit the buttons on the console, there was nearly unanimous consensus among these expert listeners.


Track A was DD+JOC at 768 kbps

Track B was AC-4 at 448 kbps

Track C was 7.1.4-ch 24-bit 48 kHz PCM at 13,824 kbps



Some caveats: our sample size was (16) listeners. Not large enough to be scientifically useful. Also, the majority of these listeners were established audio professionals, and were likely to be more discriminating than the average music consumer.

photo credit: Giulia Corrao

REFLECTION and QUOTES

One very thoughtful mix engineer kept going back and forth between B and C, and came to the conclusion that the lead vocal “sat better” in the B mix. That showed a preference for AC-4.

Another veteran New York mixer felt like the bass was rounder and fuller in A. And while he begrudgingly admitted that A suffered from “warbly space monkeys” in the surround channels, it was worth it for better bass. This launched into a conversation about how the sonic compromises of vinyl are appealing to people, and that maybe the technically limited DD+JOC was imparting a cool coloration that we should savor (I quickly excused myself from this conversation to get another slice of pizza). But after the event, I consulted with an expert who had an explanation for this perceived coloration, “There is a loss of inter-object decorrelation with DD+ JOC that causes build up in the decoded signal, likely coloring the sound. AC-4 does not have this limitation.”

“PCM is over thirty times the data rate of AC-4, and some of these expert listeners (including myself) were occasionally fooled into thinking there was no difference or only a small difference between AC-4 and PCM.”

When listening to A with individual loudspeakers in solo, I heard either warbling or a choppy, gating sound in the height channels. And overall, I felt a slightly grainy, indistinct quality. Listening to C, there was a glassy, smooth, solid quality to the tone in every channel. And if you put on a good mix and turned around and faced the back wall, the soundstage was clear both in width and height -something that A had trouble with. B maintained that glassy smooth quality and well-defined soundstage. Listening closely to B, there was some unappealing swishing in certain channels, but B never had the choppy gating or space monkeys of A. B isn’t perfect. A sharp transient on one side of the room sometimes momentarily takes energy out of a channel on the other side of the room. This is rare, but there is some speculation that B’s variable bit rate is the reason for the energy shifts compared to A’s constant bit rate.

photo credit: Giulia Corrao

Some written quotes from our attendees:


“B and C had much more clarity compared to A”

“Various fluttering encoding artifacts were far more noticeable in A in the top / surround speakers when solo’d. Likely noise masking and low amplitude encoding artifacts.”


“Phase coherence in C and B were almost seamless. C stood out by a small margin. B had on the other hand better tonal balance.”


“A lacked “punch” and “depth” compared to B and C. B seemed to be very similar to C, but at times felt flatter than C.”


“Soloing individual speakers was most apparent.”

“C was the best”

“Listening to the all speakers except for the LCR speaker made me notice the quality much more.”


…and my all-time favorite comment from the night:

“A was bad and someone somewhere should feel bad about it.”


Finally, this quote:

“A sounded s—t. The vocal and low end balance were way off. B and C felt very close. If I had to pick I’d say B, but i don’t think in a double blind test I’d pick them out consistently.”



For reference, “B” was AC-4 running at 448 kbps. And “C” was a 7.1.4-ch PCM render at 24-bit 48 kHz resolution, a data rate of 13,824 kbps. This is over thirty times the data rate of AC-4, and some of these expert listeners (including myself) were occasionally fooled into thinking there was no difference or only a small difference between AC-4 and PCM. That’s a huge achievement and I am eagerly looking forward to the large music streaming service that will adopt AC-4 and provide it to their customers. Television streaming appears to be ahead in this regard: in a January 5th press release, Dolby announced that “Peacock is extending the availability of Dolby Vision and Dolby Atmos across live sports over the coming year. Peacock is also the first streamer to announce its commitment to support Dolby Vision 2 and Dolby AC-4, anticipated to launch later this year.”

“With Forthcoming Adoption of Dolby Vision 2 and Dolby AC-4, Plus Expansion of Dolby Vision and Dolby Atmos Across Live Sports, Peacock Is the Premier Destination for Streaming in Dolby”

Thank you for reading. If you want to hear these codecs for yourself, the new version of Immersive Master Pro will batch encode DAMFs and ADMs into DD+JOC and AC-4 as easily-distributable MP4s. The new Dolby Reference Player can play both of these codecs. macOS Sequoia, iOS, tvOS and countless video game systems and home theater systems can play DD+JOC. This gives producers and mixers new and more accurate ways to show clients how a mix will be heard in the real world. Email us at info@immersive-machines.com with any questions. And if you’re not already a member of the Audio Engineering Society, sign up today at https://aes2.org/.

photo credit: Giulia Corrao


Special thanks to Roey Shamir and Stephen Ward from the AES New York Section for making this an official event and helping me to host. Also thanks to Mark Christensen for use of Engine Room Audio’s 7.1.4 studio. Garrett Treanor of Immersive Machines meticulously prepared all the listening material. And finally, we had great music to use thanks to these engineers who sent mixes: Zach Szydlo, Gabriel Lundh, Anthony Rodriguez, Jeff Silverman, Justin Gray, Morten Lindberg, Mark Christensen, Roey Shamir and Nathaniel Reichman. Some of the big ears in attendance were Christopher Latina, Angela Piva, Nacor Zuluaga, John Henry Dale, Oscar Zambrano, Brad Leigh, Louis Manno and Thomas Mowrey.



https://aes2.org

https://engineroomaudio.com

https://www.immersivemasterpro.com

https://professionalsupport.dolby.com/s/article/What-is-AC-4?

*PCM is not technically a codec because there is no data compression. But for the purposes of this article we’ll use a shorthand and refer to it as a codec.

photo credit: Giulia Corrao

 

Batch convert your ADM and .atmos files into Dolby Atmos-encoded MP4s.

These MP4s are the ideal format when sharing a mix for client review.

Until now, Dolby Atmos MP4 delivery has primarily meant Dolby Digital Plus with Dolby Atmos. Immersive Master Pro 1.3 supports that workflow, and now adds Dolby AC-4 L4 as a new MP4 export option.

Immersive Master Pro 1.3 supports batch Dolby Atmos MP4 export in:

  • Dolby Digital Plus with Dolby Atmos (Film and Music modes)

  • Dolby AC-4 L4 (Music mode)

Hear your work exactly as your audience will hear it.

New and existing customers can try Immersive Master Pro 1.3 free for 30 days. As a thank-you to all the customers who supported us in our first year, upgrade pricing to 1.3 is very competitive.

 

Download a 30-day demo of Immersive Master Pro 1.3 here.

Next
Next

Immersive audio insights: Katia Sochaczewska on listening beyond stereo