Transient Shaped Dialog?

YouTube · 28 Jan 2015

I've recently started noticing something annoying in TV show dialog. I first heard it when watching Mozart in the Jungle on Amazon Video. But then I heard it on a recent episode of The Flash. I'm wondering if it is something new, or it's just one of those "once you hear it, you can't stop hearing it."

From abusing transient shapers, I know what an artificially boosted attack sounds like. That's what I'm hearing in the dialog of these shows. It's barely there, but there's a bit of a high-frequency percussiveness to the beginning of words, especially noticeable with male voices, or females with vocal fry.

Googling around, it does seem that some sound mixers have fallen in love with shapers to lower noise, and level dialog. There's also the new Dialog Denoiser in iZotope's RX 4, that has recently come into being.

I'm wondering if anyone else has heard this artifact (or will now that I've mentioned it), and knows when it actually started showing up on dialog, and what is to blame.

28 Jan 2015

ScuzzyEye wrote:I've recently started noticing something annoying in TV show dialog. I first heard it when watching Mozart in the Jungle on Amazon Video. But then I heard it on a recent episode of The Flash. I'm wondering if it is something new, or it's just one of those "once you hear it, you can't stop hearing it."

From abusing transient shapers, I know what an artificially boosted attack sounds like. That's what I'm hearing in the dialog of these shows. It's barely there, but there's a bit of a high-frequency percussiveness to the beginning of words, especially noticeable with male voices, or females with vocal fry.

Googling around, it does seem that some sound mixers have fallen in love with shapers to lower noise, and level dialog. There's also the new Dialog Denoiser in iZotope's RX 4, that has recently come into being.

I'm wondering if anyone else has heard this artifact (or will now that I've mentioned it), and knows when it actually started showing up on dialog, and what is to blame.

Interesting. I've always seen an increase in peak levels when adding attack, which leads to lower loudness when compared at the same peak level. Not sure how adding attack could level dialog, but adding sustain could potentially accomplish this.

This effect could also come from using compression with slow attack times. I know it's not exactly the same thing, but technically it's almost exactly the same thing (basing this on the way transient shapers work).

Not sure I want to learn to hear this artifact - maybe it will go out of fashion soon…

YouTube · 28 Jan 2015

Yeah, it may not be a transient shaper, per se, but something changing the shape of the attack on the way to achieving another goal. E.g. An automatic tool for more even dialog, without having to do the work of going in and adjusting levels of each line.

28 Jan 2015

ScuzzyEye wrote:Yeah, it may not be a transient shaper, per se, but something changing the shape of the attack on the way to achieving another goal. E.g. An automatic tool for more even dialog, without having to do the work of going in and adjusting levels of each line.

OK, I may regret this because like you say once you hear it you can't "un-hear" it. But I'll ask anyway because I'd like to hear what you're talking about! Do you have any examples you can post here?

YouTube · 28 Jan 2015

I was thinking about doing just that. Here's a snippet of dialog from the most recent Flash episode. http://scuzzyeye.com/tests/Transient-Flash.wav

The worst offender is the "absolutely" that's said at about 2 second in, and then the "say" in "have to say" at 12 seconds is also pretty bad, and then "hero" at 12 seconds.

It's not just the attack at the beginning of a word after a silence, even individual phonemes can be affected, as is heard in "hero". It's the like entire envelope of dialog is being adjusted on the millisecond by millisecond.

Maybe this has been going on for a while, but I'm just now noticing it. Or maybe it is a new, horrible piece of software.

28 Jan 2015

ScuzzyEye wrote:I was thinking about doing just that. Here's a snippet of dialog from the most recent Flash episode. http://scuzzyeye.com/tests/Transient-Flash.wav

The worst offender is the "absolutely" that's said at about 2 second in, and then the "say" in "have to say" at 12 seconds is also pretty bad, and then "hero" at 12 seconds.

It's not just the attack at the beginning of a word after a silence, even individual phonemes can be affected, as is heard in "hero". It's the like entire envelope of dialog is being adjusted on the millisecond by millisecond.

Maybe this has been going on for a while, but I'm just now noticing it. Or maybe it is a new, horrible piece of software.

I'm definitely hearing heavy dynamics control (to describe it generically) - could simply be a brick wall limiter overused on dialog in an attempt to level the dialog "automagically". After looking at the waveforms, if a limiter is used it's being used on each actor individually.

Here's what the dialog clip looks like, to see relative levels, with the word "absolutely" circled:

And here's a zoomed in view of the word by itself, showing the transient peak at the front:

I'm not sure there's any way to know whether it was a transient shaper adding (enhancing) that transient, or a compressor with a slow attack time - both would have similar results in my experience.

I blame the loudness wars.

YouTube · 28 Jan 2015

As I was recording it, and saw the audio plot on the track, I knew that cliff face was going to be to blame.

I don't think it is something being applied per actor. But it does seem to stand out more on lower pitched sounds. As I said, the deeper male voices are usually hit with this artifact in almost every scene. But the females aren't as bad unless they make a creaky, fry sound. Then each little creak becomes sharp, even though there's only about 30 to 40 ms between them.

It does sound bad, right? I'm not imagining things?

Despondo · 29 Jan 2015

Interesting discussion and analysis here! I refuse to listen to the provided clip for fear of developing an ear for hearing this artifact in shows I watch. LOL!

29 Jan 2015

ScuzzyEye wrote:As I was recording it, and saw the audio plot on the track, I knew that cliff face was going to be to blame.

I don't think it is something being applied per actor. But it does seem to stand out more on lower pitched sounds. As I said, the deeper male voices are usually hit with this artifact in almost every scene. But the females aren't as bad unless they make a creaky, fry sound. Then each little creak becomes sharp, even though there's only about 30 to 40 ms between them.

It does sound bad, right? I'm not imagining things?

I don't think it's something being applied overall, as there are plenty of words that are louder than "Absolutely" and less compressed. This would only be true if the processing was applied per actor as far as I can figure.

Yes, it does sound bad, definitely over-processed for my tastes. What you describe, sounding worse on lower pitched sounds, is a common "fast release" artifact. Now a fast release could come from a limiter/compressor, or from an over aggressive "sustain" setting on a transient shaper (both involve the release stage of compression). A fast release, if fast enough, will actually attempt to track the WAVEFORM rather than the ENVELOPE of the lower frequency audio signals (probably stating the obvious here, apologies if so).

So if it's not a traditional compressor being used, it could be something like a transient shaper or even something like the Oxford Inflator or similar. Either way, as audio trends go, I would expect to hear more of this rather than less. ;(

YouTube · 29 Jan 2015

Despondo wrote:Interesting discussion and analysis here! I refuse to listen to the provided clip for fear of developing an ear for hearing this artifact in shows I watch. LOL!

I've not noticed it on Arrow, the show that crossed-over to create The Flash, so the effect isn't everywhere yet.

I did look, Mozart in the Jungle, and The Flash don't have any audio people in common.

YouTube · 29 Jan 2015

selig wrote:I don't think it's something being applied overall, as there are plenty of words that are louder than "Absolutely" and less compressed. This would only be true if the processing was applied per actor as far as I can figure.

Yes, it does sound bad, definitely over-processed for my tastes. What you describe, sounding worse on lower pitched sounds, is a common "fast release" artifact. Now a fast release could come from a limiter/compressor, or from an over aggressive "sustain" setting on a transient shaper (both involve the release stage of compression). A fast release, if fast enough, will actually attempt to track the WAVEFORM rather than the ENVELOPE of the lower frequency audio signals (probably stating the obvious here, apologies if so).

So if it's not a traditional compressor being used, it could be something like a transient shaper or even something like the Oxford Inflator or similar. Either way, as audio trends go, I would expect to hear more of this rather than less. ;(

What if it is not compressing overly loud dialog, but instead catching bits as soon as they cross some threshold, and boosting them up? Haven't I see an effect like that somewhere?

It's not obvious from the short clip I posted, but in some scenes it really does seem to be boosting lines that would have been delivered more quietly.

It should have been obvious, but it took you saying before I thought about it. Once you're dealing with times less than 10 ms, you're into frequencies above 100 Hz. So yeah, instead of riding the sound's envelope, the bare waveform becomes able to be followed instead. It's definitely getting into that territory.

I looked up the Oxford Inflator. Statements, like found on its product page, "can provide an increase in the apparent loudness of almost any programme, without obvious loss of quality or audible reduction of dynamic range," seem to encourage abuse.

Transient Shaped Dialog?

Who is online