Extreme sampling rates for extreme pitch shifting or time stretching?

Have an urge to learn, or a calling to teach? Want to share some useful Youtube videos? Do it here!
User avatar
Marco Raaphorst
Posts: 2504
Joined: 22 Jan 2015
Location: The Hague, The Netherlands
Contact:

18 Dec 2017

I always felt that using something like 192 kHz is simply marketing bullshit. But maybe there's one thing I had always forgotten about: pitch shifting and time stretching.

See for example http://www.musicofsound.co.nz/blog/why- ... mple-rates

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

18 Dec 2017

Marco Raaphorst wrote:
18 Dec 2017
I always felt that using something like 192 kHz is simply marketing bullshit. But maybe there's one thing I had always forgotten about: pitch shifting and time stretching.

See for example http://www.musicofsound.co.nz/blog/why- ... mple-rates
Years ago, around the time the higher sample rates initially became available and folks questioned "why", it was the film sound FX guys who first described this process. It's the same reason you film at higher frame rates (120 FPS and above) when you're going to slow it down significantly in post for the classic super slo-mo effect.

When folks first started investigating using higher sample rates in order to capture more data above the range of human hearing (to slow down later in post), they discovered some interfaces did better than others. The problem then, and probably still now, is that not all high sample rate interfaces actually give you extended flat frequency response above the range of human hearing. Why? Because that's not the goal of higher sample rates from every designer's perspective.

One primary advantage to using higher sample rates is to be able to use less steep (and thus more sonically transparent) filters on the A/D. Instead of a very steep filter starting around 19-20 kHz at a sample rate of 44.1 kHz covering a range of only a few semitones from 19-20 kHz to nyquist (22.1 kHz), if you use a sample rate of 96 kHz you can now use a very gentle (comparatively speaking) filter from 24 kHz to 48 kHz. Now you have an entire octave to work with instead of a few semitones, and the filter doesn't even begin sloping until comfortably above the range of human hearing.

This is one of the main reasons given to why higher sample rates actually sounded better (smoother filters), especially with cheaper interfaces that scrimped on the lower sample rates. So it wasn't so much the higher sample rates sounded better in these interfaces, it was the lower sample rates sounding worse! In higher end converters you heard very little difference between sample rates, mainly due to the lower sample rates using better (more expensive) designs.

Taking the concept further, 192 kHz sample rates give you a full two octaves to work with for your A/D filters - but in many cases the filter "cutoff" is still around 20 kHz. Yes, there are some interfaces that shift the cutoff of the A/D filter way up so you can actually record the supersonic region, but you have to shop around and do a little research to find them - especially if you want a portable version for field recording. Even if your A/D has a cutoff around 20 kHz at higher sample rates, you can still record SOME super-sonic energy - it's just not going to be as much and as flat as it would be if you had a converter specifically designed for this purpose.
:)
Selig Audio, LLC

User avatar
Marco Raaphorst
Posts: 2504
Joined: 22 Jan 2015
Location: The Hague, The Netherlands
Contact:

18 Dec 2017

Thanks Giles! I can follow this theory. Can't hear it myself I think. My hearing stops at 15.5 kHz :)

EdGrip
Posts: 2343
Joined: 03 Jun 2016

19 Dec 2017

I always figured it was for activity like that. If you're recording a sample and you think you're going to be pushing it to the limits with pitch and time later.

User avatar
Marco Raaphorst
Posts: 2504
Joined: 22 Jan 2015
Location: The Hague, The Netherlands
Contact:

19 Dec 2017

selig wrote:
18 Dec 2017
Marco Raaphorst wrote:
18 Dec 2017
I always felt that using something like 192 kHz is simply marketing bullshit. But maybe there's one thing I had always forgotten about: pitch shifting and time stretching.

See for example http://www.musicofsound.co.nz/blog/why- ... mple-rates
Years ago, around the time the higher sample rates initially became available and folks questioned "why", it was the film sound FX guys who first described this process. It's the same reason you film at higher frame rates (120 FPS and above) when you're going to slow it down significantly in post for the classic super slo-mo effect.

When folks first started investigating using higher sample rates in order to capture more data above the range of human hearing (to slow down later in post), they discovered some interfaces did better than others. The problem then, and probably still now, is that not all high sample rate interfaces actually give you extended flat frequency response above the range of human hearing. Why? Because that's not the goal of higher sample rates from every designer's perspective.

One primary advantage to using higher sample rates is to be able to use less steep (and thus more sonically transparent) filters on the A/D. Instead of a very steep filter starting around 19-20 kHz at a sample rate of 44.1 kHz covering a range of only a few semitones from 19-20 kHz to nyquist (22.1 kHz), if you use a sample rate of 96 kHz you can now use a very gentle (comparatively speaking) filter from 24 kHz to 48 kHz. Now you have an entire octave to work with instead of a few semitones, and the filter doesn't even begin sloping until comfortably above the range of human hearing.

This is one of the main reasons given to why higher sample rates actually sounded better (smoother filters), especially with cheaper interfaces that scrimped on the lower sample rates. So it wasn't so much the higher sample rates sounded better in these interfaces, it was the lower sample rates sounding worse! In higher end converters you heard very little difference between sample rates, mainly due to the lower sample rates using better (more expensive) designs.

Taking the concept further, 192 kHz sample rates give you a full two octaves to work with for your A/D filters - but in many cases the filter "cutoff" is still around 20 kHz. Yes, there are some interfaces that shift the cutoff of the A/D filter way up so you can actually record the supersonic region, but you have to shop around and do a little research to find them - especially if you want a portable version for field recording. Even if your A/D has a cutoff around 20 kHz at higher sample rates, you can still record SOME super-sonic energy - it's just not going to be as much and as flat as it would be if you had a converter specifically designed for this purpose.
:)
There's a thing I don't understand though. I understand that in video 120 FPS is simply using more frames. But in audio that don't happen. A higher sample rate is not sampling any frequency with more samples right? Only will it also record fequencies above 20 kHz which can be pitched downwards. I am wondering though if it makes sense to pitch something like 30 kHz. That's just noise. Downpitching needs lowpass not a boost of high frequencies.

I am curious to hear any real world examples of this. I know sound designers often claim that 192 kHz is great for time stretching but I am not sure. Paul's Stretch sounds superb using 44.1 samples.

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

19 Dec 2017

Marco Raaphorst wrote: There's a thing I don't understand though. I understand that in video 120 FPS is simply using more frames. But in audio that don't happen. A higher sample rate is not sampling any frequency with more samples right? Only will it also record fequencies above 20 kHz which can be pitched downwards. I am wondering though if it makes sense to pitch something like 30 kHz. That's just noise. Downpitching needs lowpass not a boost of high frequencies.

I am curious to hear any real world examples of this. I know sound designers often claim that 192 kHz is great for time stretching but I am not sure. Paul's Stretch sounds superb using 44.1 samples.
Frame rate and sample rate are very similar. Ever see the “wagon wheel” effect?
https://en.m.wikipedia.org/wiki/Wagon-wheel_effect

You are actually “seeing” aliasing! The sample rate (frame rate) isn’t fast enough to reproduce the highest frequency present in the original. The result is a distortion of the original.

Works with propellers too:


And with water:


As for audio, there is plenty of super-sonic audio going on that we cannot hear. Bats are one example. If you want to record bats you need a recorder that captures signals above 20 kHz (google “bat detector”).

Sound artists such as Richard Divine have used these sounds:


But in general, if you shift a 44.1 kHz recording down an octave, and it contains energy in the top octave, you’ll end up with no energy in the “new” (post pitch-shifted) top octave. Shift it down two octaves and your highest frequency is now around 5 kHz, which sounds like a badly recorded version of the original in some respects.

I’ve used bird recordings slowed down 4x to mimic whales, after hearing whale sounds sped up that sounded like birds. I also gradually slowed down a creek recording to sound like you’re underwater - in both cases the lack of high end energy in the results worked in my favor, but I still wished I had the option for a full spectrum result after slowing down (this was 1992 technology, so no “Paul Stretch”!).



Sent from some crappy device using Tapatalk
Selig Audio, LLC

User avatar
Marco Raaphorst
Posts: 2504
Joined: 22 Jan 2015
Location: The Hague, The Netherlands
Contact:

19 Dec 2017

selig wrote:
19 Dec 2017
Frame rate and sample rate are very similar. Ever see the “wagon wheel” effect?
https://en.m.wikipedia.org/wiki/Wagon-wheel_effect

You are actually “seeing” aliasing! The sample rate (frame rate) isn’t fast enough to reproduce the highest frequency present in the original. The result is a distortion of the original.

Works with propellers too:


And with water:


As for audio, there is plenty of super-sonic audio going on that we cannot hear. Bats are one example. If you want to record bats you need a recorder that captures signals above 20 kHz (google “bat detector”).

Sound artists such as Richard Divine have used these sounds:


But in general, if you shift a 44.1 kHz recording down an octave, and it contains energy in the top octave, you’ll end up with no energy in the “new” (post pitch-shifted) top octave. Shift it down two octaves and your highest frequency is now around 5 kHz, which sounds like a badly recorded version of the original in some respects.

I’ve used bird recordings slowed down 4x to mimic whales, after hearing whale sounds sped up that sounded like birds. I also gradually slowed down a creek recording to sound like you’re underwater - in both cases the lack of high end energy in the results worked in my favor, but I still wished I had the option for a full spectrum result after slowing down (this was 1992 technology, so no “Paul Stretch”!).



Sent from some crappy device using Tapatalk
Wagon wheel I know. So higher sampling rates contain higher resolution samples of all frequencies. Never knew that. I always felt a frequency is fixed, so recording something like 8 kHz in 44.1 or 96 contains exactly the same information. Unlike frame rates of video. But if I understand you correctly, it works like frame rates. So using a higher sampling frequency is a higher resolution.

I always felt that to do extreme pitch shifting you need oversampling or extreme high sampling frequencies but ONLY for processing, not for the source because I always though tone frequency = sampling frequency (ok, 2 times for Left and Right).

Although it's to heavy for me to understand I now learned that higher sampling rates are higher resolution files. Mmm, I have been wrong for so many years I must say! Thanks a lot!

(always makes me feel I should stop thinking and just trust my ears, my knowledge in these things is always lacking...)

EdGrip
Posts: 2343
Joined: 03 Jun 2016

19 Dec 2017

Sample rate and bit depth are both resolution, one of time and one of amplitude.

Image

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

19 Dec 2017

EdGrip wrote:Sample rate and bit depth are both resolution, one of time and one of amplitude.

Image
Only bit depth is resolution. Increasing sample rate doesn’t increase resolution - it increases frequency response. In other words, going from 44.1 to 96 kHz doesn’t increase the resolution of a 1 kHz signal in any way! It’s not “correct” to think of audio the same way you think of a 2D graphic image, mainly because there is “time” involved. That’s why video/film is actually a better comparison IMO.


Sent from some crappy device using Tapatalk
Selig Audio, LLC

User avatar
Marco Raaphorst
Posts: 2504
Joined: 22 Jan 2015
Location: The Hague, The Netherlands
Contact:

19 Dec 2017

selig wrote:
19 Dec 2017
EdGrip wrote:Sample rate and bit depth are both resolution, one of time and one of amplitude.

Image
Only bit depth is resolution. Increasing sample rate doesn’t increase resolution - it increases frequency response. In other words, going from 44.1 to 96 kHz doesn’t increase the resolution of a 1 kHz signal in any way! It’s not “correct” to think of audio the same way you think of a 2D graphic image, mainly because there is “time” involved. That’s why video/film is actually a better comparison IMO.


Sent from some crappy device using Tapatalk
So 1 kHz recorded with 44.1 or 192 is the same?

I always thought that. But when you compared it to video frames I thought I was wrong for all those years. 120 frames captures simply more images but I always though sound is not like video. We can only capture a frequency in a rate which is as fast as the frequency. A frequency sampled doesn't lack the images like a 50 frames video will do. Right?

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

19 Dec 2017

Marco Raaphorst wrote:
19 Dec 2017

Wagon wheel I know. So higher sampling rates contain higher resolution samples of all frequencies. Never knew that. I always felt a frequency is fixed, so recording something like 8 kHz in 44.1 or 96 contains exactly the same information. Unlike frame rates of video. But if I understand you correctly, it works like frame rates. So using a higher sampling frequency is a higher resolution.
You were correct in thinking increased sample rate doesn't increase resolution, just like with video. In both cases it's when you try to represent a frequency beyond nyquist (1/2 sample rate) that you run into aliasing, because there are not enough frames/samples to accurately reconstruct that frequency. So just like with audio, as you increase the frequency of something up to and beyond nyquist, you start to see/hear the frequency go in the opposite direction! Pitch starts to decrease, wagon wheels start to turn the opposite direction - same exact thing happening for the same exact reason.
Marco Raaphorst wrote:
19 Dec 2017
I always felt that to do extreme pitch shifting you need oversampling or extreme high sampling frequencies but ONLY for processing, not for the source because I always though tone frequency = sampling frequency (ok, 2 times for Left and Right).

Although it's to heavy for me to understand I now learned that higher sampling rates are higher resolution files. Mmm, I have been wrong for so many years I must say! Thanks a lot!

(always makes me feel I should stop thinking and just trust my ears, my knowledge in these things is always lacking...)
"Resolution" is a tough word to use for sample rate, because like with 2D graphics it implies the entire signal is changed as you increase sample rate. But that's not the case, and it's why most won't use that word when speaking of sample rate. The confusion comes when you post a 2D picture and say "see, more samples = more accurate waveform". But a sine wave at 1 kHz is as accurate at 44.1 kHz sample rate as it is at 96 kHz.

I tend to avoid the term "resolution" (which is a marketing term used to describe higher sample rates and larger bit depths) because it's misleading. Yes, higher sample rates and larger bit depths can indeed store more information. But it comes in the form of wider frequency response (for sample rate) and wider dynamic range (for bit depth).

Just like with sample rate, increasing bit depth doesn't change ALL audio signals. For example, a full scale sine wave can be accurately reproduced with 1 bit "resolution". How? Because it's a full level - there's no dynamic range in that signal! But if you lower the level of that 1 bit sine wave, it quickly falls apart because there is so little dynamic range being stored (6 dB in this case). So we can say that a full scale sine wave sounds exactly the same with 16 bit resolution as it does with 24 bit resolution, which is why the term "resolution" can be misleading if you keep "seeing" a 2D drawing of an audio waveform when you hear the word "resolution".

To re-cap. Sample rate affects frequency response, bit depth affects dynamic range (and thus also signal to noise, or noise level).

It's worth reposting this excellent video that explains all of this as clearly as I've ever seen. Jump to 8:42 for the section on bit depth:
Selig Audio, LLC

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

19 Dec 2017

Marco Raaphorst wrote:
19 Dec 2017
selig wrote:
19 Dec 2017


Only bit depth is resolution. Increasing sample rate doesn’t increase resolution - it increases frequency response. In other words, going from 44.1 to 96 kHz doesn’t increase the resolution of a 1 kHz signal in any way! It’s not “correct” to think of audio the same way you think of a 2D graphic image, mainly because there is “time” involved. That’s why video/film is actually a better comparison IMO.


Sent from some crappy device using Tapatalk
So 1 kHz recorded with 44.1 or 192 is the same?

I always thought that. But when you compared it to video frames I thought I was wrong for all those years. 120 frames captures simply more images but I always though sound is not like video. We can only capture a frequency in a rate which is as fast as the frequency. A frequency sampled doesn't lack the images like a 50 frames video will do. Right?
24 FPS video can capture a 20 CPS wagon wheel just as accurately as 60 FPS. But only 60 FPS can accurately capture a 30 CPS wagon wheel - make sense?
Selig Audio, LLC

User avatar
Marco Raaphorst
Posts: 2504
Joined: 22 Jan 2015
Location: The Hague, The Netherlands
Contact:

19 Dec 2017

I still don't understand it.

"But a sine wave at 1 kHz is as accurate at 44.1 kHz sample rate as it is at 96 kHz."

Exactly what I always thought. So 192 doesn't make sense if you don't need frequencies higher than 20 khz. Anything below that is captured perfectly using 44.1

EdGrip
Posts: 2343
Joined: 03 Jun 2016

19 Dec 2017

I suppose it's resolution of time. If you take a measurement, and then another measurement a minute later, anything could have happened in the interim. Measuring more often = more information. A higher sample rate can resolve greater time detail.

It was my favourite image of the ones that came up on Google images.

I don't know much about the algorithms that very cleverly extrapolate what probably happened in the interim from the samples on either side. I gather that's where inter-sample peaks come from.

EdGrip
Posts: 2343
Joined: 03 Jun 2016

19 Dec 2017

I like resolution, as a term. I think of "definition" as being the marketing devil, as in "HD sunglasses".

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

19 Dec 2017

EdGrip wrote:
19 Dec 2017
I suppose it's resolution of time. If you take a measurement, and then another measurement a minute later, anything could have happened in the interim. Measuring more often = more information. A higher sample rate can resolve greater time detail.

It was my favourite image of the ones that came up on Google images.

I don't know much about the algorithms that very cleverly extrapolate what probably happened in the interim from the samples on either side. I gather that's where inter-sample peaks come from.
There is no "information" on either side of the sample in digital sampling theory, because you sample a bandwidth limited signal (due to the filter in the A/D). Once the signal is bandwidth limited, you have essentially removed any information "between the samples", so there is no "interim" as you suggest.

We agree that measuring more often = more information. But in the case of audio, it's ONLY going to affect the frequency response. A higher sample rate can capture a higher frequency, that's it. With audio, there is no "time detail" parameter.

The confusion comes from the concept of "more information". For a 2D image, more information means more image resolution - the ENTIRE image is improved with more information/resolution. But with audio, more information either means wider frequency response or greater dynamic range, which does NOT affect the entire "image".

The only thing in common with both concepts is that there is a point in both scenarios in which more information/resolution no longer changes what we see/hear. That's where the "Law of Diminishing Returns" kicks in. ;)

And "intersample peaks" do not come from information between the samples, in the sense that adding more samples will prevent them from happening! The information was always there, the "problem" comes from using the sample "points" to measure the level inaccurately in the digital domain. The "inter-sample" peaks represent the actual waveform 100% accurately. We tend to see them as the "error", but it's the sample level peak meters that are showing us the inaccurate level in the first place.
Selig Audio, LLC

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

19 Dec 2017

EdGrip wrote:
19 Dec 2017
I like resolution, as a term. I think of "definition" as being the marketing devil, as in "HD sunglasses".
It's a great term, just not for audio because of the inaccurate connotations IMO (as evidenced by the confusion it creates when applied to audio).
As a term, it can be defined many ways, including relating to vision:

a : the process or capability of making distinguishable the individual parts of an object, closely adjacent optical images, or sources of light
b : a measure of the sharpness of an image or of the fineness with which a device (such as a video display, printer, or scanner) can produce or record such an image usually expressed as the total number or density of pixels in the image a resolution of 1200 dots per inch.

But you won't find any mention of "audio" in the definition of the word. Why not just say that increasing sample rate affects frequency response, and increasing bit depth affects dynamic range?
Selig Audio, LLC

User avatar
normen
Posts: 3431
Joined: 16 Jan 2015

19 Dec 2017

EdGrip wrote:
19 Dec 2017
I suppose it's resolution of time. If you take a measurement, and then another measurement a minute later, anything could have happened in the interim.
No, not anything. And thats exactly the "secret" to why in digital audio [bit depth = S/N ratio] and [sampling frequency = highest captured frequency]. NOTHING else, no "resolution", no "in between". Look at that video Selig posted, it explains it in detail.

avasopht
Competition Winner
Posts: 3929
Joined: 16 Jan 2015

19 Dec 2017

selig wrote:
18 Dec 2017

One primary advantage to using higher sample rates is to be able to use less steep (and thus more sonically transparent) filters on the A/D. Instead of a very steep filter starting around 19-20 kHz at a sample rate of 44.1 kHz covering a range of only a few semitones from 19-20 kHz to nyquist (22.1 kHz), if you use a sample rate of 96 kHz you can now use a very gentle (comparatively speaking) filter from 24 kHz to 48 kHz. Now you have an entire octave to work with instead of a few semitones, and the filter doesn't even begin sloping until comfortably above the range of human hearing.
Well, ... you may also find better results running your sound card at 44.1 kHz because your analogue filter will be targeting 96/192 kHz and may use a steeper digital filter before downsampling to 44.1 kHz. That steeper digital filter may not exist at the higher sample rates for your sound card. For there to be a steep digital filter at your active samplerate, the sound card would have to actually sample at a higher rate.

The same thing happens with cameras. It has to take much higher resolution images and downsample (usually making use of noise reduction filters). Some cameras will actually take shots at its native resolution, but the individual pixels will look like garbage. You'd need to get a really old webcam to see this.

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

19 Dec 2017

avasopht wrote:
selig wrote:
18 Dec 2017

One primary advantage to using higher sample rates is to be able to use less steep (and thus more sonically transparent) filters on the A/D. Instead of a very steep filter starting around 19-20 kHz at a sample rate of 44.1 kHz covering a range of only a few semitones from 19-20 kHz to nyquist (22.1 kHz), if you use a sample rate of 96 kHz you can now use a very gentle (comparatively speaking) filter from 24 kHz to 48 kHz. Now you have an entire octave to work with instead of a few semitones, and the filter doesn't even begin sloping until comfortably above the range of human hearing.
Well, ... you may also find better results running your sound card at 44.1 kHz because your analogue filter will be targeting 96/192 kHz and may use a steeper digital filter before downsampling to 44.1 kHz. That steeper digital filter may not exist at the higher sample rates for your sound card. For there to be a steep digital filter at your active samplerate, the sound card would have to actually sample at a higher rate.

The same thing happens with cameras. It has to take much higher resolution images and downsample (usually making use of noise reduction filters). Some cameras will actually take shots at its native resolution, but the individual pixels will look like garbage. You'd need to get a really old webcam to see this.
Not sure I totally follow your comparison to cameras - cameras have a fixed number of pixels in the sensor, so they HAVE to down-sample to achieve different/lower resolutions. Or you can just use a part (sub-set) of the sensor for a 1:1 ratio between sensor pixels and image pixels.

Digital audio does not have this restriction, and thus can use different sample rates because it does not have a “fixed resolution” like a digital camera.

The advantage you mention is only for cost savings as I understand it, as a steep digital filter is ‘cheaper’ to implement well than a steep analog filter. And since the converters can already sample at higher rates, why not use the higher rate for the analog conversion then down-sample and use a digital filter at that point - though I have to wonder if there’s extra latency involved in the conversion (which may be moot since there’s less latency with the higher sample rate)?


Sent from some crappy device using Tapatalk
Selig Audio, LLC

avasopht
Competition Winner
Posts: 3929
Joined: 16 Jan 2015

19 Dec 2017

selig wrote:
19 Dec 2017
Not sure I totally follow your comparison to cameras
Mainly because they also tend to employ the use of software filters (at least in your mobile phones).
selig wrote:
19 Dec 2017
The advantage you mention is only for cost savings as I understand it, as a steep digital filter is ‘cheaper’ to implement well than a steep analog filter.
Yep. I've no idea where the price cut off point is for when sound cards start using pure analogue anti-aliasing filters.

EdGrip
Posts: 2343
Joined: 03 Jun 2016

20 Dec 2017

Thanks Selig!
It's going to take some reading and head scratching to properly internalise how a wave can always be reconstructed perfectly from comparatively few sample points, irrespective of where in the cycle they happen to fall.

I think I get that *any* difference/variance between
- the mathematical curves described/generated from the sample points, and
- the original sound
...is, by its nature, higher-frequency content. So when you add more sample points to accurately encode those differences too, in fact all you're doing is recording higher frequencies than you were before. Is that about right?

User avatar
selig
RE Developer
Posts: 11681
Joined: 15 Jan 2015
Location: The NorthWoods, CT, USA

20 Dec 2017

EdGrip wrote:Thanks Selig!
It's going to take some reading and head scratching to properly internalise how a wave can always be reconstructed perfectly from comparatively few sample points, irrespective of where in the cycle they happen to fall.

I think I get that *any* difference/variance between
- the mathematical curves described/generated from the sample points, and
- the original sound
...is, by its nature, higher-frequency content. So when you add more sample points to accurately encode those differences too, in fact all you're doing is recording higher frequencies than you were before. Is that about right?
Yes.

The thing that helped me understand how a wave can always be accurate reconstructed in this. The wave MUST be “bandwidth limited” for this to work.

In other words, you first define the range of possible/allowable frequencies. This is done with the A/D filter, which removes all “illegal” frequencies before digitizing.

That same filter is used as a reconstruction filter on the digital signal, thus insuring that only “legal” frequencies are allows to be converted back to analog.

This is how a sine wave can be accurately reconstructed from so few samples - because it can only be in a certain range, all other possibilities are excluded (filtered).

Also, remember that any frequency within the top octave WILL be a sine, because no matter what you “input”, all harmonics will be removed. Say you input a 15 kHz sawtooth wave. The first harmonic is one octave above the fundamental, or 30 kHz in this case. But at a sample rate of 44.1 kHz, that harmonic will be filtered out (as well as the rest), leaving a sine.

Even if you use a higher rate, that 30 kHz harmonic won’t be reproduced by your amp/speaker in most cases, and even if it is, won’t be heard by our ears - so, a sine is what you get in all cases.

And in the video I attached a few posts back, you can see that a sine wave can be perfectly reconstructed all the way up to just below nyquist!

But the main take-away from this should be the term “bandwidth limited”, which is essential to understand when trying to grasp the “magic” behind digital theory IMO.


Sent from some crappy device using Tapatalk
Selig Audio, LLC

avasopht
Competition Winner
Posts: 3929
Joined: 16 Jan 2015

20 Dec 2017

EdGrip wrote:
20 Dec 2017
Thanks Selig!
It's going to take some reading and head scratching to properly internalise how a wave can always be reconstructed perfectly from comparatively few sample points, irrespective of where in the cycle they happen to fall.
Another conundrum is how frequency modulation and *some distortion of a sample produces the same sidebands and harmonics as would be produced with a higher sample rate or were performed by analogue processes (ignoring aliasing for the moment) :puf_wink:

Understanding that might give you plenty of insight.

Always remember that PCM is just a representation of tge sound, but not the sound itself. Also note that there is a two way transformation between PCM and the frequency spectrum. What this means is that the frequency spectrum representation is sort-of contained within the PCM, and the PCM is-sort of contained within the frequency spectrum representation.

Typically in the digital world we perform this transformation in powers of two (known as windows), such as 64, 128, 256, 512 and 1024.

The frequency spectrum, by the way, is represented as sine oscillators (known as bins) with frequencies at fixed intervals (e.g. 0hz, 10hz, 20hz, ...). If you took one oscillator, like 20hz and tried to render it as a wave at the samplerate you end up with those vertical bars that needs a curve to pass through it. If you were to render all sine oscillators and then add them together, you get back the original PCM wave.

A 64 frame window gives you 64 oscillators.

The transformation process between the two are pretty identical, and there is a way to use one transformation to perform both jobs. That might not mean much to you now, but I just want drum in the idea that a wave sort-of contains more than just a wave.

So you might be looking at these few sample points unable to discern how to draw the curve of the wave, but if you remember that the frequency spectrum is contained within that wave, it becomes appararent that you can just transform the PCM to the frequency spectrum and just sum the bins. You can now render the waveform at any samplerate, and even feed it to analogue oscillators to render it all the continuous glory that is analogue synthesis.

As selig says above, some D/A converters use the same antialiasing filter as the A/D converter, but there are other D/A methods, such as 1-bit DAC.

There are other properties of waves and signal processing that explain why your speaker can reproduce multiple frequencies just as well it could reproduce each individual frequency (except when it starts to distort, then they come with harmonics that can result in some cancellation).

*: Need to double check whether this applies to all distortion processes.

---

Note to self: videos and pictures would probably help.

User avatar
Marco Raaphorst
Posts: 2504
Joined: 22 Jan 2015
Location: The Hague, The Netherlands
Contact:

20 Dec 2017

I can barely follow it. Gave up. Back to just listening for me :)

Post Reply
  • Information
  • Who is online

    Users browsing this forum: No registered users and 6 guests