What audio really is

Heigen5 · 06 May 2021

Someone claimed or actually told that whereas there are pixels in the graphics making the picture, the audio is in frames and has it's own "pixels" but just in the audiowise. So that probably means that in the digital audio there's only certain amount of different frames. If this is being true I can already think of lots of benefits to develop stuff in the audio science. Some of my tracks might have some same frames as there is in the Prodigy's music.

deeplink · 06 May 2021

I might be wrong, but I think the term "sample rate" is equivalent to pixels.

Heigen5 · 06 May 2021

deeplink wrote: ↑
06 May 2021
I might be wrong, but I think the term "sample rate" is equivalent to pixels.

Yeah, that should at least being one thing that is related to it.

guitfnky · 06 May 2021

I don’t believe that’s quite right. as long as the sample rate is double the frequency you’re looking to reproduce, my understanding is that the audio will be played back perfectly (as a perfect sine wave). so it’s not like it’s being played back in a “pixelated” form, or in discrete steps, it’s actually a completely faithful recreation of the originally captured sound (unless there’s some kind of aliasing).

Selig would know for sure, but I’m fairly certain that’s how it works.

06 May 2021

Heigen5 wrote: ↑
06 May 2021
... in the digital audio there's only certain amount of different frames.

Infinite monkey theorem

Heigen5 · 06 May 2021

guitfnky wrote: ↑
06 May 2021
I don’t believe that’s quite right. as long as the sample rate is double the frequency you’re looking to reproduce, my understanding is that the audio will be played back perfectly (as a perfect sine wave). so it’s not like it’s being played back in a “pixelated” form, or in discrete steps, it’s actually a completely faithful recreation of the originally captured sound (unless there’s some kind of aliasing).

Selig would know for sure, but I’m fairly certain that’s how it works.

So your claim there can be new audio that gets it's own binary because it's new?

Heigen5 · 06 May 2021

orthodox wrote: ↑
06 May 2021

Heigen5 wrote: ↑
06 May 2021
... in the digital audio there's only certain amount of different frames.
Infinite monkey theorem

I read the first chapter, but I'm not understanding it to it's fullest, so I hope you can type your own more understandable answer about it.
Is audio being already settled in a way that new audio doesn't occur anymore?

jam-s · 06 May 2021

Each individual sample has most likely been reproduced a million times by different people (as it is just a single 8/16/24/32 bit value). It's the sequence of different sample values that make the music in the digital realm.
If you want to go deeper this video has the details on both audio and video: https://xiph.org/video/vid1.shtml

pushedbutton · 06 May 2021

Everything is an illusion, nothing really exists.

06 May 2021

pushedbutton wrote: ↑
06 May 2021
Everything is an illusion, nothing really exists.

This statement is an illusion as well.
As long as someone believes some illusions are more realistic than the others, they can reuse the concept of existence to encapsulate those illusions.

guitfnky · 06 May 2021

Heigen5 wrote: ↑
06 May 2021

guitfnky wrote: ↑
06 May 2021
I don’t believe that’s quite right. as long as the sample rate is double the frequency you’re looking to reproduce, my understanding is that the audio will be played back perfectly (as a perfect sine wave). so it’s not like it’s being played back in a “pixelated” form, or in discrete steps, it’s actually a completely faithful recreation of the originally captured sound (unless there’s some kind of aliasing).

Selig would know for sure, but I’m fairly certain that’s how it works.
So your claim there can be new audio that gets it's own binary because it's new?

I don’t understand what you’re asking here, but here’s some good info that might help: https://www.soundonsound.com/techniques/digital-myth

one of the basic takeaways is that the reproduced audio will perfectly represent everything up to half of the sampling rate (except for the possible aliasing artifacts I mentioned—it explains that too).

guitfnky · 06 May 2021

a better way to put it might be to say there are no “pixels” going in, and no “pixels” coming out. only internally, within the CPU, you can think of it as being pixelated.

Heigen5 · 06 May 2021

guitfnky wrote: ↑
06 May 2021
a better way to put it might be to say there are no “pixels” going in, and no “pixels” coming out. only internally, within the CPU, you can think of it as being pixelated.

Could you please answer this question then? Is there any possibilities to have new audio or is there a limit that how many different frames there can be?

guitfnky · 06 May 2021

Heigen5 wrote: ↑
06 May 2021

guitfnky wrote: ↑
06 May 2021
a better way to put it might be to say there are no “pixels” going in, and no “pixels” coming out. only internally, within the CPU, you can think of it as being pixelated.
Could you please answer this question then? Is there any possibilities to have new audio or is there a limit that how many different frames there can be?

I can’t answer the question because I don’t understand what’s being asked. I’m just saying there are no audio “pixels” involved.

Heigen5 · 06 May 2021

guitfnky wrote: ↑
06 May 2021

Heigen5 wrote: ↑
06 May 2021

Could you please answer this question then? Is there any possibilities to have new audio or is there a limit that how many different frames there can be?
I can’t answer the question because I don’t understand what’s being asked. I’m just saying there are no audio “pixels” involved.

My question explained: As a wav. file is about putting frames in row that make a tube, - then aren't the frames technically having a limit that makes every possible frame unique?

guitfnky · 06 May 2021

Heigen5 wrote: ↑
06 May 2021

guitfnky wrote: ↑
06 May 2021

I can’t answer the question because I don’t understand what’s being asked. I’m just saying there are no audio “pixels” involved.
My question explained: As a wav. file is about putting frames in row that make a tube, - then aren't the frames technically having a limit that makes every possible frame unique?

I’m still not sure I understand, but the number of possible discrete volume levels at a given point in an audio file is huge to begin with (depending on bit rate), and over the course of even a few samples, you have to multiply each of those numbers against each other just to come up with the number of possible configurations. it quickly becomes astronomical.

for example, in a 16 bit recording, there are 65,536 possible volume levels per sample. within 2 samples, the number of possible different configurations is already at more than 4 billion. if you go out to 3 samples, that’s almost 281.5 TRILLION possible differences.

there is technically a numerical limit over a given amount of time at a given sampling rate, at a given bit depth, but the numbers are so far beyond staggering, they’re not worth thinking about.

Heigen5 · 06 May 2021

guitfnky wrote: ↑
06 May 2021

Heigen5 wrote: ↑
06 May 2021

My question explained: As a wav. file is about putting frames in row that make a tube, - then aren't the frames technically having a limit that makes every possible frame unique?
I’m still not sure I understand, but the number of possible discrete volume levels at a given point in an audio file is huge to begin with (depending on bit rate), and over the course of even a few samples, you have to multiply each of those numbers against each other just to come up with the number of possible configurations. it quickly becomes astronomical.

for example, in a 16 bit recording, there are 65,536 possible volume levels per sample. within 2 samples, the number of possible different configurations is already at more than 4 billion. if you go out to 3 samples, that’s almost 281.5 TRILLION possible differences.

there is technically a numerical limit over a given amount of time at a given sampling rate, at a given bit depth, but the numbers are so far beyond staggering, they’re not worth thinking about.

Ok nice, so the answer is: There are limits for the possible unique frames, but the amount of them is astronomical. Right?

guitfnky · 06 May 2021

Heigen5 wrote: ↑
06 May 2021

guitfnky wrote: ↑
06 May 2021

I’m still not sure I understand, but the number of possible discrete volume levels at a given point in an audio file is huge to begin with (depending on bit rate), and over the course of even a few samples, you have to multiply each of those numbers against each other just to come up with the number of possible configurations. it quickly becomes astronomical.

for example, in a 16 bit recording, there are 65,536 possible volume levels per sample. within 2 samples, the number of possible different configurations is already at more than 4 billion. if you go out to 3 samples, that’s almost 281.5 TRILLION possible differences.

there is technically a numerical limit over a given amount of time at a given sampling rate, at a given bit depth, but the numbers are so far beyond staggering, they’re not worth thinking about.
Ok nice, so the answer is: There are limits for the possible unique frames, but the amount of them is astronomical. Right?

well, if you’re talking about a single sample (or frame, as you put it), there are only 65,536 possibilities. but we don’t listen to music as single samples. when you put that number in context with the samples around it, the possible combinations become huge.

Heigen5 · 06 May 2021

guitfnky wrote: ↑
06 May 2021

Heigen5 wrote: ↑
06 May 2021

Ok nice, so the answer is: There are limits for the possible unique frames, but the amount of them is astronomical. Right?
well, if you’re talking about a single sample (or frame, as you put it), there are only 65,536 possibilities. but we don’t listen to music as single samples. when you put that number in context with the samples around it, the possible combinations become huge.

Yes, but it's an illusion, that when you hear stuff, then there's also lots of variations in the slices, when comparing them. One frame is so small in the timing and also the frame itself has lots of common with it's brother and sisters.

EdGrip · 07 May 2021

(my hole-ridden understanding coming up:)

"frame" doesn't really work because a sample is a number, representing a single point on a wave. It's like a single bar on a bar chart; it has an y axis, which is the height of the wave at that moment (the resolution/accuracy of this axis depends on the bit depth) but no x axis (time).
This can't be compared to a tube. It's one dimensional.

Once you have two samples in a row, then you've got a second dimension - the x axis, or time. This is the smallest thing that might be considered a "frame" - you have two bars on the bar chart, describing a change between two levels of a sound wave. The sample rate is the time that passes between sample 1 and sample 2, measured as a fraction of a second (Hz).

With two samples, we know how much the height of the wave has changed between two points, and how fast it changed. Those two nuggets of info add up to a rate of change, or frequency.

So if two samples together is only a frequency, you could ask "has this frequency been recorded before?". Probably, yes. There are a lot of audio files out there. Although as guitfnky said above, the possible configurations of two samples in a row is already 4 billion, and that figure multiplies by 65,536 every time you add another sample. It's very unlikely your music shares 3 samples in a row with the Prodigy.

4 samples in a row would take 0.00009 seconds to listen to, or about a tenth of a millisecond.

jam-s · 07 May 2021

Heigen5 wrote: ↑
06 May 2021
Yes, but it's an illusion, that when you hear stuff, then there's also lots of variations in the slices, when comparing them. One frame is so small in the timing and also the frame itself has lots of common with it's brother and sisters.

Congratulations, you've found out about quantisation noise.

Now you could move on and discover more fascinating things about digital audio like aliasing and analogue reconstruction by finally watching the videos from here: https://xiph.org/video/

07 May 2021

Each "frame" of digital audio is a single sample that is a measurement of the amplitude at that point, and this is stored as a single integer ranging from 0 to 16,777,216 (assuming 24-bit). By manipulating a single sample's value, all you do is change the volume of that single sample.

The frequencies and timbres of the output audio are not determined by the internal values of the individual samples, but rather their aggregation over time.

Thus you cannot turn a single sample into a "sound" - if you multiply that sound across all other samples then what actually happens is a silent flat line because there is no wave being created.

I think this answers the question that you are struggling to ask correctly

07 May 2021

Enlightenspeed wrote: ↑
07 May 2021
Each "frame" of digital audio is a single sample that is a measurement of the amplitude at that point, and this is stored as a single integer ranging from 0 to 16,777,216 (assuming 24-bit). By manipulating a single sample's value, all you do is change the volume of that single sample.

The frequencies and timbres of the output audio are not determined by the internal values of the individual samples, but rather their aggregation over time.

Thus you cannot turn a single sample into a "sound" - if you multiply that sound across all other samples then what actually happens is a silent flat line because there is no wave being created.

I would disagree. Each sample represents one impulse Sinc(π t/T) function with its center on it, so it spreads its influence on the intersampling gaps between other samples and contributes to the frequency content.

07 May 2021

orthodox wrote: ↑
07 May 2021

Enlightenspeed wrote: ↑
07 May 2021
Each "frame" of digital audio is a single sample that is a measurement of the amplitude at that point, and this is stored as a single integer ranging from 0 to 16,777,216 (assuming 24-bit). By manipulating a single sample's value, all you do is change the volume of that single sample.

The frequencies and timbres of the output audio are not determined by the internal values of the individual samples, but rather their aggregation over time.

Thus you cannot turn a single sample into a "sound" - if you multiply that sound across all other samples then what actually happens is a silent flat line because there is no wave being created.
I would disagree. Each sample represents one impulse Sinc(π t/T) function with its center on it, so it spreads its influence on the intersampling gaps between other samples and contributes to the frequency content.

Really? I've always seen it regarded as amplitude, and from multiple sources.

I haven't moved into audio yet, so if you don't mind can you point me in the direction of any texts that cover it being taught in this manner?

Cheers,
Brian

07 May 2021

Enlightenspeed wrote: ↑
07 May 2021
I haven't moved into audio yet, so if you don't mind can you point me in the direction of any texts that cover it being taught in this manner?

I can't point it on the internet, but it somehow comes from analog signal reconstruction from its sampled values, interpolation of the signal with the condition of limited frequency band, the Nyquist ISI criterion and such.

You can see the perfect illustration in Izotope RX, if you zoom in a (silent) waveform to the samples and try to move one sample.

What audio really is

Who is online