Jump to content

Do muted soundlayers in DAWs affect the sound quality and the size of the track?


Recommended Posts

What do you think?

Do muted sound layers in the composition make the sound quality after exporting worse or increase the track size unnecessarily?

I just ask 'cause I export my compostions when progressing with a remix sometimes (sometimes a few synthesizers fuck up a lil bit after exporting a track) - and I often leave some muted layers on which I continue later on.

Cause I'm never really sure when I'm really done with a remix, I often leave the muted layers always in the composition.

Do you know if this has some negative effects on the exportet remix?

Link to post
Share on other sites

I only leave those in if there aren't many. Otherwise when I'm done, I try to delete anything I know I won't use. I do know that if a soundfont makes a project file take too long to render, muting it won't help.

By the way, you could just try it yourself and see what happens. :P

Edited by timaeus222
Link to post
Share on other sites

It shouldn't do no. It only renders whatever is played, so no interference or added size should transfer to the rendered file.

I am no expert when it comes too all the algorithms and equations that goes on underneath it all tho, it hurts my head! So I might be corrected, but I wouldn't of thought it would make any difference.

It would only really effect things within the project itself, it will add to the CPU use etc, but you can always freeze them to free that up in most DAWs.

Link to post
Share on other sites

I tried something out.

I exported the track with muted extra layers and the second time with deleting the muted layers before exporting.

According to the properties of the files they both had exactly the same size - so I guess the muted layers don't appear in the exported version.

So I'm sure this doesn't even affect the quality of the exported song.

(at least at my DAW - not sure if this is common in every music production software)

Link to post
Share on other sites

So here's how audio works:

A WAVE file is simply a gigantic list of amplitude vs. time values. Every successive value is the next sample of audio. So if I have 3 seconds of audio at 44100 KHz sampling rate, that means that 44100 of these values will represent the output signal for one second. The next 44100 samples will be the next second, and so on.

When you mix signals together to create layered sounds, all that's happening (think about waves in physics) is that you're simply adding all of the values together. If this is challenging for you, read up on the concept of superposition. It basically means that if you have smaller "things" contribute to a system, you can express a net thing. Crude example, you can spend $40 on gas, $100 on games, make $500 all in a month. That's 3 separate things, but I can simply express it as one thing, a net gain of $360. No matter how many things happened to my finances that month, it reduces down to one number every month. This is the backbone of physics as well.

IllustrAddSyn.jpg

All of the dotted lines are separate waveforms (layers). Adding them all together gets you the solid waveform, and that's represented in digital signal as a string of values on the y-axis (in computer calculations this will be decimals from -1.0 to ~.999). This is true for simple waveforms, it's also true for every one of your instruments and mixer tracks. At each sample point in time, everything is getting added up into a single value. Doesn't matter if it's 3 instruments or 30. It still sums into one signal.

In 16-bit audio, each samples will be an integer from -32768 to 32677 (2^16 or 65536 values in total). Since there are 8 bits in a byte, that means each sample is 2 bytes of size.

Give it a moment of thought, and you'll quickly realize that no matter how many plugins, how many crazy effects, instruments, imported audio, recorded, processed, whatever...

The size of your wave file will still only ever be the sampling rate * the time length * the bit depth in bytes * the channels (so like, 2 if stereo), not including some miscellaneous bytes of headers and stuff that media players read before they get to the audio data in a wave file.

So that means if my song is in mono at 5 minutes long at 44100 KHz 16-bit, it is most certainly going to be around 360 seconds * 44100 samples per second * 2 bytes per sample = 26460000 bytes, or ~25 MB. If my final output is in stereo, that means there are two channels with that many samples in time, so it doubles to roughly ~50MB.

The only thing that differs with your project complexity is how long the computer takes to sum everything up into one signal. Your computer has to pump harder to do all the math when there's more math to do. However, this will only slow down the rendering process in the DAW's exporting. The final file is still just one signal, and the only determinant of size is the length in time of that signal.

To answer your side thought, yes this is common to all DAW's. It doesn't matter if one DAW renders the silence of the muted track and the other one doesn't render it at all. The final output is still one signal. Going back to the finances example, rendering a muted track is just adding $0 to my net gain every month... and nothing changes.

Edited by Neblix
Link to post
Share on other sites

Thanks, man - nice explanation.

Though it's hard to believe for me, that the filesize along the same output quality options is only depending on the length of the track and not on the density of the signals (amount of layers and denity of midi/sound events) - if I've got it right.

But I'll check this after I've fixed my DAW.

Link to post
Share on other sites

Another way to think about is very simple. Imagine recording a band through a mixer which feeds into a good ol cassette deck. The mixer has multiple channels but you're mixing it down into the stereo cassette. If you mute the drummer, his track won't get to the tape despite playing his heart out. It's not much different with plugins in a DAW in most cases. It makes sense to me. ;)

Link to post
Share on other sites
So here's how audio works:

A WAVE file is simply a gigantic list of amplitude vs. time values. Every successive value is the next sample of audio. So if I have 3 seconds of audio at 44100 KHz sampling rate, that means that 44100 of these values will represent the output signal for one second. The next 44100 samples will be the next second, and so on.

When you mix signals together to create layered sounds, all that's happening (think about waves in physics) is that you're simply adding all of the values together. If this is challenging for you, read up on the concept of superposition. It basically means that if you have smaller "things" contribute to a system, you can express a net thing. Crude example, you can spend $40 on gas, $100 on games, make $500 all in a month. That's 3 separate things, but I can simply express it as one thing, a net gain of $360. No matter how many things happened to my finances that month, it reduces down to one number every month. This is the backbone of physics as well.

IllustrAddSyn.jpg

All of the dotted lines are separate waveforms (layers). Adding them all together gets you the solid waveform, and that's represented in digital signal as a string of values on the y-axis (in computer calculations this will be decimals from -1.0 to ~.999). This is true for simple waveforms, it's also true for every one of your instruments and mixer tracks. At each sample point in time, everything is getting added up into a single value. Doesn't matter if it's 3 instruments or 30. It still sums into one signal.

In 16-bit audio, each samples will be an integer from -32768 to 32677 (2^16 or 65536 values in total). Since there are 8 bits in a byte, that means each sample is 2 bytes of size.

Give it a moment of thought, and you'll quickly realize that no matter how many plugins, how many crazy effects, instruments, imported audio, recorded, processed, whatever...

The size of your wave file will still only ever be the sampling rate * the time length * the bit depth in bytes * the channels (so like, 2 if stereo), not including some miscellaneous bytes of headers and stuff that media players read before they get to the audio data in a wave file.

So that means if my song is in mono at 5 minutes long at 44100 KHz 16-bit, it is most certainly going to be around 360 seconds * 44100 samples per second * 2 bytes per sample = 26460000 bytes, or ~25 MB. If my final output is in stereo, that means there are two channels with that many samples in time, so it doubles to roughly ~50MB.

The only thing that differs with your project complexity is how long the computer takes to sum everything up into one signal. Your computer has to pump harder to do all the math when there's more math to do. However, this will only slow down the rendering process in the DAW's exporting. The final file is still just one signal, and the only determinant of size is the length in time of that signal.

To answer your side thought, yes this is common to all DAW's. It doesn't matter if one DAW renders the silence of the muted track and the other one doesn't render it at all. The final output is still one signal. Going back to the finances example, rendering a muted track is just adding $0 to my net gain every month... and nothing changes.

I actually understood that since I took Computer Science. #nerd :tomatoface:

In short, the DAW reads everything that you want to render, even if it's muted, and rejects what is muted, making it take longer to render but still give the same final result, as long as the muted stuff doesn't end up lengthening the song/piece.

Edited by timaeus222
Link to post
Share on other sites
Though it's hard to believe for me, that the filesize along the same output quality options is only depending on the length of the track and not on the density of the signals (amount of layers and denity of midi/sound events) - if I've got it right.

These things are only things that are present in your DAW. There's no MIDI in a wave file. There's no plugins or effects in a wave file. There's no layers. It's just a string of numbers.

If you import the file in something like Audacity, you can zoom in super close and see each number for every index in time (in my previous example, every 44100 positions is 1 second). What happens for output is that your soundcard will just take these numbers and generate a smooth voltage signal where the voltage will match each sample value and oscillate the speaker in that fashion.

screenshot.png

Imagine over time the speaker cone takes the position of every next y-value (amplitude) as your x position (time) increases. In this example, it wobbles a little bit forward and backward, and then starts pushing really hard forward and backward.

In other words, think of the signal (waveform) like the path of the speaker, where a sample value of 1.0 is the speaker cone pushing all the way forward, and -1.0 is pushing all the way back. All the numbers in between are appropriately all of the speaker cone positions between all the way out and all the way in.

If you're confused about why you can output a bunch of different instruments and frequencies with just one signal, read about Fourier's Theorem. Basically, it's the same as superposition. A bunch of things adding together can be expressed as one thing (like how 2 + 4 + 6 can be expressed like 12). A signal can be represented as a sum of a bunch of sine waves at different frequencies, so if you were to play out multiple instruments through individual speakers for each one, and then play everything as a summed signal through one speaker, you'd hear more or less the same thing.

This is because our ears automatically analyze the frequencies for us. There's a rolled up sheet in there which basically has a little vibrator for every frequency; when it vibrates, it tells our brain that we hear it. Whether we're manually adding a bunch of sine waves together or just draw the square wave manually; it's a mathematically equivalent result, it doesn't matter how it's done beforehand.

If you've done anything in Photoshop, just think of it as flattening the image. You're taking all the settings and generating just a raw image out of it. You can't go back to find all the blending options, the pen tool paths, the smartobjects, etc. It's just raw pixels. It's the same with rendering a wave file. Samples are like pixels but for audio.

Edited by Neblix
Link to post
Share on other sites

...

screenshot.png

...

In other words, think of the signal (waveform) like the path of the speaker, where a sample value of 1.0 is the speaker cone pushing all the way forward, and -1.0 is pushing all the way back. All the numbers in between are appropriately all of the speaker cone positions between all the way out and all the way in.

...

If you've done anything in Photoshop, just think of it as flattening the image. You're taking all the settings and generating just a raw image out of it. You can't go back to find all the blending options, the pen tool paths, the smartobjects, etc. It's just raw pixels. It's the same with rendering a wave file. Samples are like pixels but for audio.

Another way to say it is to think of a graph made by tracking a single particle that oscillates up and down, always at the exact same height as the current position of the wave's tracing. i.e. the particle's oscillation correlates with the generation of the wave.

The photoshop example is also a good one of WAV file generation.

If this doesn't quite click, then try rendering one WAV with normal audio, and one WAV of just silence of the same exact length. It'll be the same exact file size, plus or minus a few bytes.

Edited by timaeus222
Link to post
Share on other sites
If this doesn't quite click, then try rendering one WAV with normal audio, and one WAV of just silence of the same exact length. It'll be the same exact file size, plus or minus a few bytes.

Not even; it'll be exactly the same.

0000000000000000

takes up the same amount of space as

1011010010100011

and every other 16-bit binary number. It takes up 16 bits.

So by induction, if the signal length is the same, the size will be the same.

And think about this, instead of 0000000000000000, we can say something like 10000 0, saying "there are 16 0's". We have captured 16 0's while using less than 16 0's to say it. This is the simplest, most crude of basic compression.

This is not how mp3 compression works, though. mp3 centers the lower frequencies in a mono channel (yes, mp3 is actually a very destructive compression because it destroys panning for the low end) and cuts out high frequencies. In that case, you've taken all the stereo low end and turned it to mono, making the low end HALF THE SIZE. It's very effective, but also very destructive if you crafted a stereo field down there.

EDIT: My explanation was slightly inaccurate. To read more about what actually happens with the stereo image, read about joining.

Edited by Neblix
Link to post
Share on other sites
Not even; it'll be exactly the same.

0000000000000000

takes up the same amount of space as

1011010010100011

and every other 16-bit binary number. It takes up 16 bits.

So by induction, if the signal length is the same, the size will be the same.

And think about this, instead of 0000000000000000, we can say something like 10000 0, saying "there are 16 0's". We have captured 16 0's while using less than 16 0's to say it. This is the simplest, most crude of basic compression.

This is not how mp3 compression works, though. mp3 centers the lower frequencies in a mono channel (yes, mp3 is actually a very destructive compression because it destroys panning for the low end) and cuts out high frequencies. In that case, you've taken all the stereo low end and turned it to mono, making the low end HALF THE SIZE. It's very effective, but also very destructive if you crafted a stereo field down there.

EDIT: My explanation was slightly inaccurate. To read more about what actually happens with the stereo image, read about joining.

I added the uncertainty bit because for a test file, the size was not the same, but the size on disk was the same (though that's what matters IIRC).

25p65xz.jpg

Edited by timaeus222
Link to post
Share on other sites

Size on disk is a function of cluster size. In this case, anything generally close to either file size will have the same size on the disk (because it rounds up to the nearest cluster).

So no, the files are not quite the same size. Are they precisely the same length, to the millisecond? Otherwise, I'd guess that there are differences in the headers, specifically the format chunk (which is variable size, see http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html).

Link to post
Share on other sites
Size on disk is a function of cluster size. In this case, anything generally close to either file size will have the same size on the disk (because it rounds up to the nearest cluster).

So no, the files are not quite the same size. Are they precisely the same length, to the millisecond? Otherwise, I'd guess that there are differences in the headers, specifically the format chunk (which is variable size, see http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html).

Yes, it's the same file, but one has been filled with silence.

Link to post
Share on other sites

You 're absolutely right guys.

The file size is not depending on the signals and the number of instrument layers that work together at the same time.

It's depending on the length of the track - even if you make the track longer without using further effects and instruments for the additional time of the track.

Just have checked this out at my DAW.

Physics, dude - don't let me feel like a being that is torn between electromagnetic waves and interferences. :D

Link to post
Share on other sites

I just discovered something interesting: Audacity does indeed introduce distortions into your render if there are muted tracks in the project. Really bad ones. I can't think why. The distortions aren't there during playback within the program, they just show up when you export to .WAV.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...