Jump to content

Creating a realistic impression of depth in stereo mixes

Recommended Posts

This might be interesting for all those who want add more depth to the mix and separate the frequencies of the instruments, vocals and synths much more and in a much cleaner, more realistic way.

Lots of you might already have mastered the skill of creating a vital and structured stereo panorama - the skill of placing instruments and other signal sources alongside the x-axis between the left and the right side to get a clean mix.
But even with a well-structured panning of the sound sources between the left and the right side, the final soundtrack might still sound kinda flat.

So, a much more difficult - and highly desirable - intention might be the creation of greater depth impressions in your mix.
There are quite a few possibilities with which you can do this - which I will explain in the following part of this text.

A - Recording your tracks with different microphone positions

In a recording studio with 2 well-chosen microphone positions (one microphone on the more left side, the other one on the more right side - and both in the front of the room) it could be kinda easy for bands to get a good stereo recording with useful depth informations without doing too much in your DAW software afterwards.
Somewhere I've read or heard that Michael Jackson, for example, created some depth impressions in his soundtracks already in the recording process by singing from various distances and positions within the room right into the microphones.

So, if you have a real band with real instruments and at least two good microphones for recording within a well-pepared room, you can create depth kinda easily with a different placement of the instruments which you want to record within the sensory fields of the microphones.
If you place your sounding instrument more to the left microphone, you will later hear more on the left side and less on the right side of your speaker system.
If you place your instrument more in the rear of the the room, a wide variety of different effects will make an impact on your recorded signal, which create the impression in your perception that the signal is coming more from the rear of the room.

These are the effects you might want to reproduce with the tools in your DAW to create an impression of depth, if you don't use microphones for recording (for example, if you are just working with synths or VSTi samples).

B - You commonly reproduce those depth effects with following settings or methods in your DAW:

1) Volume (sound pressure) level (gain staging)
Louder signals might be perceived as closer signals, quieter signal might be perceived as more distant signals (at least you will get the impression, if a signal with a kinda constant sound pressure level is moving away from you or coming closer to you).
So make sure, that there are many different (and fitting) volume levels between the single tracks in your final mix.

2) Frequencies (EQ adaptation)
Another effect called dissipation is the cause for bigger frequency changes of sound events coming from a larger distance.
Especially the higher frequencies of sound events from larger distances will be damped much more than their lower frequencies (just imagine a thunderstorm coming from a far distance with a bit more dull sounding rolling thunder and a thunderstorm which is really close and nearly right above your head).
I guess it's because higher frequencies (higher and more directional energy source) will get absorbed, damped and slowed down much faster from the particles of the atmosphere than less energetic and less directional lower frequencies.
So, over longer distances you will hear more of the remaining lower frequencies (or to be more precise: you will hear less of the higher frequencies), so the sound event from a bigger distance might be perceived as duller or less brilliant.
For this purpose you might create the effect of a bigger distance by putting an EQ plugin with a high shelf filter (for cutting the higher frequencies a little bit) on the desired sound event in your mix which you want to move more in the background.

3) Time difference between perceived (or received) direct sound and its first stronger audible reflections (initial time delay gap - can be adjusted with pre-delay setting of your reverb plugins)
Just imagine a big wide hallway.
On the one end you are standing, on the other end a drummer is performing a slow drum beat.
Every time the drummer hits his percussion equipment, you will perceive the direct sound of this sound event first (because the direct way is the shortest way the sound with its approximate velocity of 340 m/s will take at normal air conditions at sea level on this planet).
A short time (maybe just a few miliseconds) afterwards, the first stronger reflections from the walls (longer way than the direct sound) will get into your ears.

And the bigger this time difference is, the bigger (longer and/or wider) the room must be - caused by the longer way of the first audible reflections.

If the drummer in the same hallway would play only 1 meter in front of you, there would be barely any time difference between direct sound and first audible reflections, because the distance between the sound event and the listener is way too small to sense the little time difference of maybe around 5 ms.

So, if you increase the pre-delay from 5 to 50 ms, it might increase the illusion of more depth.

But don't overdo this one, because the pre-delay should also fit the room size of your reverb plugin for creating a realistic spatial impression within the perception of the listener.

4) Proportion between direct sound (dry signal) volume and reflections/reverberation (wet signal) volume
Imagine a classic side-scroller for the NES - like "Zelda II: The Adventure of Link", for example.
Link is just standing in the middle of the north castle where princess Zelda sleeps because of a magic spell.
Suddenly, one of those small fairies enters the castle to bring back the flute to Link, the magic flute he lost in the desert some time ago.

But the fairy is kinda playful and plays the flute straight in front of Link's face.
Just see Link's face as the listener or receiver (you), the flute as the sound source and try to imagine a full circle around the flute (360 degrees) that always contains two angles - one angle that shows the amount of direct sound (dry signal) hitting Link's face, and the other angle, which takes up the rest of the full circle and shows the other part of the sound which will turn into reflections and will also become received as reflections back at Link's face shortly afterwards.

So, by playing the flute straight in front of Link's face, the angle (and also the amount) of the direct sound (the dry signal) might be almost a half circle (maybe just 120 degrees). The rest of the flute sound will go above his head or behind the fairy, will immediately turn into different kinds of reflections on the walls, the floor and the ceiling of the castle and might come back as a various mix of perceived reflections (the wet signal) to Link's face.

Link is kinda pissed off and tells the fairy to play the flute somewhere else, but not straight in front of him.
So, the fairy flies around 50 meters away towards one end of the castle and plays the flute again.
Now, the angle and amount of the direct sound (dry signal) hitting Link's face will be much smaller from the farther distance, and the angle and amount of the sound turning into audible reflections (wet signal) for Link will be much greater.

You can also adapt this little example to a three-dimensional room (so, the former full circle around the flute will become a full sphere around the flute, the former two-dimensional angles will become solid angles, Link's two-dimensional head will become a three-dimensional head and the two-dimensional NES castle might become a three-dimensional Wii U castle).

So, the proportion of the dry signal volume and the wet signal volume at a sound source can also create an imagination of distance and depth within the perception of the listener.

So, simply use a reverb plugin on your sound source with which you can set the proportion of the dry and the wet signal.
If you add more of the dry signal to the sound source, the sound source might be perceived as closer.
If you add more of the wet signal to the sound source, the sound source might be perceived as farther away or coming more from the rear of the room.

Just keep in mind that the room size setting of your reverb plugins only creates an imagination of space around the listener - but it won't create a feeling of depth or distance between the sound source and the listener.

C - Using a 2-channel surround plugin to place your sound sources in a simulated room

This is a pretty interesting, easy imaginable and very precise tool with which you can place all of your sound sources freely in a two-dimensional interface of a simulated room (x-axis contains the information between left and right setting, y-axis contains information between close/front and far/rear setting).

I already have such kind of a 2-channel surround mode in my DAW software Samplitude Pro X4 Suite - but I never really dared to use this one for my remixes because I didn't have the comprehension of creating an imagination of depth and its benefits for the clearness of the mix back then.
But these days, I'm gradually figuring out how to use this one for making much cleaner and more structured mixes with a much more spatial impression.

The good thing is that you won't need a surround speaker system for this purpose - but the surround-like stereo mix you create with this 2-channel surround mode is decoded in a way that makes it fully compatible with stereo speaker systems and real surround speaker systems, according to the manual.

So, all the spatial information (changes in position, loudness, frequencies and reverberation) of the placed sound sources in this virtual room will be fully reproduced on just two speakers (your studio monitors) or your headphones as well.

I'm not fully sure how this system works in every detail.
But I guess they might have used two well-placed recording microphones in the front of a bigger room with a certain distance to each other (just for the stereo imaging), measured the signal changes caused by various sound sources at different positions in the room (from close positions in front of the microphones, but also from more distant positions), created some kind of an algorithm for the signal changes and finally made a filter from this algorithm.
And with the help of this imaginable filter (it's still my assumption that it might be a filter) you could reproduce all the room information and signal changes for all possible positions in this simulated room kinda easily, much more precisely (without calculating too much for the exact distance, the correct pre-delay or the proper damping or cutting of the frequencies for creating depth - instead of this time-consuming procedure you can easily drag the sound source with the mouse on the interface to the exact position in the simulated room) and in a pretty realistic way (less irritating information that could impair the impression of depth).

So, if you place an instrument more in the rear of the simulated room, the perceived volume of the sound source will decrease, the perceived frequencies will change and the reverberation will also change - and all this complex stuff already goes by dragging the symbol for the sound source with the mouse through the virtual room at the really useful 2-channel surround mode interface.

Of course you can also do automations with the positions of the sound sources, double the signal sources, vary their distance to each other for a different stereo width or shift these sound sources parallely or freely around the x- and y-axis through the simulated room.

But this should be just a small impression of the many things you can do in such a 2-channel surround mode.

I hope, my reflections about these things might help some of the newcomers and all those who wanted to know much more about this topic in a way even I as a former ecology student (who became kinda desperate with the higher level of mathematics and physics back then because building up knowledge mostly based on ready-made formulas might be not the best way of truly understanding natural phenomenons and other essential things of life) could finally understand some of those very complex things much better.

Please correct me, if I should be wrong with certain assumptions or augment my writing, if there might be some further important things deserved to be added to this topic.

Edited by Master Mi
Link to post
Share on other sites
  • 1 year later...

What do you think - could this be the future of three-dimensional panning for stereo and surround mixes?

Some days ago, I've seen a pretty interesting VST plugin for kinda realistic three-dimensional panning which obviously works for stereo mixes as well (seems to be similar like the 2-channel surround mode I use in my DAW for quite some time now):

I know, the plugin might use only the well-known parameters from classic audio productions for creating an image of depth or certain positions in a virtual room - just those parameters I've mentioned before, like volume (sound pressure) level (gain staging), frequencies (EQ adaptation), time difference between perceived (or received) direct sound and its first stronger audible reflections (initial time delay gap - or simply the pre-delay settings) or the proportion between direct sound (dry signal) volume and reflections/reverberation (wet signal) volume.

But with the help of those tools you might set a specific room position of a sound source much faster, much more precisely and in a much more visual and intuitive way than just with cold numeric values within the oldschool way of music production.
And - of course - the three-dimensional panorama automations could be done with much less effort, calculating, in far less time - and with much better results for realistic three-dimensional staging and without the need of being the supernatural professor of physics who has already internalized even the finest aspects of sound propagation of sound sources in relation to the position and the whole constitution of the listener.

The interesting fact about this plugin is, that you can not only set a sound source between left and right (width) or front and rear (depth) positions.
You can also set the height of a sound source (up and down) - which might be also useful for the surround formats of the future where speaker units might be completely around the listener as if the listener was inside a sphere full of speakers surrounding him alongside all three axes.

I don't know where those panaroma plugins might go in the future - but I'm sure that they'll also add another component of specific sound directions and spherical angles for sound propagation in relation to the position of the listener.

But just the momentary possibilities of sound creation and separation of sound sources in a three-dimensional room within a normal stereo recording already seem to be gigantic, as you can hear in the following sound example obviosly created with this plugin:

So, what do you think - is it worth to go for such kind of a three-dimensional panning plugin or is it rather redundant for audio productions?

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.


×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...