What Is Stereo Audio/Sound? (Vs. Mono & Surround)

What Is Stereo Audio/Sound? (Vs. Mono & Surround)

Stereo audio has been a dominant format for analog and digital audio for decades, particularly in the music industry. It's a relatively simple format that does a good job of emulating the way we naturally hear the world around us (so long as we're monitoring it appropriately). And while it has its drawbacks, it's still one of the most important audio formats, if not the most important format, we have in audio. If you're interested in learning the most important aspects of stereo audio, you've come to the right place.

What is stereo audio? Stereophonic audio is a two-channel format complete with left and right audio channels. When played back on an appropriate and properly positioned system, stereo audio (reproduced as sound) offers depth and width (and arguably height) as the left channel reaches our left ear first and vice versa.

That's an incredibly condensed answer to a big question. There's a lot to unfold, so in this article, we'll discuss stereo audio in great detail, touching on how it relates to our auditory system, how it differs from other popular formats, and how we go about recording, mixing and listening to stereo audio.


Audio Vs. Sound

Before we get into the bulk of the information, let's quickly go over sound and audio.

The key difference between sound and audio is their form of energy. Sound is mechanical wave energy (longitudinal sound waves) that propagates through a medium causing variations in pressure within the medium. Audio is made of electrical energy (analog or digital signals) that represents sound electrically.

Sound is tangible. Sound waves travel through the air (and other media), causing oscillations of the medium's particles. The small variations in pressure and displacement within these media are caused by the peaks and troughs of the sound wave. For the particles of the medium, sound waves apply maximum rarefaction and compression in cycles.

The frequency of these cycles is measured in Hertz (cycles/second). Sound waves are generally made of many overlapping frequencies that, when sounded together, yield the character of the sound itself.

There are three basic frequency ranges when it comes to human hearing:

  • Audible sound, as we hear it, is within the frequency range of 20 Hz and 20,000 Hz.
  • Infrasound is the inaudible sound below 20 Hz.
  • Ultrasound is the inaudible sound above 20,000 Hz.

Audio, on the other hand, is a representation of sound, either as potential (stored) or kinetic (active) energy. That is, audio signal waveforms mimic those of the sound waves they represent, and in ideal systems, the sound being reproduced will be identical to the audio.

Audio can be analog (continuous) or digital (discrete) in storage. However, for playback to occur, the audio must be analog so that the alternating current can be properly amplified to drive the transducer (speaker or headphone driver) that will ultimately convert the audio into sound.

So sound waves are the actual longitudinal waves that interact within acoustic environments, whereas audio is an electrical (analog or digital) representation of sound.

Related Article On Sound & Audio

To learn more about sound and audio, check out my article What Is The Difference Between Sound And Audio?


The Human Auditory System

The human auditory system is rather complex, but I'd like to at least go over the basics here.

Sound waves will cause the particles of a medium to vibrate. Those vibrating particles at the eardrum will cause the ear drum to vibrate as well. In other words, as sound waves reach our ears, our eardrums vibrate in response.

The vibration of our eardrums effectively turns the sound wave energy into mechanical energy, and the vibration of our eardrums causes the bones of the middle ear (malleus, incus and stapes) to vibrate as well. This energy is transmitted through the bones and into the inner ear in the form of hydraulic energy in the fluid of the cochlea. This fluid stimulates the cochlea's hair, which converts the hydraulic energy into electrical energy that can be processed by our brains.

So, put simply, our auditory system allows us to perceive sound waves and hear the sounds around us. This happens nearly instantaneously, allowing us to sense the world around us.

An important aspect of our auditory system, especially for the purposes of this article, is that we have two ears, and in individuals with healthy hearing, both ears can hear sound sufficiently well.

Because we have two ears, we can localize sound sources around us. The seemingly insignificant time differences between a sound wave reaching one ear before the other are enough to give us an idea of where the sound is coming from.

Additionally, the shape of our ears has evolved over time and our central nervous system, ears and experience work together to tell us where sounds are coming from.

This typically holds true until the lower frequencies, where the longer wavelengths surpass the distance between our ears. At this point, it becomes much more difficult for our brains to determine the phase of a signal and, therefore, where it's coming from—this is why low-end information is often heard as being more omnidirectional.

So the differences in sound between our left and right ears tell us a lot about our environment. Add in the natural interaction of direct sound sources with their acoustic environments (the reflections, absorption, resonances and reverb of an acoustic space), and we can process sounds in 3-dimensional space.

Stereo audio, in the simplest way possible, mimics how our two ears hear the world around us by having a left channel (to stimulate the left ear primarily) and a right channel (to stimulate the right ear primarily).


What Is Stereo Audio?

Stereo audio, as I've mentioned, is a 2-channel audio format with a left channel and a right channel. It's really as simple as that. We'll get into the finer details in the upcoming sections, but let's touch on a few key points here (at the risk of repeating myself later in this article).

Stereo is often preferred due to its dimensional characteristics. In ideal setups, playing back stereo audio (reproducing it as sound) gives us a sense of width and depth that isn't possible with a single channel of audio (as is the case with mono audio).

There are effectively two methods for creating stereo audio.

The first is considered “true stereo,” where an array of microphones is carefully positioned and panned in the mixer to achieve a natural stereo recording complete with the direct sounds and all the natural reflections of the acoustic environment.

The second is “artificial stereo,” where individual tracks, including directly injected signals and overdubs, can be panned around the stereo image within the mixer and effects and processes (notably reverb and delay) can be incorporated to create a sense of space.

The left and right channels must have differences in their audio in order for stereo audio to be heard as stereo. If we had perfectly identical waves in the left and right channels, we might as well be working in mono.

However, it's important that some information is the same between the left and right channels (equal but opposite-polarity signals will effectively cancel each other out). The information that's the same between the two channels (the “sum” or “mid” information) makes up the “centre” of the stereo image, while the information that's different between the two channels (the “difference” or “side” information) makes up the stereo width.

Now, when it comes to playback, the listener has the responsibility to first set up the playback system (notably the loudspeakers) in a way that allows the stereo image to be heard and second to position him or herself appropriately between the “left” and “right” speakers of the system. Improper speaker placement and listening positions will create a scenario where the stereo image cannot be experienced as intended.

Alright, that's a good start. Let's now move on to some specifics of stereo audio and how we can use it, along with how it relates to other common formats.


Stereo Vs. Mono Audio

The main difference between stereo and mono audio is pretty simple—it's in the channel count.

  • Mono audio has a single channel.
  • Stereo audio has two channels.

Mono audio doesn't take advantage of our auditory system's two ears and cannot create a sense of width. Stereo audio, on the other hand, is the most basic way to create a sense of width and overall dimensionality, as it exploits our two-ear hearing systems with a channel for the left and a channel for the right.

But that doesn't mean that mono audio should never be used. In fact, there are plenty of use cases for mono audio, particularly where width isn't a concern for the audio—applications such as audiobooks, telecommunication (phone calls, video chat, etc.), intercoms, and single-driver playback systems.

Another benefit of mono audio is that it takes the responsibility for the stereo image away from the end listener (since there's no stereo image to begin with). This was one of the concerns as stereo recordings started to become popular, though stereo soon replaced mono as the standard for music production.

Additionally, since mono only has a single channel, there's no division of bitrate in digital audio. This means that the quality of mono digital audio, all else being equal, is better than stereo digital audio, though this really isn't a concern with modern computers and digital storage.

Stereo audio, of course, has the major advantage of enhanced directionality, particularly in width. It's a go-to standard for music and still sees use in broadcasting and television, though surround sound formats are more popular for movies. That brings us to our next section.

Related Articles On Mono & Stereo Audio

To learn more about mono and stereo audio, check out the following articles:
Is Stereo Or Mono Audio Better? (Applications For Both)
How To Tell If An Audio Signal/File Is Mono Or Stereo


Stereo Vs. Surround Sound Audio

The term stereophonic, as defined by dictionary.com, pertains to “a system of sound recording or reproduction using two or more separate channels to produce a more realistic effect by capturing the spatial dimensions of a performance (the location of performers as well as their acoustic surroundings), used especially with high-fidelity recordings and reproduction systems.”

By that definition, both stereo and surround sound formats are stereophonic as they both yield dimensional sound upon playback.

The big difference? Stereo audio strictly has two channels, while surround sound has more than two channels. Once again, it's really that simple.

This “more than two channels” definition is fairly broad. At the smallest, we have the 2.1 format, which effectively adds a low-frequency channel intended for a subwoofer (the “.1”) to the basic stereo format (the “2”). On the larger side, we have the popular Dolby Atmos format, which allows up to 128 audio tracks (each with its own associated spatial audio description metadata (location or pan automation data, data about the sound's movement, type, intensity, speed and volume).

As you may infer from the importance of speaker placement in stereo, surround sound formats do not only require more speakers (for each dedicated channel) to be experienced fully, but these speakers also have to be positioned correctly and the listener should be positioned properly within the surround sound playback system.

Stereo is still popular thanks to its relative simplicity—it offers great dimensional characteristics while being easily reproduced with a pair of speakers or stereo headphones.

Surround sound formats are more involved, requiring more effort to mix and more money for sufficient playback.


Recording In Stereo

Recording in stereo can be done in a few ways. The first is with stereo miking technique, of which there are many.

Stereo miking techniques offer us “true stereo”, where two or more mic capsules can capture an actual acoustic environment, emulating the way we hear the world around us.

There are three main categories for two-mic systems and an additional “surround sound” category worth addressing. I'll do so in bullet point form:

  • Coincident pair: a pair of microphones with capsules positioned as closely as possible, pointed in different directions to capture sounds from different directions.
  • Near-coincident pair: a pair of microphones with capsules positioned closely together, but not immediately together, sometimes with a physical boundary between them. The differences in location and direction will yield a stereo image, often similar to our auditory system (our two ears are effectively a “near-coincident pair” with a boundary in between).
  • Spaced pair: a pair of microphones a good distance from one another, aimed at capturing a wider stereo image than we naturally can with our ears.
  • Surround sound arrays: arrays with multiple microphones aimed at capturing a “true surround sound” recording of the environment.

I should also mention here that these techniques don't necessarily require two microphones but rather two (or more) microphone capsules. There are stereo microphones on the market that combine two separate capsules (typically in a coincident pair setup) inside a single housing with a stereo output.

Beyond miking techniques, we can also record electric instruments that often stereo outputs directly. Common examples are keyboards and synthesizers. In these cases, we can connect the stereo output of the instrument to our mixer or interface and record in stereo that way.

Additionally, many virtual instruments offer stereo, so we can record their audio in stereo within our digital audio workstations.

But even if we have mono sources, we can effectively mix them in stereo, bringing us to our next section.


Mixing In Stereo

Yes, Dolby Atmos is seemingly here to stay (as of my writing this article). However, mixing in stereo is still important and will continue to be (I hope), so let's talk about it.

Mixing in stereo means we have to account for the left and the right channels. I'll reiterate here that while stereo width is achieved by having differences between these channels (the “difference” or “side” information), we actually want a good amount of equal information between the channels (the “sum” or “mid” information).

This is important to maintaining a strong centre image, which will help the mix translate to other systems. Some systems even play back audio in mono, and it's important that our stereo mix can be summed/collapsed to mono without losing too much of the information that makes it a great mix, to begin with. In other words, we need mono compatibility from our stereo mixes.

It also helps with the overall phase cohesion of the mix, which is especially important for a solid low-end. Remember that, even in ideal listening environments, we'll need phase differences (cancellation) between the left and right channels in order to achieve width.

To put things more objectively, we want a phase correlation somewhere between 0 and 1. Phase correlation meters span continuously from -1 to +1, or from 180º to 0º. They can be put on stereo tracks or the stereo mix bus to meter the phase relationship between the left and right stereo waveforms.

At +1, we have a 100% correlation between the channels (they are exactly the same).

At 0, we have the “widest permissible left/right divergence” or the widest permissible stereo image.

Having the mix bus correlation meter moving between 0 and 1 is ideal. Smaller variations mean smaller differences in width.

At -1, our left and right channels are completely out of phase and will completely cancel each other out.

So that's the first thing to be mindful of when mixing in mono.

The next point to make is that mixing allows us to craft “artificial stereo” results, where mono signals can be panned across the stereo spectrum and stereo effects (or panned mono effects) can be used to evoke a sense of dimensionality. In fact, we can craft nice, wide stereo mixes without any stereo recordings or effects whatsoever just by using our pan pots.

That's an important aspect of mixing. Balancing the stereo image is key to a great stereo mix, and panning is a primary tool for achieving this.

With that stated, stereo effects and “true stereo” recordings help with the realism of a stereo mix, so they should be utilized when possible, as well.


Stereo Playback

As we discussed earlier, stereo audio, by itself, cannot be experienced. It must be reproduced as sound in order for us to hear it, and since we're dealing with two channels, it's critical that we set up our stereo playback systems appropriately in order to experience stereo mixes as they were intended to be experienced.

First things first, we need (at least) two separate transducers to convert each of the stereo audio channels into sound waves. That typically means a matched pair of speakers or studio monitors or a pair of stereo headphones.

Next, we need to connect the audio outputs to the speakers or headphones. The left channel output must go to the left driver(s), and the right channel output must go to the right driver(s). This is generally made simple with the designs of our audio equipment, but it should still be stated.

After that, we need to ensure the speakers and listening positions are appropriate. Headphones make this easy, as they fit on, around or in the ears with the drivers positioned equally. With speakers, however, we must consider the acoustic environment (a topic for another article) and the relative positioning between us and the speakers.

Ideally, we want to form an equilateral triangle with the stereo pair of speakers and the listening position. The speakers should be spaced apart and we should be facing them, with the distance between our left ear and left speaker equalling that between our right ear and right speaker. Furthermore, speakers should generally be positioned, height-wise, so that the main driver is roughly at ear level at the listening position.

The speakers should also be positioned appropriately within the room to minimize acoustic resonances for better sound reproduction.


Storing Stereo Audio

Finally, let's touch on the storage of stereo audio.

Nowadays, digital storage is most common, whether it's on a hard drive, the cloud (for streaming and otherwise), or a physical-digital medium like a compact disc (CD).

Analog storage includes tape (so long as there are two channels) and vinyl (the inner part of the groove is the left channel, and the outer part of the groove is the right channel).

Related Article On Storing Audio

To learn more about storage, check out my article SSD Or HDD For Audio Engineering & Music Production?


Do loudspeakers need amplifiers? Nearly all professional and consumer-grade audio equipment will output line level signals, which are not strong enough to sufficiently drive speakers. Therefore, loudspeakers require power amplifiers to bring their audio up to speaker level. These amps can be separated or designed into the speakers.

Are headphone jacks stereo or mono? The headphone jacks in consumer and professional audio devices are nearly always unbalanced stereo, having a left channel, right channel and common ground/return to connect to TRS plugs. Less commonly, headphone jacks may be designed for unbalanced or balanced mono or even balanced stereo.

Related Articles On Loudspeakers, Amplifiers and Headphone Jacks

To learn more about loudspeakers, amplifiers and headphone jakcs, check out the following articles:
Why Do Speakers Need Amplifiers? (And How To Match Them)
How Do Headphone Jacks And Plugs Work? (+ Wiring Diagrams)

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *