How Audio Technology Created Popular Music (and Changed the Way We Listen)

By Christopher Kent

It’s easy to take something extraordinary for granted when it’s everywhere around us. That applies to a lot of the things we accept as day-to-day parts of our modern lives, whether it’s having good food to eat, having comfortable homes to live in, or being able to talk to anyone anywhere just by picking up a phone and calling, texting or making a video call. In that spirit, I would argue that one of the most extraordinary things we take for granted is today’s audio technology. The kind of experience someone listening to music can have today is astonishing compared to the experience of listening to live music hundreds of years ago, or even compared to the recorded music of several decades ago.

For millennia, listening to music was a direct process; a live musician created a coherent sequence of sounds, creating an experience in the minds of the listeners — even if the only listener was the person creating the sounds. But as science and technology developed more and more rapidly in recent centuries, it was inevitable that someone would find a way to capture that live sound and make it possible to listen to it later.

Once that threshold was crossed, a long process began to unfold. Audio technology slowly became more and more sophisticated, beginning with non-electric recording devices that were created in the 1800s and eventually leading to the digital devices of today. This evolution made it possible to record and reproduce the sounds of traditional musical instruments and voices with astonishing accuracy, and it was accelerated by technological advances that allowed the creation of never-before-heard musical sounds.

Those of us accustomed to today’s spectacularly clear and creatively unlimited recordings don’t often think about how this happened. But looking back at the evolution of audio technology and the listening experience over the past 140 years is eye-opening. (Ear-opening, if you prefer.) Every new development in technology has taken listeners to new types of audio-generated internal experience. At first, being able to record and reproduce a traditional musical performance allowed listeners to enjoy and visualize the performance in their minds. Today, a recording may not have much to do with a live performance at all, and it can trigger an internal “virtual reality” experience that listeners 140 years ago could never have imagined.

As a recording artist myself, I’ve experienced many of these changes in a very personal way, because the resources available for recording my own songs gradually expanded. When I first began writing and recording songs, I had access to a reel-to-reel tape recorder from the 1950s that recorded on a single track. I had to use whatever instruments and voices were available, performing together live to create a monaural recording. Today, I record on a computer that allows me to add as many tracks of background parts and special effects as I want, one at a time, while using a digital sound-generating device — a synthesizer — that can create extraordinary familiar or unfamiliar sounds to enhance the resulting multi-track audio experience.

The evolution of audio recording technology has caused a revolutionary change in the way we listen to music. In addition, it’s made it possible for a listening experience to be shared by millions, resulting in what we now call “popular music.” That, in turn, has had a profound impact on our evolution as a society.

Here, I’d like to share some perspective on how these changes in technology occurred over the past 150 years, and how they’ve impacted the way we all listen to — and experience — music today.

To provide a framework for this discussion, let’s think about this in terms of seven levels of listening experience:

The Evolution of Listening to Music: An Overview

Level one (pre-1880): Live performance only
Level two (1880s): Capturing and reproducing sound — the birth of popular music
Level three (1941): Fidelity
Level four (mid-1950s): Stereo
Level five (early 1960s): Multitrack recording — creativity unleashed
Level six (1970s): Freedom from familiar sounds
Level seven (Early 1980s): Computerized digital data manipulation.

Level one (pre-1880): Live sound only

Through most of human history, any music the average person heard was likely to be one person — or a group of people — singing, or playing rhythm instruments. In recent centuries, some people heard singing in church, but the percentage of people able or willing to attend church was limited. Meanwhile, because most people lived in poverty, instruments were homemade or nonexistent. When sophisticated music began to be developed for the church or for the enjoyment of wealthy patrons, such as complex piano pieces and symphonies, the number of people who actually got to hear them was vanishingly small. If you couldn’t be present for a musical performance, you were out of luck.

The result was, there was no such thing as “popular music.” People may have passed a few songs that meant something to them down to future generations, but that was isolated to a family or tribe of people, if it happened at all.

Level two: Capturing and reproducing sound — the birth of popular music (1880s)

When technology finally advanced enough to make sound recording and playback possible, everything changed. Suddenly, a performance wasn’t isolated to one time and place; it could be recreated endlessly for others to enjoy.

The earliest sound-recording devices captured sound that was sung or played into a cone, which etched a line into a medium such as a rotating wax cylinder. The first example of sound being “recorded” — i.e., captured in a different medium — appears to have been done by Edouard-Léon Scott de Martinville, a French inventor, in 1857. He found a way to create a visual representation of sound on a hand-cranked cylinder. His device, which he called a phonautograph, successfully captured sound in another medium. But there was a huge drawback: The device couldn’t recreate the sound. The first device using this approach that could both record the sound waves and play them back was created by Thomas Edison. (If you’re curious, you can hear one of Edison’s first recordings online at this URL: www.facebook.com/watch/?v=337296323719780)

Edison’s invention was quickly turned into a commercial device that others could buy. The earliest type of phonograph that was commercially sold played sounds that had been recorded on a thin sheet of tinfoil wrapped around a grooved metal cylinder. During recording, a stylus connected to a diaphragm vibrated and made a groove in the foil as the cylinder rotated. Letting the stylus run through the groove again recreated the sound. Despite the mediocre sound quality, cylinders that could reproduce a captured performance became a popular commercial item for people who could afford them. (The idea of being able to reproduce a musical performance at all was unprecedented.)

The technique of recording onto a flat disc made of vulcanized rubber (instead of a cylinder) was patented in 1888 by Emile Berliner, improving the sound quality and volume — not to mention making the resulting recordings easier to mass produce and store. However, recordings released on discs didn’t outsell cylinders until about 1910. (It wasn’t until 1948 that Peter Goldmark invented long-playing vinyl records.)

The most noteworthy thing about these recordings, from the listener’s perspective, is that they didn’t sound very good. They captured the basic data, so that the listener could hear a speech, or tell that a band was playing or someone was singing, but it bore little resemblance to the sound quality you’d hear when attending a live performance. Part of the reason for this was the difficulty of creating a noise-free reproduction — any flaws in the physical cylinder or disc added distortion (or clicks or pops) to the playback. However, another problem was that this early technology was only able to capture mid-range frequency sound, not low or high frequencies. The result was that a listener could recognize what had been recorded, and could imagine hearing the person or instruments live, but the experience was almost an intellectual exercise.

Within a few decades, electric microphones had been developed and people began using them to create better input that captured a wider range of frequencies. At the same time, the materials used to capture and playback the sounds steadily improved. But even after the development of electric microphones and better mediums for recording, the fidelity of the reproduced sound was still nothing like a live performance. And, needless to say, all of these recordings were monaural; there was no stereo, multi-dimensional experience for the listener at all.

The lack of quality compared to a live performance was demonstrated in the 1930s when popular singer Bing Crosby tried recording his weekly radio program ahead of time, because doing it live every week was very stressful. The audience immediately objected, because they could tell that what they were hearing wasn’t live!

Level Three: Fidelity (1941)

Ironically, it’s often been noted that war can trigger advances in technology (for good or bad). That turned out to be the case when it came to sound reproduction.

The problem of making realistic-sounding recordings was solved by the Germans during World War II, when they perfected recording onto reel-to-reel machines that used magnetic tape. This approach created recordings of sufficient quality that a listener couldn’t tell that what they were hearing was prerecorded. That enabled Hitler to be somewhere else when he gave a speech over the radio — an important consideration when millions of people did not wish him well.

After the war, this new type of recording technology steadily improved, as did microphone technology. By the 1950s, consumer tape recorders began appearing on the market. (My father purchased one.) These machines were still monaural, meaning they could only record a single “track” of sound at a time, using a single microphone. But now recordings sounded much more like the live performance. The high- and low-frequency sounds were all there, and the distortions and clicks and pops associated with earlier recording technologies were gone.

When I began writing songs as a teenager, my dad’s tape recorder was the only equipment I had, so my earliest recordings were monaural; everyone who participated in the recording gathered around the microphone and performed together live while the tape captured the performance.

Level Four: Stereo (mid-1950s)

By the time my dad bought our reel-to-reel tape recorder in the 1950s, technology had taken another big leap: Stereo recording appeared. This was made possible by placing two parallel tracks on existing reel-to-reel tapes instead of one. Recordings were now being made using two microphones pointed in different directions instead of one.

At first, it wasn’t clear that this idea would capture the imagination of the public. For decades, people had been accustomed to listening to sound coming from a single speaker, such as the one in a radio. This didn’t usually seem like a disadvantage, because most of the live music people heard came from a single location in space — for example, someone sitting at a piano across the room. The advantages of hearing in stereo weren’t obvious to many people, most of whom had never sat right in front of an orchestra with horns on one side and strings on the other, for example. The potential benefits of stereo when listening to a single performer up close were not obvious, and even a full band would usually be seated far enough away that hearing it in stereo might seen unnecessary.

However, the theoretical advantages of stereo recording were clear — because people have two ears. We’re used to hearing in stereo in day-to-day life; it just wasn’t a notable part of listening to a speech or the live music of the time for the average person. Besides, up until then, just being able to experience a performance recorded by someone you’d never met, made at a different time and place, was magical enough.

Because stereo recording seemed like a novelty, the recording industry thought the idea needed to be promoted. In fact, my father purchased a vinyl record released in the 1950s that was designed to impress listeners with the benefits of stereo sound. It featured things like a train coming in from one side and then disappearing into the distance on the other side.

Some people loved the new listening experience. Nevertheless, I still recall that one of my relatives — a brilliant man — steadfastly refused to switch to stereo until he could no longer buy recordings in mono. (Mind you, he bought the highest-quality amplifier and speaker he could afford; he just thought stereo was a waste of money.)

To me one of the most interesting aspects of this change was the way it altered the experience that listeners had. Yes, this was partly because of better recording technology that could reproduce quieter instruments and a broader spectrum of sound. But equally important was that stereo microphones essentially duplicated the effect of sitting right in front of the artist or band or choir or orchestra, where different parts of the performance were coming from different directions. The intimacy, clarity and three-dimensional experience provided by stereo recording made it very different from listening to earlier monaural recordings. Now you could not only imagine yourself at a performance; you were in the best seats in the house, directly in front of the musicians!

One of the interesting facts here is what stereo recording didn’t change: Recordings were still made live. Now, they were just recorded using two microphones instead of one. In other words, having stereo didn’t open up new ways of creating a musical product; it just made the listening experience more intimate and lifelike.

Level Five: Multitrack recording: Creativity Unleashed (early 1960s)

In the 1960s, the way recordings were made began to change. As the technology continued to improve, artists began recording on devices that created four parallel tracks on the tape, making two new things possible: First, parts could be added to a stereo recording at a separate time from the original stereo recording (even recorded at a different location); and second, recorded parts could be modified individually, changing the tone or volume of one track without changing the sound of the entire recording.

Once it became clear that this could impact creative freedom, artists began demanding more and more tracks to work with. By the end of the 1960s, artists like the Beatles and Simon and Garfunkel had pushed their studios to sync multiple machines so they could make 16, 24 and 36-track recordings. Having the ability to add and manipulate individual parts and sounds to create a finished recording turned the process into something vastly different from a basic live recording. It was a bit like moving from taking a picture of the night sky to painting Van Gogh’s Starry Night — in 3-D! Not surprisingly, the experience had by the listener was vastly different as well.

Meanwhile, another important change occurred. Until the 1970s, producing a high-quality recording outside of a professional studio was almost unheard of. However, the newer, more advanced recording technology gradually began to drop in price, allowing artists to create multitrack albums on their own. Singer-songwriter Todd Rundgren‘s Something/Anything album was one of the first to be created in a home studio. It produced several Top Ten hits, thus proving that recording outside a traditional studio was something that could lead to thoroughly professional results.

As an interesting side note, the advent of multiple-track recording also led to the brief popularity of “quad” recordings, meant to be played back on four speakers — two in front of the listener and two behind. At the time, this seemed like a logical development. After all, if you have more than two tracks in the recording, why not listen to the recording on more than two speakers? Why not have sounds coming from behind you, or moving around the room?

The reason this never really caught on isn’t hard to grasp: People only have two ears. Yes, our ears are capable of identifying the location from which a sound is coming (such as in front of us or behind us), so having four speakers instead of two wasn’t completely pointless; but most of the real-world things we listen to happen in front of us. Add to this the cost of purchasing new playback equipment and extra speakers, and it was an uphill struggle to get people to accept the new format.

Furthermore, our brains are very good at interpreting two-dimensional sounds as three-dimensional, even if they’re actually only coming from two speakers. For example, Isao Tomita’s synthesizer arrangement of Mercury, the Winged Messenger, from Gustav Holst’s orchestral suite The Planets features sounds whirling around the listener. You don’t need more than two speakers to believe that the sounds are whirling around you, even though technically, the sounds never actually go behind you. Your brain completes the experience without any additional technological help!

In short, all the extra expense of quadraphonic playback didn’t add much to the listening experience. Nevertheless, this idea is still popular when associated with movies. Today, theaters routinely place multiple speakers around the audience. This can be duplicated with an at-home theatrical set-up as well, if you’re willing to spend the money on the technology and speakers. But does it change the listening experience in a significant way? Not really.

Level Six: Freedom from the familiar (1970s)

For human beings, one of the hallmarks of listening, for millennia, was that any sound you heard came from a familiar source. (If the sound was unfamiliar, people knew that it could eventually be identified as something made by a natural source.) The advent of advanced technology finally changed that, making it possible to create sounds that weren’t familiar at all. This had a profound impact on recordings and the listening experience.

One of the earliest appearances of this idea was the invention of the Theremin (named after its inventor, Leon Theremin) in 1919. This device produces eerie, unearthly sounds that vary in pitch and intensity as your hands move around its vertical antenna, without actually touching it. Unfortunately, the earliest Theremins were extremely difficult to play, but they were used in several horror movie soundtracks, to good effect. A revised model that was easier to play was developed later. (The latter was used on the Beach Boys’ hit Good Vibrations.)

The eventual acceptance of “unnatural sounds” as a legitimate part of music was probably hastened by a development that happened in the late 1940s and went mainstream with the advent of rock and roll in the 1950s: distorted guitar sound. Rebellious teenagers in the 1950s and 1960s reveled in this new sound, no doubt partly because their parents hated it! It may not have been obvious at the time, but a threshold had been crossed: It was now possible — and acceptable — to create and employ sounds that were not “natural” in a setting outside of a horror movie. Nevertheless, taking this to the next musical level required a much greater technological leap than simple distortion.

It wasn’t until the mid-1960s that an inventor named Robert Moog made the creation of non-standard sounds something that musicians and record producers would consider using routinely, when he created a voltage-controlled oscillator that could be played using a keyboard. (Some of the Beatles’ later recordings incorporated Moog synthesizer sounds.) Unfortunately, the early Moog synthesizers were very large and not easy to operate. For that reason, the technology didn’t really take off until the summer of 1970, when Moog released the Minimoog Model D, a version of the system that was compact, portable and easier to play.

Notably, these early synthesizers were not digital; they created sound using more “old-fashioned” analog technology such as vacuum tubes. In the 1980s, synthesizers finally became digital, with computer technology taking over. This lowered the cost dramatically and expanded their use even further.

Many listeners initially found synthesized music distasteful, simply because it wasn’t familiar. The real issue was that traditional sounds referenced familiar sound sources, such as a human voice or an instrument. This allowed the listener to imagine the source while listening to the recording. But synthesized sounds didn’t come from a familiar source, making it impossible to imagine the source of the music in the traditional way. In other words, abstract sounds forced many people to digest what they were hearing in a new, more abstract way; they could no longer reference that familiar mental image of the performer.

As a result, those who accepted this new music began to change the way they listened. They began to let the music “plug into” their feelings and imagination directly, skipping the mental recreation of a performance. Of course, not every listener chose to go down this road, but those who did learned to appreciate music in ways not possible when the source of the music was more traditional.

It’s important to note one other important aspect of synthesized music. Synthesizers — especially digital synthesizers — are capable of recreating familiar sounds as well, such as a trumpet or an orchestral string section. While this doesn’t challenge the listener in the same way more abstract sounds do, this capability caused a revolution at the creative end of the process, because it allowed musicians to “play” instruments that otherwise would require a professional expert (or groups of experts) to play. This has made it possible for independent artists who can’t afford to hire an orchestra to create fully produced recordings on a par with anything a major studio could create. That, in turn, has made nearly unlimited creative possibilities available to artists who aren’t affiliated with a major label. Professional quality recordings with orchestral backgrounds can now be created by independent artists in their bedroom or garage.

Level Seven: Computerized digital data manipulation (early 1980s)

Computers began to be used as part of the audio recording process during the second half of the 20^th Century as they became ever more sophisticated. Although this change didn’t alter the resulting listening experience as blatantly as switching from mono to stereo recording, it took the quality of audio recordings to a greater level than was previously possible. This made the “virtual reality” part of the listening experience even more amazing by making the sounds clearer and more detailed — and sometimes, even more unearthly.

Translating an audio signal into digital bits of information that a computer could manipulate wasn’t possible until computers became extremely fast and had a large enough memory to store huge quantities of digital data. The fast processing speed was crucial, because once you divide audio data into millions of tiny digital bits, you have to be able to put them out at an astonishing speed in order to get them to sound like music. But once that threshold was crossed, it became possible to record sound in a completely different way. This was a huge breakthrough for making recordings, for several reasons:

1. Representing sound in the form of millions of digital bits made it possible to preserve and recreate the original sound using entirely new types of media. For example, instead of creating a very long, spiral rut in a physical medium that reproduces sound when a needle is dragged through it (the way a vinyl record recreates sound), digital information could be stored on a surface and read by a laser, which is what happens when a CD is played.

This made a huge difference, because it solved several problems that were associated with vinyl records. For example, vinyl records can warp; dust and scratches can distort the sound; and — something that most people don’t realize — the sound of a recording had to be modified to record and reproduce clearly using a vinyl surface. For example, the relative speed of the needle in the groove gets slower as the needle approaches the center of a record. (It’s covering less distance each time the record spins.) As a result, low-frequency bass sounds don’t reproduce as well at that part of the record, which is why many vinyl records have the songs arranged with quieter songs close to the center of the record. (Check out the vinyl version of Bridge Over Troubled Water by Simon and Garfunkel, for example. Both sides of the record have quiet songs without much bass closest to the center of the record.)

2. Switching to digital made it possible to solve problems created by having to combine 16, 24, 48 or more individual tracks into a single stereo recording. For example, as each track in a multi-track recording became sonically adjustable in more and more ways, the amount of information for the recording engineer or producer to keep track of, when mixing all of those tracks together to creating the final stereo mix, increased dramatically. For instance, the vocal track in a recording could be made to have more or less reverb than the other sounds in the mix, or made to sound a little brighter so it stood out in the mix. This kind of adjustment to individual tracks could be done manually, up to a point, but as the number of tracks increased, doing so became more and more of a practical challenge. For a computer, of course, this isn’t a problem.

Furthermore, factors such as the volume of an individual track could now be adjusted during playback, adding to the complexity of making a real-time mix of multiple tracks into a stereo result. (This is often an issue, for example, with a vocal recording, where the singer’s live volume is rarely the same from start to finish.) When I first recorded in a multi-track studio, making a mix — combining all of those individual tracks into a single stereo track — was a challenge. Multiple people had to sit at a control board to make all of the changes to the individual tracks in real time as the recording was played back. Now, a computer can make the changes for you, and it makes those changes exactly the same every time, so if you don’t like the result you can go back and change one detail and run the whole process again.

Another issue that arose because of multi-track recording was the reality that multiple tracks being combined into a final stereo mix often ends up muddying the sounds when they’re put together. But with digital technology, the producer can not only control the volume of each individual track, and where it’s placed in the “stereo spread” (for example, putting the tambourine off to one side and the background singers off to the other side), the producer can also alter each sound in very specific ways to avoid getting a muddy-sounding result when they’re combined. Today, most producers will boost different audio frequencies in the most important tracks so that they emphasize different audio ranges. Although few listeners could detect this, it results in our brains hearing those tracks distinctly. It’s a digital trick that would be almost impossible to do without a computer, but it makes the final mix sound clear, not muddy.

3. Digital recording allows types of sound alteration that are limited — or impossible — in a non-digital format. For example, to add reverb to a recorded track, it was common in the 50s, 60s and 70s to play the recorded sound (such as the featured vocal) in a large underground chamber and re-record the result, making it sound as if the vocal had been recorded in a large hall or an “echoey” space to begin with. With digital manipulation, not only can the same effect be achieved inside the computer, but the exact nature of the “space” the person appears to be singing in can be adjusted in almost infinite ways.

Other digital effects include compression, which evens out the sound volume automatically, to whatever degree you want. Most pop recordings, for example, use compression to ensure not only that the lead vocal is a fairly consistent volume, but to ensure that the overall recording stays at about the same volume level from start to finish. (This is a characteristic of pop music that sets it aside from, say, an orchestral recording of a classical piece, where the full range of volume from very quiet to very loud is usually left untouched.)

4. Computers not only create and alter sounds, they can also manipulate pre-recorded sequences of sound, such as drumming. Today, creating a drum part is vastly different from using a so-called “drum machine” from 30 years ago. The first drum machines could program individual drum sounds to play back in a certain order. In contrast, today, an artist can take drum sequences played by a top studio drummer and shape them into the part the artist wants to hear, including altering the tempo without changing the pitch of the sounds. The result is indistinguishable from a drummer actually playing the part. In essence, an artist can now create a professional drum part by simply knowing what he or she wants to hear, without spending 20 years learning to play the drums or hiring a professional drummer.

Digital recording has gradually lowered the cost of doing all of this. Not only did computers make it possible to manipulate the sound in remarkable ways, the steadily dropping cost of this technology has made it possible for ordinary people to make elaborate, professional-sounding recordings in their own homes. This has had a profound effect on the music industry, as artists not backed by giant industry corporations can record and release their own professional-quality recordings. Although some professional producers believe this has lowered the overall quality of released music, there’s no question that it has allowed artists of all kinds to participate in creating and publishing music — something that was unheard of prior to the 1980s.

It’s worth noting one downside of getting computers involved with recording: They’ve become so sophisticated today that they can now generate a recording by themselves, even writing the song, based on very simple information you provide. This makes it easier than ever for someone to create a song and a polished recording of it. Unfortunately, it also eliminates a lot of creativity that would previously have come from the artist. Whether this will turn out to be a good thing or a bad thing remains to be seen.

A Blessing Worth Appreciating

Listening to music today is an amazing experience, and it’s there for any of us to enjoy. I believe it’s one of the great blessings of being alive right now. Like most blessings, there’s no requirement that we take advantage of it. But what we listen to — and how we listen to it — changes our experience of the world around us, and that changes us. So, if it’s been a while since you listened to your favorite recordings, take an afternoon off and revisit the listening experience. It’s a blessing worth appreciating!