The Song of the Hydra: multiple lead vocals in modern pop music recordings

by Timothy Warner, Salford University


 1  Introduction

The human voice holds a special position in music: composers and commentators, for example, often make the simple distinction between vocal and instrumental forms. Unlike all other musical instruments, the human voice is able to combine speech – and hence language – and music in a single sonic gesture, two separate forms of expression encapsulated in one utterance. Furthermore, unlike the instrumentalist, every singer is able to bring the vast range of timbres and styles of delivery derived from the myriad subtleties of verbal communication to their art. The sheer scope of vocal utterance, coupled with its explicit linguistic message, inevitably gives singing a privileged position in music. Moreover, the voice, whether singing or speaking, is always with us: when it is not ringing in our ears, it is present in our thoughts. So when the autodidact Thomas Alva Edison (1847-1931) conducted his first experiments with the prototype phonograph – allegedly on August 12, 1877 [1] –, it was hardly surprising perhaps that these initial trials consisted of him shouting into the machine the words of the nursery rhyme 'Mary Had a Little Lamb'. And from this moment, the human voice held an equally special and important position in the history of audio recording. In fact, virtually every phase of artistic and technological change in audio recording is punctuated and represented by recordings of the human voice. It is therefore ironic that John Philip Sousa (1854-1932), one of the first composers to benefit financially from the phonograph through the success of many recordings of his marches, should concern himself with the detrimental effects the phonograph may have on the human voice. Sousa writes: "Singing will no longer be a fine accomplishment; vocal exercises will be out of vogue! Then what of the national throat? Will it not weaken? What of the national chest? Will it not shrink?"[2].
This early emphasis on singing was consolidated when the first great record producer identified the first international recording star. When Fred Gaisberg (1873-1951) recorded the young Italian tenor Enrico Caruso (1873-1921) on 11 April 1902, he personally guaranteed the exorbitant £100 fee demanded by the singer, so convinced was he of the singer's potential with the new medium [3]. And as competition among singers grew during the first decade of the twentieth century, so ever greater care and attention to detail was devoted to the recording process itself: a "distance test" recording of Dame Nellie Melba (1861-1931) has survived [4], illustrating the lengths to which she went in order to ensure the ideal balance between voice and piano, and to achieve the highest recording level before distortion became unacceptable. Example 1:  Nellie Melba Distance Test

 2  A New Kind Of Singer

With the arrival of the electric recording process in 1925, singers and producers began to explore and develop new techniques that flew in the face of traditional musical performance practice. The crooner, a new kind of singer, emerged during this period; a singer who did not aim to project their unamplified voice into a vast auditorium, but instead relied on a closely placed microphone that could be subtly manipulated to support a whole range of new vocal performance gestures [5]. Singers realized that the new recording technology could be used to produce an unforced and intimate range of vocal sounds. Just as the use of the microphone changed the way singers expressed themselves musically, so the acoustic relationship between recording artiste and audience also began to change: the performer could now be "up close and personal", apparently speaking directly, and closely, to the listener. As Derek B. Scott relates, crooning "added to the intimacy of a musical commodity that could be consumed personally and privately in the home" [6].
Example 2: Bing Crosby's 'When the Blue of the Night'
From this moment the projected character of the singer becomes an increasingly important element of their performance: the sophisticated yet streetwise persona of a Sinatra or the reassuring homeliness of a Crosby became an intrinsic aspect of their art, and would have been less likely without the advantage of a closely placed microphone. So, when Sam Phillips discovered a singer who could convincingly combine the vocal styles of white country music and black rhythm and blues into an amalgam that seemed to speak directly to the increasingly economically empowered youth of the USA, the gyrating hips, sneering lips and gravity defying quiff of the young rebel who was "all shook up", were part of that appeal. But Phillips also realized that the distinctive image and the sound of Elvis Presley's voice might not be quite enough to distinguish him from his competitors. Hence Phillips drew heavily on slapback echo derived from magnetic tape technology to provide an unnatural, almost spooky, quality to the lead vocal on the early Sun recordings [7]. And suddenly the voice in recordings of popular music did not just seem intimately close to the listener, it seemed internalized: like an imagined voice heard inside the listener's head. As Evan Eisenberg points out, effects like slapback echo "serve not to project musicians in exterior space, but to direct listeners' attention to different zones of interior space" [8]. It is significant that when Elvis went to RCA, "Steve Sholes [RCA producer] was adamant that Phillips's sonic treatments be adhered to as closely as other studios would allow" [9]. Example 3: Elvis at Sun - 'Tomorrow Night' Clearly, when it comes to combining more than one voice on a recording, achieving the right blend and balance of timbre is crucial. In this regard it is interesting to note just how many popular music vocal ensembles are made up of people who are, or appear to be, related: the Mills Brothers, the Andrews Sisters, or the Everly Brothers, for example. As Ross Barbour, founding member of the Four Freshmen, relates: "There haven't been many great vocal groups that didn't have some family in them [...]. The vocal sound is similar by heredity and the pronunciation is the same, because the members grew up in the same environment"[10]. The Beach Boys took this a step further: for the vocal tracks on the song 'Surfin' USA', for example, Brian Wilson not only recorded the voices of his brothers and cousins, but also double tracked them, creating a vocal sound that was the result of both heredity and technology [11]. Example 4: The Beach Boys' 'Surfin' USA' The closer the various voices that appear on a recording are to each other in terms of timbre, the more likely the listener is to believe the multiple vocal tracks are, quite literally, "singing with one voice". The process of multitrack overdubbing enables a singer to achieve the greatest level of timbral congruence: multiple vocal parts performed by the same voice. While this method offers enormous musical and sonic advantages, like the crooners with their microphone technique and the slapback echo of rock and roll, it challenges traditional notions of song performance, privileges the artificial over the natural, and challenges the social aspects of music-making and music reception.  Overdub recording has a long and rich history. Sidney Bechet's 1941 recording of 'The Sheik of Araby' [12] in which he played all the instruments is an early example, and the pioneering work of Les Paul is well documented.  Example 5: Sidney Bechet's 'The Sheik of Araby' Although technologically intensive, highly time-consuming and demanding a radically new way of approaching musical creation, the success of recordings by The Beatles ensured that the overdub process increasingly became the standard way of making popular music recordings from the middle of the 1960s onwards [13]. Indeed, several different techniques associated with tape technology were explored by George Martin, Geoff Emerick, Ken Townsend and The Beatles during this period: vocals were routinely subjected to double- tracking, automatic double-tracking (A.D.T.), multiple recordings of the same voice, flanging, being speeded up, slowed down or even played backwards. Many of these processes give the voice an unnatural, "out of this world" quality. It is noteworthy that George Martin writes of John Lennon wanting to "make real the voice he heard in his head" [14]. Example 6: The Beatles' 'Lucy in the Sky' Since then, the use of multitrack overdubbing as a technique and the application of extensive signal processing, often for timbral modification or spatial positioning, have become established as a central part of the production of pop music recordings. As a consequence, listeners have grown increasingly accustomed to hearing a single, identifiable voice apparently performing contrapuntally with itself, often appearing in a series of sonic guises differentiated by timbral and spatial modification, within the same recording. This paper will go on to demonstrate how this increasingly pervasive practice is implemented through a brief analysis of the vocal tracks on a recent, successful chart pop recording.

  3  'Overprotected'

'Overprotected' [15] was the ninth UK top ten hit for Britney Spears. Entering the singles chart on 2 February 2002, its highest ranking was number four. As well as being released as a single, it appears on the album Britney [16] and the promotional video for the recording is included in the collection Britney: The Videos [17].
'Overprotected' was written and produced by Max Martin and Rami; it was recorded and mixed in Stockholm and New York, with the guitar tracks credited to Max Martin, the turntable work to Daniel Savio, and the "background vocals" to Britney Spears and BossLady. This recording highlights an important characteristic of much modern chart pop music: although one talks of "'Overprotected' by Britney Spears", Britney the person played a relatively minor creative role in the production. She did not write the music or lyrics, find, programme or sequence the many samples and synthesized parts, or even play any of the instruments heard on the recording. Similarly, Britney Spears the person played a relatively modest creative role in the promotional video: direction, choreography, even make-up, hairstyle and clothing were in others' hands. Hence Britney Spears the pop star, as opposed to the person, functions rather like a commercial product: a whole team of people are involved with the creation and marketing of a brand. Paul Morley neatly captures this essential point when he describes pop stars like Britney Spears as "fictional characters that other people have written a soundtrack about" [18]. The recording of 'Overprotected' follows a typical chart pop song structure: Introduction - Verse 1 - Bridge 1 - Chorus - Verse 2 - Bridge 2 - Chorus - Middle Eight - Chorus (X3). The arrangement consists largely of sampled and synthesized sounds with several guitar tracks that have been subjected to extensive signal processing (equalization, flanging, filtering, etc.). It also exhibits extreme attention to detail: the level, envelope, and timbre of each sound has been considered with the utmost musical care. The resulting recording has a textural complexity that is musically impressive, while still retaining a strong and direct sense of character (which is both immediately asserted and then sustained throughout), and a determined, almost relentless, momentum.  Like a great deal of modern pop music, 'Overprotected' emphasizes the manipulation of rhythm and timbre while melody and harmony, the elements often given greatest importance in traditional musical composition and analysis, are far more limited and predictable. Melodically, 'Overprotected' is characterized by simple, diatonic lines in C minor that often move by step and strongly reinforce the rather traditionally stated tonality of the piece: there are no "blue" notes or portamento ornamentation to suggest blues or rock, for instance. The harmony of the piece, based entirely on three note chords linked in predictable and traditional ways, is similarly unadventurous. Indeed, the only hint of modulation throughout this fiercely tonal piece comes with the Neapolitan sixth progression, a device established in the Baroque, that occurs on the final cadence of the chorus, on the word "overprotected". Example 7: Britney Spears' 'Overprotected' However, the treatment of the lead vocal is neither traditional nor predictable. While the melodic range is limited to a minor sixth and draws upon a small number of repeated rhythmic cells that strongly support the predominantly anapaest rhythm of the lyrics, multitrack overdubbing and extensive signal processing are used to ensure an extremely high level of diversity. A wide range of techniques are employed on the voice including double- and triple-tracking, multitap delay, extreme panning, unusual equalization, proximity to the microphone (and hence the listener), and the varied application of a range of artificial reverberation settings. This is evident from the beginning of the recording with the short spoken introduction that lasts a mere 14 seconds. It starts with the whispering of several voices presented backwards and is followed by a few short phrases ("I need time, love, joy. I need space. I need me") that are broken up by being subjected to four different sonic treatments: there is a main voice, a voice panned hard left, another panned hard right and a final voice that has a relatively fast multitap delay. Example 8: Intro to 'Overprotected' Similarly, the first chorus begins with a couple of short questions and answers:
     "What am I to do to win my life?               
     (You will find out don't worry)
      How am I supposed to know what's right? 
          (You've just got to do it your way)"

 The voices that pose these questions are triple-tracked with one voice placed centrally and the other two panned hard right and left, while the answers are double-tracked without panning but timbrally modified through the use of a vocoder. As the piece progresses, double-tracking, multitap delay, and vocoder effects are increasingly applied to short phrases or even single words within phrases. Moreover, secondary contrapuntal lines are increasingly added (e.g. "help the way I feel" in the second chorus) to sections when they are repeated. This breaking up of phrases through overdubbing and signal processing is most extensive in the latter part of the middle eight section and here a further technique is introduced: the sample-like stutter (on "what I, what, what, what I'm gonna"). This involves the rhythmic repetition of a word or part of a word reminiscent of digital sample manipulation so popular in the 1980s – the most well known example is, of course, the "N-N-N-Nineteen" that appears on Paul Hardcastle's 'Nineteen' [19]. Incidentally, it is rather ironic that a technique developed through keyboard control of short digital recordings should subsequently be emulated by singers and become an accepted part of modern pop vocal performance practice [20]. Example 9: 'Overprotected Middle Eight' The fundamental point here is that the single lead vocal, which has been a staple part of popular song for so long, is being modified and fragmented for expressive purposes as a result of technological manipulation. If we accept that the use of the microphone enabled a new kind of intimate relationship between singer and listener, and that slapback echo and related magnetic tape-based effects tended to suggest an internalized voice to the listener – almost in the manner of a soliloquy – then multitracked, multiple lead vocals might be seen to function rather like an internalized drama: a series of voices (which are fundamentally identifiable by the listener as the same voice) discussing a topic. In this regard, it is perhaps appropriate to mention that the lyrics of 'Overprotected' explore an issue that is likely to be of primary concern to a young or adolescent audience (i.e. the bulk of those people who buy records by Britney Spears). One might also add that the use of multiple lead vocal parts in modern pop recordings might be echoed by, and have an impact upon, the accompanying visual imagery. Hence in a section of the video for Natasha Bedingfield's 'These Words' [21], for example, the separate, simultaneous voices in the recording are presented by several differently clothed Natashas, performing each particular line. Similarly, the back cover of the Britney Spears album on which 'Overprotected' appears has the following image: multiple images of Britney. Example 10: OHP of back cover image.

 4  Conclusion

Multitrack overdubbing and extensive signal processing have given rise to the inclusion of several different voices by the same singer on modern pop recordings. This practice has several important implications. On a purely musical level, the recording blatantly challenges notions of traditional performance practice: it simply could not be re-created live. Moreover, being able to combine several different versions of the same voice on a single recording produces a timbral synergy which surpasses that of the great vocal ensembles of the past. Instead, the listener is presented with an impossible, "super-representation" of a single character. Finally, in terms of creativity, possibilities for expression and contrapuntal complexity are increased, giving rise to musical artefacts that display greater range and depth.
From an aesthetic perspective, a recording like 'Overprotected' is perhaps best listened to with headphones since the spatial manipulations that form a vital part of its character, are more clearly audible. Yet the use of headphones, especially when connected to a portable, walkman-like device, tends to undermine completely the traditional social aspects associated with music and music-making: the single listener is isolated from the outside world with their individual choice of pre-recorded music resonating inside their head. Furthermore, the multiple vocals found on many modern pop recordings would seem ideally suited to such a listening environment. They are rather like radio plays: multiple internalized soliloquies, approaching a particular, often intimate, issue from several perspectives, and would consequently seem incongruous if performed before a large audience in an arena. The title of this paper – The Song of the Hydra – makes the link between the practice of recording multiple lead vocals and a mythical beast, the many-headed monster of Greek mythology. Multiple lead vocals on a recording make explicit the mythical element present in all audio recording. As Evan Eisenberg points out, "there is no original musical event that a record records or reproduces. [...] The original musical event never occurred; it exists, if it exists anywhere, outside history. In short, it is a myth" [22]. Multiple vocal tracks on a single recording, like crooning and slapback echo, are a further indication of the fundamental and ever-growing differences between recording and traditional musical performance practice. Over the past 100 years recording has developed its own techniques and relationships with listeners that are not only alien (and inappropriate) to traditional performance practice but also simply unachievable.


[1] Roland Gelatt, The Fabulous Phonograph 1877-1977, Cassell and Company Ltd., 1977, p. 21.

[2] Quoted in ibid., p. 147.

[3] Gerald Northrop Moore, Sound Revolutions. A biography of Fred Gaisberg, Sanctuary Publishing Ltd., 1999, pp. 91-96; Enrico Caruso, Opera Airs and Songs Milan 1902-04, 1985 (EMI CDH 7 61046 2).

[4] Nellie Melba, Melba, Nimbus Records, 1997 (NI7890).

[5] They Called It Crooning, 1984 (ASV CD AJA 5026).

[6] Derek B. Scott, From the Erotic to the Demonic. On Critical Musicology, Oxford University Press, 2003, p. 84.

[7] Elvis Presley, Elvis at Sun, 2004 (BMG 82876 61308 2).

[8] Evan Eisenberg, The Recording Angel. Music, Records and Culture from Aristotle to Zappa, Pan Books, 1988, p. 53.

[9] Mark Cunningham, Good Vibrations. A History of Record Production, Castle Communications, 1996, p. 33.

[10] Quoted in Charles L. Granata, I Just Wasn't Made for These Times. Brian Wilson and the Making of Pet Sounds, Unanimous Ltd., 2003, p. 40.

[11] The Beach Boys, Twenty Golden Greats, 1987 (EMI CDP 7 46738 2).

[12] Sydney Bechet, Really the Blues, 1993 (ASV CD AJA 5107).

[13] The Beatles, Sgt. Pepper's Lonely Hearts Club Band, 1987 (EMI CDP 7 46442 2).

[14] George Martin and William Pearson, With A Little Help From My Friends: The Making of Sergeant Pepper, Little Brown & Co., 1995, p. 79.

[15] Britney Spears, 'Overprotected' (CD Single), 2001 (Zomba 9253072/LC07925).

[16] Britney Spears, Britney, 2001 (BMG 82876 53637 2).

[17] Britney Spears, Britney: The Videos, 2001 (Zomba Video DVD 9222798).

[18] Paul Morley, speaking on Front Row, BBC Radio 4 (14 November 2003).

[19] Paul Hardcastle, 'Nineteen' (1985), Number Ones of the 80's, 1993 (EMI 7243 8 27013 2 3).

[20] See Timothy J. Warner, Pop Music – Technology and Creativity, Ashgate, 2003.

[21] Natasha Bedingfield, 'These Words', 2004 (Arista 82876639182).

[22] Evan Eisenberg, op. cit., p. 41.