The Representation Of Space In Audio And Audiovisual Works.

Michael Bates

Faculty of Architecture Design & Planning University of Sydney

This email address is being protected from spambots. You need JavaScript enabled to view it.



Since the introduction of film sound in the late 1920’s, the mainstream aesthetics of representing space in audio production has followed two parallel paths- one in music and one in cinema.



This paper offers a brief discussion of some key concepts related to the aesthetics of spatially arranging audio in music recording and cinema audio production, citing a series of exemplars from a short history examining the evolution of  ‘staging’ sound in music and cinema audio production and sound design.



The paper further discusses the construction of space with sound and the arrangement of elements within that space, with reference to the shibbolethic promise of the reproduction of the ‘real’ that has pervaded the discourse regarding audiophile stereophonic recordings and cinema surround sound, and looks at the advent of new contexts for the representation of space in audio and audiovisual works.



It identifies and discusses the possibility of an underlying aesthetic equivalent, in much of pop and rock music production, as well as in mainstream cinema, of a dynamic aural mis en scene that has, for the most part, particularly regarding music production, been left unremarked upon by the critical literature.




“All spaces, whether they are moving-image spaces, landscape spaces, architectural spaces or sound spaces, contain within them their own contours, features, dynamics and hence perceptual logics.


In a movie theater, the image, no matter how much depth of field it contains, is bound in a two-dimensional box. Meanwhile the sound, even if it is a simple stereo image, is continually setting the boundaries of the perceptual space in a fluid way and can move around and occupy any part of that space or several at once.”(Jones 2006: 21)



The practices of audio production, and in turn sound design, have developed into complex forms of constructing discourse, with their own poetics, that is, their own grammar, conceptual shorthand, and a multiplicity of genre structures and forms that are referenced albeit often tacitly and seemingly transparently. This is particularly true when it comes to representing space across different media.



Ostensibly, for many audio practitioners, the spatial arrangement of sound in a mix is a fairly straight forward matter with few alternative approaches, but on closer examination there is a rich array of complex often contradictory codes at play that have been employed throughout the history of audio production in both the music and film industries.



It is easy to make assumptions that techniques that work in an aural only mode will translate to a sound for picture mode or vice versa, or for example that multichannel techniques developed for cinema or home theatre will always be appropriate for music production or again vice-versa. In fact differences between audio only and sound for picture modalities can be quite marked.



For example, voices are treated differently in the two media, albeit they cross-pollinate. A reverberant voice in the iconic Chess or Sun rock ‘n’ roll music style is not generally employed in cinema and would not be read the same way. Reverberant voices tend to be used in film voiceovers to represent inner thoughts or might be used to evoke an eerie, uncanny or disembodied threat, or, if synchronised to picture, perhaps a visiting alien or malevolent spirit rather than the teenage angst of the spirited outsider or their chorus. Yet, as we shall see, there are connections.



In the absence of image, the listener creates their own visual impression and, in terms of space, their own imagined geographies. As Martin Esslin succinctly put it:



‘The nature of man’s consciousness and sensory apparatus is predominantly visual, and inevitably compels him to think and imagine in visual images. Information that reaches him through other senses is instantly converted into visual terms.’(Esslin 1962:131)



When image is present even an extremely panned sound may be perceived as coming from a different spatial origin if it is linked to a visual. This ‘spatial magnetisation’, according to sound film theorist Michele Chion, happens reflexively and ‘in spite of the evidence of our own senses.’ (Chion 1994: 70)



On the one hand picture respatializes audio, but on the other it can also increase our awareness of the boundaries and make audiences conscious of the replay mechanism (the Brechtian moment). Indeed the dominant (Hollywood) visual paradigm is to eschew revelation of the mechanism, which often lies contra to audio listening, which revels in production techniques –in Dennis Smalley’s concept of ‘technological’ or ‘recipe’ listening. (Smalley 1997)



With more music being mixed for DVD, issues in sound design for picture become increasingly relevant and with the advent of HD-DVD, BluRay and new 5.1 streaming of audio it is more likely that producer /engineers used to a stereo aesthetic will be called upon to produce more spatial outcomes. Indeed another ‘gimmick factor’ (see below) may come into play i.e. product that commercially exploits spatialities in a similar way to how Sun and Chess produced their new distinctive and saleable sounds in the 1950’s. Indeed at a time when sales are declining as a consequence of new media, one can see the commercial attraction of offering material only readily enjoyed on media that is much more difficult to pirate.



As such it is hoped it may be worthwhile to touch upon some of the underlying themes regarding representing or constructing space, particularly in popular music production and in sound design for cinema, noting similarities and differences between the two modes.


2  Space in Music


It is important to note a long prerecording era history in the use of reverb and echo in the production of music and indeed in the drama of mankind– for example in the design and use of the cathedral architecture in Gregorian chants, and in subsequent Renaissance musics. 



Since antiquity, humans have been concerned with respatializing performed sound- off-stage voices and instruments in classic theatre, the resoundings of oracles from their shrines. In the ancient world there were deep linkages between reverberant space and the sacred or magical. Indeed, it has been argued (Reznikoff 2005), that prehistoric cave paintings took place in sites chosen as much for their acoustic properties and would have been accompanied by ritual singing, reverberating and resonating throughout the dark alcoves.   Reverberation was heard as a signifier of the power of the gods or their agency- the pharaoh, the king, the priest or the emperor- and this in turn affected the choice or design of various performance spaces, which, again in turn, has shaped expectations of recorded music.



In Western music, a long lineage of composers have explored the arrangement of music in physical spaces, from the antiphonal Syriac chants in the third century, to the polychoral work of Willaert and the Venetians, Andrea and Giovanni Gabrielli, in works by Thomas Tallis and Bach, in the various spatial arrangements in works by Vivaldi, Berlioz, Tchaikovsky and Mahler, George Ives with his mapped out marching bands, and John Cage, György Kurtág, Luigi Nono and Henry Brant to name a few.



However, Michael Clarke argues, it is:


‘Not until Stockhausen that space is formally incorporated into the compositional process, given equal weight as other parameters and applied in various ways as movement, position (place and distance), envelopment, structural proportion, symmetry and etc.’(Clarke 1998: pp 221- 246)



Stockhausen’s methods of work with multichannels of sound playback, rotating loudspeakers and more recently of even mounting performers and speakers to be flown around in helicopters- in the Helicopter Quartet (1992-93) - have rarely been emulated in terms of the spatial being so integral to the music, or in the detailed techniques and scale of his approach to spatial composition. Arguably the first to compose quadraphonically with Kontakte in 1960, in terms of recording and documenting his work it is still difficult to imagine how to ‘reproduce’ Stockhausen’s works such as his 55 loudspeaker performances at Expo ’70 in Osaka, through conventional audio playback systems.



Moreover, music itself has long developed spatial metaphors and codes. According to composer Thanos Chrysakis:



‘Composers have always created auditory spaces of different dimensions—that is, different kinds of musical spaces, from two-dimensional space (a simple melody) to more elaborate auditory dimensions (the dimension of depth through the permeability of the sounds, the dimension of directionality through their various speeds and movements and so on). Multidimensional composed auditory spaces can be found in—-amongst other places—fugues, sonatas and contemporary music that uses net-like structures, sound masses, polyphonic multi-layering, etc.’(Chrysakis 2006: pp 40-45)



In simple terms, composers, by using differences in dynamics, might imply closeness through loudness, and relative distance through softness in instrument volume, while rises and falls in pitch might imply vertical ‘up’ and ‘down’ movement, a phenomena long exploited in animation to underscore movement of objects and characters and dubbed ‘Mickey Mousing’ and used, albeit usually more subtly, across a range of cinema soundtracks- for example, in Wings of Desire (1987) where the audience tracks the point of view of the angel as he flies out of the aeroplane window and the descends at the end of the shot into the apartment building accompanied by rises and falls of pitch.



Many of these techniques were well in place by the eighteenth century, and as a consequence says Peter Doyle:



Orchestral music came to have a kind of world making capability [and] these worlds became in turn emblems or externalisations of the composer’s.. inner state; the musical landscapes.. became a “screen” onto which was projected the inner state of the composer.’(Doyle 2005:58)



Moreover, music composers, particularly in the popular music domain, by the late nineteenth and early twentieth century (if not earlier) drew on what Doyle (Doyle 2005) describes as ‘a mélange of musical features.. what were rather crudely drawn geographic, cultural and racial conceits’ that included “Hawaiian”, ”cowboy”, “Irish”, “Latin”, and blackface tropes that contained within them spatial signifiers.



Hence, once these passed into recording practice, for example, ‘the sliding steel guitar.. came to stand for “Hawaiian music” and the Hawaiian landscape, while later the electric steel guitar would also come to be associated the landscape of the “western”, then the surf (where the waves are heard as the ‘landscape’ of the sea) and the spy movie (the tough guy by association with the cowboy). Hank William’s steel players would use ‘ train whistle wah-wah [and] sobbing tremolo to reference “offstage places”, to provide a voice from outside the studio.’(Doyle 2005:58)



Ironically, in many ways a lot of the richly developed spatial music codes and physical techniques of the prerecording era had to be put aside with the arrival of recording, as the space collapsed to a narrow mono bandwidth and limited dynamic range, and spatial musics were impossible to meaningfully record. Many spatial concepts had to be rediscovered and/or were reinvented often to be used in widely different contexts.


3 Early Recording And the Quest for The Facsimile


Historically, one can see almost every technical innovation regarding spatial audio seems to have been accompanied by a commercial appeal to aspirations for realist replication of performance. From discussions regarding the ‘transparency’ of microphones and loudspeakers, to ‘trimensionality’ of stereo in the 1930’s and again in hi fi in the 1950’s, marketing appeals have been made on the basis of the ability to either replicate an original audio performance or create/recreate a three dimensional space. Similarly with Cinerama and then surround sound, and then IMAX and home theatre systems, spatial ‘reproduction’ is often presented simply as a range of technical audio engineering issues to resolved rather than recognizing the more complex underlying cognitive/psychological factors involved in spatial representation and the apprehension of audio space, most particularly that of memory and associative cues drawn from both the real world and other mediated content.



For example, the rhetoric in 1950’s hi fi advertising seem surprisingly resonant with recent appeals to similar desires and expectations in audio, home theatre, Imax and broadband. In 1955, Ampex claimed their tape was:



‘The signature of perfection in sound.. [that] takes you beyond the hi-fi barrier.. and you’ve pierced the last barrier that stands between you and actually experiencing the complete realism of the original performance.’(Audio 1955)



Or the ‘New Bozak 304 luxury speaker system’ which, with its:


‘Stereo fantasy- adds the ultimate ingredient of realism.. Spatial Perception.. with two channel stereo program material, it recreates an unbroken front of living sound over the whole angular listening area, retaining throughout virtually the entire room, the sense of perspective and homogeneity of sound that is the essence of the ultimate in realism.’(Audio 1956-57)



The advertising offered to transport the listener to alpine peaks, tranquil forest settings, seasides, desert sands, and experience other galaxies, earthquakes and even sounds from other worlds. Yet often early stereo ‘product’ had less to do with its much hyped up ‘realism’ and more to do with content that panned dramatically to draw attention to the replay medium, either as an aesthetic or a merchandising ploy.



This same appeal to realism has continued until the present day. For example when Dolby introduced their stereo systems to cinema:


‘The advertising for Dolby Stereo often emphasised the ‘realism’ of sound, presumably because it eradicated the difference between sound recording and the actual sound and also delivered sound material around the listener (offering the equivalent of sound patterning in daily life.’(Whittington 2007:118)



As Peter Doyle writes:


‘Later still, stereo headphones, the Sony Walkman, hi-fi home stereos, large public sound systems, discos and so forth would all seek immersion responses from their listeners- the willing sense that the actual physical space the listener occupies is for the moment less real, less important, even less convincing than the virtual aural dimensionality within the recording.’(Doyle 2005:81)



The introduction of digital media, such CD, DVD and now BluRay and HD-DVD can all be seen as further efforts to come closer to the oft-stated desire of reproducing ‘reality’ in one’s own living room- in other words to deliver the facsimile.



It all seems to have started with Edison.



The early physical arrangements of recording elements were for a mono and narrow bandwidth world. Pre-electric recordings necessitated a prioritising and discarding of instruments.[i] Musicians were strictly spatially arranged with the singer or soloist closest to the recording horn, privileging the voice and letting the audience get closer to the performer than they would in a real performance space. Music needed to be performed forte with a narrow dynamic range, which undermined spatial metaphors in music. And so, in those early recordings, as Peter Doyle argues, the recordings ‘came imbued with their own spatial codes; they constructed their own proscenium arch’ (Doyle 2005:50) with the various placements of instruments and the content of the music composition constructed to superimpose fictional “sound pictures” onto the playback space- or at least that was the promise.



This ideology was typified in the release of content known as ‘descriptive recordings’, with a blurb in one brochure from 1897 for a piece called “Morning On the Farm” that opened “a window on a farmyard scene.. ‘so real and exact that it requires but a slight stretch of the imagination to place one’s self in that position’ (Doyle 2005: 50)



This window on a ‘virtual reality’ was in essence little more than the recording of barnyard chickens.



Indeed, in spite of the limitations of the nascent recording process, from as early as 1878 when Edison first wrote about the phonograph, he promoted the idea of a sound recording producing a perfect facsimile of the original sound and presented it as fait accompli claiming he had achieved:


“The captivity of all manner of sound waves [and].. their reproduction with all their original characteristics at will, without the presence or consent of the original source.”(Freire 2003: 67-71)



Subsequently, the pursuit of recreating the performance or the concert hall setting has remained a pervasive theme in the audio technology domain particularly with respect to marketing equipment and content, often at the expense of clouding the aesthetic mechanisms of that content.



Sergio Freire describes Edison’s later marketing strategy of his phonograph from 1916 to 1926 as an intention  “to demonstrate that “it was actually impossible to distinguish the singer’s living voice from its recreation in the instrument.. With such a strategy, relying on the equivalence of live performance and recording reproduction”, he says, “Edison sought to disguise spatio-temporal disruptions brought about by the phonograph.”(Freire 2003: 67)



Indeed, by the time the recording mechanism in the early 1920’s had shifted from a mechanical to an electrical one there were many who felt that with the increase in recording bandwidth to 100-5000 Hz this facsimile was, not only desirable, but truly achievable. It meant, for the first time, audible recording of background ambience- the sound of the performance space- was now achievable.



Harry F. Olson of RCA would opine in 1934:


‘The primary object of sound reproduction is the elimination of distortion and the reproduction at the listener's ears of sound waves identical in structure with those given out at the source.’ (Altman 1992)



This became the engineering litany, the textbook definition, or as sound film theorist Rick Altman describes it,  'the sound engineer's basic assumption, his primary article of faith'. (Altman 1992)



As Ross Snyder would write in the AES journal as late as 1953:


“The search for "facsimile" reproduction of sound has gone on for more years than the recent mushrooming of the high-fidelityindustry might at first indicate.. It is, indeed, startling to realize that facsimile was accomplished as long ago as 1933.” However, though a number of parameters were established for evaluating the sound “The fifth consideration: turned out to be crucial: the preservation of spatial orientation.. No important number of observers were ever deceived into mistaking the copy for the original. In 1933 at the Chicago World's Fair, however, a binaural system, produced under laboratory conditions and presented under optimum circumstances, finally achieved the desired deception.. It remained, in that year, for the Bell Laboratories to duplicate the feat with loudspeakers, using a system of three rather than two channels, and labelled "stereophonic.” (Snyder 1953)



Yet it was during this same era that audio producers would begin to find aesthetic alternatives to Olson’s engineering raison d’aitre as the recording process shifted from one with a main purpose to document or archive a performance, to shaping the sound of the recordings.



In the same year that Olson wrote his credo for transparency, an article appeared in Everyday Science and Mechanics magazine noting the aesthetic possibilities of the new ‘three dimensional” sound system demonstrated by Bell Labs, writing:



The acoustic engineers can not only give a faithful reproduction of an orchestra but, with their filters, tone and volume controls, execute variations on it at will.. Electrically, we can have a thousand-piece orchestra. The artistic possibilities of “auditory perspective” are as yet only experimental, but they place a remarkable opportunity before composers and conductors.’(Everyday Science & Mechanics 1934.)



It is also interesting to note how, in contrast, over at RKO, film sound men such as Carl Dreher had been quick to pick up as early as 1931 (one might argue by necessity)- an aesthetic of overt deception:




'Since the reproduction of sound is an artificial process’, he said ‘it is necessary to use artificial devices in order to obtain the most desirable effects.’(Altman 1992)



Irregardless of one’s aesthetic approach: something fundamental had shifted in recording and it was very much to do with space:



‘With electric recording, suddenly it became possible to represent space in a wholly new way..  A listener might now apprehend a recording and experience a sense of physical space, other than one he was actually occupying.’(Doyle 2005: pp 56-57)



Electric recording gave rise then in the 1920’s to two competing music recording philosophies that grew out of radio practice: on the one hand, sound would be captured on a single mike in an acoustically neutral studio, while on the other, multiple mikes would be placed and, as a by-product, would pick up more of the room. The former, dry, ‘depthless’ and ‘placeless’ recordings would be regarded as a form of “realism” and used principally on ‘popular music’[ii], while the latter method with an aural depth as a consequence of picking up the room ambience would be applied to classical orchestral music and be regarded as “romanticism”.



According to Read and Welch ( Doyle 2005: pp 56-57) who identified this dichotomy, the former approach gave ‘an effect of intimacy, the orchestra and soloist being transported into the living room, the singer or soloist singing just for you’ while the latter transported the listener into the performance studio or auditorium. It’s interesting to note that this idea of the ‘romanticist’ recording tends to be later regarded as the more purist and “realistic” approach- i.e. the argument that audiophile recordings which pick up the instruments and the room ambience in situ are more ‘honest’ and ‘real’ – while the dry, close recordings have come to be regarded as part of the shaped and manipulated ‘product’. One can see how this grew out of the early stereo aspirations of Blumlein with his ideas to reproduce the wavefront and of Bell Labs idea of  ‘facsimile’ sound reproduction in three-channel stereo. This dichotomy permeates arguments to the modern day about the merits of various recording and reproduction systems, particularly regarding which methods are the most transparent in reproducing the audio space of the performance.



Yet, according to Francis Rumsey, it is not about producing a verisimilitude of an acoustic environment that is at the pragmatic heart of most practitioners employing spatial audio:


The primary aim of most commercial media production is not true spatial fidelity to some notional original soundfield, although one might wish to create cues that are consistent with those experienced in natural environments.’(Rumsey 2001:19)



In cinema, no one visually really expects to see  ‘reality’ on the screen- take for example the use of constant cuts to close-ups or extra close-up shots. So why would one expect to hear sound to match the cuts? In fact, audio perspective is constantly violated in film –sound perspective usually doesn’t change between cuts to match the shot, as it was learned fairly early on in sound film that it could in fact be quite disturbing, as Jean Luc Godard in his films from the early 1960’s on such as Masculin/Feminin (1966) so clearly demonstrated and exploited as an aesthetic.



One could perhaps argue that in cinema a romantic ‘realism’ existed from the advent of film sound- a pretence that the sound on the screen was real when in effect it was accepted that it was always illusory, albeit effacing the mechanism of that illusion (until Godard and company came along!) 


4  Early Constructions of Recorded Space



One of the first consequences of the recording process was to change the relationship between the performer and the audience, allowing the listener to get closer than ever would have been possible in a performance space. By the advent of electric recordings it became possible for the producer to choose between a more ‘dry’, ‘close’ perspective on a recording or a more distant aural perspective that included the room ambience. Indeed, through mike placement, room positioning, and using multiple microphones, various instruments could be located in very different spatial perspectives within the one recording. By the 1950’s this idea of shaping a recording would come to be described as ‘making a record’ as opposed to ‘making a recording’.



Influenced by radio, the ability to admit the audience into a singer’s personal space lent nascent electric recordings a sense of intimacy and led to the possibility of a spatial reconfiguring in music arrangements, whereby  ‘crooners’ such as Bing Crosby could gently, and seemingly personally, address their audience while their big band played in the background. As Howard Goodall describes it:



‘This was the aural equivalent of the cinema’s close-up.’(Goodall 2001:200)



However, while these warm intimate spaces could be created as envelopes for one’s living room it was also possible to transform recordings into spaces unrelated to the performing stage or studio, including more ‘existential’ spaces or what Peter Doyle dubs ‘the acoustics of otherness’ (Doyle 2005:92).



Doyle posits an awareness in recording as early as the 1930’s:


‘Of radio’s deterritorializing potentials’ where ‘sound qualities alone might be used to construct analogies to vision, with the potential to move between the real and the fantastic,’ and cites a range of examples from the 1920’s onwards where performers and producers begin to create a range of ‘virtual spaces’, including, by use of room reverb[iii] in ‘cowboy’ music, pictorial evocations of the mythic and geographic American western landscape. Indeed this reverberation would become a spatial code used to create ‘a sonic scenery.. apparently controlled, fixed and framed as a romanticized, heroic, “western” landscape painting’ using miking techniques to locate ‘the central voice on-mic, with the other voices progressively further off-mic [so that] they set up a sense of here and there, of centre stage and side of stage, of self and other, apparently confident that record buyers would supply the absent visual accompaniment.’ (Doyle 2005:105)



Moreover, beyond the absent visual, the reverb and echo in the music also now came to signify the heroic aloneness of a solitary figure riding off into the sunset.



Meanwhile, Robert Johnson would be recorded in 1936 in the Gunter Hotel playing facing the corner of the room and as Peter Doyle describes it:



‘By singing directly into the corner of the room, Johnson enacts a temporary exponential enlarging of his own subjectivity. The room architecture itself is enlisted so as to become a temporary aspect of the self. The walls ring, resonate and echo.. Johnson successfully invites the audience to consciously zero in on his space… The singer is a world-maker, a demiurge. He is both creator of the world and the actor within it.. Having set up this space Johnson then uses it as a stage, a mis en scene.’ (Doyle 2005: pp 78-82)



At that point, a recording was no longer a realist, but poorer quality facsimile of a performance, it had become ‘a record’ and the audience was invited to enter and immerse themselves within its subjective territory.


5  Fresh Sound



The idea of ‘making a record’ – achieving a “sound” as opposed to capturing a performance has a long tradition in popular music. The ‘Sun’, ‘Chess’, ‘Liverpool’, ‘Motown’, ‘Seattle’, ‘Chicago’, ‘Manchester’ ‘sounds’ are but a few that come to mind.



There can be a light year of difference between making a ‘recording’ and making ‘a record’. The latter is the search for sounds, for style, for statement beyond capturing performance. There is a gimmick aspect to it – the sound of “the sound” – and certainly those at Sun and Chess records were aware of it.



As Phil Chess put it: “We saw that the original and fresh stuff, the fresh sound was what sold records.’(Doyle 2005:163)



What is important to realise is there has often been a strong spatial dimension to those ‘sounds’, from the early rock ’n roll slap echo to Phil Spector’s ‘Wall of Sound’ (an amorphous spatial mass drenched in chamber reverb) to the Phil Collins compressed drum kit resounding in the Townhouse stone room on “In the Air Tonight”.



Les Paul and Bill Puttnam’s exploration of sounds in the late 1940’s was influential and represented a further move away from representing performance or pictorial sound spaces.



In 1947 at Bill Puttnam’s Universal studios, Jerry Murad and the Harmonicats recorded ‘Peg of My Heart’. Putnam is generally credited with that recording as being the first to use artificial reverb –the echo chamber- on a pop music recording and initiating the ‘echo craze’ that ensued in the following years as the technique became an overt recording gimmick. He was most certainly not the first to use the technique on recordings as it had been in use in radio drama and cinema since the 1930’s[iv], while Pierre Schaeffer at the same time took pre-recorded sounds, such as trains, and replayed them, altering their ambience and then re-recording them, but according to Bruce Swedien, an engineer at Universal at the time:



‘It was the first pop music recording where artificially controlled reverb was used for artistic effect. Many of the recordings that were done prior to that had reverb, but it was part of the acoustics of the recording environment. Bill's contribution to the art was that he literally came up with the design of the way the echo or reverb sound is sent from the recording desk and the way it's returned to the mix so that it can be used in a variable amount’. (Swedien undated)



Importantly, this ability to dynamically manipulate space became a dominant feature in much of popular music recording to the present day. Producers began to use varying amounts of reverb between verses and choruses, and even between lines ultimately with the effect of seeming to imbue the singer with the ability to have a kind of shamanistic  ‘dominion over space.’(Doyle 2005:157)



From 1947 until the coming of rock in the early 1950’s a number of artists including Patti Page, Les Paul and Mary Ford, Rosemary Clooney and Vaughn Monroe all made extensive use of spatialising effects in their music productions in increasingly non-literal and expressionistic ways.



Les Paul’s success, with his use of multitracking, delay, echo, reverb, phasing, flanging and tape speed manipulation ‘did much to free the recorded music product from the need for it to be a true, pure analogue of a prior real-world sonic-musical event’  (Doyle 2005:146). Yet Paul brought a different sensibility to spatialising practices and ‘conceived of his music on a vertical rather than a horizontal axis’ such that while the strategy might be spatial it was not in the sense of a place or a geographic landscape.



Yet this essentially West Coast based school of using echo and reverb still framed the listening within a virtual proscenium arch and ‘ rarely invited its listeners to “inhabit” its virtual spaces, to become co-creators of its inner worlds in the way that southern rockabilly or Chicago blues (sometimes) did.’ (Doyle 2005: 162)



Or rock ‘n’ roll.


6  Welcome to the Subterranean



Cinema in the 1940’s came to use reverb and echo as markers of the supernatural, the eerie and uncanny, establishing a cinematic device that would extend particularly to present-day horror and science fiction. Beyond the literal physical subterranean space, filmmakers began to use reverberant sounds as symbols of the unknown or the unknowable, of rumours and disembodied spirits.



‘When the lights dim the aural space swells out, but once this happens the position of the remote actants (ghosts, night sounds, cat people) can no longer with certainty be fixed… lacking the concreteness and fixity of the material daylight world the echoic world assumes the characteristics of a dreamed space- acausal, unstable, non-linear, non-rational.. allowing the aural mis en scene to expand nightmarishly.’ (Doyle 2005: pp 116-117)



This sonic cinematic trope would in turn become part of the spatial repertoire of the rock ‘n’ roll record, not as one used to evoke fear, but to evoke the calling down of the wayward spirit, or, rather, the spirit calling for the audience to come in to the subterranean world of the performer.



From around 1953, Sam Phillip’s recordings at Sun Records began to feature a slapback echo sound that soon came to be regarded as the trademark of the “rock ‘n’ roll sound” and as a consequence of its commercial success was widely emulated. While the reverb techniques had already been used by Tin Pan Alley as well as the West Coast commercial mainstream, both Sun and Chess studios used the devices to create extreme non-realist spaces that the listener was invited to share with the performers. Space had become wild and chaotic. Using slapback delay not only had a spatial impact but strong explosive rhythmic and timbral effects on the music that suggested disruptive ‘troublemakers on the threshold.’(Doyle 2005:186)



Concepts of space in music recording according to Doyle (2005) had been:


‘contemptuously exploded.. [and] Phillips.. turn[ed] his mis en scene inside out by constructing an impossible, M. C. Escher–like space.’ Whereas ‘earlier spatial recording practices located their spaces generally within strict narrative frameworks.. rock ‘ roll spatialities.. were utopian.. They were about “no-place”.. “no particular place to go”.. an infinite transcendent now.’(Doyle 2005: pp 233-234)



Reverb had come to have a new connotation- that of the marginalised outsider calling in a reverberant voice from a subterranean ‘underground’ for the sub-culture to join in and participate on the dance floor or in the milk bar as they replayed the record.



By the time Elvis Presley had left Sun in 1954 and recorded ‘Heartbreak Hotel’, the use of reverb had become a ‘gothic structure [with] Presley a weirdly inflated presiding presence within it’- a new “dark priest” in an “underground” cathedral of ‘adolescent angst.’ (Doyle 2005:209)


7  Acoustic Fantasies



By the 1960’s the aesthetic basis for a non-realist construction of recorded audio space was well-founded. The decade would see the ideas evolve more radically as the spaces became at times psychedelic ones, and the era was seen as a new stage in ‘record making’ as musicians and their producers overtly chased new sounds.



The production of new acoustic fantasises was in part enabled by an acceleration of technical innovations in recording and music performing technologies, including the introduction of portable delay units and the inclusion of reverb devices in amplifiers. It meant that instruments could exist in their own spatial territories while being part of an overall recording studio space within a recording.  It also meant that musicians were able to quote prior spatial studio production practices and develop them as motifs within their own oeuvre.



The introduction of tape splicing (and later multitracking) would mean that a band like the Beach Boys could on “Good Vibrations” record different vocal phrases in different studio spaces to produce different moods and feels. The superimposition of phasing effects across a track like The Small Faces’ “Itchycoo Park” could see a song’s entire acoustic space at times rendered other worldly, in a style emblematic of an entire generation’s psychedelic mindset.



One of the most influential “sounds” was created by producer/composer/arranger Phil Spector. His trademark ‘Wall of Sound’ created an impossible sonic world that was an acoustic mulch of elements achieved both by technical means – derived from Sam Phillips iconic echo chamber and tape effects used at Sun Records- as well by his over-the-top musical arrangements with a great deal of instrument doubling (including drum kits, basses and rhythm guitars) to create an overwhelmingly grand sonic mass. Of course this approach was a much a textural as a spatial one and indeed it is interesting to find Spector a great advocate of the mono mix undoubtedly as a consequence. It is certainly light years away from a ‘realist’ recording aesthetic.



Arguably the most iconic non-realist aesthetic of the era is that of The Beatles and George Martin from Revolver onwards. In songs likeTomorrow Never Knows’, spaces no longer simply become imaginary, time-space is reversed, distorted and feeds back, non-acoustic worlds are layered with studio instruments and sound effects and at times it seems everything is processed. John Lennon sings through a Lesley speaker, tapes are looped; drums and guitar solos are reversed. And yet overall the ‘space’ still has a coherence, a credibility, a poetic sensibility. For indeed, this arrangement of spaces is still unified as a dynamic mis en scene by the structure of the song.



As author Virgil Moorefield says of Spector and Martin: ‘although different in many ways, both of their approaches to production involved replacing the quest for the illusion of physical reality with a new aesthetic. The new sonic world they sought to create was the appearance of a reality, which could not actually exist, a pseudo-reality created in synthetic space. The manipulation of figure and ground through placement in the mix were all manifestations of a new aesthetic. Their interest lay not in replicating the natural world, but rather in transforming it into something else.. They were in effect attaining a status akin to that of a film director. The mis-en-scène, which had always been the domain of the record producer, widened in scope, as did his artistic role.” (Moorefield 2005: viii-ix)



In essence, by the time The Beatles were making Sgt Peppers as ‘a record’, one can hear in the musics the landscapes of mini-movies being played out- movies more animated and surrealist than theatrical- and more reflective of inner states than external ones. In their wake, a wider array of musicians and their producers from Pink Floyd to Brian Eno and David Bowie to Tom Waits to name just a few, would endeavour to create in their recordings what Frank Zappa called a “movie for your ears”(Zappa 1969) in part by profoundly manipulating space in their recordings. Indeed as Lemmy from Hawkwind/Motorhead would later explain of his music (Classic Albums 2005) many of the landscapes were not only borne out of actual LSD experiences but were intended to complement them as they were being experienced by the audience.


8  Non-Acoustic Instruments



By the end of the 1960’s, the territories of popular music also began to be inhabited by other instruments capable of evoking out-of-worldly spaces. Indeed, forever searching for new sounds, The Beatles’ Abbey Road marked one of the first uses of one of these instruments-the Moog synthesiser - in popular commercial music. One can’t imagine the ‘moviescapes’ of artists like Pink Floyd or the imaginary landscapes of Brian Eno’s ambient music without them, and in fact they had been used in cinema to evoke exotic places for over a decade before coming to use in popular music.



New spaces created by the synthesiser reached popular culture through science fiction cinema in the 1950’s beginning with Forbidden Planet (1956). This of course had been preceded earlier by the theremin, most notably used in The Day The Earth Stood Still (1951) but first used in film by Shostakovich as early as 1930, and in use in orchestras from the late 1920’s, along with the Ondes Martenot and later the Hammond organ- all instruments that solely emanated from electric spaces. One should also mention the synthetic sounds produced by writing on optical film that began to be made in the early 1930’s.



Of course, in many forms of music, audio has been spatialised in a multiplicity of ways that are more to do with perceptual fantasies or acoustic fictions than the sound of real world spaces. Electric and electronic spaces had been long explored outside the mainstream, most notably by the likes of Cage, Stockhausen and later Alvin Lucier and Morton Subotnick, so a rich tradition of spatial construction with non-acoustic instruments has long been at play. Examples abound from musique concrete and electronic music, to overlayed ambient soundscapes, cinematic atmospheres, sound art, and techno, as well as the other worlds created by the BBC Radiophonic workshop (and later EMS) such as those of Dr Who and the original Hitchhiker’s Guide to the Galaxy.



Indeed, one could really argue there has been a continuum in the construction of non-realist spaces in the production of music from the Teleharmonium[v] in 1897 onwards, from the dry uncanny voice of the Theremin in the 1920’s, to the acoustic neutralities of Kraftwerk’s sequencers, from the synthetic drum machines of Prince and Hip Hop, to the digital detritus of the anacoustic worlds of Glitch and Noise, new spaces have been constantly invented and old ones reconfigured, all produced with sounds that have no origin in the acoustic world, and all then subsumed into the acoustic geographies of popular music.



One should also not forget the electric guitar creating its other worlds with feedback, distortion, tape delays and reverb. One need only think of the sonic worlds of Jimi Hendrix or the digital delays of Edge’s guitar system creating textures of, as Robert Palmer eloquently put it:



‘A mystical, archetypal quality that brings together the acoustic qualities of mediaeval Christian, Muslim and Indian religious architecture, ancient Pythagorean notions of fundamental, cosmic “vibration”, Chinese and Indian metaphysical traditions as well as African hoodoo beliefs in” the Devil or Legba, the Yoruba/hoodoo god of the crossroads, the opener of paths between the worlds.”’(Doyle 2005:29)



Indeed, to hear such music, said Palmer:


“is to immerse oneself in a clanging droning, sensurround of guitar harmonics within a precisely demarcated, ritually invoked sonic space. This is the movable Church of Sonic Guitar, a vast and vaulted cathedral vibrating with the patterns and proportions of sound-ratios tuned precisely enough to have pleased Pythagoras.” (Doyle 2005:29)



Moreover, the studio itself was coming to be regarded as a musical instrument in which to compose virtual spaces. As musician/producer Brian Eno (2004) later explained: ‘Immersion was really the point: we were making sound to swim in, to float in, to get lost inside.’



At the time, Eno declared: “It's now possible to make records that have music that was never performed or never could be performed and in fact doesn't exist outside of that record.. I want to think not in terms of evoking a memory of a performance, which never existed in fact, but to think in terms of making a piece of sound which is going to be heard in a type of location, usually someone’s house.” (Moorefield 2005:22)



Thus, says Virgil Moorefield: “realism, or the model of replicating a concert experience, just isn't the point anymore. What matters is the sonic experience the record offers, on its own terms, as sound. The producer is the director of the aural movie.. As of Sergeant Pepper, it had become quite clear to the pop world that performances and records were not necessarily linked.”  (Moorefield 2005:22)



Ironically, as a by-product, the problem for many musicians would then become how to reproduce that ‘record’ as live performance in the concert hall.


9  The Sound Stage & The Superimposition of Spaces


With the advent of multitrack recording it became easier to record and overlay multiple spaces within the same aural time frame. With close mike recordings able to deny natural room acoustics, mixes could dynamically reconstruct new configurations of space with some instruments impossibly proximate and others emanating from virtual or electronic spaces. The practice itself was not new- multiple miking had been at play since the introduction of electric recording- but with its ease of use, the practice became more pronounced.



By the 1970’s, with the widespread adoption of multitrack recording and a proliferation of stereo music production, despite the growing number of aesthetic options, certain conventions for constructing the sound stage began to emerge, particularly in terms of stereo panning.



Stereo music albums had been available since the 1950’s but these tended to be expensive and primarily marketed for the audiophile, more so in a post-war Europe still in economic recovery. As mentioned previously, often that early stereo ‘product’ was concerned with content that panned dramatically to draw attention to the replay medium, or to musics that tended to be recorded with the ‘romanticist’ aesthetic, such as orchestral or jazz music, where producers aimed to reproduce the concert space in the living room.



In popular music, the change to producing and mixing The Beatles recordings in stereo is an illustrative one. Essentially, until the last few albums, material was produced and mixed primarily for mono playback, with stereo remix sessions taking place almost as an afterthought and often conducted in the absence of the producer, and most certainly in the absence of the band. The ‘mono first’ approach could create some challenging issues for the stereo remixer- for example, during the definitive mono mixdown of “I Am The Walrus”, a live BBC-Radio broadcast of ‘King Lear’ had been recorded onto the master over the playout, necessitating the stereo version to be spliced with a mono ending. Listening to the stereo mixes from this period, for example “Hello Goodbye”, one is struck by the extreme panning of musical elements- drums, piano, bass to the left, backing vocals, strings and electric guitar to the right, and with the lead vocal shared between the two channels to achieve a mono centre. What is somewhat surprising, given that the limitations of 4 track recording and de rigueur hard panning, is how cohesive and spatially coherent the whole picture sounds when played back, despite a very different arrangement of the sound stage to what would come to be regarded as ‘conventional’.



Despite improvements in the technology, even by the final album, Abbey Road (1969), one can still hear similar spatial arrangements of instruments on the sound stage. For example, on “Come Together” the drums and electric piano are still panned hard left, the rhythm guitar to the hard right albeit the bass and lead guitar are now starting to take a more central stage position or are more finely panned, while on other tracks a more finely arrayed spread of elements has begun to be implemented, most evidently in songs like “The End”. It’s interesting to note that these stereo geometries were being produced at a time when the band had decided to make a ‘record’ that sounded more like the band performing on a stage- in other words they were being drawn away from their world of acoustic fictions back to a ‘romanticist’ aesthetic.



Yet, if at first sight the trend in these spatial geometries seems to suggest a reversion to the pursuit of the facsimile, in popular music production, what evolves is something more complex, for encapsulated on the sound stage within the veneer of a seemingly ‘realistic’ spatial arrangement were the multiple layers of acoustic codes that had come before.



If, for example, one looks at a microhistory of drum recording practice in terms of changing fashions in spatial geometry and spatial construction, one can clearly see profound changes in aesthetic approaches, particularly regarding the arrangement of the kit on the sound stage. In earlier times a ‘good’ drum kit sound was ‘out there’ in the room. For example, Gene Krupa’s big band drum sound of the late 1940-s early 1950’s, which is distant in the mix (except in solos) and always reverberant without any close miking.  The bass drum is difficult to discern and the sound is spatially chaotic, relaying the kind of explosive exuberance that presages the rock ‘n’ roll spaces of the 1950’s.



By the time of Ringo’s solo on Abbey Road, the drums had become drier, much closer and more isolated (multiple miked and compressed) and had begun to be panned across the sound stage, with the respective distance between the tom toms impossibly wide. Somehow, as an image, in the same mix, we can hear the drums both intimately and yet spread out further than was ever physically possible on a real stage. 



Of course, it was also question of style- at the same time that Ringo was measuring his beats, a wildly reverberant Keith Moon was recorded with a liberal addition of room sound creating spatial chaos in The Who. What is certain is that the ‘realist’ audiophile stereo techniques employed in recording for example an orchestra were nowhere in sight- instead a vast rhythmic mechanism was in the process of being constructed, that would ultimately lead to the massive sounds of Public Image and later Nine Inch Nails.



By the end of the 1960’s what seemed to emerge was what has become the classic textbook method of arranging a drum kit across the soundstage. As Stanley Alten describes it in Audio In Media:



‘The bass and kick drum go in the middle of the mix, along with the snare that sits right on top of the kick. The hi hat is placed off to the right side. The tom toms are panned across the stereo field.’(Alten 1981: pp 571-72)



Indeed, a conventional mix- and nothing like Ringo being panned to the extreme right of stage. In actuality, what he seems to be describing is a subset of drum recording practices founded on the sounds and geometries of classic rock drummers such as Charlie Watts and John Bonham that have pervaded much of rock production both live and recorded ever since. Indeed, if one adds the other instruments following the rest of the recipe, the essential ‘classic rock’ spatial arrangement seems complete:



‘‘The lead vocal is in the centre of the mix and up front.. The background vocals.. are making a line behind it.. The string sound.. works best as a long, smooth line way in the back.. Acoustic piano is panned to 7 o’clock and 5 o’clock positions.. The rhythm guitar fills the space about 9 o’clock and 3 o’clock. A lead guitar, sax or synthesiser fits just to the left or right of centre. The lead instrument is kept a short distance from the bass and central drums..’ (Alten 1981: pp 571-72) and so on.



Important to recognise, as is not usually pointed out in the various texts, is that this arrangement might in fact be considered a single scene in what might be a much more dynamic spatial arrangement of the elements. Moreover, it is an arrangement that conceptually looks upon the mix as a construction of a band on stage where of course the instruments and players are essentially in fixed positions and very framed by a theatrical proscenium arch where the actors rarely change position.



Much to Alten’s credit, he does recognise “there are many options in positioning various elements in an aural frame” and “each musical style has its own values”. (Alten 1981: p p571-72) Indeed, and those options might include dynamic constructions of the sound stage, where some elements may be fixed but others move, a little like changes of lighting set arrangements, as well as the superimposition of spaces (conceptual as well as physical). In broader terms the idea of a dynamic spatial and temporal configuration is resonant with how Serge Lacasse ‘likens the concept of vocal staging to the broad theatrical notion of mise-en-scène, and that of setting to a particular effect of mise-en-scène occurring at a given time or lasting for a given duration.’ (Doyle 2005:29)



Perhaps one of the best examples of this dynamic mise-en-scène would have to be Tony Visconti’s production of David Bowie’s vocal track on “Heroes” at Hansa Studios in Berlin in 1977:



‘'Heroes' was apparently recorded with three different mics — one close, one a few feet away and the third at the other end of the room — the latter two of which were gated with different thresholds in order to introduce increasing amounts of room ambience as Bowie sang louder.’(Sound on Sound 2001)



Bowie’s voice, like the rock ‘n’ roll shamans of the Chess and Sun Studio era before him, roams across several spatial domains- from that of the close intimate crooner to the explosive outsider, from the subterranean cathedral to all stations between. Ironically, Visconti achieved this effect principally through mike placement techniques, albeit complemented by various processes, yet the effect would be much emulated in the post-digital processing era to come- by Tom Waits, for example on ‘When The Earth Dies Screaming’ or by Nick Launay in his production of Peter Garrett’s voice on the Earth and Sun and Moon album.



As Launay explained:


‘'From the point of view of creating moods, if a vocal was just a narrative about some external subject not a personal one then the vocal could be a bit more distant and treated. When it's a personal lyric, then you often find it's better to mix vocals up close so that it sounds like you're having a conversation.’(Horton 1993)



Between verse and chorus, between one line and the next, even between words, producers would more than ever before encode changes of space in the vocals as part of the emotional grammar of the arrangement.



It is useful to consider, too, how unlikely technical considerations at times seemed to predetermine some of the spatial strategies involved in arranging elements within the sound stage. One can almost certainly in part trace the bass/bass drum centre dominated mixing convention to problems in the disk cutting process (no longer an issue in the digital domain although bass management can create other issues).



According to Bobby Obwinski (1999): ‘Because music elements tended to be hard panned to one side this caused some serious problems: if a low frequency boost was added to the music just on that one side, the imbalance in the low frequency energy would cause the cutting stylus to cut right through the groove wall when the master lacquer disc (the master record) was cut. The only was around this was to either decrease the amount of low frequency energy from the music to balance the sides, or pan the bass and kick and any other instrument with a low frequency component to the centre.’ (Obwinski 1999: pp 21-22)



Similarly, the coming of multitracking gave rise, in some quarters, to a strong denial of acoustics in recording.



As Mark Cunningham (1998) points out: “the additional flexibility given to engineers through the use of sixteen-track machines explains the sudden clarity of rhythm tracks, particularly drums, which came to fruition in the early Seventies-a period when strict close miking was the order of the day and little or no room ambience would creep into the mix.” (Cunningham 1998)



This of course would be over-corrected with the coming of digital processing in the late 1970’s by the addition of artificial reverb and may well help explain why it was done so extremely.



From the late 1970’s forwards, coincident with the introduction of digital delay and digital reverb, the growing practice of superimposition of space became more pronounced. Drum kits would come to have several spaces enfolded within their domain or be impossibly scaled- a myriad of otherwise dry drum kits would resound with Yamaha Reverberated snares while Phil Collins’ seemingly gigantic’ snare drum[vi] became a prime example of a spatialised signature sound identified with the time.



Indeed from this time, in popular music production, it is not unusual to find many contexts in which in ‘spaces’ are layered – for a more subtle example one only need listen to some of the beautiful music mixes by Daniel Lanois, such as The Neville Brothers album Yellow Moon, where different reverberant treatments on various instruments are superimposed and subject to change structurally throughout the various songs to create changes of intimacy and mood and evoke different abstract senses of ‘place’.  Alternatively one might consider the layers of re-reverberance that Eno and Budd and their collaborators used to spawn the ’new age’ ambient aesthetic of oneiric soundscapes. Both Eno and Lanois, who have collaborated on many projects together, not only practice techniques of superimposition but expanding on Puttnam’s ideas of sending a sound to the echo or reverb and varying its return to the mixing console, send the ensuing reverberated signal to a second processor mixing out the original (dry) sound and in so doing building more complex superimposed spatial textures while simultaneously effacing the acoustic sources.



From the 1980’s on, the sound of drums has been completely redefined, with the introduction of synthetic and then sampled instruments. From their non-acoustic origins, synthetic sounds brought new electronic geographies to the sound stage, while sampling meant that encoded spaces could be recaptured, reconfigured and overlayed with new spatial codes. The “record” of the drum kit could be replayed as an instrument, the performance could become another loop.  The loop could then be granulated, its space collapsed and bits and bytes of it reassembled with other elements amidst the noise of new spatial arrangements. Moreover these virtual instruments, having not necessarily originated on an acoustic sound stage, could be given credible spatial geometries completely unrelated to physical space.



Indeed the idea of an acoustic space for some such as Trent Reznor, from Nine Inch Nails, became increasingly irrelevant:



“Everything was programmed. My idea of a drum is a button on a machine. When I hear a real drum kit...when someone hits a kick drum, it doesn't sound to me like what I think a kick drum is. Any time I've been faced with, 'Let's try miking up the drums', well, you put a mike up close, you put another one here, 300 mikes, gates, bulls---, overheads, bring 'em up and listen to it and it doesn’t sound at all like it did in the room. It sounds like a 'record-sounding drum kit.' It doesn’t sound like being in the room with live ringy drums. You read these interviews where producers will say, 'It sounds like you're in the room with the band'. No it doesn't. Nirvana's record doesn't sound like you're in the room with them. It might sound sloppy, and it sounds interesting, but it's not what it sounds like in the room, to me, anyway.”(Moorefield 2005:55)



Virgil Moorefield (2005:67) offers the following insightful response:



“This blurring of the distinction between recordings of real instruments and recordings of recordings is an interesting game”, says Moorefield, “ and fits right in with Reznor's overall strategy of playing a kind of push-and-pull with the two poles of record production, the illusion of reality ("this was played by real people in a real setting") and the reality of illusion ("this doesn't exist in the real world; we're making our own universe".’(Moorefield 2005:67)



This convergence of sampling, superimposition techniques, and the use of microsound, glitch and noise elements, has seen the production of musical soundscapes by such a diverse range artists as Christian Fennesz, Matmos, Pan American, and David Toop, with their mediated spaces at times sounding like cinematic atmospheres and at others like post-apocalyptic fallout from a dysfunctional matrix. These post digital spatial tropes ‘thick with imaginings, memories, utopias, foreboding’ (Toop 2004:54) have found their way into the musical spaces- the ‘universes’ or ‘acoustic fictions’- conceived by mainstream artists including U2, Radiohead, Björk, and even alt rock bands such as Wilco on albums such as Yankee Hotel Foxtrot.


10  Representing Space in Cinema



‘The cinema is a dream we all dream at the same time.’


Jean Cocteau. (Bergen 2005)



In cinema, indeed even before moving picture was technically possible, the idea of synchronised sound and image being able to produce a facsimile of the world was being mooted. In their first review of Edison’s phonograph Scientific American (1877) would declare:


'It is already possible by ingenious optical contrivances to throw stereoscopic photographs of people on screens in full view of an audience. Add the talking phonograph to counterfeit their voices, and it would be difficult to carry the illusion of real presence much further.'



Even before the coming of film sound, two basic attitudes have dominated the history of film theory and practice, a realist approach ‘celebrating’ the raw materials and an ‘Expressionist’ approach focusing on the power of the filmmaker to modify or manipulate reality. According to James Monaco (2000):



‘The first dichotomy of film aesthetics is that between the early work of the Lumière Brothers, August and Louis, and Georges Méliès. The Lumières had come to film through photography. They saw in the new invention a magnificent opportunity to reproduce reality. On the other hand, Méliès, a stage magician, saw immediately film’s ability to change reality- to produce striking fantasies.’



Behind these two principles lies on the one hand a desire to garner audience participation and on the other a desire engender detachment. A major force in early American cinema, D. W. Griffith described the two major ‘schools’ of film practice:


‘The American school “says to you: ‘Come and have a great experience!’* Whereas the German school says: ‘Come and see a great experience.’ (Monaco 2000:287)



The idea of immersing the film spectator in the experience of the narrative has remained a significant part of the rhetoric of Hollywood film content marketing to the present day, particularly extending the role to create that illusion to the sound mechanism.  Yet like any good magic show the director as magician/illusionist has ever endeavoured to efface the apparatus.



‘In this economic sense’, says Monaco, ‘movies are still a carnival attraction- rides on roller coasters through chambers of horror and tunnels of love- and Realism is totally besides the point.’(Monaco 2000:385)



Except perhaps when you are selling the picture or the home theatre technology!



Initially the development and economic success of film sound was dominated by a primary concern for synchronised dialogue.  Early sound film had little concern with spatial information and it would be many years before atmospheres would become an integral part of the film narrative. Yet while early sound films were principally talking films intended to convey a sense of drama on a naturalistic level, in reality they functioned by way of what Eisenstein described as "highly cultured dramas" and other photographed performances of a theatrical sort (Eisenstein, Pudovkin & Alexandrov 1928) while the audio was reduced to a semblance of the reality it pretended to portray, and, as mentioned previously, early on directors learned to violate aural perspective in order to maintain audiovisual continuity and aural intelligibility.



Indeed, the space of lot of early film sound was very ‘boxed’ and theatrical without atmosphere or ambience due to technical limitations (e.g. the microphones were poor, the camera themselves were noisy and the early sound recording stages had poor acoustics and/or isolation). Indeed, perhaps because of those limitations, directors learned that the omission of sounds, including the use of silence, in lieu of synchronised sound could be employed as powerful evocative cinematic tools. Russian film theorists recognised and argued early that film sound could be served as well by the use of asynchronous and counterpoint of sounds.



Prior to the arrival of sound in late 1920’s, film had developed a complex visual grammar particularly through the use of montage Moreover, perhaps because it took so many years to implement – the phonograph was around fifty years old by the time of the first film sound feature[vii]- an understanding of the use of sound symbolically through the use of live music in the silent cinema theatre had become intrinsic to the dramatic form with a well developed system of leitmotifs providing a wide range of available codes. Indeed an entire tradition of non-diegetic sound superimposition had been employed during the silent era- from the use of the voice-over, to music and off-screen sound effects. With a parallel understanding in the non-literal use of sound developing in radio, by the time of the arrival of film sound, a non realist tradition was evolving ready to underpin the synchronised dialogue once the technical means were available.



The early introduction of a separate film sound camera was a major ideological breakthrough, for it meant that sound and image could be separated and recombined with different elements, and with this reassociation came a fundamental realisation of the symbiotic relationship between sound and image, the idea that sound could modulate the meaning of image and vice versa. As Walter Murch describes it:



“This reassociation should stretch the relationship of sound to image wherever possible.  It should strive to create a purposeful and fruitful tension between what is on the screen and what is kindled in the mind of the audience.. by virtue of their sensory incompleteness — an incompleteness that engages the imagination of the viewer as compensation for what is only evoked by the artist. Every successful reassociation is a kind of metaphor, and every metaphor is seen momentarily as a mistake, but then suddenly as a deeper truth about the thing named and our relationship to it. The greater the stretch between the "thing" and the "name," the deeper the potential truth. (Jarrett October 2000)



Indeed, Murch argued it was in the manipulation of ‘incompleteness’ either visually or aurally that the true power of cinema lay:



‘That’s the key to all film for me—both editorial and sound. You provoke the audience to complete a circle of which you’ve only drawn a part.  Each person being unique, they will complete that in their own way. When they have done that, the wonderful part of it is that they re-project that completion onto the film. They actually are seeing a film that they are, in part, creating: both in terms of juxtaposition of images and, then, juxtaposition of sound versus image and, then, image following sound, and all kinds of those variations.’(Jarrett Spring 2000)



A good recent example of this of is the film Inland Empire (2007) where director David Lynch repeatedly underscores seemingly innocuous visual settings with deep bass drones to insinuate the eerie, uncanny and nightmarish undertones in the narrative. Elsewhere during the interrogation scene in Holden’s office in Bladerunner (1982) the electronic sounds of the ambient atmosphere are transformed by the play of audio and vision in the narrative into a reflection of the internalised consciousness of the replicant.



When this ‘reassociation’ was further facilitated in the mid 1930’s by the introduction of ‘looping’ (post synch dialogue) and the possibility of ‘multitracking’ the audio by using multiple strips of sound film with a synchroniser, the basic technical elements were in place to complete an ideological shift from the ‘capture’ of a live filmed performance in synchronisation with the image to the ‘building’ of an idealised soundtrack to be created after the picture had been edited. 



At first with a minimum number of tracks available to superimpose, and a narrow bandwidth and dynamic range due to optical film constraints, much of that role of ‘composing atmospheres’ was given over to the music soundtrack to play in addition to any essential sound effects –with its own recorded spatial tropes encapsulated and drawing on a well-codified system of leitmotifs from the silent film era to construct intimations of emotional and dramatic space. However, it would only be a matter of time and the enabling technology for more a wider range of sound sources and more complex spatial tropes to be used, both as an underscore and as an integral part of the narrative.



Directors came to understand the unravelling of the mind space of film could in fact be more like a dream and the landscapes they were creating aurally were oneiric ones.



Indeed, as film editor/sound designer Walter Murch notes in explaining how film cuts work:



‘Well, although ‘day-to-day’ reality appears continuous, there is that other world in which we spend perhaps a third of our lives: ‘the night-to-night’ reality of dreams. And the images of dreams are much more fragmented, intersecting in much stranger and more abrupt ways than the images of waking reality- ways that approximate, at least, the interaction of cutting. In the darkness of the theatre, we say to ourselves, in effect, ‘This looks like reality but it cannot be reality because it is so visually discontinuous; therefore it must be a dream.’ (Murch 2001: 55)


11  Channeling Voice, Channeling Space


If, however, the film soundtrack has come to a great extent to be liberated by the possibilities afforded by the reassociation of sound and image, the cinema sound field that has evolved from a mono optical soundtrack with a narrow bandwidth and limited dynamic range, to a digital multichannel ‘superfield’, as Michele Chion describes it (Chion 1994:132), is still dominated in the mainstream by a centre channel dialogue convention best summed up in the phrase ‘dialogue is king.’ In other words the space of the voice in cinema is primary and one where it must be intelligible, and every other sound subservient to it irregardless of the acoustic ‘realism’.



Indeed according to Rick Atman:


‘So deep-rooted is Hollywood's dedication to dialog intelligibility, that nothing but perfectly understandable dialog could possibly satisfy spectator expectations.’ (Altman 1995)



Ostensibly, in both film sound and music production, this ‘vococentrism’ or privileging of the voice is a convention rarely deviated from albeit it may occasionally be moderated. On the rare occasions when the code is fully transgressed the effect is profound and can be disturbing. In Elem Klimov’s Come and See, Mike Figgis’ Leaving Las Vegas, or in Steven Spielberg’s Saving Private Ryan narrative events justify these profound moments of loss of intelligibility with great effect, while in Jean Luc Godard’s films, such as in Masculin Feminin, changes in visual perspective accompanied by hard cuts in aural perspective with a resultant loss in dialogue intelligibility, dramatically jolt the audience’s sensibility making them aware of the mechanism of the medium in lieu of the content of the film.



‘Vococentrism’ pervades in music mixes, too, paralleling the fixity of mainstream cinema. One might ask where for example are the Jean Luc Godards of pop and rock music mixing? It is hard to find examples. Apocryphally, the ‘Glimmer Twins’, Jagger and Richards, once claimed in an interview they used vocal masking to create a desire in the audience to hear a song again in order to discern the lyrics, but this is a more a matter of nuance than extremes. These nuances however can be rich in meaning, particularly given the authoritative power of the voice.  For example in cinema the voiceover presents a character outside the time-space of the diegesis in film and at times a similar thing happens in music mixes when a processed or dry proxemic voice distances the singer from the space of the band or the narrative of the song.



With the dialogue given its centre channel pedestal, it is informative then to look at how the use of the other channels in cinema has evolved. It is important to consider that alternative channel configurations may have been used quite differently and resulted in an entirely different cinematic aesthetic.



Indeed the original Vitaphone system in 1926 deployed a speaker configuration that did not endeavour to emulate the diegetic sound space of the film but instead attempted to replicate the silent film theatre experience:


While one speaker is maintained behind the screen--in order to reproduce infrequent speeches.. the other is located in the orchestra pit, pointing upwards, simulating the sound of the orchestra it has displaced.” [my emphasis](Altman 1995)


When by 1929 this paradigm had shifted to one where sound emanated solely from behind the screen, abandoning the orchestra pit model, it meant from henceforth the space of non-diegetic music in cinema would co-exist with the screen space- in other words the paradigm moved towards an even more symbolic mode of representation of audio with less roots in the real world correspondences of sounds.



Despite the constraints of mono, directors, drawing on age-old theatre traditions of ‘voices-off’, quickly learned to make use of off-screen sounds to extend the narrative space and work as powerful dramatic devices. From the motivic off-screen whistle that alerts the audience of the identity of the child killer in Fritz Lang’s M (1931) to the iconic off-screen coyotes and a cock’s crow which recalls New Testament betrayals and extends the narrative world in Sergio Leone’s The Good The Bad And The Ugly (1966) as he amplifies characters faces in close-up, directors created subjective audio spaces to heighten the dramatic tension rather than simply echo the location of the visual space. Alfred Hitchcock, who would come to regard the use of sound effects as the equivalent to dialogue, would in Psycho (1960) use different kinds of rain for structural dramatic effects and most notably in the aftermath of the famous shower scene use the relative silence of the stark motel atmosphere to amplify Anthony Perkins visual revulsion at his grisly ‘discovery’. In these effects were the beginnings of the truly fleshed out multichannel atmospheres that were to follow, albeit in later decades.



In the late 1930’s experiments began to take place with multichannel audio[viii], most notably with Fantasia (1940) – more a novelty of panning in an animated film than concerned with realism- but it was not until the 1950’s that the first commercial efforts to present a ‘3D’ experience took place in Hollywood cinema with the introduction of Cinerama, followed by the Cinemascope and Todd AO formats.



Yet, if using the 3-channel stereo that Bell labs had heralded in the 1930’s might usher in the age of the facsimile, Cinerama’s five behind screen channels augmented by three additional speakers in the auditorium, was less concerned with fidelity to the screen image and more with providing spectacle. Indeed with the centre channel still concerned with the primacy of the dialogue, the auditorium speakers were ‘used only intermittently, usually to reinforce spectacular visual effects, [and] surround sound worked directly against the ideal of spatial fidelity applied to the three directional front speakers’. (Altman 1995)



With little atmosphere to produce a 3D verisimilitude (not that it was ever the point) it was really the function of the grandly orchestral music soundtracks and the widescreen to give the audience a sense of spectacular scale in their cinema experience. Indeed, in films of this period and over the next two decades one can find more interesting constructions of space created by use of a single mono channel in imaginatively codified ways.



Perhaps the most powerful and influential evocation of spatial scale by the use of the orchestral score comes with Stanley Kubrick’s 2001(1968), where the space of the screen is ‘upscaled’ by the all consuming orchestral score that leads us into the film and masks the noiselessness of the vacuum. In reality, space may be silent but in 2001 the non-diagetic music expands the void and indeed seems to create visual space as well as weightlessness, It also is used structurally with great narrative impact, with its absence a powerful dramatic contrast when HAL cuts the astronaut’s lifeline on his spacewalk and true silence marks the deathly profound.



The 1970’s saw the introduction of Dolby Stereo in cinema. It is important to stress that stereo does not mean two-speaker replay (indeed even in the 1930’s early stereo was envisaged with a minimum of three speakers) and Dolby Stereo was in fact a four-channel system with a left-centre-right front plane and a single surround channel. It reveals a great deal about the underlying aesthetics to realise that it was regarded as an extended stereo system.[ix]. In other words, like the 1950’s construct, a front plane complemented by ambient augmentation, as opposed to a fully immersive surrounding field. At first however:


“ A new generation of sound specialists labored mightily to employ the surround speakers to enhance spatial fidelity. Having failed to learn a lesson from the mistakes of Fifties stereo technicians, the sound designers of the post-Star Wars era regularly placed spatially faithful narrative information in the surround channel. Recalling the 3-D craze in the mid-Fifties, for a few years every menace, every attack, every emotional scene seemed to begin or end behind the spectators. Finally, it seemed, the surround channel had become an integral part of the film's fundamental narrative fiber.”(Altman 1995)



Yet, in practice, it was found that despite being able to localise sounds around the audience and add spatial sonic movement to them as in real life, the result really wasn’t the ‘fidelity’ that the sound specialists were ‘labouring mightily’ to achieve.



Instead ‘through its usage as an element of spectacle and through its identification with the genres of spectacle, stereo sound became associated for audiences not so much with greater realism as greater artifice.’ (Belton 1992)



The outcome of this realisation was that by 1983: ‘in most Hollywood cinema production  “all narrative information would henceforth emanate from the front speakers, with the surrounds used for spectacular (but nonessential) enhancements. Thus freed from any responsibility to present narrative events or even spatial fidelity, the surrounds began a new career (especially in fantasy or horror films) as purveyors of spectacular effects… the surrounds were being liberated from the demands of spatial fidelity or narrative relevance.”(Altman 1995)


In this context, with the coming of Dolby 5.1 Surround Sound in 1979, it is vital to note that of the first two cinema releases in that defining format (Superman has claim to be the first) it is the sound design approach of Walter Murch for Apocalypse Now which establishes a dynamic spatial sound design model that essentially underscores much of the aesthetics of Hollywood cinema sound today.



At the same time, certain geometries or channel configurations and conventions for their use had become codified- the cinema surround stage with its ‘vococentric’ centre channel, ambient surrounds and special effects LFE channel, would endeavour to keep all essential sounds in the front plane and music and atmospheres usually out of the centre channel. Ironically it was not the spatial arrangement of the speakers that had the most effect on the perceived ‘realism’ of the cinematic experience but the addition of the .1 LFE channel which accentuated the visceral experience by forging a tactile link between the audience and the events on the screen, or by way of counterpoint adding a note of dread or awe.



Interesting to note, in the opening surround sound sequence of Apocalypse Now the space is an surrealistic one, with synthetic helicopters creating an oneiric geography, while real ‘space’ Saigon’ (in reality Manila) is established by way of a detached voiceover. The time-space of the film is established by the non-diegetic soundtrack of the Doors. Meanwhile, the subsequent jungle atmosphere superimposed over Willard’s hotel room aims to convey a ‘mind space’ that sound designer Murch is seeking to evoke that sets the tone for the protagonist of the film.



It is a long way from the facsimile, but imminently engaging.



12  A Dynamic Audiovisual Mis En Scene



In cinema aural space had become an integral part of the narrative mis en scene. It had come to have a power akin to that ascribed by Louis Gianetti to the visual arrangement of elements within the frame:



‘Space is one of the principal mediums of communication in film. The spatial structure of virtually any kind of territory used by humans betrays a discernable concept of power and authority.”(Gianetti 1972-66-67)



Indeed, with the introduction of multichannel audio in cinema, according to Michel Chion, cinema space has been transformed, and a consequence of the resultant  ‘wraparound superfield’ on ‘multitrack cinema’, Chion argues, has been to ‘progressively modify the structure of editing and scene construction,’ undermining, for example, the importance of the establishing or long shot. (Chion 1994: 150)



Some of the best audio practitioners make judicious and considered use of spatial ‘gestures’ as a form a dynamic mis en scene construction i.e. moving from mono to stereo to surround fields and other multichannel configurations. Rather than thinking of a constant field that one seeks to emulate in detailed verisimilitude to our everyday audio/acoustic experience, they create a series of permutating spatial signifiers including, but not limited by, those similar to our ‘real’ world experience, realising that real world experience too is the result of a subjective and cognitive activity.



Almost certainly one of the first directors to construct a dynamic audiovisual mis en scene, albeit working in mono, was Orson Welles. Welles with his previous radio background had an acute understanding of the evocative powers of sound, having produced the ‘War of the Worlds’ broadcast in 1938 about an alien invasion that saw parts of east coast USA in panic and even led to several suicides. 



A fantastic example of how changes in the sound space can act as a plot device is demonstrated in Welles’ film Touch of Evil (1958) where in the climax of the film, Sgt Menzies who is wired for sound with a radio mike is crossing a bridge in conversation with the antagonist bad cop Hank Quinlan who is being recorded by the hiding fugitive Mike Vargas hoping to get evidence on Quinlan to exonerate himself.



Quinlan suddenly hears his voice echoing and distorted by the recording under the bridge and realises he is being taped.



As Walter Murch describes it: ‘So that echo – that particular quality of sound- causes the plot to unravel: Quinlan accuses Menzies, there is a struggle, Menzies is shot, the n Quinlan goes after Vargas, and then is shot himself by the dying Menzies. Welles hung the whole ending of the film on the ability of the people in it, and the audience, to understand a subtle nuance with the sound. That it’s the wrong echo.’(Ondaatje 2002: 195)



Murch’s work on the prototypical Apocalypse Now shows us how we can profit by thinking of a dynamic construction of space in terms of an integrated audiovisual mis en scene where every sensorial input is an element that furthers our engagement along the narrative journey. An excellent example of this is the ‘Do Lung’ sequence described by Murch at a London School of Sound lecture in 1998:



“The scene begins with the realistic sounds of bridge construction. You hear
arc welders, flares going off, machine guns and incoming artillery. As the scene
continues, though, you'll notice that the explosions and the machine guns are
gradually replaced by sounds of construction - the machine guns become rivet
guns, for instance, so there's already a subtle warping of reality taking place. Francis
called this scene 'the fifth circle of Hell' Once the scene gets into the trench, the
dilemma is explained: there's a Vietnamese soldier out there, a sniper taunting the
Americans, and they're shooting wildly into the dark with an M50 machine gun,
but they just can’t get him. Finally, out of frustration the machine gunner asks for
'The Roach'. He turns out to be a kind of human bat; someone who has precise
echo-location instead of sight: if he can hear the sound of the voice, he can then
pinpoint his target, adjust his grenade launcher and, in the dark, shoot the sniper.



As 'The Roach' approaches the camera, the rock music that has been reverberating
in the air of the scene, coming from all speakers in the theatre, concentrates itself
in the centre speaker only and narrows its frequency range, seeming to come from
a transistor radio which Roach then clicks off, taking all the other sounds with
it. After a brief rumble of distant artillery, there is now silence except for some
kind of unexplained, slow metallic ticking and the calling of the sniper. Visually
you see the battle continuing - flashes of light, machine gun bursts, flare guns
- but there is nothing of that at all. You have entered into the skin of this human
bat and are hearing the world the way he hears it. He echolocates, shoots, there's
an explosion and then a moment of complete silence: even the metallic ticking is now gone. Willard asks Roach if he knows who's in command, and Roach answers
enigmatically: 'Yeah.' Then the scene is over, we shift location and the world of
sound comes flooding back in again.’(Murch: 1998)



The effect of these dramatic spatial shifts in perspective can be dramatic, as sound designer Gary Rydstrom describes it:


“You can shift focus on a cut instantaneously and it has the effect of a Godard jump cut. There’s something that shocks you and jumps you into the next sound.’(LoBrutto 1994)



It is a technique that has now been well integrated into the sound designers’ repertoire and, indeed, been employed not only in film but also in television including even the Simpsons.



Indeed, the “Treehouse of Horror VI” (1995) episode where Homer accidentally steps into a 3D dimension is a neat encapsulation of how changes in space may be encoded in an audiovisual medium. From a mono and ‘dead’ flatland cartoon world devoid of music and atmosphere, Homer transits to the stereo ‘Tronland’ replete with its aircon-like atmospheric noise and wind loops and incidents of stereo high frequency movement to map the geography of the stereo space. Between worlds Homer talks to Marge through the reverse gated reverb, while in an abject denial of astral physics the black hole is designed as a large reverberant cavernous space to denote its immense scale. At the end, the sound of ‘the worst place yet’ – downtown L.A.- is produced as an atmosphere of highpass filtered traffic, a mediated sound reminiscent of the actuality of a newscast or documentary but a poor stylised facsimile of the real world.



A more elaborate but equally dynamic interplay between space and image occurs in Mike Figgis’ film Timecode (2000). While the screen is divided into four quadrants each with an ongoing part of the narrative shot on four cameras in real time, the single unifying factor of the film is the soundtrack. From the outset of the film, the gaze of the audience is trained by the multichannel audio tracks. Dialogue, breaking the vococentric convention, without an accompanying synchronous image, starts in a surround channel but snaps to the right front speaker as the image of the actor appears in the top right quadrant, matching the visual space with the sound source. In the nuanced drama that unfolds, Figgis will use movements of sound to direct the audience’s attention between quadrants, essentially using sound in a kind of quasi-editing process, creating tensions in the audience as the screen actions at times elicit a desire to eavesdrop on conversations unheard at the expense of those present in the mix. Three earthquakes ‘realistically’ rumbling from the LFE channel, structurally unify the action on the three screens whilst viscerally connecting the narrative space with the viewer.



In each of these examples what becomes clear, is that, in modern cinema, a dynamically constructed mix of elements not only representing the space but amplifying character and playing part of the narrative flow of events, indeed in directing our gaze, is of far more importance than recreating a simple ambience that has been recorded in synchronisation with the action. Indeed these are seldom used. Instead what we experience is a dynamic interplay of constructed spaces.



Even Saving Private Ryan, which in the press made much of its authenticity, may have used, in the main, real sound sources but the sound design was still a constructed world moving between selected, focussed points of view, marked as much by the omissions of sounds as their inclusion. In the selective dialogue of the landing boats, in the underwater filtering, in Hank’s shell-shocked tinnitus, in the LFE rumble of the approaching tanks, the effect is one of profound evocation but not reproduction.  The spaces are multiple, constructed and intermixed, and ultimately cohesion, consistency and credibility are far more important than realistic reproduction.



This dynamic audiovisual mis en scene, uses, in the main, the same technologies as pop rock musics, yet it draws on a different set of conventions and grammars, principally because every sound is framed with respect to image and vice-versa in a symbiotic relationship.



In terms of multichannel production and composition of popular music, which has been pursued at least since Pink Floyd’s Dark Side of the Moon in 1973, aesthetic approaches tend to be idiosyncratic, while discussions are often are distilled into simple issues of geometries and channels that can be summarised as five basic choices:



a)      A stereo sound stage with its phantom centre, planes and horizon;


b)     The cinema surround stage with its ‘vococentric’ centre channel, ambient surrounds and special effects LFE;


c)      The extended stereo model with essentially a stage-to-audience perspective;


d) The ‘in the band’ music mix- with an immersive, often quadraphonic approach;


e) True spatial /multichannel experimentation (from pyrotechnic panning to variable soundstages.



In addition one could include the audiophile approaches of ‘realist’ techniques such Ambisonics, but this is the main has been used for orchestral and jazz recordings.



Often, simplistically, the aesthetic can be as simple as a basic two-channel front plane mix, with instruments and effects moved around in space to achieve a sense of dynamic excitement. In other words while there may be a dynamic spatial mis en scene it tends to be operating within a stereo, or at best extended stereo plane, as opposed to a multichannel aesthetic. It is hard to find an example where the spatial shifts are as dramatic as those mentioned from cinema where the sound transitions from mono to stereo to surround fields and other multichannel configurations in a structured way.



Perhaps, because the market for the majority of music recordings is still primarily in a stereo format, and because music tends to be re-mixed for surround sound formats usually with accompanying video, usually concert footage or video clips, if at all, the situation in some ways resembles that time when The Beatles in the 1960’s first encountered stereo, where a spatial aesthetic had yet to be fully appreciated and formulated, most certainly at the level of ‘making a record’, that is, at a compositional and production stage.



Thus far there a few rare exceptions- Frank Zappa’s Quadiophiliac is one although hardly mainstream and quite eccentric in his approach to space, while Flaming Lips’ Zaireeka is another project that comes to mind but then again the rich underlying and dynamic multichannel aesthetic that one finds in film sound design still appears to be absent. Moving away from work produced primarily for an audio only multichannel experience, works such as Daniel Lanois’ 5.1 remixes of Peter Gabriel’s Play The Videos tend toward a film aesthetic yet have a sense of artificiality about them and a lack of coherence as consequence of not being originally, specifically designed either visually or aurally for a surround sound replay medium.


13  The Future


The technologies for recording and post producing spatial audio and audiovisual work has become far more economically accessible and begun to widely permeate audio, audiovisual and music practice. For example, the growing availability of impulse response based reverbs makes it not only possible to replicate the acoustics of wide range of spaces, including electronic ones, as well as dynamically switch between them, but it is also now possible to emulate prior recording practices previously unfeasible or impracticable. For example, when Jimi Hendrix recorded ‘The Wind Cries Mary,’ the recording engineer, finding his guitar level was distorting, instead of turning down the guitar amplifier moved the microphone to the far side of the studio resulting in the signature sound we know today. Using an impulse response of the studio one can virtually replicate that room sound exactly and use that spatial code in an entirely new context.



Looking beyond the legacy of the twentieth century, much discussion of future spatial audio practice has centred on the technical means of transmission and reproduction such as Holman’s 10.2 and NHK’s 22.2 systems, yet if the past has given us one strong indication of future trends it is that change will be driven as much by the consumer’s desire for content. To that end the creative use of multiple LFEs and spotlight speakers, the production of immersive environments for a range of contexts, surrounding screens such as those used in i-cinema may prove of interest not only in new forms of cinema but also in principally audio only production contexts.



14  Conclusions



As more artists move toward remixing or creating their material in surround sound, it is perhaps worth asking: will the dominant spatial arrangements in mixing evolve or have the essential alternatives already in practice been established?



Sound designers and music mixers tend by necessity to be pragmatic (‘whatever it takes’) relying in the main on tried and true formulae for success (swayed by commercial as well as aesthetic concerns)- yet scratch the surface and one usually finds some deep and serious thinking about the motives for their technical decisions.



What at first glance would appear to be in practice a fairly straightforward choice of recording and mixing strategies is, when one looks more closely, a complex language with a long history that even touches on ontological issues regarding the representation and evocation of reality in art.



The advent of audio has been accompanied a long history of promises and appeals to notions of ‘realism’.


However, as we have seen, an evocation of space is not simply a question of reproducing or creating physiological cues and ‘real world’ spatial geometries.



While some of the means of creating an apprehension of space are physical such as using speaker placement, panning, reverb, delay and echo, others are associative involving symbols and metaphors. As opposed to music production, with sound for picture this process becomes more complex not least because we can have possible asynchronous and/or counterpoint relationships between sound and image spaces, and one modulates the other.



Moreover, in both modes, there can be the employment of proxemic or hyper-real perspectives, particularly with voice: the ‘in your face’/ ‘in your head’/ ‘radiophonic’/ ‘telephonic’ or ‘filmic’ voiceover or vocal line.  Intimate radio voices are often employed both in sound for picture and also in music recording. The spatial zones they occupy are otherwise usually reserved only for our closest most intimate friends.



Still very much the underlying basis for Hollywood script writing is Aristotle’s classical narrative model, which is very much about transcending the self by immersing our attention- not in realism but escapism- by being ‘detached’ and ‘out there’- identifying with the ‘other’ and experiencing catharsis. This still dictates much of entertainment production and shapes the way we approach our construction of narrative and/or musical space.



Yet it is two-way- there is an interactive element even in the most passive of modes. The spectator contributes, interfaces and refashions the message according to their own unique model of the world.


In sensory incompleteness, they fill in the gaps. In sensory overload, they omit or select details to do the same- and so detach from other elements of the content.



As such the representation of space is as much about evocation as it is about imitation for even producing the best available facsimile can only result in a subjective version of the truth.



Finally in both modes, of course, there is the site where Walter Murch asserts the highest fidelity sound resides- in silence, where the imagination creates the perfect space in the mind of the listener- between the ears.



Michael Bates November 2007.



This paper would like to acknowledge that it is particularly indebted to Peter Doyle’s text Echo and Reverb: Fabricating Space in Popular Music Recording 1900-1960 cited in the references below for its invaluable rich and detailed analysis of early popular music recording.







Alten, Stanley R.  Audio in Media 3rd Edition Wadsworth Publishing 1981


Altman, Rick (ed) Sound Theory/Sound Practice New York Routledge 1992


Altman, Rick The sound of sound-A Brief History of the Reproduction of Sound in Movie Theaters”Vol. 21, Cineaste, 01-01-1995 Audio Volumes 38-42 1954-58. A magazine published by AES in 1950’s and 60’s See:, Jim  ‘Applying Aural Research: The Aesthetics Of 5.1 Surround’ Undated


Belton, John  ‘1950’s Magnetic Sound: The Frozen Revolution’ in Sound, Theory Practice, ed. Rick Altman New York Routledge 1992


Bergen, Ronald The Coen Brothers Phoenix 2005.


Boon, Marcus ‘The Eternal Drone’ in Undercurrents Rob Young (ed) Wire London 2003.


Carr, Robert ‘The Production and Musical Perspectives of Humberto Gatica’ Recording Engineer/Producer Oct 1981 quoted in Alten, Stanley R.  Audio in Media 3rd Edition Wadsworth Publishing 1981


Chrysakis, Thanos ‘Leonardo Music Journal Artists Statements’ Leonardo Music Journal, Vol. 16, pp. 40–45, 2006


Chion, Michel Audio-Vision. Sound On Screen. Columbia University Press 1994


Clarke, Michael  ‘The Concept of Unity in Computer Music’ in Clarke, Michael


1998 Extending Contacts: the Concept of Unity in Computer Music Perspectives of New Music, Vol. 36, No. 1. Princeton: Princeton University Press


Classic Albums - Motorhead: Ace of Spades (2005) DVD Eagle Vision USA


Doyle, Peter Echo and Reverb: Fabricating Space in Popular Music Recording 1900-1960 Wesleyan University Press 2005


Eisenstein S. M., Pudovkin V. I., and Alexandrov G. V.  ‘A Statement’ First published in Zhizn Iskusstva on August 5, 1928.


Eno, Brian  ‘Ambient Music’ in Christoph Cox & Daniel Warner Audio Culture. Readings in Modern Music. New York 2004.


Everyday Science & Mechanics April 1934.


Esslin, Martin Mediations. Essays on Brecht, Beckett and the Media London 1962.


Flint, Tom  Should I always pan my vocals to the centre of the mix?” in Sound on Sound Magazine Feb 2005.


Freire, Sérgio  Early Musical Impressions from Both Sides of the Loudspeaker” in Leonardo Music Journal, Vol. 13, pp. 67–71, 2003


Gerzon, Michael “A year of surround-sound.” Hi-Fi News, August 1971


Gianetti, Louis Understanding Movies 3rd edition Prentice-Hall 1972.


Goodall, Howard Big Bangs: The Story of Five Discoveries that Changed Musical History. Vintage 2001


Hodgson,Jay  ‘Outline for a Theory of Recording Practice With Reference to the Mix for Pink Floyd’s Speak To Me’ (1973) JARP Vol. 1(i) FEB 2007 Source:


Hoffman, Steve


Holman, Tomlinson 5.1 Surround Sound Up and Running Focal Press 2000.


Horton, Tim ‘The Launay Way’ Juke May 29, 1993


Jarrett, Michael “Sound Doctrine: An Interview with Walter Murch” Film Quarterly, Vol. 53, No. 3 (Spring, 2000), pp. 2-11


Jarrett, Michael “Stretching Sound to Help the Mind See” New York Times October 1, 2000


Jones, Stuartspace-dis-place: How Sound and Interactivity Can Reconfigure


Our Apprehension of SpaceLeonardo Music Journal, Vol. 16, pp. 20–27, 2006


Levitin, D. J. (2006).  This Is Your Brain On Music: The Science of a Human Obsession.  New York: Dutton/Penguin.


Lewisohn, Mark The Complete Beatles Recording Sessions Hamlyn 1988


LoBrutto Vincent Sound On Film: Interviews With Creators Of Film Sound Praeger 1994


Love, Mike in interview with Richard Glover ABC Radio (2BL) 13 November 2007


Maaso, Arnt  ‘The Proxemics Of The Voice: An Analytical Framework For Understanding Sound Space In Mediated Talk’


Massey, Howard Behind the Glass Backbeat Books San Francisco 2000.


Milicevic, Mladen ‘Film Sound Beyond Reality: Subjective Sound In Narrative Cinema’


Monaco, James How to Read A Film- Movies, Media, Multimedia. 3rd Edition Oxford University Press 2000


Moorefield, Virgil Dissertation Abstract p i Republished as Virgil Moorefield The Producer As Composer: Shaping The Sounds of Popular Music MIT Press 2005


Murch, Walter ‘Dense Clarity, Clear Density’


Murch, Walter In the Blink Of An Eye 2nd Edition Silman-James Press 2001


Murch, Walter
‘Touch of Silence’ Friday 17 April 1998, lnstitut Français, London


Ondaatje, Michael The Conversations: Walter Murch and the Art of Editing Film Borzoi Books Toronto 2002


Obwinski, Bobby The Mixing Engineer’s Handbook Mix Pro Audio Series Valejo, CA, 1999.


Reznikoff, Iegor ‘On Primitive Elements Of Musical Meaning’ section 2 
Journal of Music and Meaning JMM 3, Fall 2004/Winter 2005,


Rumsey, Francis, 2001, Spatial Audio, Boston: Focal Press


Scientific American December 22, 1877.


Smalley, Denis ``Spectromorphology: Explaining Sound-shapes'', Organised Sound: 2/2, 1997)


Snyder, Ross H  ‘History and Development of Stereophonic Sound Recording’ in Journal of AES April 1953


Sound on Sound May 2001 Advanced Gating Techniques: Part 2


Swedien, Bruce quoted in Universal Audio History at


Weis Elisabeth and John Belton (Ed) Film Sound: Theory and Practice NY Columbia University press 1985.




Yu, Emily “Sounds of cinema: what do we really hear?” Journal of Popular Film and Television 22 July 2003


Zappa, Frank Liner notes to The Mothers of Invention Uncle Meat 1969 Reprise Records




Beach Boys ‘Good Vibrations’ on Smiley Smile Capitol 1966


Beatles, The ‘Tomorrow Never Knows’ Revolver EMI 1966


Beatles, The Abbey Road EMI 1969


Beatles, The Sergeant Pepper’s Lonely Hearts Club Band 1967 EMI


Bowie, David “Heroes” on Heroes RCA Records 1977


Collins Phil ‘In The Air Tonight’ on Face Value Virgin Records 1981


Flaming Lips Zaireeka Warner Bros 1997


Gabriel, Peter  ‘Intruder’ on Peter Gabriel III 1980 Virgin records


Hendrix, Jimi ‘The Wind Cries Mary’ Single MCA Records 1967.


Jerry Murad and the Harmonicats ‘Peg of My Heart’ Vitacoustic Records 1947


Neville Brothers album Yellow Moon 1989 A & M Records


Pan American Quiet City Kranky 2004


Pink Floyd Dark Side of the Moon Harvest 1973


Presley, Elvis ‘Heartbreak Hotel’ RCA Records 1956


Small Faces ‘Itchycoo Park’ Single Immediate 1967


Stockhausen, Karl-Heinz Kontakte Wergo 1993


Stockhausen, Karl-Heinz Helicopter Quartet (1992-93) Auvidis 2000


Toop, David 37th Floor At Sunset Sub Rosa 2004


Wilco Yankee Hotel Foxtrot 2001 Nonesuch Records


Zappa, Frank Quadiophilia Rykodisc 2004





Coppola, Francis Ford Apocalypse Now 1979


Spielberg Steven Saving Private Ryan 1998


Figgis, Mike Leaving Las Vegas 1995


Figgis Mike Timecode 2000.


Gabriel, Peter Play The Videos Real World 2004


Godard, Jean Luc Masculin Feminin 1966


Groenig, Matt The SimpsonsTreehouse of Horror VI” (1995)


Hitchcock, Alfred Psycho (1960)


Klimov, Elem Come and See 1985


Kubrick Stanley 2001(1968)


Lang Fritz M (1931)


Leone Sergio The Good The Bad And The Ugly (1966)


Scott, Ridley Bladerunner 1982


David Lynch Inland Empire 2007


Welles, Orson Touch of Evil 1958


Wilcox, Fred Forbidden Planet 1956


Wise, Robert The Day The Earth Stood Still 1951


Wenders, Wim Wings of Desire 1987




[i] The bandwidth (between around 160-2000 Hz) favoured male voices and brass instruments.

[ii] It is interesting to note there was a commercial component to this as well. Steve Hoffman writes that ‘when electric recording came in (1925), some record companies like Columbia and Victor, recorded in an ambient environment (churches, meeting halls, etc.) but, when Jukeboxes came in, the Jukebox operators demanded that the record companies deaden their sound. The metallic sound of the Jukeboxes made the records sound too thin. So, the record companies (hurting from the depression) did just that, just in time for the swing era.’ (Hoffman

[iii] Achieved not by chamber reverb but by positioning instruments further off-mike in the room.

[iv]There are on-line references to ‘the original echo chamber at EMI's Abbey Road Studios in London was one of the first in the world to be specially built for recording purposes, when the studio was established in 1931’ but correspondence to the studios has, as of the time of this writing, not revealed any further details as to its use. See

[v] A sound generation system patented by Thaddeus Cahill- designed to be played over telephone lines. It weighed over 200 tonnes.

[vi] First used in 1983 on Peter Gabriel’s track ‘Intruder’ and then on Phil Collin’s hit single ‘In the Air Tonight’, using the compressed talkback mike in the drum studio of the Townhouse.

[vii] Early synchronised sound for picture was delivered on several different mechanisms - either on a separate disk (sound on disc) e.g. Vitaphone or via an optical track on the filmstrip (sound on film).

[viii] The first stereo film was Love Finds Andy Hardy (1938) produced by MGM but released in mono.

[ix]Michael Gerzon in 1971 talks about the idea of the extended stereo system as ‘spatial stereo’. See Michael Gerzon “A year of surround-sound.” Hi-Fi News, August 1971