|
Ron's right arm: tactility, visualization, and the synesthesia of audio engineering
Dr. Eliot Bates
University of Maryland, College Park
Abstract
Most scholarship on audio engineering
analyzes practices and practitioners in terms of musical and technical
knowledges. The few references to sensory perception typically center on
critical listening practices (“golden ears” engineers), audiophilia, and
technologies of audition. However, particularly in light of computer-based
workflows, the practice of audio engineering features carefully developed
synesthesias of critical listening, visualization of digital audio, and tactile
manipulations of interfaces, which can’t adequately be explained as cognitive
processes or as conscious knowledge.
I draw on literature in the emerging field of
sensory scholarship, in particular Brian Massumi’s theorization of synesthesia
and affect and Charles Hirschkind’s analyses of cultivated “sensoriums” in order
to show how practices of audio engineering can be productively theorized as a
strategic retraining of the senses. I draw diverse examples from field research
conducted in the US and Turkey. One example – Ron’s right arm – explores how
one audio engineer uses his right arm to “feel” when the bass is right in a
rock mix. Another example explores the creation of “büyük ses” (big sound) in
Anatolian “ethnic” music and the use of the Protools edit window to “visualize”
bass. In both cases, bass is something that is felt or seen, but not
immediately audible. Through an attention to differing kinds of synesthesias,
we can better understand how audio engineers perform their craft.
1 Introduction
To the casual
observer it might look like Ron’s entire body is being supported by the
SoundWorkshop mixing board. Much of his body weight leans into the console,
which is coupled to his body through his right arm. Head faced down, he
concentrates deeply on the funk/jazz/dub mix in progress. Earlier in the
session hours had been spent in aligning and biasing the Otari MTR90 24 track
tape machine; in selecting and tweaking the outboard EQs, compressors, and
reverbs; in getting the mix “in the ballpark.” However, “in the ballpark” won’t
do. Ron’s concentrating on the bass. In particular, he’s concentrating on his
right arm, for Ron knows his control room, he knows his monitors, he knows that
the bass is “in the pocket” when it feels a certain way in his right forearm.
The feeling is inaudible, or at least is subtle enough that ears alone can not
be trusted to perceive it.
For Ron,
mixing is not just a process of aural listening. Mixes have a feel, and by that
I do not mean feeling as a metaphoric quality. Feel is tactile, feel can even
be visually mapped, particularly now that Ron’s mixing workflow is now oriented
around a Samplitude Digital Audio Workstation. The right arm is still an
invaluable sensing organ, but couples not just with the analog console but with
a synesthesia of visual and auditory practices.
I must
confess, my paper’s subtitle may be a bit misleading: “Tactility,
Visualization, and the Synesthesia of Audio Engineering.” More precisely, I am
not interested in replacing one universalizing discourse about audio
engineering practice with another universalizing discourse about the same.
Instead, I suggest that not only should we, as scholars and engineers, be
interested in synesthetia, but in differing,
historically, culturally, socially, and politically situated synaesthesias, plural. Ron, the engineer with whom I begin my analysis, has
developed a unique and particular mode for audio engineering which differs from
those of other engineers, for example tonmeisters in Istanbul who also are concerned with bass but use different
conceptual, bodily, and sensory techniques for achieving their aims.
Despite the
differences, what appears to be generalizable across contexts is that audio
engineering practices are not reducible to one sense alone. Every widespread
form of engineering developed until today has depended on the body for the
manipulation of interfaces and on audition through headphone or loudspeaker
audition systems. All computer-based audio engineering technologies depend upon
the visualization of abstractions of sound and also a visualization of the
interface for manipulating sound. However, scholarship on audio engineering has
ignored the sensing body for the most part, focusing primarily on the products
of audio engineering (i.e., commercially released recordings), on
engineering-specific knowledge sets, and on engineering as an art form. My
presentation today is a small subset of a much larger project concerning
sensory practices and recording occupations. As such, my findings are tentative
and intended more to be provocative and exploratory than to stand as a finished
research project.
2 Theorizing the
Senses
I argue that in order to understand the
production of affect, or perhaps the affect of production, we need to pay
attention to bodies, to the senses, to the practices of audio engineering and
musicianship. I draw on some seemingly unlikely sources for inspiration,
including Brian Massumi’s theorization of the relation between affect,
synesthesia, and virtuality; and Charles Hirschkind’s analysis of bodily
practices and cultivated “sensorium.”
“For the present is lost with the missing
half second, passing too quickly to be perceived, too quickly, actually, to
have happened” (Massumi 2002: 30).
Massumi, fusing the experimental
psychological research of Benjamin Libet with the philosophical writings of
Spinoza and Henri Bergson, is acutely interested in unpacking the relation
between so-called “free will” and functions of “higher” consciousness, with
autonomic bodily reactions that occur in the brain but are outside of cognitive
processes per se. Among the examples he considers is the mystery of the missing
half-second, which in brief is the strange phenomenon of the half-second lag
time between a non-anticipated sensory stimulus and the brain’s cognition of
and interpretation of its happening, which is then “back-dated” in time to seem
as if the cognition and the stimulus was simultaneous. The most obvious example
of this is if you are suddenly burned by scalding water – you feel no pain for
at least one-half second, even though all of your pain receptors have fired, as
the sensation has yet to be interpreted as pain by your brain.
Why should we care about the missing
half-second? Audio engineering, like many musical-technical practices, involves
matters of extremely quick timing, of motor movements that appear to happen at will and to result
from conscious volition. However, it is impossible for actions of this rapidity
to be cognitive or entirely conscious. Therefore, an understanding of these
temporal micropractices (or perhaps, microtemporal practices) requires an
understanding of the training the body goes through in order to perform audio
engineering tasks. Extending this, I argue that audio engineering is
necessarily conceived not as a mono-sensory set of motor skills, but rather as
a particular kind of synesthesia. Returning to Massumi:
“Affect is synesthetic, implying a
participation of the senses in each other: the measure of a living thing’s
potential interactions is its ability to transform the effects of one sensory
mode into those of another. Affects are virtual synesthetic perspectives
anchored in the actually existing, particular things that embody them” (Massumi
2002: 35).
I am also influenced by the work of
Charles Hirschkind, who analyzes the practices of specific forms of Islam in
relation to what he terms a particular sensorium.
His work concerns what he terms the practice of “ethical listening” present in
the contemporary Egyptian da’wa movement, where
religious activity works towards cultivating a proper relation between hearing,
the heart, and ethical practice:
“I approach the question of the
sensorium... from the perspective of a cultural practice through which the
perceptual capacities of the subject are honed and, thus, through which the
world those capacities inhabit is brought into being, rendered perceptible”
(Hirschkind 2001: 624).
The idea of differing, cultivated
sensoriums allows us to understand the role of long-term repetitive practices
(in the case of Islam, ritualized weeping and “ethical hearing”; in the case of
engineering, special modes of listening and certain kinds of
engineering-specific tactility and vision) in creating particular modes of
being in the world. However, it would be a mistake to suddenly declare the
existence of a singular audio engineering sensorium. Rather, audio engineering
in different cultural, social, historical, architectural, and technological
contexts has come to depend on particular sensoriums. I will sketch out aspects
of a few contrasting examples.
3 Feeling and Timing
Let us consider the matter of audio
engineering knowledge, through a couple examples of practices which show that
an analysis of knowledge alone is not enough to explain how it is that audio
engineering is performed. Returning to Ron’s right arm, his perceptual
practices, on an ongoing, real-time basis, inform his choice of EQ,
compression, and subharmonic synthesis strategies for addressing the sound of
bass in a mix. Although he has a few “default” EQ or compression settings that
might be a starting point for further tweaking, ultimately every song, every
different electric or acoustic bass instrument produces sequences of timbres
that impart unique challenges. There is no visual representation of the bass
recording that assists Ron in the process of mixing, and no formula or objective
measurement which adequately conceptualizes the field of practice. In other
words, it is not conceptualized or abstracted knowledge about bass that informs
his particular working style, but rather the simultaneous interaction between a
synesthetic sensory disposition towards bass sound and a loosely-structured
repertoire of signal processing techniques. Clients who come to Ron’s studio
often say he “knows” how to mix bass. They’re not totally right: Ron feels how to mix bass.
Ron’s right arm draws our attention to
the body as a perceiving organ, but other engineering practices are more apt
for demonstrating that missing half-second in action. Let’s consider the once
widespread practice of riding the fader during live vocal tracking. We could be
tempted to say that the engineer “knows” what the vocalist is going to sing and
anticipates the singer’s every move, but this misses the crucial detail that
the timing window is so narrow that it is physically impossible, due to that
half-second rule, for the engineer to have conceptualized every fader move they
make. An engineer who is comfortable with the technique of fader riding – and
this is a technique which requires a lot of practice – harnesses certain kinds
of knowledge, but ultimately what they do entails a semi-autonomic motor
process whereby their fader moves correlate near-immediately with raw sensory
data, including auditory stimulus (the sound of the singer) and visual
information (the sight of the singer). In short, fader riding is thoroughly
non-cognitive. Knowledge of fader-riding, largely a collection of reflections
on prior successful and unsuccessful attempts, does help shape a kind of
special disposition of the engineer. We could even say that, in part, knowledge
preforms the practice. But that knowledge alone
cannot explain the immediacy of the practice, let alone the sensorium of the
fader-riding engineer.
4 The production of büyük
ses:[i]
percussion editing in Istanbul
I conducted over two years of intensive
field research in numerous recording studios in Istanbul. Two of the things
that stood out to me included the incredible speed with which every part of the
recording process unfolded, and the considerable manual dexterity of everyone
involved, most notably the studio musicians and the engineers. To put it into
perspective, it was not uncommon for an entire album’s worth of 36-60 track
mixes to be: conceived, arranged, tracked, edited, mixed, and mastered within a
five day period of time. No prefabricated samples were used and only a single
musician was tracked at a time; tracking consisted primarily of acoustic studio
musician overdubs and doubles. This work unfolded in ProTools HD facilities
with a minimum of outboard gear, which were installed in fabric-covered
concrete rooms that were not originally designed for music tracking. Due to the
extreme nonlinearity of the rooms, and the presence of many null points and
standing waves, bass frequencies presented the most significant problem – how
to track them, how to fit them into a mix, how to perceive their balance within
a mix, how to engineer them.
The first point about Istanbul recording
production and the issues of tactility and synesthesia concerns the general
sense among engineers that the sound coming from speakers was misleading and
unlikely to translate from one room to another. Subsequently, the visualization
of sample data in the ProTools editing window was used to garner critical
information about the sound of the mix. One interesting linguistic turn
epitomizes a common interchange between the arranger and engineer: bu miks
nasıl görünüyor? – bu miks bitmiş gibi görünüyor.
“How does this mix look? – It looks like it could be finished.”[ii]
In this case, the expression is not metaphoric, but rather literal, as both
parties stare intently at the LCD monitors, appraising the visual impression of
the waveform representations of the two-track mixdown as shown on the screen.
The visualization corresponds, perhaps, to an auditory image of what the mix
could sound like on an imaginary sound system in an imaginary studio, albeit
one that the arranger and engineer will never have access to.
A related issue concerns percussion
arrangement. The predominant arrangement aesthetic consists of numerous layered
local and foreign origin percussion instruments, creating a complex,
polyrhythmic texture coalescing on a small number of strong accents. With the
multitude of instruments available today, and the seemingly limitless track
count afforded by ProTools HD, this has led to arrangements with ten or more
unique instruments playing simultaneously, many with significant energy in the
low bass frequency range. In analyzing one innovative percussion arrangement of
the song “Gülçini,” a traditional 7/8 horon
dance piece from the Eastern Black Sea of Turkey, it is instructive to see how
the primary four-bar groove was constructed (Figure 1). First, the basic askı-davul part was tracked. This part has the closest relation to what could
be considered an authentically local (asıl, yerli) performance of the rhythm for a kemençe horon ensemble and dance context.[iii]
Amongst the important musical features, this part has the correct pattern of
accents, the correct relative dynamic contrast between accented and unaccented
beats, and an appropriate groove (which can be defined as a pattern of
expressive microtimings, of events that don’t correspond with a metronomic
division of the bar).
Next, the two cajon drum rhythms are
overdubbed, with percussionist Soner Akalın listening to a mix of metronome and
the askı-davul part. These parts add additional
low energy on the downbeats of every measure, and on some of the third and
fifth beat accents as well. Following this, the udu, frame drum, and tambourine
parts were tracked separately to the mix of askı-davul and the two cajon drums. At this point, the seven-part percussion
arrangement had become quite a strong dance rhythm, with four drums providing
bass frequency components. As if this seven-part arrangement were not enough,
the song arrangers (Aytekin Gazi Ataş and Soner Akalın) decided that there was
a need for additional sounds more explicitly evocative of dance, so Aytekin and
Soner went into the tracking room, laid down a large plywood box, and
overdubbed themselves stomping on it. Four stereo tracks of that were created,
as well as three stereo tracks featuring aspirated inhales and exhales on
strategic downbeats. A staff-notated rendition of the complete percussion
arrangement can be seen in Figure 2.
Figure 3 shows the ProTools visualization
of just one measure, in particular the six parts that compete for space in the
bass frequency range. Among these parts, the event attacks on the strong beats
of the measure (one and five) appear to have been performed at different
moments in time, ranging from 30 ms. prior to the bar line (the first cajon) to
5 ms. after the bar line (the second cajon). While at first it may seem that
this is due to some inaccuracy on the part of the percussion performances, or a
lack of attention to quantizing the beats, this could not be further from the
case. The audible effect of this arrangement does not convey the sense of
multiple nonaligned attacks. Due to the nature of auditory event perception,
delays of less than 60 ms between events of similar timbres are typically not
perceived as separate entities but rather as acoustic reflections of a single
entity (Chowning 1999). Thus, the up-to-35 ms. deviation in event attacks
produces an effect which sounds like one very long, evolving, and timbrally
complex bass drum sound.
There are two things at play here: the
first is a novel performance practice of studio percussionists, who while
tracking deliberately offset particular attacks not in order to alter the
groove, but instead to create parts that contribute to the illusion of a
single, huge bass drum sound. Studio musicians have also developed ways of
playing certain instruments while minimizing the volume of the attack component
of each separate event. These techniques differ markedly from any traditional
performance practice used for live performance on the same instruments.
The second element concerns the practice
of engineers in using digital editing to deliberately offset bass-intensive
events, when the effect isn’t successfully produced by the studio musician. As
I noted earlier, hearing bass in the studio is a precarious operation, and
therefore this latter operation is typically done visually by engineers, with
the stereo master output of the DAW being used to measure the success. I should
point out that no compression is ever used on percussion, and rarely is any EQ
applied to the bass frequencies of sound sources. Therefore, this combination
of practices ensures that peak amplitudes of multiple sound sources don’t
combine to overload the output of the mix.
I began my discussion of Turkish
recording workflows with discussion of the speed of the recording process, as
well as a mistrust of, or perhaps nondependence on, studio listening. My
observations in Turkey indicate that specially cultivated visual practices – seeing when the mix is done – as well as precise motor control in studio
musicianship and mouse-keyboard based digital audio editing, form an integral
part of the synesthesias and sensoriums of recording professionals in Turkey.
Of course, listening does play a part in the creation of this music. However,
what is being listened for, and how studio-situated listening relates in
real-time to other sensory practices, and to what perhaps we might call a sense
of imaginary acoustic ideation, has a local specificity.
5 Conclusion
Steven Connor has written extensively on
the mistrust and fear of tactility and the sensing body that pervades Western
historical and ethnographic writing (Connor 2000, 2004). Such feelings run
deep, and have had pronounced effects both on the techniques and the subject
matters of academic scholarship, although this division between knowledge and
practice has not always been the norm. Inside the U.S. studio milieu, abstract
knowledge is highly prized as a mark of professionalism. However, the
discourses in and surrounding the studio sometimes blur an important reality:
that studio work is ultimately a practice, is a craft, is something that not
only requires touch and the sensing body but is first and foremost a tactile
art. To understand the practices of audio engineering is to understand the
synesthesias and sensoriums of audio engineering in different milieus.
Some scholars have written about “golden
ears” engineers, those who have amazing listening skills that allegedly surpass
those of mortal humans. I ask, what about the engineers with golden eyes and
golden forearms?
Acknowledgments
My research was logistically made
possible by a State Department Fellowship from ARIT (American Research
Institute in Turkey) and a Fulbright IIE grant. I wish to offer particular
thanks to Aytekin Gazi Ataş, Ömer Avcı, Benjamin Brinner, Ladi Dell’aira,
Jocelyne Guilbault, Charles Hirschkind, Ayşenur Kolivar, Urum Ulaş Özdemir, and
Paul Théberge, who provided invaluable comments on earlier versions of this
paper.
[i] Büyük ses literally means “big sound,” and is an
aesthetic feature of mixes that developed in the 1990s and became a standard
mix paradigm by the early 21st century. Although related to Western
concepts of loudness, the büyük ses aesthetic has indigenous origins, yet no known
precedents in Anatolian traditional performance practices. See Bates (2010) for a more extensive analysis
of büyük ses.
[ii] The producer, in
Turkey, is typically absent from the recording workflow and functions largely as
the financier of a project. The aranjor (arranger) refers to an individual who orchestrates a
piece and manages the workflow of the recording sessions, yet unlike Western
producers lacks the same degree of “creative liberty.” See Bates (2008) for more on Turkish studio
work.
[iii] The kemençe
horon ensemble consists at
a minimum of a solo singer playing a kemençe (three-stringed box fiddle from the Eastern Black Sea
region). More commonly, the kemençeci (kemençe player)/singer is joined by a chorus of voices
provided by the horon
(line) dancers. In the city of Trabzon and its surrounding area, a single askı-davul drum may accompany the kemençe and singing. See Picken (1975) for a historical account of
the kemençe.
ReferencesBates, Eliot. 2008. Social interactions, musical arrangement, and the production of digital audio in Istanbul recording studios. Ph.D. Dissertation. University of California Berkeley
Chowning, John. 1999. Perceptual fusion and auditory perspective. In Music, cognition, and computerized sound: an introduction to psychoacoustics, edited by P.R. Cook. Cambridge, MA. MIT Press.
Connor, Steven. 2000. Dumbstruck: a cultural history of ventriloquism. Oxford. Oxford University Press.
Connor, Steven. 2004. The book of skin. Ithaca. Cornell University Press.
Hirschkind, Charles. 2001. The ethics of listening: cassette-sermon audition in contemporary Egypt. American Ethnologist 28(3), 623-649.
Iyer, Vijay. 2002. Embodied mind, situated cognition, and expressive microtiming in African-American music. Music Perception 19(3): 387–414.
Massumi, Brian. 2002. Parables for the virtual: movement, affect, sensation. Durham. Duke University Press.
Picken, Laurence Ernest Rowland. 1975. Folk musical instruments of Turkey. London: Oxford University Press.
DiscographyKabaosmanoğlu, Yaşar. 2006. ‘Gülçini’ on Rakani. Metropol Müzik Üretim.
Endnotes

Figure 1: "Gülçini" basic askı-davul rhythm

Figure 2: "Gülçini" nine-part percussion arrangement

Figure 3: Protools visualization of the first measure of “Gülçini,” showing the staggered timings of low frequency percussion strokes.
|