Evening Will Come: A Monthly Journal of Poetics (Issue 29, May 2013)

Joshua Liebowitz
Musical Drift: Toward a Method of Sonopoetics

I wish you might live till there is nothing more to be said in music.

— Wolfgang Amadeus Mozart, 1777


In May of 2012 I began working in collaboration with Rodrigo Toscano to sculpt sound for his body-movement poem, Spine (Collapsible Poetics Theater, 2008). The question we asked ourselves was, in what kind of space could sound and poetics be unified. Since the piece is meant to be performed in a theatrical space, we decided that instead of having the performers speak or mouth the text, the sound should be mapped to their voices, and to their body movements. What we didn’t expect was that in designing the piece this way, we’d set course beyond any frame of reference. We thought the process, then, and what we found along the way should be documented. What follows is a declaration, and it is also a field guide.


For too long Sound has struggled under Music: the rigid nostalgia of western harmony and free trade exoticism found in the university; the evocative self-absorption of entertainment parading as ironic yet sensitive. It is time to free sound from music.

We can free sound from music by poetics.


There is no shortage of chatter on how much Music and Poetry have in common. We hear there is as much a syntax to music as there is to poetry; that poetry and music share rhythm, pitch, dynamics, tempo – timbre, even. What we don’t hear enough about though are the physical properties that give form to both in the first place. This is because the classical characteristics by which we understand poetics and so categorize Music and Poetry just don’t work when we start talking about durations, positions and tensions of speech units and space, which are the elements of poetics itself.

Sound does not ignore these elements. Sound is these elements.


At its most fundamental level, sound is a perceived event resulting from interactions between frequencies and amplitudes, displaced air molecules, and the mechanisms in our ears. In more everyday terms, what we call sound begins with an onset attack or stress that repeats and varies over a duration of time, causing the medium around it to vibrate in various degrees of intensity or tension, which we hear as positions of high and low, loud and soft.

As with poetics, sound is an event consisting of duration, position and tension.


It’s good to remember that sound and poetics are invisible. When we read or listen to a poetic text, our minds process time- and space-based acoustic information that we in turn interpret as sets of available meanings. The same holds true for sound in and of itself. Whether we are able to see the source of a sound or are only able to hear it, whatever we wish to call that sound is the result of a cognitive process that ascribes meaning to the audible event. We cannot see this process. We cannot see “meaning.” What we can do though, is we can take the acoustic information leading up to what we interpret as meaning out of its initial analog domain, and represent that info in the digital domain.

The elements of meaning in sound and poetics need not remain invisible.


The ability to digitally represent acoustic information as visual info is nothing new. What is a departure is the ability to visually represent that data as more than just a soundwave consisting of pitch, time and loudness. Enter Spectral Processing. This software gives us the ability to convert pitch and loudness information at the time-based, macro level of a soundwave into a series of elementary frequencies and amplitudes at the position-based, micro level. The application then reconverts this quantum information into a new soundwave or soundfile so we can hear it and see it.1 Essentially, we get to place sounds under a microscope. And thanks to the expansion of the internet, digital audio tools like spectral processors that were once available only through the university or at a very high cost are now accessible, often for free, to anyone with access to a computer and a server.

What does Spectral Processing have to do with poetics?


Words, just like soundwaves, are also macro structures that contain information about pitch, time and loudness. And the same holds true at the atomic level: the speech units that underlie words act just like sound elements; the only difference is, we like to call them phonemes and morphemes. Down below, a speech event is initiated by an onset attack on the part of the speaker, causing two lungs and a vocal tract to vibrate in various degrees of tension. This gives the speech sounds duration and loudness. Up above, it is the variations in duration and loudness that we interpret as different pitches. In turn, our ability to generate and process different pitches forms the basis of all vowels, and thus syllables. If we were worried about convention, we’d simply say we stress these syllables in as many ways as possible to play with prosody and syntax – out would pop a poetics, and our case would be closed. But we’re not yearning for the safety of some imaginary and unprovable history. Instead, consider the following about speech units in relation to sound:

A frequency is an oscillation of matter given shape and duration by an initial onset attack that determines its amplitude, or loudness // A frequency’s amplitude interacts with the medium through which it propagates, and determines its position in that space // We hear this interactive position as pitch // Morphemes and phonemes – the elementary units that characterize speech – are collections of frequencies and amplitudes //

As with sound, poetics is an event of tension, position and duration.


When we put to spectrum the digital signal of a soundwave, an algorithm acting like a camera to a specified section of time within the wave snaps a shot, and catches highly precise positions of the frequencies and amplitudes within the wave, but in an enhanced or stretched out version of time. Inversely, the algorithm snaps a shot encasing detailed durational info, but crunches the frequency and amplitude data into a series of clicks. Irrespective of the snapshot-route we choose, the result is a signal compound carrying with it some, but not all information respecting frequency and amplitude and some, but not all info concerning time. This is because spectral processing performs by an algorithm called a Fast Fourier Transform. This transform converts time into a function of frequency, and frequency into a function of time – but it cannot do so simultaneously. The interesting thing about the software performing by this algorithm is that its non-simultaneous operating procedure is actually a statement of the Uncertainty Principle in quantum mechanics. Put simply, the Uncertainty Principle states that a particle’s position and momentum cannot be known with certainty at the same time of measurement.

Since we know by Spectral Processing that sound and poetics share the same spatiotemporal core, the fact that the software we used to find this out measures position and time non-simultaneously presents us with a unique problem.

At the time of processing a wave, we cannot tell whether we are measuring micro data at the macro level of pitch, time and loudness, or macro data at the micro level of frequency and amplitude. Since we cannot differentiate between this information, we are faced with the implication that because frequencies and amplitudes have time, they can act like waves; because waves have pitch, position and time, they can act like particles. What we end up with is wave-particle duality at the level of sound, and so too, poetics – with neither capable of being measured with certainty.

An operating environment for unifying sound and poetics should reflect this uncertainty.


Once recorded, a poetic text – or any other text for that matter – becomes a soundfile from which we can extract samples to spectrally process. The borders of a sample picked for processing span the time of the section chosen within the soundfile or soundwave. We can extract multiple samples from the same location so that in one sample, duration is expanded, allowing the space of the particle positions to be represented more accurately; in the other, space is contracted so that duration can be captured more accurately. After extracting these samples and the spatiotemporal properties that comprise them, we can alter those properties in as many ways as we see fit – whether it be through noise gating, filtering; the possibilities for manipulation here are basically endless.2 But the important thing is that this allows us to form a poetics based not solely at the level of syllable and syntax, but one that also incorporates the elemental sound properties that comprise syllables and syntax.

Because the data captured in our sample collection of textual substrata is spatiotemporal in character, instead of necessarily discernable tones and pitches, what we get are the sounds of multiple interactions of space and time: multiples of high-low, left-right, soft-loud, long-short, sounds shaped by contractions and expansions of time and space within the samples. The more we collect, the more we can affect a syntax of blur, where the parameters separating a micro sound space from a macro sound space can be combined and allowed to interact with another.

Like a diagram in the hands of architects Ben van Berkel and Caroline Bos, the text can become an instrument.3 Made into a signal, rather than remaining a collection of signs, the text can interact with the same sonic particles that both constitute the text and are constituted by it. And as the process by which we uncover the text’s particles is itself non-simultaneous, discrete, invisible, the resulting interactive structure will not only rigorously represent the uncertainty of the text’s meaning, but also the process by which that uncertainty was achieved.

Invisible, spectral processing is itself the stuff of sound and poetics.


When the spectral samples we extract are combined with their larger grammatical reciprocals at the level of the original recorded text, we create a highly interactive sound structure, which is heard as a series of unfolding sound events. Within this structure, not only can the sequential unfolding of sounds precede and support the text, this unfolding crucially is the space where the original text is heard. Unlike the historic relationship between text and document, the space encompassing the text is generated by its sonic elements, and so dually functions as a text itself, and a space that interacts fully with the original text. As the space encircling the text is sounded, what were once margins become unmarginalized: this sound space can interact with itself, and in proximity to the original text.

Within these planes of interaction, the sonic space occurring before the first large-scale words are heard can be further processed to create particles of sound bearing but a trace of even their constituency as the atomic elements of the processed source text. This space, then, can have its own spatiotemporal interactions between sounds, united by way of their uncertain, discrete coming-to-be, and equally wandering: disappearing and reappearing in relation to both themselves, the mass sounding space they shape, and that same unmarginalized space which is heard before the first discernable words appear.

The sounds that shape the outer, textual space can also travel into the macro word formants themselves, with some being absorbed by these larger utterances, and some traveling around, through, above and below them if you like: only to return, or not return at a later time in the sound space as a whole. Interacting like this, not only do the multiple, spatiotemporal planes of sound deform and decontextualize their relations to the text, but they also affect the larger spatiotemporal characteristics of the text itself. By way of these interactions between physical properties at the quantum and classical levels, the boundaries between sound planes begin to overlap: the space becomes multi-dimensional.

In all ways connected to the original text, this multi-dimensional sound space leaks perimeter relations not just within, but at its outer perimeters too, as these surficial boundaries can always be extended – further processed, uncertainly. Topological and modular, this sound space is a manifold.

It is the manifold space that unifies sound and poetics. Logically derived from the instrument of the text by a rigorous, illogical process, this manifold is a continuous sound space containing discrete uncertainty.

A sono-poetic manifold is inharmonic, multi-dimensional, discordant: discursive.


The paradox that is discursion obfuscates boundaries between the sono-poetic manifold and any would-be performers. They needn’t read the text, nor even mouth it, as the text is within the manifold, and that is already performing. Essentially, any actions or body movements called for on the part of a corporeal performer signals that they must necessarily interact with the sonopoetic manifold, specifically because this invisible soundspace is performing in parallax to them.

This necessary interaction creates then another dimension of sound between those of the sonopoetic manifold, and the larger, Euclidean actions of the performers. In this plane, the micro and macro sounds within the manifold space can travel out amongst and around the performers as they can in their proximities to the text and to themselves. This very strange boundary space can be bent so that the sonopoetic manifold seems to enter the actions of the performers – enabling confusion as to whether the soundspace of the manifold is enveloped by the performers, or is enveloping them. The spatiotemporal properties that characterize the classical performance space itself also become malleable.

A sonopoetic manifold is an enigma.


The core of a sonopoetic method is mutable and rigorously absurd. The interior is arrived at by spectrally processing the larger properties of speech and space that characterize a text to reveal its quantum sound belly, where its poetic capacity resides. The process to spectra is itself poetic: inherently uncertain, it cannot reveal simultaneously the particle data of speech units and sound as they occur in time; but it does reveal their existence. Because of this duality, this observer effect, we cannot know whether the macro characteristics of a text are the result of its micro characteristics, or vice versa. The notion, then, of a complete syntactic delivery system at both levels is dissolved, and with it, clear semanticity. A sonopoetic method takes this into account: its logic is both arbitrary and discursive. It derives a continuous series of unfolding sound events from the continuous unfolding of the text. But it does so discretely; taking the sonic quanta out of the text for processing, then placing them in parallel, and arbitrarily within the unfolding of the text at its larger level of grammar. The result is not sound poetry, nor the poetics of sound. It is sonopoetic. It says that the spectral process, the interactions between atomic and Euclidean sounds, and the resulting multiples of space and time between these sounds, between the lines and letters, form a sonopoetic manifold, and that this sounding space is the poetics: is the poetry, is the music.

A method of sonopoetics says, harmony is dead; semantics is dead; long live the sonifold.


Not satisfied simply being décor for our psyches and our emotions, a sonopoetic method seeks to deaestheticize music and poetry by unmarginalizing and giving voice to the spatiotemporal sounds that form their practice. Sonopoetics strives for a balance between the hyper-physical and the hyper-emotional.

This is not strictly about dancing. This is particle audio.

Within the rigors and discursiveness of spectral processing we can create an invisible, enigmatic structure – a sonic architecture where program, site and plan are constituted entirely within, and by a chosen text. If we want to create new spaces by freezing, decomposing and recomposing syllables, durations, positions, tensions, sound particles and the components of their components, we should. We can.

In this way, a method of sonopoetics invites further investigation into the sound of space, and how space itself is understood. Using this method, we can bend and blend, contract and expand space and time by making curvatures and planes audible. So naturally it follows that sonopoetics has applications in other disciplines.

Above all, a method of sonopoetics encourages collaboration and participation. Aware and impressed by the dense gravity of solipsism, sonopoetics seeks to de-emphasize the “me” in the artistic process, and in the receptive process, by being a laboratory for shared research and shared responsibilities in the exploration of ideas. Taking as its starting point the recorded text, the method prefers an interdisciplinary, signal-based communication route between two or more people, to enable horizontal consensus rather than vertical dialogue. Acknowledging that a complete system for the interpretation of meaning and semanticity might never be verifiable, the method understands the importance of an intersubjective approach. Rather than searching for differences amongst commonalities, a sonopoetic method searches for commonalities amongst differences.


1 See Michael Norris, “Understanding Spectral Processing: SoundMagic Spectral.” Manual, SoundMagic Spectral, New Zealand School of Music, New Zealand, Revised August 2011.
Return to Reference.

2 For a more complete discussion, see Trevor Wishart, Audible Design: A Plain and Easy Introduction to Practical Sound Composition (Orpheus The Pantomime, 1994).
Return to Reference.

3 See Ben van Berkel and Caroline Bos, “Interactive Instruments in Operation Diagrams,” in Olafur Eliasson: Surroundings Surrounded, Essays on Space and Science, ed. Peter Weibel (Graz; Cambridge: ZKM; MIT, 2000; 2001) 79-88.
Return to Reference.