Nicholas M. Collins
St.John’s College
Centre for Music and Science
Faculty of Music
University of Cambridge 2006
This dissertation is submitted for the degree of Doctor of Philosophy
245 pages
Abstract
Musical agents which can interact with human musicians in concert situations are a reality,
though the extent to which they themselves embody human-like capabilities can be called
into question. They are perhaps most correctly viewed, given their level of artificial intelligence
technology, as ‘projected intelligences’, a composer’s anticipation of the dynamics of a concert
setting made manifest in programming code. This thesis will describe a set of interactive systems
developed for a range of musical styles and instruments, all of which attempt to participate
in a concert by means of audio signal analysis alone. Machine listening, being the simulation
of human peripheral auditory abilities, and the hypothetical modelling of central auditory and
cognitive processes, is utilised in these systems to track musical activity. Whereas much of this
modelling is inspired by a bid to emulate human abilities, strategies diverging from plausible human
physiological mechanisms are often employed, leading to machine capabilities which exceed
or differ from the human counterparts. Technology is described which detects events from an
audio stream, further analysing the discovered events (typically notes) for perceptual features of
loudness, pitch, attack time and timbre. In order to exploit processes that underlie common musical practice, beat tracking is investigated, allowing the inference of metrical structure which
can act as a co-ordinative framework for interaction. Psychological experiments into human
judgement of perceptual attack time and beat tracking to ecologically valid stimuli clarify the
parameters and constructs that should most appropriately be instantiated in the computational
systems. All the technology produced is intended for the demanding environment of realtime
concert use. In particular, an algorithmic audio splicing and analysis library called BBCut2
is described, designed with appropriate processing and scheduling faculties for realtime operation.
Proceeding to outlines of compositional applications, novel interactive music systems are
introduced which have been tested in real concerts. These are evaluated by interviews with
the musicians who performed with them, and an assessment of their claims to agency in the
sense of ‘autonomous agents’. The thesis closes by considering all that has been built, and the
possibilities for future advances allied to artificial intelligence and signal processing technology.
Contents
1 Introduction
1.1 Personal Motivations and Thesis Structure . . . . . . . . . . . . . . . . . . . .
1.2 Interactive Instrument Research in Computer Music . . . . . . . . .
1.3 Psychological Issues . . . . . . . .
1.4 Signal Processing Issues . . . . . . . . . . .
1.5 Aims and Implementation
2 Beat Tracking and Reaction Time
3 Automatic Segmentation
4 Realtime Beat Tracking Algorithms
5 Automated Event Analysis
6 BBCut
7 Interactive Music Systems
7.1 Precursors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Machine Enhanced Improvisation . . . . . . . . . . . . . . . . .
7.2.1 Sat at Sitar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2.3 DrumTrack . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Baroqtronica: The Art of Machine Listening
7.3.1 Substituet . . . . . . . . . . . . . . . . . . .
7.3.2 Oamaton . . . . . . . . . . . .
7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . .
8 Conclusions
8.1 Intelligent Agents? . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.1 Autonomy . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.2 The Shape of Musical Actions . . . . . . . . . . . . . . . . . .
8.1.3 Interactive Music Systems as Agents . . . . . . . . . .
8.2 Machine Listening Research . . . . . . . . . . . . . . . . . . .
8.2.1 Event Detection and Analysis . . . . . . . . . . . . . .
8.2.2 Beat Tracking
8.2.3 BBCut3? . . . . . . . . . . . . . . . . . . . .
8.3 Research Outcomes . . . . . . . . . . . . . . . . . . . . . . . . .
8.4 Compositional Outcomes . . . . . . .
St.John’s College
Centre for Music and Science
Faculty of Music
University of Cambridge 2006
This dissertation is submitted for the degree of Doctor of Philosophy
245 pages
Abstract
Musical agents which can interact with human musicians in concert situations are a reality,
though the extent to which they themselves embody human-like capabilities can be called
into question. They are perhaps most correctly viewed, given their level of artificial intelligence
technology, as ‘projected intelligences’, a composer’s anticipation of the dynamics of a concert
setting made manifest in programming code. This thesis will describe a set of interactive systems
developed for a range of musical styles and instruments, all of which attempt to participate
in a concert by means of audio signal analysis alone. Machine listening, being the simulation
of human peripheral auditory abilities, and the hypothetical modelling of central auditory and
cognitive processes, is utilised in these systems to track musical activity. Whereas much of this
modelling is inspired by a bid to emulate human abilities, strategies diverging from plausible human
physiological mechanisms are often employed, leading to machine capabilities which exceed
or differ from the human counterparts. Technology is described which detects events from an
audio stream, further analysing the discovered events (typically notes) for perceptual features of
loudness, pitch, attack time and timbre. In order to exploit processes that underlie common musical practice, beat tracking is investigated, allowing the inference of metrical structure which
can act as a co-ordinative framework for interaction. Psychological experiments into human
judgement of perceptual attack time and beat tracking to ecologically valid stimuli clarify the
parameters and constructs that should most appropriately be instantiated in the computational
systems. All the technology produced is intended for the demanding environment of realtime
concert use. In particular, an algorithmic audio splicing and analysis library called BBCut2
is described, designed with appropriate processing and scheduling faculties for realtime operation.
Proceeding to outlines of compositional applications, novel interactive music systems are
introduced which have been tested in real concerts. These are evaluated by interviews with
the musicians who performed with them, and an assessment of their claims to agency in the
sense of ‘autonomous agents’. The thesis closes by considering all that has been built, and the
possibilities for future advances allied to artificial intelligence and signal processing technology.
Contents
1 Introduction
1.1 Personal Motivations and Thesis Structure . . . . . . . . . . . . . . . . . . . .
1.2 Interactive Instrument Research in Computer Music . . . . . . . . .
1.3 Psychological Issues . . . . . . . .
1.4 Signal Processing Issues . . . . . . . . . . .
1.5 Aims and Implementation
2 Beat Tracking and Reaction Time
3 Automatic Segmentation
4 Realtime Beat Tracking Algorithms
5 Automated Event Analysis
6 BBCut
7 Interactive Music Systems
7.1 Precursors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Machine Enhanced Improvisation . . . . . . . . . . . . . . . . .
7.2.1 Sat at Sitar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2.3 DrumTrack . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Baroqtronica: The Art of Machine Listening
7.3.1 Substituet . . . . . . . . . . . . . . . . . . .
7.3.2 Oamaton . . . . . . . . . . . .
7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . .
8 Conclusions
8.1 Intelligent Agents? . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.1 Autonomy . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.2 The Shape of Musical Actions . . . . . . . . . . . . . . . . . .
8.1.3 Interactive Music Systems as Agents . . . . . . . . . .
8.2 Machine Listening Research . . . . . . . . . . . . . . . . . . .
8.2.1 Event Detection and Analysis . . . . . . . . . . . . . .
8.2.2 Beat Tracking
8.2.3 BBCut3? . . . . . . . . . . . . . . . . . . . .
8.3 Research Outcomes . . . . . . . . . . . . . . . . . . . . . . . . .
8.4 Compositional Outcomes . . . . . . .