Document not found! Please try again

Plymouth brain-computer music interfacing project: from ... - CiteSeerX

4 downloads 3665 Views 3MB Size Report
controlled primarily with gestural devices (mostly using customised switches and .... easy to use music production software manufactured by Propelerhead.
154

Int. J. Arts and Technology, Vol. 3, Nos. 2/3, 2010

Plymouth brain-computer music interfacing project: from EEG audio mixers to composition informed by cognitive neuroscience Eduardo R. Miranda Faculty of Arts, Interdisciplinary Centre for Computer Music Research (ICCMR), University of Plymouth, Plymouth PL4 8AA, UK E-mail: [email protected] Abstract: This paper reports the current level of development achieved by our research into brain-computer music interfacing (BCMI), which is aimed at special needs and Music Therapy, in addition to the entertainment industry. It surveys the technology developed to date at ICCMR and glances at work-inprogress designs informed by cognitive experiments. Research into BCMI involves three major challenging problems: (1) the extraction of meaningful control information from signals emanating from the brain, (2) the design of generative music techniques that respond to such information and (3) the definition of ways in which such technology can effectively improve the lives of people with special needs and address therapeutic needs. This paper focuses on the first two challenges, particularly the music technology side of BCMI research. In overall, this paper focuses on the creative computer music component of our research rather than on the minutiae of its underlying scientific protocols and experimental methods. Keywords: brain-computer music interfaces; EEG-controlled music; music technology; computer-generated music; interactive music systems. Reference to this paper should be made as follows: Miranda, E.R. (2010) ‘Plymouth brain-computer music interfacing project: from EEG audio mixers to composition informed by cognitive neuroscience’, Int. J. Arts and Technology, Vol. 3, Nos. 2/3, pp.154–176. Biographical notes: Eduardo R. Miranda is a Professor in Computer Music and Head of the Interdisciplinary Centre for Computer Music Research (ICCMR) at the University of Plymouth, UK. He coauthored the book New Digital Musical Instruments: Control and Interaction Beyond the Keyboard (A-R Editions, 2006) and has authored a number of research papers on new technologies for music. Eduardo is an active composer using new computer technology to create music. His solo album CD Mother Tongue (Sargasso, London) includes a selection of works from his research into new possibilities and technologies for composing with voice, both natural and synthesised.

Copyright © 2010 Inderscience Enterprises Ltd.

Plymouth brain-computer music interfacing project

1

155

Introduction

A brain-computer interface (BCI) allows a person to control computers by means of commands expressed by signals read directly from their brain using appropriate brain scanning technology. Basically, there are two approaches to read brain signals: invasive and non-invasive. Whereas, invasive approaches require the placement of sensors inside the skull, non-invasive methods use technology that can read brain signals from the outside the skull. Currently, the most viable non-invasive practice for BCI is to read the electroencephalogram (EEG), with electrodes placed on the scalp. We are interested in developing brain-computer music interfacing (BCMI) technology aimed at special needs and Music Therapy, in particular for people with severe physical disability, but able brain function. This paper presents the current level of development achieved by our research into BCMI. Our research addresses three major challenging problems of BCMI: 1

the extraction of meaningful control information from signals emanating directly from the brain

2

the design of generative music techniques that respond to such information

3

the definition of ways in which such technology can effectively improve the lives of people with special needs and address therapeutic needs.

This paper reports on the first two challenges, particularly the music technology side of BCMI. It draws attention to the creative computer music component of our research rather than on the minutiae of its underlying scientific protocols and experimental methods. Why BCMI for special needs and Music Therapy? Our work is motivated by the extremely limited opportunities for active participation in music making available for people with severe physical disability, despite advances in technology. This prevents them from engaging in the many emerging community initiatives, which provide opportunities for recreational music-making including performance, composition and creative workshops. Moreover, opportunities for engaging in therapy as part of a rehabilitation programme are also limited. At present, there are a number of systems available for recreational music-making and Music Therapy for people with physical disabilities. In short, these systems are controlled primarily with gestural devices (mostly using customised switches and potentiometers), which can meet the needs of most people with movement disorders who otherwise would not have access to music-making opportunities. However, such access is still denied to those with more complex physical conditions. For example, severe brain injury, spinal cord injury and Locked-in Syndrome result in weak, minimal or no active movement, which therefore prevent the use of gestural devices. These patient groups are currently either excluded from music recreation and therapy, or are left to engage in a less active manner through listening or choice-making only. To many with disability, BCMI technology has the potential to enable more active participation in recreational and therapeutic opportunities. This paper is structured as follows: it begins with a brief introduction to basic scientific notions that are needed to follow the paper. Next, it introduces three proof-ofprinciple systems, which use EEG to control an audio mixer, compose melodies and activate rules of a generative music system, respectively. It follows by presenting two

156

E.R. Miranda

experiments aimed at identifying patterns of brain activity associated with cognitive tasks deemed to be musical. The first is an EEG experiment aimed at identifying EEG information correlated with the notion of active listening and the second is a functional Magnetic Resonance Imaging (fMRI) experiment aimed at identifying brain activity correlated with tonal processing. The paper concludes with a discussion of ongoing work towards the implementation of two BCMI systems inspired by the results of those experiments. One is aimed at generating variations of short musical passages and the other at generating tonal chord progressions.

2

The electroencephalogram

Neural activity generates electric fields that can be recorded with electrodes (Misulis, 1997). The EEG is the visual plotting of this signal. However, nowadays the initials EEG are used to refer to the electric fields themselves. These electric fields are extremely faint, with amplitudes in the order of only a few microvolts. To be dealt with, these signals need to be amplified significantly. There are two conventions for positioning the electrodes on the scalp: the 10–20 Electrode Placement System (as recommended by the International Federation of Societies for EEG and Clinical Neurophysiology) and the Geodesic Sensor Net (developed by a firm called Electric Geodesics, Inc.). The former is more popular and is the convention adopted for most of the systems described below: it uses electrodes placed at positions that are measured at 10% and 20% of the head circumference (Figure 1). In this case, the terminology for referring to the position of the electrodes uses a key letter that indicates a region on the scalp and a number: F = frontal, Fp = frontopolar, C = central, T = temporal, P = parietal, O = occipital and A = auricular (the ear lobe; not shown in Figure 1). Odd numbers are for electrodes on the left side of the head and even numbers are for those on the right side. A configuration of electrodes on the scalp is referred to as a montage. Montages fall into one of two categories: referential or bipolar. Referential means that the reference for each electrode is in common with other electrodes; for example, each electrode may be referenced to an electrode placed on the earlobe. Bipolar means that each channel is composed of two neighbouring electrodes; for example, channel 1 could be composed of Fp1–F3, where Fp1 is the active electrode and F3 is the reference, then channel 2 could be composed of Fp2–F4, where Fp2 is the active electrode and F4 is the reference, etc. The EEG expresses the overall activity of millions of neurons in the brain in terms of charge movement. In the case of non-invasive EEG, the electrodes can detect this only on the scalp. This is a difficult signal to handle because it is filtered by the meanings (the membranes that separate the cortex from the skull), the skull and the scalp before it reaches the electrodes. Moreover, this signal needs to be further scrutinised in order to be of any use for a BCI. Power spectrum analysis is most commonly used to analyse the EEG and is briefly introduced below. For a review of various EEG analysis methods, please refer to Sanei and Chambers (2007). Power spectrum analysis is primarily based upon Fourier-based techniques, such as the discrete fourier transform (DFT). DFT analysis breaks the EEG signal into different frequency bands and reveals the distribution of power between them. This is useful because it is believed that specific distributions of power in the spectrum of the EEG can encode particular states of mind.

Plymouth brain-computer music interfacing project Figure 1

157

The 10–20 electrode placement system

Table 1

Bands of EEG activity and associated mental states for a healthy young adult

EEG rhythm

Frequency band

Mental association

Delta

G < 4Hz

Sleep

Theta

4 Hz < T d 8 Hz

Drowsiness, trance, deep relaxation, deep meditation, hypnosis

Alpha

8 Hz < B d 13 Hz

Relaxed wakefulness, (normally generated by closing the eyes)

Low beta

13 Hz < C() d 20 Hz

Awake, alertness, moderate mental activity

Medium beta

20 Hz < C(m) d 30 Hz

High alertness, intense mental activity

Gamma (also referred to as high beta)

H > 30 Hz

Hyper-awareness, stress, anxiety

As far as BCI systems are concerned, the most important EEG activity lies below 40 Hz. There are six recognised bands of EEG activity below 40 Hz, also referred to as EEG rhythms, each of which is associated with specific states of mind. The exact boundaries of these bands are not so clearly defined and the meaning of these associations can be contentious. Nevertheless, Table 1 gives what is perceived to be consensual values and associations. The great majority of those who have attempted to use the EEG to generate music have done so by direct sonification of EEG signals (e.g. Hinterberger and Baier, 2005). For reviews of the field of BCI please refer to Berger et al. (2008) and Dornhege et al. (2007). In a nutshell, as early as 1934, a paper in the journal Brain reported a method to listen to the EEG (Adrian and Matthews, 1934). But it is now generally accepted that it was Alvin Lucier who composed the first musical piece using EEG in the mid of the 1960s: Music for Solo Performer (Lucier, 1976). Lucier placed electrodes on his own scalp, amplified the signals, and relayed them through

158

E.R. Miranda

loudspeakers that were “directly coupled to percussion instruments, including large gongs, cymbals, tympani, metal ashcans, cardboard boxes, bass and snare drums}” (Lucier, 1980). The low frequency vibrations emitted by the loudspeakers set the surfaces and membranes of the percussion instruments into vibration. In the early 1970s, David Rosenboom began systematic research into the potential of EEG to generate music. In 1990, he introduced a musical system whose parameters were driven by EEG components believed to be associated with shifts of the performer’s selective attention (Rosenboom, 1990). Rosenboom explored the hypothesis that it might be possible to detect certain aspects of our musical experience in the EEG signal. This was an important step for BCMI research as Rosenboom pushed the practice beyond the direct sonification of EEG signals, towards the notion of digging for potentially useful information in the EEG to make music with.

3

Proof-of-principle systems

3.1 BCMI audio mixer BCMI audio mixer was implemented for the control the faders of a music mixer with their EEG (Miranda and Soucaret, 2008). For instance, assume a piece of music recorded into three tracks: the first track contains the beat, which has a constant rhythm (bass and drums). The second and the third tracks contain piano and guitar solos, respectively. The system uses information from EEG power spectrum analysis to control the faders for the second and third tracks: the power of (low and medium) Beta rhythms controls the fader for track two and the power of Alpha rhythms controls the fader for track three. Therefore, if the system detects prominent Alpha rhythms in the EEG, then the guitar solo sounds louder than the piano. Conversely, if the system detects prominent Beta rhythms, then the piano solo sounds louder than the guitar (Figure 2). The system was implemented using Waverider, an affordable four-channel EEG amplifier manufactured by MindPeak, combined with Reason, an off-the-shelf relatively easy to use music production software manufactured by Propelerhead. BCMI audio mixer is easy to setup and customise; it requires only basic knowledge of music technology to assign different tracks for EEG control or replace the tracks. The montage was referential, with electrodes placed at Fp1, Fp2, O1 and O2. The signals of the four electrodes are added prior to executing the analysis. The signals were filtered in order to tear out signal interference originating from eye movements and facial muscles (EMG); this is a standard EEG procedure that takes places in all other systems and experiments reported in this paper. Colleagues working at ICCMR trained themselves to control the mixer with no difficulty. After a few sections they were able to increase and decrease the power of their Alfa rhythms in relation to their Beta rhythms practically at will, therefore exercising almost full control of the mixer.

Plymouth brain-computer music interfacing project Figure 2

159

BCMI audio mixer controls the faders of the mixer of an off-the-shelf music production software (see online version for colours)

3.2 EEG melodies Having established that it is possible to control the faders of a music mixer with the EEG, we moved on to investigate the possibility of generating the actual musical tracks with the EEG. We developed a method to produce melodies from the ‘topological’ behaviour of the EEG across an electrodes montage (Soucaret, 2008). In this case we did not sum the EEG from all electrodes prior to analysis. Rather, we considered the EEG signal of each individual electrode separately in order to infer possible trajectories of specific types of EEG information across a referential montage of 14 electrodes, as listed in Table 2; see Figure 1 for placement scheme. As an example, let us assume that we are interested in tracking the behaviour of the overall EEG amplitude. Figure 3 plots the amplitude of the EEG on each electrode for approximately 190 sec. Each plot is divided into five windows of approximately 38 sec each; the size of this window is arbitrary. The average amplitude is calculated for each window and the electrode with the highest value is singled out (shaded windows in Figure 3). The example in Figure 4 shows how the power of the EEG has varied across the montage: the area with the highest EEG power moved from electrode 2 (Fp2) to 1 (Fp1), then it moved to electrode 5 (F4) followed by electrode 6 (F8), where it remained for two windows.

160 Table 2

E.R. Miranda The montage of 14 electrodes used in EEG melodies

Electrode number

Electrode name

1

Fp1

2

Fp2

3

F7

4

F5

5

F4

6

F8

7

T3

8

T4

9

T5

10

P3

11

P4

12

T6

13

O1

14

O2

Figure 3

The varying amplitude of the EEG on 14 different electrodes for approximately 190 sec (see online version for colours)

Plymouth brain-computer music interfacing project Figure 4

161

Tracking the behaviour of the amplitude of the EEG signal across a montage of electrodes (see online version for colours)

Note: In this example, the area with the highest EEG power moved from electrode 2 (Fp2) to 1 (Fp1), then it moved to electrode 5 (F4), followed by electrode 6 (F8), where it remained for two windows.

The method to produce melodies works as follows: we associate each electrode with a musical note (Table 3), which is played when the respective electrode is the most active with respect to the EEG information in question. The associations between notes and electrodes are arbitrary. In the case of our example, the trajectory shown in Figure 4 would have generated the melody shown in Figure 5. (Rhythm is allocated by means of a Gaussian distribution function, which is not relevant for discussion here.) A number of analyses can be performed in order to track the behaviour of other types of EEG information. For instance, we generated two concurrent melodies by tracking the trajectory of Alpha rhythms and Beta rhythms simultaneously. We also generated polyphonic music by tracking other types of EEG information simultaneously, such as correlation between electrodes or sets of them, synchronisation between one or more electrodes, etc.

162 Table 3

E.R. Miranda Associations between musical notes and the electrodes of a given montage

Electrode number

Electrode name

Musical note

1

Fp1

A4

2

Fp2

A4#

3

F7

B4

4

F5

C5

5

F4

C5#

6

F8

D5

7

T3

D5#

8

T4

E5

9

T5

F5

10

P3

F5#

11

P4

G5

12

T6

G5#

13

01

A6

14

02

A6#

Figure 5

Melody generated from the behaviour of EEG power shown in Figure 4

The system was implemented using a 32-channel EEG system manufactured by g.Tec, combined with software developed at ICCMR to analyse the EEG and generate the melodies. As compared with EEG audio mixer, despite the fact that it takes much longer to place 14 electrodes on the scalp of the user, the EEG melodies systems is also relatively easy to run. However, in contrast with the previous system where the subjects were able to steer the mixer at will, the subjects were not able to control the music in the same way here. Much more research is needed to establish the meaning of the trajectories in terms of cognition or state of mind, particularly with respect to gaining a better understanding of how they could be used to control music. Nevertheless, it was possible to produce interesting pleasant music by forging crafty associations of electrodes and notes, combined with careful generation of rhythmic figures.

3.3 BCMI piano It is clear that is very hard to compose music with a BCMI system by simply mapping EEG onto notes directly. One approach to address this problem is to furnish the system with artificial intelligence (AI) to allow for EEG control of some form of decision making within an ongoing composition process.

Plymouth brain-computer music interfacing project

163

Rather than trigger musical notes one-by-one, BCMI Piano uses information from EEG power spectrum analysis to steer an intelligent system that composes music online. The system sends out continuous MIDI information to play the music on a MIDI-enabled acoustic piano (Figure 6). We also apply Hjorth’s (1970) analysis to measure the complexity of the signal. This information is used to control the tempo and the loudness of the music. The EEG is sensed by means of a bipolar montage of seven pairs of electrodes, as follows: G–Fz, F7–F3, T3–C3, O1–P3, O2–P4, T4–C4 and F8–F4. As in the case of BCMI mixer, the EEG signals of the seven channels were filtered in order to tear out signal interference and added together prior to executing the analyses. The BCMI piano implements a generative system that composes music using rules extracted from a given corpus of examples. It extracts sequencing rules from the examples and creates a transition matrix representing the transition-logic of whatfollows-what. New musical pieces in the style of the ones in the training corpus are generated by sequencing building blocks of music material (also extracted from the examples in the corpus) in a domino-like manner. Although this type of self-learning sequence generator could be used for any type of musical element (such as musical note, chord, bar, phrase, section, etc.), BCMI piano focuses on short vertical slices of music such as a bar, or half-bar. We use a simple statistical predictor whose predictive characteristics are determined mainly by the harmonic identity of the slices. In a nutshell, the system works as follows: every time it has to produce a bar of music, it checks the power spectrum of the EEG at that moment and triggers generative music commands associated with the most prominent EEG rhythm in the signal; these associations are arbitrary and can be modified. The system is initialised with a reference tempo (e.g. 120 beats per minute), which is constantly modulated by the results from the signal complexity analysis. For more details, please refer to (Miranda, 2007). Figure 6

The music is played on a MIDI-enabled acoustic piano (see online version for colours)

164

E.R. Miranda

Figure 7

An example output where the piece alternates between the styles of Schumann and Beethoven as the EEG jumps back and forth from bar to bar between Alpha and Beta rhythms

The system can generate music for the piano that contains, for example, more Schumannlike elements when Alpha rhythms are more prominent in the spectrum of the subject’s EEG and more Beethoven-like elements when Beta rhythms are more prominent. Figure 7 shows an example where the EEG jumped back and forth from bar to bar between Alpha and Beta rhythms. The system was implemented using the same 32-channel EEG system used in EEG melodies. The intelligent generative music system was implemented in LISP. As in the case of BCMI audio mixer, after a few training sections colleagues in the laboratory were able to increase and decrease the power of their Alfa rhythms in relation to the Beta rhythms practically at will, therefore being able to voluntarily switch between two styles of music. The signal complexity measurement tended to be higher when Beta rhythms were more prominent in the signal. Pieces in the style of Beethoven are played slightly louder and faster than pieces in the style of Schumann.

4

Moving forward

One of major challenges that need to be addressed in order to move our research forward is to discover meaningful musical information in brain signals for BCMI control beyond the standard EEG rhythms. We have been performing a number of brain imaging experiments aimed at gaining a better understanding of brain correlates of music cognition. In the following section we report on the results of two of such experiments.

Plymouth brain-computer music interfacing project

165

4.1 The active listening experiment The objective of this experiment was to test the hypothesis that it is possible to detect information in the EEG indicating when a subject is engaged in one of two mental tasks: active listening or passive listening. In this context, the active listening task is to replay the experience of hearing some music, or part of that music, in the ‘mind’s ear’. Conversely, passive listening is to listen to music without making any special mental effort. In day-to-day life experience we are likely to be listening passively if we are relaxing to peaceful music or engaged in some other task whilst listening to light music in the background. Three non-musicians, young males participated in the experiment. The experiment was divided into six blocks of trials, giving the participants the chance to relax. Each trial lasted for 8 sec and contained two parts: a rhythmic part, lasting for the entire trial, and a melodic riff part, lasting for the first half of the trial. (A ‘riff’ is a short catchy musical passage that is usually repeated many times in the course of a piece of music.) It was during the second half of each trial that the mental task was performed. The rhythmic part comprised four repetitions of a 1-bar rhythm loop. Two repetitions a 1-bar riff loop starting at the beginning of the trial and terminating halfway through were superimposed on the rhythmic part (Figure 8). In total, there were 15 unique riff loops: five played on a synthesised piano, five using a typical electronic timbre (General MIDI timbre #55, ‘synth voice’), and five on an electric guitar. The music was in the style of a pop club-like dance tune at 120 beats per minute, four beats per bar. The background rhythm looped seamlessly for the entire duration of each trial block. Blocks were named after the task the participant was instructed to perform on that block, and they were ordered as shown in Table 4. Each of the 15 riff parts was presented four times in each block in random order. Figure 8

Participants listened to 4-bar trials containing a looped riff lasting for 2 bars (see online version for colours)

166

E.R. Miranda

Table 4

The experiment was divided into six blocks of trials

Block

Subject 1

Subject 2

Subject 3

1

Active

Passive

Counting

2

Passive

Counting

Active

3

Counting

Active

Passive

4

Active

Passive

Counting

5

Passive

Counting

Active

6

Counting

Active

Passive

Note: Blocks were named after the mental task the subjects were instructed to perform.

Participants were instructed to perform one of three mental tasks while listening to a continuous sequence of trials: 1

Active listening: listen to the looped riff that lasts for 2 bars, then immediately after it finishes, imagine that the riff continues for another 2 bars until the next trial begins.

2

Passive listening: listen to the entire 4-bar trial with no effort; just relax and focus on the continuing background part.

3

Counting task: listen to the looped riff that lasts for 2 bars, then immediately after it finishes, mentally count the following self-repeating sequence of numbers (i.e. mentally spoken): 1, 10, 3, 8, 5, 6, 7, 4, 2, 1, 10, etc.

The classification task was to determine the class of 2-second multi-channel EEG segments, where class(1) = active listening, class(2) = passive listening and class(3) = counting task. Counting was included as a control task to determine whether the EEG features that might allow for the differentiation between the imagery and relaxed listening tasks are not merely a function of a concentrating versus a non-concentrating state of mind. Only the last 4 seconds (i.e., the second half of each trial) were considered for analysis. These 4-second segments were further divided into 2-second segments. Thus each trial yielded two segments. There were 120 trials for each of the 3 conditions and each subject produced a total of 720 segments: 240 segments for each condition. The data are randomly partitioned into training set and testing set with split ration of 9:1, resulting in 648 training segments and 72 testing segments. A 128-channel Geodesic System, manufactured by Electrical Geodesics, was employed for EEG acquisition. This system uses a montage of 128 electrodes forming a geodesic structure, referred to as Geodesic Sensor Net. We employed a linear autoregression algorithm to represent the EEG data in a compressed form in terms of estimations of spectral density in time (Anderson and Sijercic, 1996, Peters et al., 1997). Then, a classic single hidden-layer static neural network (Multi-Layer Perceptron), with variable number of hidden units and up to three output units, was used for the classification task. The network was trained in batch mode for 50 epochs, using a scaled conjugate gradient algorithm, as described by Bishop (1995). More details about data preparation, representation and classification can be found in Duncan (2001) and Miranda et al. (2003). In short, the data were divided into two sets: a training set E and a test set T. The training set E was used to train the neural network to recognise the mental tasks of the elements that were left in T. In total, there were 768 inputs to the network.

Plymouth brain-computer music interfacing project

167

The network was reset, retrained and re-assessed 10 times with different permutations of training and testing segments. Classifications were made between 2-second multi-channel segments belonging to pairs of conditions (for 2-way classification) and to all three conditions (for 3-way classification). The average classification scores, including confidence limits and standard deviation, for each subject are shown in Table 5. Remarkably, the classification scores are above 90% accuracy. We acknowledge that these results may not sound statistically robust because the experiment involved only three subjects. Nevertheless, they encouraged us to work towards a BCMI capable of figuring our whether a subject is actively listening to music, or passively listening to it without any special mental effort. This notion is supported by a number of reports on experiments looking into musical imagination. (e.g. Limb and Braun, 2008; Meister et al., 2004; Miranda et al., 2005; Petsche et al., 1996). Table 5

Average classification scores for the active listening experiment

Subject

Classification task

Mean

Min.

Max.

Deviation

Confidence

1

Active u Passive

0.998

0.979

1.000

0.007

r0.007

Active u Counting

0.996

0.979

1.000

0.009

r0.009

Passive u Counting

0.994

0.979

1.000

0.010

r0.010

Active u Passive u counting

0.998

0.958

1.000

0.015

r0.016

Active u Passive

0.994

0.979

1.000

0.010

r0.010

Active u Counting

0.973

0.896

1.000

0.031

r0.032

Passive u Counting

0.954

0.896

1.000

0.038

r0.039

Active u Passive u Counting

0.951

0.903

0.986

0.023

r0.024

Active u Passive

0.973

0.958

1.000

0.014

r0.014

Active u Counting

0.992

0.979

1.000

0.011

r0.011

Passive u Counting

0.994

0.958

1.000

0.014

r0.014

Active u Passive u Counting

0.985

0.958

1.000

0.015

r0.016

2

3

4.2 fMRI tonality experiment In this section, we present an fMRI study of tonality, focusing in particular on the difference in neural processing of tonal and atonal stimuli and neural correlates of distance around the circle-of-fifths, which describes how close one key is to another (Durrant et al. 2009). Tonality is concerned with the establishment of a sense of key, which can be established by a monotonic (single) melodic line, with harmony implied, but can also have that harmony explicitly created in the form of chord progressions (homophony). Tonality also defines clear expectations, with the chord built on the first tone (or degree) taking priority and the chords based on the fourth and fifth degrees also particular important because their constituent members are the only ones whose constituent tones are entirely taken from the seven tones of the original scale, and occurring with greater frequency than other chords. In tonal music, the chord based on the fifth degree is often followed by the chord based on the first degree; in musical jargon, a dominant-tonic progression. This special relationship also extends to different keys, with the keys based

168

E.R. Miranda

on the fourth and fifth degrees of a scale being closest to an existing key (based on the first degree of the scale) by virtue of sharing all but one scale tone with that key. This gives rise to the circle-of-fifths (Shepard, 1982) where a change (or modulation) from one key to another is typically to one of these other keys that are close in this way. Hence we can define the closeness of keys based on their proximity in the circle of fifths, with keys whose first degree scale tones are a fifth apart sharing most of their scale tones, and being perceived as closest to each other (Figure 9). Each stimulus consisted of 16 isochronous events lasting 500 ms each, with each stimulus therefore lasting 8 sec without gaps in between. Each event consisted of a chord recognised in Western tonal music theory, with each chord being in root position (i.e. the bass note of the chord is also the foundation note). The sense of key (or lack of key) was given by the sequence of 16 chords, rather than by individual chords. For a single run, stimuli were ordered into 24 groups of three stimuli with no gaps between stimuli or groups. The first stimulus in each group was always a tonal stimulus presented in the home key of C major, the second was always a tonal stimulus that could either be in the distant key of F# major, the closely-related key of G major, or the same key of C-major. In order to reset the listener’s sense of relative key, the third stimulus in each group was always an ‘atonal’ stimulus; that is, the chord sequences without any recognisable key (Figure 10). Sixteen volunteers (9 female, 7 male; age 19í31) from the participant pool at the Leibniz Institute for Neurobiology gave informed consent to take part in the experiment, which was approved by the Ethics Committee of the University of Magdeburg. Five experimental conditions were therefore defined: distant, close, same, initial and atonal (no key) conditions, respectively. As a result of the contiguity of groups, the first stimulus in each group followed the atonal stimulus in the previous group (except for the initial group), which was defined as the initial condition. The distant and close conditions therefore defined changes from one key to another (distant or close respectively), whereas the same condition defined no change of key (i.e. the next stimulus was in the same key). The atonal condition defined a lack of key, which was included here as a control condition. The stimuli were ordered such that all tonal stimuli were used an equal number of times, and the conditions appeared in all permutations equally in order to control for order effects. Figure 9

The circle-of-fifths of Western tonal music

Note: The outer set of letters represents the circle for major keys, the inner set of letters represents the circle for the most closely related minor keys. The closeness of keys is given by their radial distance, with the closest keys occupying adjacent segments.

Plymouth brain-computer music interfacing project

169

Figure 10 Example of stimuli

Note: Top: Tonal stimulus in the key of C major (initial and same conditions). Middle: Tonal stimulus in the key of F# major (distant condition). Bottom: Atonal stimulus in no obvious key.

In order to draw the attention of the participants to the tonal structure of the stimulus stream, the behavioural task in the scanner was to click the left mouse button when they heard a change to a different key (distant, close and initial conditions), and right-click the mouse button when they heard a change to no key (atonal condition). As the participants were non-musicians, the task was explained as clicking in response to a given type of change so as to avoid misunderstandings as to the meaning of ‘tonal’, ‘atonal’ and ‘key’. A detailed description of the data acquisition procedures and analysis methods are beyond the scope of this paper; this can be found in Durrant et al. (2009). Basically, the fMRI data were acquired on a Siemens Trio 3T MRI scanner equipped with an eightchannel head coil. The perceived scanner noise was attenuated by earplugs (24 dB) and ear muffs (20í30 dB) in which MRI compatible electrodynamic headphones were integrated (Baumgart et al., 1998). Each stimulus block lasted 8 sec and was immediately followed by the next stimulus block. Analysis was performed with a general linear model (GLM) (Friston et al., 1995, 1999). The behavioural results are shown in Table 6, which gives the percentage of trials that contained a left- or right-click for each condition. Second-level one-way ANOVAs were performed for the left- and right-click results respectively across all participants. Conditions distant, close and initial had a significantly higher number of left-click responses than for conditions same and atonal. Conversely, the atonal condition had a significantly higher amount of right mouse clicks than for distant, same and atonal conditions. These results confirm that the participants, all of whom were non-musicians, could perform the behavioural task satisfactorily. They suggest that the participants must have had some awareness of the tonal structure of the stimuli.

170

E.R. Miranda

The fMRI results are summarised in Table 7 and partially shown in Figure 11. In particular, we studied the balanced contrast same vs. close and distant in order to evaluate activation related to key changes. We found eight active clusters for this contrast, which represent a diverse network of neural activity. The bilateral activation of transverse temporal gyri, popularly known as auditory cortex, is of significant interest. This area showed a systematic increase in BOLD (blood-oxygen-level dependent) amplitude on both hemispheres with increasing distance in key. That is, we found relatively high activity for the distant key changes, slightly less, but still significant activity for the close key changes, and much less activity for no key changes. This finding suggests that auditory cortex may not be limited to low-level individual pitch (or single note) processing as commonly thought, but also is involved in higher-order tonal processing. This is significant for our research into BCMI because it indicates fairly well defined potential sources of control information associated with tonality and modulation. By way of a related experiment, we cite the ambitious work of Janata et al. (2002), which reported the identification of a ‘tonal map’ in rostromedial prefrontal cortex. To the best of our knowledge, however, this could not be verified by other researchers ever since. Table 6

Behavioural results, showing the percentage of trials that contained a left- or rightclick aggregated over all participants in the experiment

Condition

Left click

Right click

Distant

89.453

11.328

Close

83.594

0.7812

Same

26.563

3.5156

Initial

68.62

4.2969

Atonal

14.193

Table 7

83.984

Anatomical results of GLM analysis contrasting conditions with and without a key change

Location

Tal X

Tal Y

Tal Z

Cluster size

(1) Right transverse temporal gyrus

51

17

10

1,023

(2) Right insula

36

17

13

948

(3) Right lentiform nucleus

24

1

1

750

(4) Right caudate nucleous

14

4

22

1,443

(5) Left anterior cingulate gyrus

1

41

11

2,574

(6) Left cingulate gyrus

0

16

35

786

(7) Left superior frontal gyrus

12

50

36

2,241

(8) Left transverse temporal gyrus

51

18

11

981

Note: All active clusters preferentially favour key change stimuli. (Tal X, Y and Z are Talairach coordinates, which are used to describe the location of brain structures).

Plymouth brain-computer music interfacing project

171

Figure 11 Clusters of activation for the contrast same and distant and close key, including bilateral activation of auditory cortex (1 and 8) (see online version for colours)

5

Discussion: on converting research results into practice

In this section, we present two work-in-progress systems whereby we are attempting to apply the results of the previous experiments in practice.

5.1 Towards an active listening BCMI The active listening experiment inspired the development of a BCMI where the user would eventually be able to affect the music being produced online by focusing attention to specific constituents of the piece. An initial prototype was designed to produce two tracks of music inspired by the scheme established for the active listening experiment (Figure 8): it comprised a rhythmic track and a solo track that is generated by a transformational rule (Miranda, 2000). The system works as follows: firstly, the neural network is trained to identify when the incoming EEG corresponds to active or passive listening, as described in the experimental procedure. Then, the rhythmic part is continuously played and a riff is played sporadically. Immediately after a riff is played, the system checks the subject’s EEG. If it detects active listening behaviour, then the system activates a transformation rule to generate a variation of the riff that has just been played. Otherwise it does not generate anything and waits for the subject’s response to the next sporadic riff. Sporadic riffs are always a repetition of the last played riff; in other words, it does not change until

172

E.R. Miranda

the system detects active listening behaviour (Figure 12). The initial riff is given by default. Needless to say, the subject who plays the music should be the same person as the one who trained the system. In practice, we found it difficult to reliably detect active listening behaviour when subjects were consciously trying to change the riffs online. We believe that the problem is mainly to do with the fact the subjects did not receive sufficient training before they tested the system. We are currently working towards a methodology to train subjects. Figure 12 The overall functioning of the active listening BCMI (see online version for colours)

Plymouth brain-computer music interfacing project

173

5.2 Towards controlling tonality with a BCMI The results of the fMRI tonality experiment inspired the development of a BCMI whereby the user would eventually be able to steer the tonality of a piece of music based on the degree of activation of the auditory cortex. However, only the generative music system has been fully implemented so far. We adopted a computational paradigm referred to as constraint satisfaction problem (Anders and Miranda, 2008) to implement a generative music system that generates sequences of homophonic chord progressions online (Figure 13). The input to the system is a stream of pairs of hypothetical brain data, which controls higher-level aspects of a forthcoming chord progression. The first value of the pair specifies whether a progression should form a cadence, which clearly expresses a specific key (cadence progression), or a chord sequence without any recognisable key (key-free progression). Additionally, if the next progression is a cadence progression, then the key of the cadence is specified by the second value of the pair. Figure 13 Extract from a sequence of chord progressions generated by the constraints-based generative system

Note: In this case the system produced a sequence in C major, followed by a sequence in no particular key and then a sequence in A major.

174

E.R. Miranda

Each chord progression consists of n major or minor chords (in the example n = 16), and different compositional rules are applied to cadence and key-free progressions. For instance, in the case of a cadence, the underlying harmonic rhythm is slower than the actual chords (e.g. one harmony per bar), and all chords must fall in a given major scale. The progression starts and ends in the tonic chord, and intermediate root progressions are restricted by Schoenberg's rules for tonal harmony (Schoenberg, 1986). For a key-free progression, rules enforce that all 12 chromatic pitch classes are used. For example, the roots of consecutive chords must differ and the set of all roots in the progression must express the chromatic total. Also, melodic intervals must not exceed an octave. A custom dynamic variable ordering speeds up the search process by visiting harmony variables (the root and whether it is major or minor), then the pitch classes and finally the pitches themselves. The value ordering is randomised, so we always get different solutions. Despite ongoing attempts at using fMRI for BCI (Weiskopf et al., 2004), fMRI is still impractical: fMRI scanning is simply too expensive to run, the equipment is not portable and it is not health and safety-friendly for continuous usage, to cite but three impracticalities. Moreover, fMRI scanners produce noise during the scan, which is very inconvenient for a musical application. Unless EEG technology is surpassed by something more convenient and reliable, it still is the best technology for BCMI. Therefore, further research is needed in order to establish whether those fMRI BOLD activations we found in auditory cortex can be detected in the EEG or not. Also, further investigation is needed in order to establish whether such activation can be produced voluntarily or not. Would subjects be able to learn produce bilateral activations of transverse temporal gyrus simply by imagining tonal progressions?

6

Final remark

The introduction listed three major challenging problems of BCMI and this paper focused on the first two. The third involves bringing the technology to the real world. This is perhaps the most challenging of the problems because it requires the design of meaningful assistive scenarios and clinical applications, and methods for evaluation. Establishing a robust testing programme for a specific group of patients is also challenging because it is not common to find a statistically plausible number of patients sharing the same type of disability and/or clinical condition within the same institution. We are currently liaising with carers and medics to make more progress at this front.

Acknowledgements The author would like to thank his research team at ICCMR and other collaborators for their contributions at various stages of the work reported in this paper: Vincent Soucaret, for his work on BCMI audio mixer and EEG Melodies; Bram Boskamp, for his work on the generative music system for BCMI piano; Alex Duncan, Ken Sharman and Kerry Kilborn, for their work on the Active Listening experiment; Simon Durrant, for his work on the fMRI experiment; Torsten Anders for his work on the generative music system using constraints programming. Many thanks to Andre Brechman and Hanning Scheich at the Leibniz Institute for Neurobiology, Magdeburg, Germany, for their assistance with

Plymouth brain-computer music interfacing project

175

the fMRI experiment and the opportunity to use their Siemens Trio 3T MRI scanner. Part of this research was conducted in the context of the EPSRC-funded funded project, ‘Learning the Structure of Music’ (Le StruM), grant number EPD063612-1.

References Adrian, E.D. and Matthews, B.H.C. (1934) ‘The Berger rhythm: potential changes from the occipital lobes in man’, Brain, Vol. 57, No. 4, pp.355–385. Anders, T. and Miranda, E.R. (2008) ‘Constraint-based composition in realtime’, Proceedings of International Computer Music Conference (ICMC2008), Belfast, UK. Anderson, C. and Sijercic, Z. (1996) ‘Classification of EEG signals from four subjects during five mental tasks’, Solving Engineering Problems with Neural Networks: Proceedings of the Conference on Engineering Applications in Neural Networks (EANN’96), pp.407–414. Baumgart, F., Kautisch, T., Tempelmann, C., Gaschler-Markefski, B., Tegeler, C., Schindler, F., Stiller, D. and Scheich, H. (1998) ‘Electrodynamic headphones and woofers for application in magnetic resonance imaging scanners’, Medical Physics, Vol. 25, pp.2068–2070. Berger, T.W., Chapin, J.K., Gerhardt, G.A., McFarland, D.J., Principe, J.C., Soussou, W.V., Taylor, D.M. and Tresco, P.A (Eds.) (2008) Brain-Computer Interfaces: An International Assessment of Research and Development Trends. Heidelberg: Springer. Dornhege, G., Millán, J.R., Hinterberger, T., McFarland, D.J. and Müller, K.R. (Eds.) (2007) Toward Brain-Computer Interfacing. Cambridge, MA: The MIT Press. Duncan, A.A. (2001) ‘EEG Pattern classification for the brain-computer musical interface’, PhD Thesis, University of Glasgow. Durrant, D., Hardoon, D.R., Brechmann, A., Shawe-Taylor, J., Miranda, E.R. and Scheich, H. (2009) ‘GLM and SVM analyses of neural response to tonal and atonal stimuli: new techniques and a comparison’, Connection Science, Vol. 21, Nos. 2–3, pp.161–175. Friston, K.J., Holmes, A.P. and Price, C.J. (1999) ‘Multisubject fMRI studies and conjunction analyses’, NeuroImage, Vol. 19, pp.385–396. Friston, K.J., Holmes, A.P. and Worsley, K.J. (1995) ‘Statistical parametric maps in functional imaging: a general linear approach’, Human Brain Mapping, Vol. 2, pp.189–210. Hinterberger, T. and Baier, G. (2005) ‘Parametric orchestral sonification of EEG in real time’, IEEE MultiMedia, Vol. 12, No. 2, pp.70–79. Hjorth, B. (1970) ‘EEG analysis based on time series properties’, Electroencephalography and Clinical Neurophysiology, Vol. 29, pp.306–310. Janata, P., Birk, J.L., van Horn, J.D., Leman, M., Tillmann, B. and Bharucha, J.J. (2002) ‘The cortical topography of tonal structures underlying western music’, Science, Vol. 290, pp.2167–2170. Limb, C.J. and Braun, A.R. (2008) ‘Neural substrates of spontaneous musical performance: an fMRI study of jazz improvisation’, PLoS ONE, Vol. 3, No. 2: e1679. doi:10.1371/ journal.pone.0001679. Lucier, A. (1976) ‘Statement on: music for solo performer’, In D. Rosenboom (Ed.), Biofeedback and the Arts, Results of Early Experiments. Vancouver: Aesthetic Research Center of Canada Publications. Lucier, A. (1980) Chambers. Middletown, CT: Wesleyan University Press. Meister, I.G., Krings, T., Foltys, H., Boroojerdi, B., Muller, M., Topper, R. and Thron, A. (2004) ‘Playing piano in the mind – an fMRI study on music imagery and performance in pianists’, Cognitive Brain Research, Vol. 19, No. 3, pp.219–228. Miranda, E.R. (2000) Composing Music with Computers. Oxford: Elsevier/Focal Press. Miranda, E.R. (2007) ‘Brain-computer music interface for composition and performance’, Int. J. Disability and Human Development, Vol. 5, No. 2, pp.119–125.

176

E.R. Miranda

Miranda, E.R., Roberts, S. and Stokes, M. (2005) ‘On generating EEG for controlling musical systems’, Biomedizinische Technik, Vol. 49, No. 1, pp.75–76. Miranda, E.R., Sharman, K., Kilborn, K. and Duncan, A. (2003) ‘On harnessing the electroencephalogram for the musical braincap’, Computer Music Journal, Vol. 27, No. 2, pp.80–102. Miranda, E.R. and Soucaret, V. (2008) ‘Mix-it-yourself with a brain-computer music interface’, Proceedings of 7th ICDVRAT with ArtAbilitation. Maia/Porto, Portugal. Misulis, K.E. (1997) Essentials of Clinical Neurophysiology. Boston, MA: ButterworthHeinemann. Peters, B.O., Pfurtscheller, G. and Flyvberg, H. (1997) ‘Prompt recognition of brain states by their EEG signals’, Theory in Biosciences, Vol. 116, pp.290–301. Petsche, H., von Stein, A. and Filz, O. (1996) ‘EEG aspects of mentally playing an instrument’, Cognitive Brain Research, Vol. 3, No. 2, pp.115–123. Rosenboom, D. (1990) ‘The performing brain’, Computer Music Journal, Vol. 14, No. 1, pp.48–65. Sanei, S. and Chambers, J.A. (2007) EEG Signal Processing. Hoboken, NJ: Wiley & Sons. Schoenberg, A. (1986) Harmonielehre (7th ed.), Wien: Universal Edition. Shepard, R.N. (1982) ‘Structural representations of musical pitch’, In D. Deutsch (Ed.), The Psychology of Music. Oxford: Oxford University Press, pp.344–390. Soucaret, V. (2008) ‘Audio control with biosignal for music’, MSc Dissertation, Faculty of Technology, University of Plymouth, UK. Weiskopf, N., Mathiak, K., Bock, S.W., Scharnowski, F., Veit, R., Grodd, W., Goebel, R. and Birbaumer, N. (2004) ‘Principles of a brain-computer interface (BCI) based on real-time functional magnetic resonance imaging (fMRI)’, IEEE Transactions on Biomedical Engineering, Vol. 51, No. 6, pp.966–970.

Suggest Documents