Visualising Music for Performance Flexibility - Institute of Information ...

Visualising Music for Performance Flexibility Paul Lyons and Yiding Zhang Paul Lyons1 and Yiding Zhang2 1

Institute of Information Sciences and Technology Massey University, Private Bag 11-222 Palmerston North, NEW ZEALAND [email protected] 2 Institute of Information Sciences and Technology Massey University, Private Bag 11-222 Palmerston North, NEW ZEALAND [email protected] Abstract We describe a visualisation system which allows an artist to perform music to a pre-recorded accompaniment without some of the constraints that normally render such a performance dull and lifeless. Musicians who perform with an accompanist or a band can't always practise together or may be unable to accept gigs by lack of availability of the other musician(s). In these circumstances, some artists have tried performing to pre-recorded accompaniments, but these are inflexible; it is difficult, or impossible, to achieve variations in tempo, spur-of-the-moment repetitions or omissions of sections of the music, and solo breaks of indeterminate length. The system described here is a prototype for a computer-controlled accompanist capable of all of these variations. It is based on three premises. First, the accompaniment is recorded as MIDI data, in which the various sections that are, or may be, performed, are individually identified. Secondly, the arranger creates a representation of the "syntax" of the piece of music that shows repeated sections, and alternative sections using a visual structure similar to the syntax diagrams commonly used for defining computer programming languages. Thirdly, the performer interacts with this graphical representation of the high-level structure of the piece, choosing alternatives, and altering tempi, while the music is playing. Keywords: Music syntax, ACCOMPANIST, music interface, Graphical User Interface 1. Introduction ACCOMPANIST is a computer system - in the strict sense of an integrated unit involving both software and hardware components - that adds value to a MIDI recorded accompaniment. It gives a performers who work with a prerecorded accompaniment a degree of control over the large-scale musical structure of the pisece as it is being performed. This overcomes problems of inflexibility in conventional recorded music. The system uses the computer for these purposes: • It provides a means for constructing and displaying a representation of the default structure of a piece of music - a music structure notation, if you will. • While the performance is under way, it provides a means for the performer to specify a variation of the default structure by specifying the parts of a MIDI file which are to be replayed, in real time. • It provides a means for extracting those parts of a MIDI file at the behest of the performer and for

• •

stitching them together in an order specified by the performer, to produce a continuous piece of music. It provides a means for varying the playback tempo. It provides a means for restarting the playback, in synch, after an arbitrary length delay in which musicians have been performing without accompaniment.

The use of a computer to control playback of MIDI-encoded music is hardly novel. The innovation in the present system lies in its combination of touchscreen input and graphical interface to allow the performer to interact directly with the music structure notation while the performance is under way, to modify its pattern.

2. Background 2.1 The Problem The ACCOMPANIST system has wide general applicability, but it has grown from a personal need. We will explain the specific situation, before moving on to a solution that can be used in a wide variety of situations. The first author of this paper is a member of a (mainly) jazz vocal group comprising four singers and four instrumentalists, and also a church music group. Both groups find difficulty in scheduling practices and gigs, because instrumentalists are often unavailable. Neither group can rely on availability or consistency in its accompaniment. 2.2 Recorded Accompaniment - The Obvious Solution An obvious solution to the problems outlined above is to pre-record an accompaniment on cassette tape or CD. Alternatively, with a prerecorded full MIDI accompaniment, a group could simply mute the tracks associated with instruments whose players turn up on the night. Although some musicians work very successfully with such systems, prerecorded accompaniments are unsuitable for music performance styles that involve even a slight degree of flexibility. Nearly all music, not just improvisational jazz, requires flexibility and communication between performers. Tempi, for example, are rarely rigidly defined, or at least adhered to, and a group will often, by impromptu common consent, repeat or omit a section. Jazzrelated styles may follow a rigid structure for part of a piece, but incorporate a four, eight, twelve, etc. bar improvisational band break. A prerecorded accompaniment could include a blank section, as long as the band break, but would almost inevitably restart too early or too late. Furthermore, some members of a group - particularly the rhythm section - can feel that a recorded beat usurps their authority. A pre-recorded accompaniment is unsuitable in all of these circumstances. 2.3 What Would an Ideal Solution be Like? Commercially available generative systems that produce an automatic accompaniment based on a key or a chord sequence, a tempo, and a musical style, are usually generic, with no concession to a melody line or tempo variations. They are often also musically uninteresting. If an accompaniment is an important part of the musical statement, or if a singer relates the pitch or time of an entry to a “landmark” in the accompaniment, then this approach is inadequate. A preferable solution would insert

fragments of MIDI accompaniment, for this sort of shortterm precision, into a longer-term, customisable, framework. The performer would see a musical “syntax”, comprising a predefined sequence of named MIDI sections, such as Verse 1, Bridge, etc., with alternative accompaniments for some parts of the music. During playback, the computer could generate a variation of this sequence, at the performer’s request, by any of the following customising actions: jumping to a new section, repeating a section, choosing an alternative accompaniments for a particular section, or altering the number of repetitions of a particular section. Interaction with the system would involve a perceptually intuitive, or at least easily learnt, representation of music “syntax”, to minimise the difficulty of interacting with the syntax during a performance, when a performer’s higherlevel cognitive skills may be temporarily swamped by physical and emotional involvement in the music. 2.5 Technology Requirements The aims presented in the previous section are admirable. Are they achievable? MIDI files (5) can contain a sequence of note-on, noteoff events, but also information about the beat-rate. They have provision for the labelling of individual sections, to specify timing throughout the track.. Using this information, it is possible in principle to pick out sections of the file and generate a MIDI stream containing the same note sequences, but with altered timing so that the sections are strung together in a different order from the original file, maintaining musical synchronisation between the concatenated sections. A MIDI stream can therefore be filtered so as to conform to tempo variations that may occur during a performance. A second component of the system is the physical interaction interface. Mouse-based, (typewriter) keyboardbased or push-button switch based interfaces are inappropriate for this sort of interaction. The mouse is too indirect for use during active music-making. The authors feel, without experimental justification, that using a mouse for controlling a sequence of musical sections would overload some of the mental pathways that are used for other aspects of live music making. Relating the physical movement of a mouse to the actions taking place on the screen is not a completely intuitive action. The present authors hypothesise that it may involve mental pathways that are involved in reading music or formulating speech, so that it would be difficult to perform these activities simultaneously. Typewriter-style keyboards are even more indirect, and formulating written commands to control the musical events, as the music is playing, and the performer

Fig. 1: The Smart Music Interface

is undertaking some other activity such as singing, is unlikely to produce successful spontaneous musical expression. Push-buttons might be used during the performance for selecting which section of music should follow the current one. In small numbers, push-buttons are a less cognitively demanding sort of interface, but they are at their best in electronic keyboards where each push-button has a welldefined single role; the musical pattern construction envisaged in the current system is just a little too free to suit a push-button interface. The reader might well object that musicians think nothing of making several hundred button-choices per minute when using one of the most successful musical interfaces of all, the piano keyboard. This, with its 88 keys, is far more complex than a pushbutton interface that performer could use to choose the identitiy of the next section of music to play, when the current one has finished. The reader would be right; musicians do interact in very complex ways with piano keyboards, but it takes most people a great deal of effort and time to learn how to play them, and to learn the theory of equal tempering, keys and harmony that are hidden behind the geometrical pattern of a piano keyboard. This effort is repaid, for musicians, at any rate, by the joy of being able to produce beautiful sound sequences, of a seemingly infinite variety and harmonic complexity. Learning to “play” a button-based interface for organising the playback sequece of prerecorded sections of accompaniment does not promise the same emotional returns and feeling of direct involvement in producing the music as learning to play a piano. Consequently it is doubtful that a similarly complex interface would seem to repay the effort involved in learning how to “play” it. Further, it’s difficult to see how a pushbutton interface could easily be related to the macrostructure of a wide variety of music.

Fig. 2: Cakewalk's In Concert system presents information about song structure textually

An interface technology that does not suffer from these disadvantages is the touchscreen, either resistive or capacitive (which can be activated just by touch), or electromagnetic (which requires an electromagnetic activator, usually a “pen”). With this technology, it is possible to manipulate a pictorial representation of objects in a target environment even more directly than with conventional, mouse-based “direct manipulation” interfaces. It is also possible to represent the structure of a piece of music as an intuitively simple graphic. That isn’t to say that pieces of music with a complex structure will suddenly look simple, but that graphical presentation techniques can present such a structure without amplifying the complexity that is inherent in the music. Pen-based systems allow fine manipulation, which may not be necessary or appropriate under the conditions of a live music performance. The screen resolution of pure touch systems, on the other hand, may be slightly too low to allow easy interaction with a detailed graphical representation of the structure of a piece of music. ACCOMPANIST’s present playback system is implemented on a stand-alone computer, for convenience. A production system could be an embedded system, incorporated into an electronic keyboard, to allow a perfomer to augment the accompaniment, or it could be a single-function component of a MIDI setup. 2.6 Similar and complementary systems Cakewalk In Concert (1) allows a keyboard player to play one part from a multi-part MIDI file. The In Concert software adjusts the speed and volume of the other parts to follow the keyboard player’s lead. The keyboard performer can skip or repeat a section, and the In Concert software will follow the lead. Not all performers will wish to play a

Fig. 3: The Band-in-a-Box interface ACCOMPANIST could be addded to one of these products as a high-level flexible control system. keyboard, or even be able to play a keyboard accurately enough to generate the MIDI signal that In Concert uses as a trigger and synchronisation signal for its playback. Furthermore, the interface, shown in Fig. 2 presents information about location within the music digitally - in the sense that the digital watches that were temporarily fashionable in the 1980s presented time digitally. Complex information about temporal relationships, such as occurs between sections in a piece of music, is best presented via a pictorial, analog, interface than via a textual, digital, interface. Coda Music (2) produce an interactive, computer driven accompaniment system called Smart Music Studio (previously Vivace), which accompanies vocalists and wind players using a prerecorded MIDI accompaniment. The real-time adaptability of the software is limited. It does not allow a performer to alter the number of repetitions of a section, or to choose one of several alternative

Human plays melody, Computer comps

Computer solos, Human comps

Human solos, Computer comps

accompaniments for a particular section of the music. However, it listens to and follows a soloist's spontaneous tempo changes, (see the “Soloist follows accompaniment & Accompaniment follows soloist” controls in Fig. 1) and it compares the solist’s pitch with what the sheet music calls for, and adapts to the performer as necessary. In spite of this flexibility, the product is marketed as a practice tool, rather than a performance instrument, and the user must set up the configuration in a separate preliminary phase before starting the playback. There are a number of accompaniment-generation systems, like JAMMER (3) , which is described thus: Creating professional accompaniment tracks in JAMMER is fast and easy. Just enter some chords on JAMMER's lead sheet, pick a musical style and then press the Compose button. JAMMER instantly creates and plays full professional arrangements of accompaniment in the style you have chosen. or PG Music’s (4) Band-in-a-box (see Fig. 3): Band-in-a-Box is an automatic accompaniment program for your computer. You type in the chord symbols to any song, using standard chord symbols like C, Fm7, & Cm7b5, choose a style and press PLAY. Band-in-a-Box then generates a pro quality 5 instrument accompaniment of bass, drums, piano, guitar and strings in over 100 styles of music. Such systems would be an ideal complement to ACCOMPANIST, as they address the problem of generating an accompaniment, but not the problem of customising the accompaniment easily, and during the performance (though Band-in-a-Box allows a performer to jump to any position in a song). Alternatively, ACCOMPANIST could be added to one of these products as a high-level flexible control system. There are some quite sophisticated systems for improvising accompaniment to a human performance. William Walker’s (6, 7) ImprovisationBuilder analyses the humans’ musical output and improvises jazz

Computer plays four, Human comps

Human plays melody, Computer comps Ending

Computer plays melody, Human comps

Human solos, Computer comps

Computer solos, Human comps

Human plays four, Computer comps

Computer plays melody, Computer comps

Fig. 4: The Finite State machine used by ImprovisationBuilder to control the musical conversation

accompaniment and solo sections based on conversational model of jazz performance. Whereas the commercial applications described above are not sufficiently adaptive in real time to provide a high degree of flexibility, the ImprovisationBuilder system is far more adaptive than is suitable for groups that don't improvise solos or accompaniments. IB (ImprovisationBuilder) is an interesting system, because of way it uses the large-scale structure of a piece of music to coordinate the human and computer contributions to the music. Fig. 4 shows, as a Finite State Machine, the structure of an IB jazz improvisation, based around the variations on the melody and harmonic structure of a particular song. Each complete repetition of the harmonic structure is called a chorus. The first chorus, called the Head, is an initial statement of the actual melody from either human(s) or computer, (accompanied by the other member of the pair) and it is followed by an improvised melodic solo chorus, again from either human(s) or computer. This has the same length as the melody, and follows the same chord structure. Then human and computer “trade fours” - play alternating four-bar sections - following the harmonic structure of a chorus, until a chorus has been completed, and then chorus comprising a recapitulation of the melody occurs, followed by an ending. Although IB uses an explicit representation of the largescale structure of the music within the system, the musicians have very little control of the path that the music follows through that Finite State Machine, being able to make choices only at the start and at the points in the FSM where the arrows cross. This is not to say that IB will generate mechanical-sounding music; the microstructure of the computer’s solos is not predetermined, and the microstructure of its accompaniment is based on the humangenerated solo, so there is considerable scope for the human participant(s) to influence the style of the music, if not its structure. This may sound like a significant restriction of the freedom of jazz improvisers, but in fact, it suits the prevailing style of small-group jazz, where musicians are often called upon at short notice to perform with musicians they have bever worked with before. It is their knowledge of, and willingness to conform to, a shared structure that allows these performances to proceed. 3. ACCOMPANIST The considerations described in Section 2 above have led to the design of ACCOMPANIST, (Adaptable ComputerControlled Output of a MIDI Pre-recorded Accompaniment, Not In Strict Time). This combination of software and hardware displays a graphical representation of the structure of a piece of music and allows the musician to control such

parameters as the playback speed, the number of repetitions of a section of the music, and which of several alternative versions of a section get played. 3.1 Design of the playback system The description given here outlines a design for a complete version of the ACCOMPANIST system. A prototype designed to test the concept has been implemented. A description of this prototype follows later in the paper. ACCOMPANIST is divided, into two parts. The first is the Capture System. This enables an arranger to edit a preexisting MIDI file to incorporate a default syntax for the piece of music. A syntax is a sequence of sections that may be concatenated to create a valid piece of music. Let us consider a conventionally constructed song with two verses, separated by choruses preceded by an intro, and followed by an outro. In a slight modification of the EBNF (Extended Backus-Naur Form) formal syntax notation we might represent this structure as: ::= { | }*2 This is to be read as: the consists of followed by 2 repetitions of the sequence { or followed by }, followed by a or a and then the . The modification of EBNF is the use of 2 to denote a specific number of repetitions. Normally the * appears on its own after a set of curly braces, and implies 0 or more repetitions of the items enclosed in braces. This power of this notation is very great. It can describe much more complex structures that the simple song structure: EBNF is routinely used for defining the syntax of complete programming languages. Powerful though it may be, EBNF is not easy to understand. It would not be the basis of an intuitive interface for displaying a song’s structure to a performing musician, and allowing that musician to control the sequence of sections. An alternative to EBNF, the syntax diagram (or tramline diagram, or railway line diagram) is also commonly used. In this notation (in general, a multi-level Finite State Machine), the syntactic structure of a language is presented as a picture. Any traversal of the diagram that starts at the left hand end, follows the line, and finishes at the right, passes through a valid sequence of symbols from the language. Therefore, our song structure could be defined by the Finite State Machine shown in Fig. 5, and any sequence of sections that occur in a traversal of the diagram, following the path from the left and ending at the right, defines a valid rendering of the song. In this diagram arrowheads are only used on left-pointing lines, to reduce the “busyness” of the diagram. As in our slight

song

intro

2

chorus

verse

outro

band break

Fig. 5: A song structure as a conventional syntax diagram

modification of EBNF, the number (under the semicircle) is a sight extension of the conventional notation, which should be taken to mean that the traversal should pass that point twice. This diagrammatic notation is far more easily comprehensible than EBNF, and, equally importantly, is susceptible to interaction by direct manipulation during the course of a performance. The ACCOMPANIST notation for representing the structure of a piece of music is therefore based on structure diagrams. The Capture System is a Graphical User Interface for producing such a diagram and creating links between the components of the diagram and sections in a MIDI file. The second part of ACCOMPANIST is the playback system. This must allow the musician: • to view the default structure of the piece of music, • to choose alternatives from within the default structure of the piece of music, • to override the default playback order of sections within the default structure of the piece of music, • to override the default number of repetitions of a section, or group of sections, • to control the tempo, of the music and • to synchronise the start of the section following a solo or band break of arbitrary length. The conventional syntax diagram notation has been modified slightly to support these requirements. In Fig. 6, the music follows the path of the heavy playback line, from left to the double bar line at the right. Sections are shown in named boxes. The boxes are angled as a compromise between legibility - for which they would ideally be oriented horizontally - and packing efficiency for which they would ideally be oriented vertically. The section currently being played is shown in reversed colour - in this monochrome paper, with white type on a black background.

1 2 3

6

6 5 4

Fig. 6: An ACCOMPANIST Section Diagram

When the playback reaches a pick-up pointer, the section of the MIDI file that the pick-up pointer points to is played. Thus the first section to be played is the intro. It will be followed by the chorus. The chorus is visually juxtaposed with another section, a band break, to indicate that these are alternatives. The pickup pointer is not merely a passive indicator of the prescribed path of the music, but also an active interaction tool. The performer can drag a pick-up pointer from one alternative section to another while the first alternative is playing whereupon the second alternative will start playing at the same point in its sequence (which may produce some odd effects, but need not be totally unmusical) or - more commonly - before the alternatives start to play. In the latter case the previous section along the playback line will simply be followed by the selected alternative. The playback line may include loops, to identify sections, or sequences of sections, that are repeated. Numbered switchpoints are used to specify that a particular path will be followed on iteration n, where n is the one of the numbers in the range specified on the arrowhead. So the switchpoint shown at the start of this paragraph means “Take the top path after iterations (verses) 1 to 4, and the bottom path after interation 5.” The values on a switchpoint’s arrowheads are set by the arranger, but the performer can override them by touching the arrowhead (or the section of the playback line leading from the arrowhead), to indicate that the next time the playback path reaches the switchpoint, it will follow the specified path, irrespective of the number of the iteration. Verse numbers, are used to identify the iteration that is currently playing. Verse numbers for verses that have not yet begun are aligned horizontally below the playback line. Verse numbers for verses that have finished are aligned at 450 above the playback line. The number on the playback line is the number of the currently playing verse. When a new ak s bre u iteration of a loop begins, the verse numbers advance one or nd rse tro in ch ba ve position. Therefore the all of the verse numbers are shown 5 in 1a horizontal line underneath the playback line, before a loop has started playing for the first time. After it has finished, they are aligned at an angle of 450 above the playback line. In addition to acting as feedback to the performer, these numbers are also a performance control. The performer can drag either end of the strip, in either direction, to increase or decrease the current verse number.

The section diagram is only part of the interface that performer can use to control the playback of the music. Below this on the screen, there is a tempo controller, also touch-activated. This comprises two sub-controls, a beatcapture region, and a pair of speed controls (see Fig. 7). The Beat Capture region of the interface (a simple rectangle) allows a performer to tap at a multiple of the beat rate (usually 1, but sometimes 1/4, 1/2, 2 or other more exotic multiples). ACCOMPANIST adapts the playback speed to the performer’s beat rate in real time. A performer can also use the continuous speed change controls. These are large triangular regions. Pressing the broad inner end of one of these regions effects a slow alteration in the plaback speed; pressing the thin outer end effects a rapid alteration in the playback speed. Finally, the performer can set the speed directly, using the slider control at the top of the control. The slider also acts as feedback device; it moves when the performer taps on the beat pad or on the accelererate/decelerate triangles. The tempo controller can be used for synchronisation at the start of a piece, and after band breaks. When the arranger is defining the musical syntax for a piece of music, she or he can insert null sections (such as the band break in Fig. 5 and Fig. 6) which, because of their improvisatory nature, may have an arbitrary duration. In either case, the synchronisation involves tapping on the beat control for two bars. In the case of synchronisation at the start of the piece, this simply involves tapping at the speed with which the music will start. ACCOMPANIST expects two bars of this input, derives a speed from it, and starts the accompaniment at this speed. For synchronising the restart after a band break, the situation is slightly more complex, because both the tempi and the time signatures of the break and the following section may differ. Further, there is no MIDI information associated with the arbitrarily long blank section that “plays” during the band break, so ACCOMPANIST has access to no speed or time signature data during this time. Consider a band break during which the band is playing 4 in Common Time ( /4) without MIDI accompaniment, at 120 beats per minute. It would be technologically simple for ACCOMPANIST to capture speed information as the perfomer counted in two bars of the following, 40

50

60

70

80

90

100

110

120

130

140

150

160

170

Fig. 7: The tempo controller

180

190

200

210

3

accompanied, section in, let us say, /4 time at 100 beats per minute. However, most performers would find it musically difficult to count the new section in at the new rate while singing, or even just listening to the old section at the old rate. Accordingly, it is possible to associate a nominal time signature and playback speed with a blank section, even though there is no associated MIDI file. Using this information and the real time signature and playback speed information from the following section, ACCOMPANIST determines the speed ratio between the two sections. Then the performer can count out the last two bars of the blank section, in time with the live performance. In these circumstances, ACCOMPANIST calculates s2, the speed of the following section, according to the following equation: s2 = s1. S2/S1 where: S1 is the nominal speed of the blank section, S2 is the speed of the following MIDI section, s1 is the speed at which the performer counts out the last two bars of the blank section The ACCOMPANIST interface also displays two more controls, a representation of the first track of the MIDI file for the currently playing section, displayed using conventional music notation, and a bar counter, comprising nothing more complex than a series of numbered rectangles These are displayed below the section diagram, and a highlighted line connects the current note in the conventional notation to the current bar in the bar representation and the current section in the section diagram. The performer can drag this line forward or backward a note, a bar, or a section at a time, to change the part of the music that is playing. This flexibility is essential when a member of an ensemble loses a beat, or a bar, or jumps to the wrong section or incorrectly repeats a section. A section arrangement can reference MIDI data in two ways. If the sections are stored in separate files, then ACCOMPANIST’s playback system can simply play those files in the order specified by the human section arranger. This would be a suitable approach if the section arranger is also responsible for creating the MIDI data. An alternative approach is to use the Marker meta-events that may be inserted into MIDI files to ideitify the start of a significant part of the sequence, using a text string, such as a rehearsal letter or a a section name such as “First Verse.” etc. Such Markers may or may not already be present in a prerecorded MIDI accompaniment from another source, so it is necessary for ACCOMPANIST to provide an interface with which the section arranger can edit Markers. The interface for generating a section diagram therefore contains two main features, one much more complex than

the other. The simpler feature is a dialog box for selecting a file containing a section. The more complex feature is an interface component for editing markers in a MIDI sequence. This component should allow the arranger: to view the MIDI file using conventional musical notation or a piano roll representation, to play the file, to identify the playback point on the music notation to stop and restart the playback at arbitrary points to annotating a particular point in the score with the text for a marker (a section name) All of these features are provided by commonly available composition software such as Cakewalk, Finale, Evolution, or host of others. As the current aim of the ACCOMPANIST project is to investigate the validity of the overall concept, rather than produce a fully integrated system, it has seemed appropriate to omit this interface from the present ACCOMPANIST system, and to include it later. For the moment therefore, annotation of MIDI files to allow ACCOMAMPANIST to identify sections is to be left to other software. 3.2 HCI Considerations The develoment of the ACCOMPANIST interface is underpinned by the authors’ philosophy about HumanComputer Interaction. Briefly, this may be summarised by the catchphrase “Perception is better than Analysis.“ The underlying idea is that our interrelationships with tools is largely based on perceptual clues. A well-designed handle looks like something you can grasp or turn. It may have other properties as well, like a smooth surface, without sharp edges, but its first responsibility is to remind the user of its function by its form. The perceptual pathways by which this association is accomplished are not necessarily intuitive; the association may have to be learnt, but if the object is well-designed, they will be easily learnt, and after they have been learnt they should not require conscious thought. Door handles (if designed for ease of use and not as an overt architectural statement) and conventional musical notation are extreme examples of this. The use of door handles is easily learnt, and allows the user to achieve a single, mundane end. Music notation is difficult to learn, but eventually becomes reflexive and subconscious. The far greater rewards obtained from learning music notation outweigh the pain of learning it. Observation of music pupils who have to consciously analyse the meaning of notes on the staff before playing them, and accomplished musicians who have internalised the mental processes of notational analysis so that, subcounsciously, they translate the notes directly into a set of appropriate hand movements will convince anyone

instantly of the benefit of perception over analysis. The difficult task - especially when one is trying to foist a new notation onto musicians, whose minds are already overloaded with symbols - is to devise a notation which both taps into existing perceptual pathways and can be learnt quickly - which implies simplicity - and is powerful enough to represent a wide variety of situations - which implies complexity. Ideally, users new to the notation would find it so simple that they would be unaware of learning it. Section sequences could be specified textually, for example. A song with the same structure as the example shown in Figure 1 could be specified by the textual representation something like this: intro chorus | band break verse 1 chorus | band break verse 2 chorus | band break verse 3 chorus | band break verse 4 chorus | band break verse 5 chorus | band break verse 6 outro

which is unambiguous, but uses space inefficiently, and, by virtue of being “unrolled,” hides the high-level structure of the piece. Because of its relentlessly textual nature, it is difficult to keep track of where one is “up to” in the sequence - though in a live interface using a keyboad, with the structure represented on a screen, it would be possible to highlight the name of the currently playing section. More complex pieces of music would have an unwieldy representation using this notation. A shorter, but more obtuse, representation might be 6*(| ) We deliberately do not write out the “reading” of this legend, in order to emphasise the analysis that is required to understand it. Note that bracketing symbols are required to delimit the effect of the 6* multiplier, and also to allow phrases like “band break” to contain a space character. This notation could be learnt, and probably internalised like conventional musical notation, but it does not instantly tap into existing structures in the reader’s perceptual system. Nor does it offer an easy way of altering the sequence of sections as the music plays. The visualisation shown in Fig. 6 is designed to overcome these deficiencies. It shows the repeated structure of the music as a loop. The eye traces around the path and

sees that the items inside the loop will be encountered a number of times. There is no need for textual analysis. The graphical loop symbol is instantly distinguishable from the graphical section symbol. By contrast, in a textual notation, because of the similarity of style of individual characters, the user must parse the information in the loop to find the symbol that denotes the end of the repeated material. For these reasons, the visualisation is based on the metaphor of a path, with branches and stopping points. The intention is to tap into our subconscious understanding of the visual appearance of a path. The detailed graphical representation of the path is an important aspect of this visualisation. Our perceptual system is preset to take notice of certain cues, such as edges and corners. Consequently, boxes made of lines (which we may thin of as a pair of edges back-to-back) drawn in black on a white background are highly attentiongrabbing and may divert the performer’s attention from the section names. In future versions of ACCOMPANIST, the appearance of the interface will be optimised to emphasise the most important components and de-emphasise others The use of a touchscreen was justified earlier ACCOMPANIST could be operated with a touchpad or even with a mouse, but a feeling of genuine direct manipulation of the diagram components will only be achieved if the user can actually touch the place on the screen where a component is displayed, and see it react to the touch instantly. Again, this is necessary to align the interface as directly as possible with the user’s perceptual system. 3.3 TRAINEE, TRAINEE, the present implementation of ACCOMPANIST, falls short of the goals listed above, but is just about powerful enough to act as a testbed for the general idea. Specifically, it allows an arranger to construct a section diagram for a piece of music with an arbitrary number of sections, linked into a path, with loops around some or all of the sections, and switchpoints and loop counters. TRAINEE has been implemented on a Fujitsu Stylistic 1000 pen computer running Windows 95. For the purposes of demonstrating the system, it has been divided into an interface component and a MIDI controller component. The interface runs on the Fujitsu tablet, and because this does not have a sound card installed, it sends control commands (“start section x at time t1,” “start section y at time t2,” etc.) to the Rogus system which is resident on another PC. This PC also stores the MIDI file(s) and has a sound card

installed. There is no communication path back to the tablet from the second PC, as all timing for the sections can be derived from the MIDI files. The interface that has been implemented for TRAINEE is not as elegant as the interface shown in Fig. 6, and the user actions that are required to produce a section diagram are unnecessarily complex. Nevertheless, the system has allowed us to demonstrate that the overall ACCOMPANIST concept of presenting the performer with a graphical representation of the music’s macro-structure, via a touchscreen which she or he can use to manipulate it, is viable and future work should produce a system that surpasses the “ideal” design given above, both in terms of graphical elegance and usability. 4. Future developments 4.1 Graphical improvements TRAINEE’s section diagrams were oriented horizontally by analogy with conventional musical notation. Orienting them vertically would improve the legibility and packing efficiency of section names. Fig. 8 shows how the interface would look with this alteration (and some other minor graphical improvements). Currently, when an arranger constructs a set of alternative sections, the playback defaults to the first alternative, and the performer overrides this default by tapping another section or by dragging the switchpoint to it. In future implementations the arranger will be able to label each section in a set of alternatives with the numbers of the verses during which it should be played. 4.3 Default alternatives Currently, when an arranger constructs a set of alternative sections, the playback defaults to the first alternative, and the performer overrides this default by tapping another section or by dragging the switchpoint to it. This has to be done at every performance. In future versions of ACCOMPANIST, therefore, it will be possible for the arranger to label each section in a set of alternatives associate with one or more verse (iteration) numbers. Then that section will be played by default during that verse. Of course, the performer will still be free to choose one of the alternative sections, by the mechanism described above, and therefore override the default.

1

2

3

4

5

6

7

8

9

10

210 200 190 intro

180 170

3

2

160

1

4 5

chorus

6

band break

150 140 130 120

verse 110 6 6

1-5

100 90 80

outro

70 60 50 40

Fig. 8: Proposed future ACCOMPANIST interface

5. References [1] “Cakewalk In Concert” URL: http://cakewalk.com/ Press/icfnl.htm, January 1998 [2] “Smart Music,” Coda Music Technology, Inc. 6210 Bury Drive, Eden Prairie, MN 5346-1718 USA [3] “JAMMER Professional: Music Composition Software for Windows” URL: http://www.soundtrek. com/accompaniment.htm, 1998 SoundTrek, 3408 Howell St, Suite F, Duluth GA 30096 [4] PG Music “Band in a Box” PG Music Software, URL: http://www.pgmusic.com/index.html

[5] “MIDI specification, 1.0”, 1983, International Midi Association, 5316 West 57th Street Los Angeles, CA 90056, 1983. [6] “Applying ImprovisationBuilder to Interactive Composition with MIDI piano”, William Walker and Brian Belet, Proc. 1996 Int. Mus Conf., Hong Kong [7] “A Computer Participant in Musical Improvisation”, William F. Walker, Proc. 1997 Human Factors in Computing Systems (CHI 97)