A Tutorial on
Sound and Music Computing
Xavier Serra Universitat Pompeu Fabra Barcelona, Spain
[email protected]
Index Introduction to the SMC field Basic challenges Some historical references
Current research trends Sound synthesis and processing Sound/Music description and understanding Music performance and interaction
Future challenges in SMC [This tutorial also included a section on “Large scale physical modeling synthesis” that was given by Stefan Bilbao and that is not included in here] 2
SMC tutorial
Introduction to Sound and Music Computing 1. Basic challenges 2. Some historical references
3
SMC tutorial
Music communication framework “Musical” Knowledge base
Composer
Listener
Perception Cognition
Symbolic representation
Temporal controls
Sound field
Room
Source sound
“Physical” Knowledge base
4
Performer
SMC tutorial
Instrument (Moorer, 1990)
SMC disciplines Music
Artificial Intelligence
Computer Science
Theory Performance Composition
Psychomusicology
Programming
Psychology Cognition
SMC Digital hardware
Psychoacoustics Digital signal processing
Acoustics
Device design
Engineering
5
Physics
SMC tutorial
(Moorer, 1990)
General references Moore, F. R.. 1990. Elements of computer music. Englewood Cliffs (N.J.) : Prentice Hall. Roads, C. 1996. The Computer music tutorial. Cambridge (Mass.) : MIT Press. Polotti, P. and D. Rocchesso. 2008. Sound to Sense - Sense to Sound: A state of the art in Sound and Music Computing. http://smcnetwork.org/public/S2S2BOOK1.pdf Sound and Music Computing Network, http://www.smcnetwork.org Computer Music Journal, http://www.mitpressjournals.org/cmj International Computer Music Conference, http://www.computermusic.org/
6
SMC tutorial
Introduction: Basic Challenges
7
SMC tutorial
Modeling sounding objects
String vibration
8
Membrane vibration
SMC tutorial
Modeling perception
Diagram of inner ear
9
SMC tutorial
Modeling cognition
(Peretz and Coltheart, 2003)
10
SMC tutorial
Introduction: Some historical references
11
SMC tutorial
“I dream of instruments obedient to my thought and which with their contribution of a whole new world of unsuspected sounds, will lend themselves to the exigencies of my inner rhythm.” (Varèse, 1937)
12
SMC tutorial
Historical keywords 1950's-1960's. Algorithmic composition (Hiller, Xenakis) 1950´s-1960’s. Sound synthesis (MUSIC-V) 1960’s-1970’s. Sound analysis/synthesis 1970’s. Music workstations (Chant, 4A) 1980’s. Physical models, Interactive systems 1990’s. Music information retrieval
13
SMC tutorial
Hiller, L. & L. Isaacson. 1959. Experimental Music. McGraw-Hill Book Company, Inc. “The process of composition can be understood as the extraction of order from the chaotic multiplicity of possibilities” Information Theory. Method of Monte Carlo. Markov Chain. “The Illiac Suite” for String Quartet, 1957 Four experiments: Monody, two an four voices Four voices, first spices counterpoint Experimental music Music with Markov chains
14
SMC tutorial
Xenakis, I. 1963. Musiques Formelles. Revue Musicale.
glissandi in "Pithoprakta"
Philips Pavilion on the World Expo 58 in Brussels by LeCorbusier
15
SMC tutorial
Mathews, M. 1969. The Technology of Computer Music. MIT Press. MUSIC I-II first real computer synthesis program, developed by Max Mathews of Bell Laboratories in 1957. MUSIC III in 1960 introduced the concept of a “unit generator”. Newman Guttman, 1957: “The Silver scale” Daniel Arfib, 1979: “Le Souffle du Doux”
16
SMC tutorial
Matthews, M. and Moore, R. 1970. Groove—A Program to Compose, Store, and Edit Functions of Time. Communications of the ACM.
Groove: Generated Real-time Operations On Voltage-controlled Equipment
Emmanuel Ghent, 1970: “Phospones”
The GROOVE System at the Bell Telephone Labs , c1970
17
SMC tutorial
Chowning, J. 1973. The Synthesis of Complex Audio Spectra by Means of Frequency Modulation. Journal AES. FM with vibrato From a bell to a voice DX7 Rhodes Chowning, 1977: “Turenas”
18
SMC tutorial
Moorer, J. A. 1975. On the Segmentation and Analysis of Continuous Musical Sound by Digital Computer. Ph.D. thesis, Stanford University. Reference
Institute
Performance
Knowledge used
Moorer75
Stanford University
Polyphony:2 (severe limitations on content). Sounds: violin, guitar. Note range: 24.
Heuristic approach.
Chafe82, 85,86
Stanford University
Polyphony:2 (presented simulation results insufficient). Sound: piano. Note range: 19.
Heuristic approach.
Maher89, 90
Illinois University
Polyphony: 2. Sounds: clarinet, bassoon, trumpet,tuba, synthesized. Note ranges: severe limitation, pitch ranges must not overlap.
Heuristic approach.
Katayose89
Osaka University
Polyphony:5 (several errors allowed). Sounds: piano, guitar, shamisen. Note r.: 32.
Heuristic approach.
Nunn94
Durham University
Polyphony: up to 8 (several errors allowed, perceptual similarity). Sound: organ. Note range: 48.
Perceptual rules.Architecture: bottom-up abstraction hierarchy.
Kashino93, 95
Tokyo University
Polyphony: 3 (quite reliable). Sounds: flute, piano, trumpet, automatic adaptation to tone. Note range: 18.
Perceptual rules, timbre models, tone memories, statistical chord transition dictionary. Architecture: blackboard, Bayesian probability network
Martin96
MIT
Polyphony: 4 (quite reliable). Sound: piano. Note range: 33.
Perceptual rules. Architecture: blackboard
Klapuri 2001: original 19
transcription
SMC tutorial
Grey, J. M. 1975. An Exploration of Musical Timbre. Ph.D. thesis, Stanford University. Factors determining the timbre of a musical sound: • Loudness • Amplitude envelope • Fluctuations of pitch and intensity • Formant structures • Temporal evolution of spectral distribution
20
SMC tutorial
Moorer, J. A. 1978. The use of the phase vocoder in computer music applications. Journal AES.
N −1
STFT:
X l (k ) = ∑ w(n) x(n + lH )e − jω k n
l = 0,1,...
n =0
Inverse STFT:
1 s (n) = ∑ Shift lH ,n l =0 K L −1
K −1
∑ X ( k )e k =0
Pitch transposition (by Dolson)
21
SMC tutorial
l
jω k m
Time stretch (by Dolson)
Roads, C. 1978. Granular Synthesis of Sound. Computer Music Journal. "All sound is an integration of grains, of elementary sonic particles, of sonic quanta." -Xenakis (1971). Helmuth’s example
22
SMC tutorial
Cadoz, C. 1979. Synthese sonore par simulation des mécanismes vibratoires. Thèse.
23
SMC tutorial
Samson, P. R. 1980. A general-purpose digital synthesizer. Journal of the AES. 256 generators (waveform oscillators with several modes and controls, complete with amplitude and frequency envelope support) 128 modifiers (each of which could be a second-order filter, random-number generator, or amplitude-modulator, among other functions). 64 Kwords of delay memory with 32 access ports could be used to construct large wavetables and delay lines. A modifier could be combined with a delay port to construct a high-order comb filter or Schroeder allpass filter-fundamental building blocks of digital reverberators. Four digital-to-analog converters.
24
SMC tutorial
Karplus, K. and A. Strong. 1983. Digital synthesis of plucked-string and drum timbres. CMJ. Plucked-string model Jaffe, 1988: “Silicon Valley Breakdown”
Physical model of a flute
25
SMC tutorial
Cope, D. 1987. An Expert System for Computer-assisted Composition. CMJ. The EMI system is based on: deconstruction (analyze and separate into parts) signatures (commonality - retain that which signifies style) compatibility (recombinancy - recombine into new works)
EMI Bach Invention
26
EMI Beethoven sonata
SMC tutorial
EMI Joplin music
Puckette, M. 1988. The Patcher. Proceedings of the ICMC.
27
SMC tutorial
Lindemann, E. et al. 1991. The Architecture of the IRCAM Musical Workstation. CMJ.
28
SMC tutorial
Feiten, B. and S. Guenzel. 1994. Automatic Indexing of a Sound Data Base using Self-Organizing Neural Nets. CMJ. Music Information Retrieval Segmentation Melody Soundfile
Rhythm Instrument
Low-level Descriptors extractors
LLD XML file
29
Music Description Extractors
MusicD XML file
GUI-accessible functionalities Content Navigation Content Visualization Content Search & Retrieval Content-based Transformations
SMC tutorial
Historical references (1 of 2) Hiller, L. and L. Isaacson. 1959. Experimental Music. McGraw-Hill Book Company, Inc. Xenakis, Iannis. 1963. Musiques Formelles. Revue Musicale n°253-254, 1963. Matthews, M. 1969. The Technology of Computer Music. MIT Press. Matthews, M. and R. Moore. 1970. Groove—A Program to Compose, Store, and Edit Functions of Time. Communications of the ACM 12:715. Chowning, J. 1973. The Synthesis of Complex Audio Spectra by Means of Frequency Modulation. JAES 21(7): 526-534. Moorer, J. A. 1975. On the Segmentation and Analysis of Continuous Musical Sound by Digital Computer. Ph.D. thesis, Dept. of Computer Science, Stanford University. Grey, J. M. 1975. An Exploration of Musical Timbre. Ph.D. thesis, Dept. of Psychology, Stanford University. Moorer, J. A. 1978. The use of the phase vocoder in computer music applications. JAES, 26(1/2):42-45.
30
SMC tutorial
Historical references (2 of 2) Roads, C. 1978. Granular Synthesis of Sound. CMJ 2(2): 61-62. Cadoz, C. 1979. Synthese sonore par simulation des mécanismes vibratoires. PhD thesis. Grenoble: I.N.P. Samson, P. R. 1980. A general-purpose digital synthesizer. JAES 28(3): 106-113. Karplus, K. and A. Strong. 1983. Digital synthesis of plucked-string and drum timbres. CMJ 7(2):43-55 Cope, D.. 1987. An Expert System for Computer-assisted Composition. CMJ 11(4): 30. Puckette, M. 1988. The Patcher. Proceedings of the ICMC 1988. Lindemann, E. et al. 1991. The Architecture of the IRCAM Musical Workstation. CMJ 15(3), pp. 41-49. Feiten, B. and S. Guenzel. 1994. Automatic Indexing of a Sound Data Base using Self-Organizing Neural Nets. CMJ 18(3), pp. 53-65.
31
SMC tutorial
Current Research Trends 1. Sound synthesis and processing 2. Sound/Music description and understanding 3. Sound/Music interaction
32
SMC tutorial
Trends: Sound synthesis/processing
33
SMC tutorial
Tradition of sound synthesis
34
SMC tutorial
Synthesis with physical models
Digital implementation of an acoustic system
35
SMC tutorial
Sampling
Ex.: Fairlight (1980) Pioneered two innovations that transformed music making, namely sampling and sequencing.
36
SMC tutorial
Spectral processing
Original Sound
Spectral Fourier Analysis Analysis
Feature Extraction
Transform.
Original Feature
Spectral Synthesis
Transformed Feature
Original Spectrum
37
Feature Addition
Transformed Spectrum
SMC tutorial
Transformed Sound
Spectral representation N
x( t ) = A0 + ∑ Ak cos( 2πf k t + φ k ) k =1
N
{
= A0 + ∑ Re Ak e j ( 2πf k t +φk ) k =1
}
N = A0 + Re∑ Ak e jφk e j 2πf k t k =1 X k j 2πf k t X k* − j 2πf k t = X0 + ∑ e + e 2 k =1 2 N
where
38
X k = Ak e jφk SMC tutorial
Spectral analysis
39
SMC tutorial
Spectral transformations Original Sound Transformation
desired timbre envelope
Amp
Spectrogram of a vocal sound fi-1
40
SMC tutorial
fi
fi+1
fi+2
f
Audio mosaicing
Tristan Jehan, 2005
(Schwarz, 2007) 41
SMC tutorial
Concatenative synthesis A possible sounds produced by the instrument B sounds produced by the performer playing the instrument recorded audio samples
Ex.: Vocaloid (Yamaha & MTG-UPF, 2005)
Instrument sonic space
Performance Score
performer model
Synthesizer diagram 42
performance trajectory generator
performance DB SMC tutorial
Performance trajectory
sound rendering
Sound
Synthesis based on gestures
gestures generated by the system 43
SMC tutorial
Ex.: Violin synthesis
44
SMC tutorial
Synthesis/processing references Serra, X. 1997. Musical sound modeling with sinusoids plus noise. In C. Roads, S. Pope, A. Piccialli, and G. De Poli, editors, Musical Signal Processing, pages 91-122. Swets & Zeitlinger Publishers, Lisse, the Netherlands, 1997. Bonada, J. Serra, X. 2007. Synthesis of the Singing Voice by Performance Sampling and Spectral Models. IEEE Signal Processing Magazine Vol.24 .2 67-79 Smith, J. O. Physical audio signal processing: for virtual musical instruments and digital audio effects. http://ccrma.stanford.edu/~jos/pasp/, 2006. Zölzer, U. editor. 2002. DAFX:Digital Audio Effects. John Wiley & Sons, May 2002. Rocchesso, D. and F. Fontana, editors. 2003. The Sounding Object. Edizioni di Mondo Estremo, 2003. Schwarz. D. 2007. Corpus-Based Concatenative Synthesis. IEEE Signal Processing Magazine, 24(2):92-104, 2007. International Conference on Digital Audio Effects, http://www.dafx.de/
45
SMC tutorial
Trends: Sound Description and Understanding
46
SMC tutorial
Taxonomy of musical features
Lesaffre et alt., 2003
47
SMC tutorial
Audio content analysis • Content: The implicit and explicit information that is related to a sound or a piece of music and that is embedded in the signal itself. Content
Manually labelled Automatically extractable
Abstraction
Signal
• Goal: Automatically describe and deal (search, edit, transform) with audio data in a meaningful way. 48
SMC tutorial
Audio content classification
49
SMC tutorial
Levels of description Low-level (signal-centered) descriptors: computed from the audio signal in a direct or derived (ex: spectral analysis) way: average energy, spectral centroid, MFCCs …. Mid-level (object-centered) descriptors: requiring an induction operation or data modeling: key, genre, instrument … High-level (user-centered) descriptors: requiring a user model: mood (ex: happy, sad), …
50
SMC tutorial
Facets of music content
Melody / Harmony
Timbre Music Content Analysis Rhythm
51
Structure
SMC tutorial
Structure description Partitioning the sound stream into homogeneous regions Detecting special roles for the segmented regions: intro, verse, chorus, bridge, Other segments can also be identified: instrumental / singing; solo / ensemble; chords…
(Ong, 2006) 52
SMC tutorial
Structure description
(Ong, 2006)
53
SMC tutorial
Tonal description Extract: Melody (predominant melody or score) Harmony (chords) Key, modulations
Much research is related to automatic transcription of music (Klapuri PhD 2004) Fundamental frequency / Multipitch estimation (de Cheveigné) Melody extraction (Predominant pitch, note segmentation) Still unsolved, even for monophonic signals.
Pitch class distribution of a piece Mid and high level features -> apply a tonal model / musical analysis (Krumhansl, Leman, Temperley, ….)
54
SMC tutorial
Tonal description
55
SMC tutorial
Rhythm description Extraction of the metrical structure of a piece
(Gouyon, 2005) 56
SMC tutorial
Rhythm description
(Gouyon, 2005) 57
SMC tutorial
Semi-automatic annotation Which tags should be used for the unannotated song?
rock Fast
Quirky
Weird
metal thrash
Indie
90s
Fast Weird Cute 80s
Drums
Sweet
playful
90s
rock
???
pop
???
thrash
???
???
heavy metal Weird
Edgy gothic 90s rock Loud
concert
thrash metal
death
Quirky
58
Loud
???
loud
SMC tutorial
90s
rock
???
Cute
Fun noise pop
Edgy
guitar
twee
guitar
Fierce
concert
Cute
Drums
Weird
Semi-automatic annotation Tags are suggested using contentbased similarity rock Fast
Quirky
Weird
Cute
Drums
The user chooses the right ones from this limited set
Indie
90s
rock
Fast Weird Cute 80s
Drums
Sweet
playful
pop
90s
Fierce
Edgy
rock
90s
Loud
thrash
metal
Loud
heavy metal Weird
Edgy gothic 90s rock Loud
concert
thrash loud
metal
death SMC tutorial
90s
rock Weird
Quirky
59
Weird
concert
thrash
Cute
Fun noise pop
thrash
guitar
twee
guitar
metal
Cover detection * Research in: Descriptors sequences
Descriptors similarity
Sequence alignment
Applications: audio segmentation, mood classification, perceptually-based descriptors similarity measures, song hierarchies, visualization, sequence prediction, rights management...
60
SMC tutorial
Cover detection
61
SMC tutorial
Towards semantic descriptors Music complexity Genre Mood ????
62
SMC tutorial
Music complexity Acoustic complexity: loudness fluctuations Timbre complexity Rhythm complexity -> “Danceability” descriptors Tonal complexity
63
SMC tutorial
Ex.: Music retrieval
Efficient management of sound archives, music retrieval, …
MusicSurfer
64
SMC tutorial
Ex.: Personal annotation Good Vibrations, a winamp plugin for building “personomies” and automatically annotating collections (Sandvold, Celma, Herrera, 2005)
65
SMC tutorial
Ex.: Music recommendation
(Celma, 2006)
66
SMC tutorial
Music description references Orio, N. 2006. Music Retrieval: A Tutorial and Review. Foundations and Trends in Information Retrieval, 1(1): 1-90, 2006. Casey, M. et al. 2008. Content-Based Music Information Retrieval: Current Directions and Future Challenges. Proceedings of the IEEE, April 2008. International Conference on Music Information retrieval, http://www.ismir.net
67
SMC tutorial
Trends: Music performance/interaction
68
SMC tutorial
Modeling performance
obtain
encode
High-quality recordings
Machine representation analyze
Expressive aspects of recordings
Structure of pieces Machine learning Models
69
Analyze
SMC tutorial
extract
Symbolic description Synthesized score Automatic expression
New musical instruments Ex.: Reactable musical instrument based on a tangible interface
70
SMC tutorial
Performance/interaction references Gabrielsson, A. 2003. Music Performance Research at the Millennium. Psychology of Music, 31(3):221-272, 2003. Widmer, G. and W. Goebl. 2004. Computational Models of Expressive Music Performance:The State of the Art. JNMR 33(3):203-216, 2004. Jordà, S. 2005. Digital Lutherie: Crafting musical computers for new musics performance and improvisation. PhD thesis, Pompeu Fabra University, Barcelona. Camurri, A. et al. 2000. Expressiveness and physicality in interaction. JNMR 29(3). International Conference on New Interfaces for Musical Expression, http://www.nime.org/
71
SMC tutorial
Future Challenges in SMC
72
SMC tutorial
Bridging the semantic gap Human Knowledge
memories
understanding
personal identity
opinions
emotions
Content Objects
genre source
harmony
dynamics
Signal features
loudness time
timbre spectrum
intensity
73
melody
duration
semantic features
labels sentences
pitch
semantic gap
music scores
similarity rhythm
expectations
tags
shot rhythm
scenes verbs
contrasts textures
articles numbers
signs
motions
adjectives
frequency
graphic style
nouns
colors
shapes
Audio
Text
Image
(music recordings)
(lyrics, editorial text, press releases, …)
(video clips, CD covers, printed scores, …)
SMC tutorial
Music information plane
Bridging the semantic gap Human Knowledge
memories
understanding
personal identity
opinions
emotions
Content Objects
Statistical dynamics modeling
Signal features
loudness time
timbre
Signal spectrum processing
intensity
74
genre
source melody Machine learning harmonyMusic theory
duration
pitch
semantic gap
music scores
similarity rhythm
expectations
semantic features
labels sentences
Web mining
tags
shot rhythm
scenes verbs
contrasts textures
articles numbers
signs
motions
adjectives
frequency
graphic style
nouns
colors
shapes
Audio
Text
Image
(music recordings)
(lyrics, editorial text, press releases, …)
(video clips, CD covers, printed scores, …)
SMC tutorial
Music information plane
Bridging the semantic gap Human Knowledge
Signal features
Music Computational cognitionsimilarity musicology
expectations music
Reasoning scores genre rhythm graphic Text rules semantic style source melody understanding labels shotMultimodal Machine features processing learning harmony Music Ontologies rhythm signs tags Statistical theory sentences dynamics Web motions modeling mining
loudness time
timbre
Signal spectrum processing
intensity
75
personal identity
opinions
emotions
Computational neuroscience Content Objects
memories
understanding
duration
pitch
scenes
adjectives
frequency
verbs
contrasts textures
articles numbers
nouns
colors
shapes
Audio
Text
Image
(music recordings)
(lyrics, editorial text, press releases, …)
(video clips, CD covers, printed scores, …)
SMC tutorial
semantic gap
Music information plane
Modeling music making
(Leman, 2007)
76
SMC tutorial
Modeling music communication Score A
Performer A
Score B
Performer B
Score C
Performer C
Score D
COMPOSITION
Performer D PERFORMANCE
Instrument A
Instrument B
Instrument C
Compositional channel: musical message + role in the performance (solo, accompanist, etc…)
Instrument D
Visual channel
Audience indiv. 1
77
Audience indiv. 2
Audience indiv. 3
SMC tutorial
AUDIENCE
Sonic channel Instrumental channel: soundproducing and modifying movements / actions, haptic feedback
Modeling social interaction
simple structure of a social network 78
SMC tutorial
Concluding remarks Defintion of the field: Sound and Music Computing research approaches the whole sound and music communication chain from a multidisciplinary point of view. By combining scientific, technological and artistic methodologies it aims at understanding, modeling and generating sound and music through computational approaches. from http://smcnetwork.org/roadmap 79
SMC tutorial
Thanks!!
Xavier Serra Universitat Pompeu Fabra Barcelona, Spain
[email protected]
Stefan Bilbao University of Edinburg UK
[email protected]