Design and Implementation of an ATM based Distributed Musical Rehearsal Studio Dimitri Konstantas (1) Simon Gibbs (2) Yann Orlarey (3) Olivier Carbonel (4) (1) University of Geneva - Centre Universitaire d’Informatique 24 rue General-Dufour, CH-1211 Geneva 4, Switzerland tel: +41 22 705.76.64, fax: +41 22 705.77.80 e-mail:
[email protected] (2) GMD, IMK.VMSD, Digital Media Lab, Schloss Birlinghoven, D-53724 Sankt Augustin, Germany,
[email protected] (3) GRAME, 6, quai Jean Moulin, F-69202 Lyon, France,
[email protected],
[email protected]
Abstract A Distributed Musical Rehearsal is one of the most demanding telepresence applications, in terms of video and audio quality and transmission latency. In the frame of the EU ACTS project Distributed Video Production (DVP) we developed an ATM based environment for the organization of Distributed Musical Rehearsals. In this paper we describe the technical specifications of the installations required for the organization and studio set-up for a distributed musical rehearsal, present our implementation of the environment and give the first results obtained from the organized distributed rehearsal trials. In the Proccedings of : ECMAST'98, 3rd European Conference on Multimedia Applications, Services and Techniques, Berlin-Germany, 26 - 28 May 1998. This work was supported by the Swiss Federal Government with the OFES grant No. 95.0493 and by the Commission of the European Union in the frame of the ACTS project DVP (AC 089).
Design and Implementation of an ATM based Distributed Musical Rehearsal Studio1 Dimitri Konstantas1, Yann Orlarey2, Simon Gibbs3 and Olivier Carbonel2 1
University of Geneva - Centre Universiraire d’Informatique, 24 General Dufour, CH-1211 Geneva 4, Switzerland e-mail :
[email protected] 2 GRAME, 6, quai Jean Moulin, F-69202 Lyon, France e-mail:
[email protected],
[email protected] 3 GMD, IMK.VMSD, Digital Media Lab, Schloss Birlinghoven, D-53724 Sankt Augustin, Germany, e-mail:
[email protected]
Abstract. A Distributed Musical Rehearsal is one of the most demanding telepresence applications, in terms of video and audio quality and transmission latency. In the frame of the EU ACTS project Distributed Video Production (DVP) we developed an ATM based environment for the organization of Distributed Musical Rehearsals. In this paper we describe the technical specifications of the installations and the organization and studio set-up of the distributed musical rehearsal environment, present our implementation of the environment and give the results obtained from the organized distributed rehearsal trials.
1
Introduction
The recent technological advances in digital media technology and communication networks are rapidly changing the media production environments. Media producers are now able to remotely collaborate in the production of different media productions using digital video technology and broadband communication networks. The EU ACTS Distributed Video Production (DVP) project [1] aims to take existing broadband technology, mainly ATM, as starting point and to develop services (applications) on top of ATM that directly address the needs of the television industry. Several broadcasters are included in the consortium and are providing user input and participating in field trials. One of the four pilot applications of the DVP project is the Distributed Rehearsal which aims in developing a studio-based teleconferencing service enabling small groups of geographically separated actors and musicians to conduct rehearsals as if face-to-face. The target is to produce a natural immersive teleconference environment allowing several people to work together as if present in the same (rehearsal) room. Musical Rehearsal, being one of the most demanding teleconference applications, was chosen as the pilot application for the development of the teleconferencing service. Toward this target a two site set-up was implemented. The first site is installed at the German 1
This work was supported by the Swiss Federal Government with the OFES grand No. 95.0493 and the EU ACTS project DVP (AC 089)
Research Center for Information Technologies - GMD at Sankt Augustin, just outside Bonn, and the second site at the Centre Universitaire d’Informatique of the University of Geneva. A number of tests and rehearsal trials were performed aiming in testing and fine tuning the equipment, and measuring the qualitative and quantitative performance of the system based on a methodology developed within the project. In this report we present an overview of the distributed rehearsal environment and describe the trials performed. In the section 2 we give an overview of the functional requirements for the organization of a distributed rehearsal, while in section 3 we present the environment. Section 4 describes the rehearsal trials we organized and discuss related issues and problems; section 5 outlines the evaluation results and in section 6 we present our conclusions.
2
Functional Specifications for a Distributed Musical Rehearsal
The target of a musical rehearsal is to prepare the musicians and the conductor for the final performance.The musical rehearsals of modern music leading from the first meeting of the musicians and conductor, to the public performance of the musical piece, are organized, in general, into three major phases: 1. During the first phase, which we will call the protocol phase, the musicians and the conductor get acquainted with the musical piece. The conductor explains to the musicians the different notations of the musical piece and the gestures that he will be using in the direction of the orchestra. 2. During the second phase, which we will call the rhythm phase, the musical piece is played in successive fragments with the tempo slowed down in the delicate parts. The musical piece is played “out of time”, linearly, so that the musicians can have a first notion of the complete musical piece. Once this notion has been obtained it is possible to start a more detailed work on the different fragments 3. Finally the third phase, which we will call the sound phase, deals with the sound considerations of the rehearsed musical piece. The sound level and quality is fine-tuned and possible problems are resolved. The protocol phase is in general short (30 to 60 minutes), while the rhythm and sound phases are very long (hundreds of hours). It must be noted that the three phases can be (and in general are) intermixed and musicians and conductor can go from one phase to another according to the needs and problems faced during the rehearsal. Also it is common for the musical play to be partitioned in smaller pieces, each of which is rehearsed independently with alternating protocol, rhythm and sound phases. Other types of musical rehearsals (classical, jazz etc.) might have different organizational phases but the requirements are practically the same for all of them. During a rehearsal (and of course a performance) the conductor coordinates the musicians using body language. That is, in addition of the hand movements for keeping the rhythm, the conductor expresses the strength of the music with the way he is breathing and/or the expression of his face, gives the starting or preparation signal to a musician with a fast glance and/or a movement of his head etc. The musicians constantly observe the conductor and follow his signals while in parallel they listen to the music from the other musicians synchronizing and controlling their own tempo. From his part the con-
3
ductor also observes constantly the musicians verifying, for example, if a musician is ready (has taken the appropriate position) to start playing, and listens to the intensity and rhythm of the individual instruments and of the overall orchestra. In synchronizing the musicians the conductor takes into consideration the layout (distance, position) of the musical instruments in the room as well as the speed of the reflexes of the musician. 2.1
Two-site Distributed Musical Rehearsal
In a localized rehearsal the musicians and the conductor are physically in the same room. In a distributed rehearsal the musicians and the conductor will be distributed in two or more sites. The most simple distribution of the rehearsal is when we have only two sites. In this case we have two alternatives for the distribution of musicians and conductor: we can have the musicians at one site and the conductor at the other site or we can have part of the musicians and the conductor at one site and the rest of the musicians at the other site. Since the experiment for a distributed rehearsal using telepresence techniques is novel, we considered for the definition of the functional requirements the two-site case where the conductor is at one site and the musicians are at the other site. Distributed Rehearsal Requirements. In general the requirements for organizing a distributed musical rehearsal are these of a telepresence session. That means that the basic need is the existence of (at least) a video wall with minimal dimensions 2x3 meters and high resolution video projection, and at least a hi-fi audio system. The goal is to give the impression to the rehearsal participants that they are physically in the same room. However a musical rehearsal needs not only a higher quality of audio (at least 20Hz to 22KHz) but in addition it requires an accurate 3D restitution of the sound space and very low transmission latency. This is due to the fact that the conductor needs to be able to also identify the exact position of each musical instrument and synchronize as accurately as possible with the musicians. It is clear that the requirements for the three rehearsal phases, in terms of audio and image quality as well as synchronization and delays, are different. We thus need to define up to which point the three phases of the localized rehearsal can be distributed, and under what environment characteristics each rehearsal phase is possible. • During the protocol phase the conductor and the musicians communicate orally explaining gestures and going over the musical piece. Thus the requirements for audio and video are relatively low, to the level of simple teleconference. No special needs for high synchronization or high quality of audio and video exist. • During the rhythm phase the musical piece is traversed in “slow motion” so that the musicians and the conductor can be familiarized to its tempo and divisions. Thus good quality of sound and image is required as well as relatively good synchronization of audio and video. However the transmission delays can be rather high (at the level of 200 ms) since the conductor synchronizes the musicians in an rough way. • During the sound phase the musical piece is performed as it is originally composed for. The requirements in audio quality and audio-video synchronization are increasingly high. In addition the transmission delays must be as small possible and in no case greater than 80 to 100ms. This delay corresponds to a distance of 30 meters between the conductor and musicians, which is not uncommon for large orchestras.
4
It is clear that the rehearsal phases have increasing technical requirements, with the sound phase having the strictest ones. Thus the stricter the requirements a distributed rehearsal environment fulfills, the more useful it will be for the organization of distributed rehearsals. Since on the other hand the three rehearsal phases will be in general intermixed, an environment satisfying the requirements of only the two first phases might not be useful for more that one rehearsals. Transmission Delay Requirements. During the rehearsal the conductor and the musicians must be able to perceive each other as if they were in the same room in terms of vision and sound. This means that the conductor should be able to see/hear the reaction to his gestures from the musicians with a maximum delay of about 160ms. This corresponds to a one way transmission delay of 80ms. Video Quality Requirements. Since conducting an orchestra is done using body language, the video quality should be high enough to allow musicians to correctly see and interpret the conductor’s signals. That means that we must have high resolution video and that the musicians should even be able to see where the conductor is watching. The conductor on the other hand should be able to identify correctly the position of each musician, as if they were physically in the same room. Thus the projected video should preserve the natural dimensions and perspective of the orchestra and conductor. Fortunately the conductor and musicians do not change position during the rehearsal, meaning that we can statically calculate the right perspective for the projected video. Audio Quality Requirements. The exact reproduction of the music is of major importance for the evolution of the rehearsal. Furthermore due to the fact that the conductor needs to be able to identify the exact location of an instrument, the sound space needs also reproduced. This means that the microphones not only need to be of a high quality but also placed in well studied points so that the reproduction of the sound space can be as accurate as possible. Rehearsal Studio. Of major importance for the distributed rehearsal application is the organization of the rehearsal studios and the layout of the required equipment for the capture and reproduction of image and sound. The target is that at the remote site the image and sound is reproduced as accurate as possible. Video Capture and Reproduction. In a localized rehearsal the musicians and the conductor are physically in the same room. In the distributed rehearsal the musicians and the conductor will be logically in the same room. That means that the video images will be projected on a large screen (2x3m) (video wall) which, for the two peers, will be like if a frame was placed between them in the rehearsal studio (Fig. 1., Fig. 2.). Obviously the luminosity of the video wall should be strong enough so that both sites can have sufficient ambient light. Since the perspective of the positions of the musicians and the conductor should be preserved, video capture should be made from a position between the screen and the musicians/conductor. Thus we cannot use large cameras which will hide the screen, but we must use non-intrusive miniature cameras. Furthermore a perspective correction of the video might be required, depending on the characteristics of the camera, so that the
5
Fig. 1. Conceptual view of the rehearsal
Fig. 2. Physical partitioning and layout of the rehearsal
physical proportions of the participants are accurately reproduced. Needless to say we must have at least 25 frames per second for the transmitted video. Audio Capture and Reproduction. For a distributed musical rehearsal the audio quality is the prime factor for its success. We must not only reproduce the sound in high quality but also preserve the depth and direction information [4]. The conductor should be able to understand the direction of each sound produced by the musicians. Thus the major issue is capturing the sound so that it can be reproduced as accurately as possible. One method for capturing the sound is by placing a microphone in front of each musician. However this suppresses all information regarding the placement of the musicians and in order to correctly reproduce the music for the conductor we must use specialized virtual reality software, like Le Spatialisateur [6] or the Interactor [5], to recreate the spatial information of the sound. A second method is to use a number of microphones (2 to 4, depending on the configuration) that will capture the sound along with all its spatial information. For the case where the conductor is at one site and the musicians at the other site, the calculations are relatively simple and two microphones placed at the right position will be sufficient [11]. For more “complex” configurations (more sites, conductor sharing a site with part of the musicians) this solution might not be adequate. Virtual reality techniques will need to be employed. However we did not study these techniques since they were out of the scope of the project. A third method is to use a dummy head, placed approximately at the place where the conductor’s head would be if he was physically present the room, with microphones capturing the sound at exactly the position of the ears with all the vibrations and interferences as would have been captured by a person’s ears. The sound space needs to be reproduced accurately only for the conductor. For the musicians the only sound source is the conductor who is at a specific place in front of them and thus a speaker behind the screen will be sufficient. For the reproduction of the sound at the conductor’s site we have two possibilities. The first is to use headphones and the second to use loud-speakers. Using headphones has the advantage that with the use of virtual reality technology [6][5], we can correctly reproduce the sound with all the its spatial information [8]. However this solution requires the installation of sensors that will be able to capture the position and orientation of the head of the conductor. The use of loud-speakers simplify the audio installations and provides a non-intrusive solution but it has the disadvantage that, if no virtual reality techniques are used and tracking of the conductor, the spatial information will be correctly reproduced only
6
at one specific point and the conductor will have reduced localization capability. 2.2
Multi-site Distributed Rehearsal
In a multi-site distributed rehearsal or in the two-site set up with one group of the musicians and the conductor at one site and a second group of musicians at the other site, the requirements become more complex. First the logical layout of the orchestra needs to be defined in a way so that a physical representation is possible. That means that if for example we have a three site set-up, we will need for the two sites of the musicians to have two projection screens, one in front of them for the conductor and one on the side for the second group of musicians. For the conductor the image of the two sites will be merged into one in order to give the impression of a complete orchestra. However the most important challenge in multi-site set-ups is the synchronization of the music played by the musicians. Assuming that we have even a very small delay in the transmission of the video and audio, like for example 20 ms, it is impossible for the music coming from all sites to be synchronized to more than one site. For the three site example, if the conductor is listening to the synchronized music from the two musician sites, then each musician site will be listening the other site with a delay of 20ms. Thus the musicians will not be able to synchronize between themselves. This of course does not means that it is not possible to conduct a rehearsal, but that a different way of organizing and conducting the rehearsal will be needed.
3
Overview of a Distributed Rehearsal studio installation
Based on the above requirements we installed two distributed rehearsal studios, one in the University of Geneva and one at GMD. Although the two studios are not identical, the basic technology used (encoding, video/audio capture and reproduction etc.) is the same. The major difference of the two studios is that the GMD studio has a far greater processing power and can integrate virtual studio techniques in the video production. In the next sections we will concentrate at the infrastructure that is directly related to the distributed rehearsal environment. More details regarding the virtual studio installations at GMD can be found in [3][7]. A diagram of the studio set up and network installations is given in Fig. 3. Video Capture and Reproduction. For the video capture non-intrusive micro-cameras (Panasonic) as well as small consumer digital cameras (SONY DCR-VX1000E) were used. The projection was made using 2 tri-tube low luminosity projectors (SONY - 230 ANSI Lumen) at GMD and a high luminosity light-valve projector at CUI (BARCO 8100 - 800 ANSI Lumen). The video wall used standard medium quality screens with a size 2x2.6 meters. Audio Capture and Reproduction. To accurately reproduce the sound of the orchestra at the conductor’s site, two different sound capture systems were tested: a dummy head and a dual microphone set up, both combined with a matrix for the correct 3D reproduction of the sound.
7
Video and Audio Encoding and Transmission. The video and audio were digitally encoded and transmitted using ATM lines. The codecs used were the FORE StreamRunner AVA/ATV (Former name Nemesys AVA-300, ATV-300). The video was encoded in an MJPEG stream and the audio was digitized in DAT quality (48KHz sampling). The bandwidth used for the transmission of the video (non-interlaced, PAL 25 fps) was between 12 and 14 Mbps (depending on the image complexity) and for the audio (DAT stereo) 1.5 Mbps. Initially the video encoding-decoding delay was 70 ms while the transmission delay was 11 ms (which makes a total delay of 81 ms). However experimentation with the codecs allowed us to reduce the video encoding-decoding delay to 46 ms (by sending even fields). The audio encoding-decoding delay on the other hand was 6 ms and we introduced a 20ms buffering in order to eliminate buffer underflow producing an annoying clicking in the audio. Thus the total audio delay was 31 ms. To be noted that the total ATM Traffic analyser and control bandwidth required for inVideo wall terlaced PAL video at 25 CUI fps was around 29Mbps, AVA-300 while the available ATM Dummy head bandwidth we had was 24 16x16 video switch Mbps. It was for this reason ATV-300 ATM DEC GigaSwitch that we transmitted non-interlaced video (that is, one Audio Mixer and Matrix To musicians’ field per frame) which respeakers 24 Mbps ATM quired half the bandwidth. In total (audio + video) we used a bandwidth of apPTT ATM Network (Swiss and German PTT) proximately 15.5 Mbps. The encoded audio and vidGMD eo were transmitted over ALL5 while the codecs are Video from other sources able to work with a cell Non-intrusive losses reaching up to 25% AVA-300 camera (which of course creates some artifacts in the reproATV-300 duced video and audio). 64x64 video switch ATM Fore switch The ATM switching was Video wall done using a DEC GiAudio mixer gaSwitch and Fore switches PAL video Microphone Analogue audio while the control and ATM ATM (155 Mbps Interface) traffic analysis was done Head-set ATM (34 Mbps Interface) using a SUN Sparc station Fig. 3. Distributed rehearsal studio Installation and an HP ATM traffic analyzer. The ATM connection from the University of Geneva to GMD was done using PVCs. It is interesting to note that we were unable to use SVCs due incompatibilities in the UNI3 signaling protocols between the DEC and Fore switches.
8
The choice of MJPEG encoding over other standards, like MPEG or H.261 was done for many reasons. The most important was that all existing MPEG codecs had (and still have) encoding delays of more that 200ms, whereas we require at most 80ms. Some companies have announced MPEG codecs with I-frame encoding delay less than 60ms which however are expected to appear in the market in mid-1998. A second reason for choosing the specific MJPEG codecs was due to their lower cost. Existing MPEG codecs were (and still are!) at least one order of magnitude more expensive. Other low bandwidth encoding standards like H.261 were rejected from the first moment due to their low audio/video quality. To be noted that for our application the major consideration was not to reduce network bandwidth but to achieve a high audio/video quality.
4
Distributed Rehearsal Trials
From the several tests and trials performed the most characteristic ones were a distributed singing rehearsal and the two distributed musical rehearsals [2]. All trials were organized between GMD (Sankt Augustin) and the University of Geneva. The singing rehearsal took place at the 30th of May 1996 and it was an early trial, where instead of the video wall at CUI we used a large (117cm) television screen. The two full scale distributed musical rehearsal trials were organized in November 15th 1996 and May 30, 1997 with the GRAME EOC orchestra. The total duration each rehearsal was 6 hours. In the November 1996 rehearsal trial objective evaluation tests were performed while in the May 1997 rehearsal trial designation tests were organized. 4.1
Distributed Singing Rehearsal
At the 30th of May 1996 we organized a distributed singing rehearsal trial of a duet with a piano. The pianist (L. Baghdassarian) and one singer (L. Dami) were in CUI - Geneva (Fig. 4.) while the second singer (F.Meylan) was in GMD - Bonn. The songs played were extracts from Hendel’s “Israel in Egypt” and Britten’s “Abraham and Isaac”. The total duration of the rehearsal was 2 hours. Because no spatial sound information was needed for this rehearsal, monophonic audio channels were used for the audio transmission. The audio sent to GMD from CUI was the mixed audio signals of the piano and the singer, while the audio from GMD to CUI was the audio signal from the singer. To be noted that the used early version of the codecs’ control software did not allow a control over the audio delay and buffering. As a result the audio delay was the same as the video delay and namely around 80 ms. Issues of the Distributed Singing Rehearsal. The goal of the distributed singing rehearsal was, on the one hand, to give us a first idea of the technical problems and issues related to the organization and setup of the rehearsal and, on the other hand, to give a first subjective appreciation of the feasibility and limitations of a distributed rehearsal. The most important issue in the singing rehearsal trial was the synchronization of the singers. The problem was that we had a delay of about 80 ms (one way) in the transmission of sound and image and thus it was impossible to be synchronized at both sites. If at one site the local singer was synchronized with the remote singer, then the remote singer would perceive his peer with 160ms delay. The way to resolve this problem is to
9
Fig. 4. Singing Rehearsal- CUI set-up
Fig. 5. Singing Rehearsal-Video from GMD
use the notion of a central point where singing is synchronized. This can be either of the sites or even a third site. In fact singing under this type of situation, that is, long delays and synchronization of the audio at a central point, appeared and mastered during the Renaissance at the Cori Spezzati, where in a large church the conductor was in the middle of the church and the singers were distributed all around at the church’s balconies. Due to the size of the church the sound delay between the singers could reach 200 ms or more (60 m distance) and the only point where the sound was synchronized was at the center of the church where the conductor was standing. Of course the difference of the Cori Spezzati and the distributed rehearsal is that in the later we not only have a delay in the sound but also in the image. Thus the singers must anticipate the conductor (or each other). Nevertheless a similar case has been faced in operas when the singers sing behind the stage being invisible to the conductor. The singers behind the stage have to anticipate the conductor in order to synchronize with the singers on the scene. The degree up to which synchronization can be achieved in the presence of delays also depends in the musical piece played. In the distributed singing rehearsal two different songs were rehearsed. One by Hendel (from Israel in Egypt) and one by Britten (from Abraham and Isaac). The Hendel piece having a regular rhythm was easier for anticipating the remote singer. The Britten piece on the other hand did not have a regular rhythm, being free time music, and thus it was more difficult to anticipate the remote singer and synchronize. However in both cases after some test and trial the singers managed to synchronize their singing. In the distributed singing rehearsal setup at CUI the image received and displayed was a mix of the GMD and CUI singers (Fig. 5.). The two images were mixed at GMD from their local video and the signal transmitted from CUI, and send to CUI. As a result the image of the CUI singer that was project at CUI had a delay of about 160ms. This confused the CUI singer since he was seeing himself with a quite long delay. An important technical problem we faced in the distributed singing rehearsal was the fine tuning of the audio signals in order to fit the acoustics of the studios. In the distributed singing rehearsal setup the sound from the remote site was reproduced using loud-speakers. This way the microphone of the singer also captured the reproduced sound from the remote singer which was then transmitted back to the remote site. As a result each singer heard the remote singer and his own voice delayed by approximately
10
160 ms. These artifacts and echo were extremely confusing for the singers. Since echo cancelers could not be used in this case we adjusted the gain of the microphones and the volume of the loud-speakers in order to eliminate the effect. 4.2
November 1996 Distributed Musical Rehearsal
The piece retained for the first distributed musical rehearsal trial, “Dérives”, was composed in 1984 by Pierre Boulez for six instruments: piano, vibraphone, violin, cello, flute and clarinet. The musicians where installed in Geneva while the conductor was in Germany. A dummy head was placed approximately where the conductor’s head would be if he was physically present in the rehearsal (Fig. 6.). The distributed musical rehearsal trial was organized into four phases. The first phase consisted of the tuning and equipment for the correct capture and reproduction of the sound. The second phase was dedicated in the quantitative quality control of the installation. The third phase was the main part of the trial where the orchestra rehearsed the retained musical piece (Fig. 7.). Finally in the fourth phase the musicians and the conductor were interviewed independently in order to obtain a subjective measurement of the distributed rehearsal environment. Evaluation Methodology. A very important issue for the distributed rehearsal system is the measurement of its quality and in consequence the limits of its usability. The quality of the system can be measured subjectively and objectively. Although subjective measurements are very important for the users of the system, we needed objective measurements in order to evaluate different options and technology choices. For this reason we developed within the project a methodology for the objective measurement of the system quality [9]. The methodology is based on the fact that in a musical rehearsal the conductor must control, identify and possibly modify notes that are coming wrong in the musicians’ scores. During the second phase of the trial specific scores were given to the musicians and the conductor. The scores given to the musicians contained various errors (like time errors, pitch errors, dynamic errors etc.) when compared to the score given to the conductor. The errors ranged from very easy to detect, to very difficult ones. The conductor was then asked to detect the errors in the musicians scores. By reproducing the same
Fig. 6. November 1996 Distributed Musical Rehearsal - Conductor’s view/dummy head
Fig. 7. November 1996 Distributed Musical Rehearsal - CUI set-up
11
Fig. 8. May 1997 Distributed Musical Rehearsal - Conductor’s view
Fig. 9. May 1997 Distributed Musical Rehearsal - Conductor at GMD
test in a local situation with different but equivalent test scores, and comparing the errors found in the distributed rehearsal and in the local rehearsal we were be able to establish a concrete measurement of the system quality. To be noted that the rehearsed piece “Dérives” was later performed in concert without additional rehearsals, a fact which indicates that the distributed rehearsal was effective for the preparation of the musicians and composed. 4.3
May 1997 Distributed Musical Rehearsal
A second distributed musical rehearsal was organized in May 30 1997. In this rehearsal the retained piece was H.P. Platz’s “Piece Noire”, for 12 musicians (Fig. 8., Fig. 9.). The total duration of the rehearsal was 6 hours. In contrast with the first distributed musical rehearsal, which had a high element of research, this was a fully professional rehearsal, part of the rehearsal series for the performance of the piece. Based on the observations and analysis of the first trial, a number of improvements were made to the system. The most important ones were the use of a higher quality dummy head (Fig. 10.) improving the acoustic diacritical capability of the conductor, and a more detailed perspective correction of the projected video at both sides. The major issue of this second rehearsal was the study of the conductors designation
Fig. 10. May 30, 1997 Distributed Musical Rehearsal CUI set-up and Dummy Head
Fig. 11. May 29, 1997 - Designation Tests
12
capabilities. That is, how well can the conductor identify the position of the musicians and how well can the musicians identify which of them is designated by the conductor. A set of designation tests were performed in May 29, 1997, with subjects students from the University of Geneva. During these tests different groups of students were positioned in specific places in front of the video wall and were designated by the “conductor” in a number of designation sequences (Fig. 11.).
5
Evaluation Results
The analysis of the evaluation results, which are described in detail in [9], allowed us to identify a number of issues that could contributing in the improvement of the overall system. Nevertheless we must note that the evaluation results are not easy to understand and interpret. More trials with more detailed experimentation is needed in order to fully understand the implications and related issues. 5.1
Objective Quality Evaluation
The objective evaluation results (Fig. 12.) indicate that the overall quality of the DR system is about 40% of a normal localized rehearsal. However it is interesting to note that the performance differs drastically from instrument to instrument. For example the performance for the flute is superior in the distributed environment than in the local one, the performance of the violon is equal for both environments, while the performance of all other instruments is inferior in the distributed environment. In our opinion this is due to the fact that the audio capture system behaves differently for each musical instrument, depending on its frequency range and its harmonics. Due to the digitalization of the sound it is probable that some phase information and high frequency harmonics are lost. Another factor that might contribute in the performance degradation of the system are the differences in the clock rates between the analog-to-digital and digital-to-analog hardware [10]. In addition the acoustics of the local and remote rehearsal rooms contribute greatly in the rehearsal performance. While the local rehearsal was done in an acoustically tuned theater, the distributed rehearsal was performed in a room where no special considerations were taken for its acoustics.
Fig. 12. Objective evaluation results
Fig. 13. Designation evaluation results
13
Finally we must note that due to the small number of trials performed the statistical sample is not very accurate. A greater number of trials was needed in order to obtain higher confidence results, which however was not possible within the budget and timeframe of the project. 5.2
Designation Tests Evaluation
The second set of evaluation tests we performed, which also affect the overall rehearsal performance, concerned the designation capabilities of the conductor. For these tests we used a number of students in the place of the musicians. The results, shown in Fig. 12., indicate clearly that the performance, that is the ability of the subjects to identify that are designated, improves dramatically with two to three 10 minute training sessions. As it can be seen after the short training sessions the performance reached the 100% level If we now consider that in the distributed rehearsal evaluation the musicians had no training at all in the DR environment, we can very well expect an improvement in their collaboration with the conductor. In fact this was clearly observed in the second trial where we had 12 musicians out of which 5 were present in the first trial. For these musicians the participation in the rehearsal was easier and more natural, allowing them to concentrate more in their work. Thus we strongly believe that even a couple of sessions will allow the musicians to familiarize with the DR system and in consequence achieve an improvement in the system performance. 5.3
ATM Measurements
Throughout the DR tests and trials we performed a large number of tests measuring the ATM QoS for different loads and configurations. The measurements included cell delay variation, cell interarrival variation and cell loss ratio. The measurement results showed that the cell delay and interarrival time is drastically affected by the load. For Fig. 14. and Fig. 15. give the cell delay under two different loads, 16 Mbps and 4 Mbps respectively. As it can be seen for 16 Mbps load the mean cell delay is 12.01 ms with a variation of 0.95 ms, while for 4 Mbps load the mean is 10.96 ms with a variation of 0.0076 µs. An interesting result of our measurements was the we observed no cell losses. In all our tests the cell loss ratio was always zero.
Fig. 14. ATM cell delay statistics - 16 Mbps
Fig. 15. ATM cell delay statistics - 4 Mbps
14
6
Conclusions
In the frame of the ACTS DVP project an immersive teleconference environment was set up between the University of Geneva and the German National Research Center for Information Technologies. The pilot application being a distributed musical rehearsal. The system was based in ATM lines and MJPEG video encoding, using video walls and 3D sound capture and recreation. A number of musical rehearsal trials were organized and the system was evaluated using a methodology developed in the project. The evaluation, based on technical and objective measurements, indicate that the system can be useful to professional musicians for conducting rehearsals over long distances. However apart the measurable technical characteristics of a rehearsal there are other elements that cannot be measured, like the relation between the conductor and the musicians, the “feeling” of the rehearsal room etc. It is for these reasons that we do not expect, at the time being, that a distributed rehearsal system can replace completely a localized rehearsal. Nevertheless the advantages of such an environment can be numerous, both in gaining time and money and in bringing together artists and ideas from around the world. A distributed rehearsal system can give the possibility to musicians to work together with musicians and conductors from other distant countries, allow music students to follow courses given by famous composers and conductors without the need to travel to the other side of the world and even allow a first level selection of musicians for an empty place in an orchestra.
References 1 2 3
4
5
6
7 8
9
Distributed Video Production - DVP, ACTS project AC 089, http://www.gmd.de/DVP/ DVP Work Package 4.3 Distributed Rehearsal, http://cuiwww.unige.ch/OSG/projects/dvp/ Breiteneder, C., Gibbs, S., and Arapis, C., “TELEPORT - An Augmented Reality Teleconferencing Environment”, 3rd Eurographics Workshop on Virtual Environments, Monte Carlo, Monaco, February 1996. Canévet, G., “La localisation auditive des sons dans l’espace”, Proceedings of the Rencontres Musicales Pluridisciplinaires Informatique et Musique - Le Son & l’Espace, Ed. GRAME, March 31 - April 1, 1995, Lyon, France, pp 3. Jaffrennou, P. A., “De la Scènographie Sonore”, Proceedings of the Rencontres Musicales Pluridisciplinaires Informatique et Musique - Le Son & l’Espace, Ed. GRAME, March 31 April 1, 1995, Lyon, France, p. 87. Jot, J-M., and Warusfel, O., “La Spatialisateur”, Proceedings of the Rencontres Musicales Pluridisciplinaires Informatique et Musique - Le Son & l’Espace, Ed. GRAME, March 31 April 1, 1995, Lyon, France, p. 103. Gibbs, S., Arapis, C., Breiteneder, C., Lalioti, V., Mostafawy, S., and Speier, J., “Virtual Studios: The State of the Art”, Eurographics'96 STAR Reports. Marsault, X., “La simulation des ambiances sonores réalité virtuelle”, Proceedings of the Rencontres Musicales Pluridisciplinaires Informatique et Musique - Le Son & l’Espace, Ed. GRAME, March 31 - April 1, 1995, Lyon, France, p. 23. Orlarey, Y., Carbonel, O., Konstantas, D. and Gibbs, S., “Distributed Rehearsal Evaluation”, GRAME, September 1997, http://cuiwww.unige.ch/OSG/projects/dvp/53.deliverable.ps.Z
15
10 Robertson, G., and McAuley, D., “Sample Rate Sychronization across an ATM network”, proceeding of the International Computer Music Conference 97, September 25-30 1997, Thessaloniki, Greece, p. 271. 11 Williams, M., “Enregistrement et Reproduction de l’Environnement Sonore Naturel”, Proceedings of the Rencontres Musicales Pluridisciplinaires Informatique et Musique - Le Son & l’Espace, Ed. GRAME, March 31 - April 1, 1995, Lyon, France, p. 31.
16