samples (e.g. 4096) sources. Work grid for Spatialize. Spatializ e. Mix. IF. F. T ... int filterIndex = filterIndices[18
Next-Gen Sound Rendering with OpenCL Lakulish Antani1,2, Nico Galoppo1, Adam Lake1, Arnon Peleg1 1Intel
Corporation 2University of North Carolina at Chapel Hill © Copyright Khronos Group, 2011 - Page 1
Next-Generation Sound Experiences • Real-time interactive apps today Advanced rendering & physics simulation for a realistic user experience
• Realistic sound can add extra dimension of realism - Currently stereo / surround sound: only the beginning! - The future is true 3D sound - Games, tele-presence, simulators… • Sound rendering pipeline well suited to OpenCL implementation - Framework for writing data-parallel code - Not only GPU, pick most appropriate device for execution (e.g. CPU if GPU is busy)
• Result: Immersive 3D sound experience in real-time on any device
© Copyright Khronos Group, 2011 - Page 2
What Is Sound Rendering? • Visual rendering: Lets you see through the eyes of a virtual character • Sound rendering:
Lets you listen through the ears of a virtual character
© Copyright Khronos Group, 2011 - Page 3
Sound Rendering Pipeline Synthesis • Sound emitted at source
Propagation • Sound bounces through the environment to the listener
Auralization • Sound reaching the listener is reproduced using the user's audio system
• Three main components - Synthesis: How and where are sounds created? - Propagation: How do sounds echo and reverberate through a scene and reach the listener? - Auralization: How can the 3D listening experience be recreated for the user?
© Copyright Khronos Group, 2011 - Page 4
Sound Rendering Pipeline
Sound Synthesis
Sound Propagation
Auralization
.wav file
scene geometry VoIP stream collision
speaker configuration
multichannel audio
listener position
.wav file Sound modeled as rays bouncing in scene* *
OpenCL
Using open-source Embree raytracer: http://intel.ly/embree © Copyright Khronos Group, 2011 - Page 5
Sound Propagation • Propagation features modeled: - Distance attenuation (inverse distance model) - Occlusion - First-order specular reflection • Compute reflections of sound using the image-source method - Using “virtual” sound sources (analogous to VPLs) • Increases number of potential audio sources - Dependent on #triangles in scene (demo: 80k triangles) - Culled by raytracer visibility check (demo: ~200-300 sources on average) Propagation increases complexity of auralization © Copyright Khronos Group, 2011 - Page 6
Auralization • Input: Sound field at the listener
bring the sounds of the virtual world
• Output: Signals at the user’s speakers which reproduce the sound field at his/her actual position
• The main focus of this demo
into the living room
© Copyright Khronos Group, 2011 - Page 7
Auralization • Two broad approaches 1.
Most interactive applications assume a point listener - Analogous to a pinhole camera - Requires lots of speakers (7.1+)
2. -
Computing separate sound fields for each ear allows for true 3D sound Called binaural rendering Analogous to stereoscopic 3D graphics Requires headphones More accurate for single listener
• We will discuss both approaches
© Copyright Khronos Group, 2011 - Page 8
Auralization: Point Listener virtual world
• Based on amplitude panning
source
- Compute panning coefficients for each speaker listener
- Each speaker scales source signal by panning coefficient 0.9
0.75
0.2
- Speakers may be positioned in 3 dimensions (need at least 6 speakers)
No sound from this direction
user 0.15
0.0
real world
© Copyright Khronos Group, 2011 - Page 9
Auralization: Binaural Rendering • Determine the sound arriving at each ear separately
virtual world source
• Given source and listener positions: - Apply a pair of filters, called headrelated transfer functions (HRTFs) - Look up HRTFs from a large table of measurements
listener
• HRTF databases publicly available1 user
real world
1MIT
KEMAR dataset: http://sound.media.mit.edu/resources/KEMAR.html
© Copyright Khronos Group, 2011 - Page 11
Auralization Pipeline Implementation
(from Propagation) source position and signal
source position and signal
Spatialize
Mix
IFFT
Playback
source position and signal
blue blocks implemented in OpenCL green blocks implemented in XAudio2 © Copyright Khronos Group, 2011 - Page 12
Real-time Sound Rendering Requirements • Hard real-time constraint on processing pipeline • Audio data divided into frames of 4096 samples of sound - Corresponds to ~100ms of sound for 44.1 kHz audio - Want: latency of single frame • Audio processed by OpenCL kernels one frame at a time
Hard limit: ~100ms time budget per frame
© Copyright Khronos Group, 2011 - Page 13
Auralization Kernels: Spatialize • Given listener & source positions, modify input signal to position correctly in 3D space
• Output: - Amplitude panning: 1 sample/speaker - Binaural rendering: 2 samples (L/R)
Mix
IF F T
Pla yba ck
samples (e.g. 4096)
sources
• Each work-item processes one sample of the frame for one source - #work-items = #sources × frame size
Spatializ e
Work grid for Spatialize
© Copyright Khronos Group, 2011 - Page 14
Amplitude Panning Kernel
Spatializ e
Mix
IF F T
Pla yba ck
kernel void Spatialize_AP(…) { size_t sampleIndex = get_global_id(0); size_t sourceIndex = get_global_id(1); …
// generate one sample per speaker for (uint i = 0; i < NUM_SPEAKERS; i++) { float4 sourcePos = …; float4 speakerPos = …; // compute the panning coefficient float panningCoeff = dot(sourcePos, speakerPos) / NUM_SPEAKERS;
Simple dot product
// generate the sample outputSample[i] = panningCoeff * inputSample;
Coherent memory access
} }
Total cost = #sources × #speakers × #samples per frame © Copyright Khronos Group, 2011 - Page 15
Binaural Rendering Kernel
Spatializ e
Mix
IF F T
Pla yba ck
// Multiply two complex numbers float2 cmul(float2 a, float2 b) { return (float2) (a.x*b.x – a.y*b.y, a.x*b.y + a.y*b.x); } kernel void Spatialize_Binaural(…) { size_t sourceIndex = get_global_id(0);
size_t sampleIndex = get_global_id(1); size_t numSamples
= get_global_size(1);
// use the source position to index into the KEMAR HRTF lookup table float4 sourcePos = …;
int phi = (int) floor(degrees(atan2(sourcePos.x, sourcePos.z))); int theta = clamp((int) floor(degrees(asin(sourcePos.y)) * 0.1f), -4, 9) + 4; int filterIndex = filterIndices[181*theta + ((phi