Next-Gen Sound Rendering with OpenCL - Khronos Group

1 downloads 122 Views 3MB Size Report
samples (e.g. 4096) sources. Work grid for Spatialize. Spatializ e. Mix. IF. F. T ... int filterIndex = filterIndices[18
Next-Gen Sound Rendering with OpenCL Lakulish Antani1,2, Nico Galoppo1, Adam Lake1, Arnon Peleg1 1Intel

Corporation 2University of North Carolina at Chapel Hill © Copyright Khronos Group, 2011 - Page 1

Next-Generation Sound Experiences • Real-time interactive apps today Advanced rendering & physics simulation for a realistic user experience

• Realistic sound can add extra dimension of realism - Currently stereo / surround sound: only the beginning! - The future is true 3D sound - Games, tele-presence, simulators… • Sound rendering pipeline well suited to OpenCL implementation - Framework for writing data-parallel code - Not only GPU, pick most appropriate device for execution (e.g. CPU if GPU is busy)

• Result: Immersive 3D sound experience in real-time on any device

© Copyright Khronos Group, 2011 - Page 2

What Is Sound Rendering? • Visual rendering: Lets you see through the eyes of a virtual character • Sound rendering:

Lets you listen through the ears of a virtual character

© Copyright Khronos Group, 2011 - Page 3

Sound Rendering Pipeline Synthesis • Sound emitted at source

Propagation • Sound bounces through the environment to the listener

Auralization • Sound reaching the listener is reproduced using the user's audio system

• Three main components - Synthesis: How and where are sounds created? - Propagation: How do sounds echo and reverberate through a scene and reach the listener? - Auralization: How can the 3D listening experience be recreated for the user?

© Copyright Khronos Group, 2011 - Page 4

Sound Rendering Pipeline

Sound Synthesis

Sound Propagation

Auralization

.wav file

scene geometry VoIP stream collision

speaker configuration

multichannel audio

listener position

.wav file Sound modeled as rays bouncing in scene* *

OpenCL

Using open-source Embree raytracer: http://intel.ly/embree © Copyright Khronos Group, 2011 - Page 5

Sound Propagation • Propagation features modeled: - Distance attenuation (inverse distance model) - Occlusion - First-order specular reflection • Compute reflections of sound using the image-source method - Using “virtual” sound sources (analogous to VPLs) • Increases number of potential audio sources - Dependent on #triangles in scene (demo: 80k triangles) - Culled by raytracer visibility check (demo: ~200-300 sources on average) Propagation increases complexity of auralization © Copyright Khronos Group, 2011 - Page 6

Auralization • Input: Sound field at the listener

bring the sounds of the virtual world

• Output: Signals at the user’s speakers which reproduce the sound field at his/her actual position

• The main focus of this demo

into the living room

© Copyright Khronos Group, 2011 - Page 7

Auralization • Two broad approaches 1.

Most interactive applications assume a point listener - Analogous to a pinhole camera - Requires lots of speakers (7.1+)

2. -

Computing separate sound fields for each ear allows for true 3D sound Called binaural rendering Analogous to stereoscopic 3D graphics Requires headphones More accurate for single listener

• We will discuss both approaches

© Copyright Khronos Group, 2011 - Page 8

Auralization: Point Listener virtual world

• Based on amplitude panning

source

- Compute panning coefficients for each speaker listener

- Each speaker scales source signal by panning coefficient 0.9

0.75

0.2

- Speakers may be positioned in 3 dimensions (need at least 6 speakers)

No sound from this direction

user 0.15

0.0

real world

© Copyright Khronos Group, 2011 - Page 9

Auralization: Binaural Rendering • Determine the sound arriving at each ear separately

virtual world source

• Given source and listener positions: - Apply a pair of filters, called headrelated transfer functions (HRTFs) - Look up HRTFs from a large table of measurements

listener

• HRTF databases publicly available1 user

real world

1MIT

KEMAR dataset: http://sound.media.mit.edu/resources/KEMAR.html

© Copyright Khronos Group, 2011 - Page 11

Auralization Pipeline Implementation

(from Propagation) source position and signal

source position and signal

Spatialize

Mix

IFFT

Playback

source position and signal

blue blocks implemented in OpenCL green blocks implemented in XAudio2 © Copyright Khronos Group, 2011 - Page 12

Real-time Sound Rendering Requirements • Hard real-time constraint on processing pipeline • Audio data divided into frames of 4096 samples of sound - Corresponds to ~100ms of sound for 44.1 kHz audio - Want: latency of single frame • Audio processed by OpenCL kernels one frame at a time

Hard limit: ~100ms time budget per frame

© Copyright Khronos Group, 2011 - Page 13

Auralization Kernels: Spatialize • Given listener & source positions, modify input signal to position correctly in 3D space

• Output: - Amplitude panning: 1 sample/speaker - Binaural rendering: 2 samples (L/R)

Mix

IF F T

Pla yba ck

samples (e.g. 4096)

sources

• Each work-item processes one sample of the frame for one source - #work-items = #sources × frame size

Spatializ e

Work grid for Spatialize

© Copyright Khronos Group, 2011 - Page 14

Amplitude Panning Kernel

Spatializ e

Mix

IF F T

Pla yba ck

kernel void Spatialize_AP(…) { size_t sampleIndex = get_global_id(0); size_t sourceIndex = get_global_id(1); …

// generate one sample per speaker for (uint i = 0; i < NUM_SPEAKERS; i++) { float4 sourcePos = …; float4 speakerPos = …; // compute the panning coefficient float panningCoeff = dot(sourcePos, speakerPos) / NUM_SPEAKERS;

Simple dot product

// generate the sample outputSample[i] = panningCoeff * inputSample;

Coherent memory access

} }

Total cost = #sources × #speakers × #samples per frame © Copyright Khronos Group, 2011 - Page 15

Binaural Rendering Kernel

Spatializ e

Mix

IF F T

Pla yba ck

// Multiply two complex numbers float2 cmul(float2 a, float2 b) { return (float2) (a.x*b.x – a.y*b.y, a.x*b.y + a.y*b.x); } kernel void Spatialize_Binaural(…) { size_t sourceIndex = get_global_id(0);

size_t sampleIndex = get_global_id(1); size_t numSamples

= get_global_size(1);

// use the source position to index into the KEMAR HRTF lookup table float4 sourcePos = …;

int phi = (int) floor(degrees(atan2(sourcePos.x, sourcePos.z))); int theta = clamp((int) floor(degrees(asin(sourcePos.y)) * 0.1f), -4, 9) + 4; int filterIndex = filterIndices[181*theta + ((phi