Predicting the Perceptual Service Quality Using a Trace of ... - CiteSeerX

Predicting the Perceptual Service Quality Using a Trace of VoIP Packets C. Hoene, S. Wiethölter, A. Wolisz September 29th, 2004. Qofis’04, Barcelona, Spain

Technische Universität Berlin

URL: www.tkn.tu-berlin.de

Content

Introduction/Background Our Approach Playout Scheduling Listening-only Tests

Delay Spikes Non-Random Packet Losses

Conclusions

VoIP transmission of a telephone call

How to Judge the Quality of a Call? Measure the quality of a VoIP system: Gather packet loss rates, mean delay, etc.

In the end the service quality is important

Human based listening tests are extensive

ITU P.862 (PESQ algorithm) measures speech quality

Easy but inaccurate

Compares original sample with the transmitted version calculates Mean Option Score (MOS) (1=bad, 5=excellent)

ITU G.107 (E-Model) predicts quality of tel.-system

Considers echo, loudness, coding, packet loss rate, delay, … Result: R Factor (0=bad, 70=toll quality, 100=excellent)

Limits of PESQ and E-Model PESQ and E-Model cannot judge the impact of: Impact of playout scheduling cannot be predicted. Variable packet delay Ö adaptive playout time Non-random packet losses Not applicable to assess VoIP packet traces resulting from simulations or experiments

Predicting Playout Scheduling

A VoIP phone can implement any playout scheduler. How to know which is going to be used? Actually, you cannot know it. But you know the most common, published playout scheduler schemes, which are

Fixed-deadline and adaptive from Van Jacobsen, Mills, Schulzrinne, Ramjee, and Moon

Calculate packets' playout times and the mean transmission delay.

Consider only speech frames during voice activity because during silence a human cannot identify the delay.

Ö We can predict the behavior of most phones !

Assessing a Trace of VoIP Packets Our solution: Using the most common different playout schedulers and encoding schemes combine of E-Model and PESQ (MOS to R formula approved by the ITU)

PESQ calculate speech quality E-Model combines MOS rating and transmission delays

Speech Quality and Transmission Delay

Speech Recordings

Audio

Audio

PESQ Reference Degraded

Degraded Audio

Speech Coding

Experiment or Simulation

Playout Buffer

(coding distortion)

(loss or delay)

(loss or scheduling)

Decoder PLC Mean Delay

E-Model

MOS Ie Id

Result: R Factor

Equations – referring to the SPECTS’04 paper…

Content



Conclusions

PESQ not verified Öconduct listening tests

Can PESQ measure playout rescheduling? or non-random packet losses? PESQ can measure variable playout but its performance has never been verified. It has been tested only for random/bursty but not for non-random losses. Conduct formal listening-only tests!

Test Sample Design

Construct packet traces with delay spikes! Delay spikes with a height of 50 to 300ms and a width of height+10%. 1 Spike in one sample

What to do? 1. Drop late packets 2. Or delay playout?

Playout Decisions 1

2

3

4

5

6

2 2+ 2++ 3

4

5

Legend: 1...6 Frame seq. no. 2+; 2++ Conceal ed frame

Ori gi nal

1

2 2+ 2++ 5

a) Packet loss

1. 2. 3.

6

1

b) Positive adapt ation

6

1

2

5

6

c) Negative adapt ati on

DROP: Drop late packets (a) ADAPT: Delay playout to late packets (b) ADAPT&FALLBACK: Delay playout (b) and return to normal during the next silence period (c)

Listening-Only Tests Follow ITU P.830 but Multiple person at the same time Use native and foreign speakers Use linear MOS scale (no discrete scale!) Sample Sequence: For each sentence 1. 2. 3.

Original at the beginning, MNRU with 5dB (worst),15,25,35 (best) Randomly distributed: packet loss samples

Produced on CD containing all the samples

Studio...

Listening-Test and Sample Design Different encoding schemes:

G.711, G729

Produced delay spikes (only during voice activity)

Ö

with length 50,100,150,200,250,300ms with width=length*1.1

Different samples with different length 5-10s), German (using Kiel Corpus Vol. I) 220 samples (including MNRU samples as a reference) reference

Frame Analysis

speech properties

PESQ

PESQ MOS

R X

seed

packetization

Loss Generator loss rate

rate/mode

Coding algorithm

sample

speaker

language

Speech Recordings

importance

Decoding PLC

Listeningonly Tests

MOS

(Correlation)

Rating Performance

Three parts: intro, first and second half. First is best. Native speaker and foreign speaker rate as good. Some persons did not kept track with the sample no.

Human LQS-MOS vs. PESQ LQO-MOS

MOS (PESQ)

4

3

2 y = 0,9403x - 0,2139 R= 0,8316 1 1

2

3 MOS (Humans)

4

MNRU vs. MOS: Correlation: R=0.977 4

MOS

3

2

1 5

15

25

35

45

MNRU MOS

PESQ MOS

MOS cited

scaled MOS

Results for G.711 coding

MOS Variance vs. Prediction Performance Correlation between PESQ and humans

1,0 0,9 0,8

trend line: R=0.768

0,7 0,6 0,5 0,4 0,3 0,2 0

0,2 0,4 0,6 0,8 1 PESQ sample variance for different impairment patterns

Content



Conclusions

Introduction

Losing one Voice-Over-IP packet impairs the perceptual quality in a wide range, depending on

the frame speech properties the encoder/decoder/concealment algorithms decoders resynchronization time after loss (especially low-rate decoders might maintain a wrong state after loss for the following frames.) the surrounding speech.

Example: Discontinuous Transmission (DTX)

Speech frames during silence are less important Lower frame rate during silence

Packet Importance

The packet’s importance is the quality degradation that its loss would cause.

Definition: The importance of frame losses is the difference between the speech quality due to coding loss and the quality due to coding loss and frame losses, times the length of the analyzed sample. Conducted some million PESQ ratings Dropped only one VoIP packet in a sample

Just try it: www.tkn.tu-berlin.de/research/mongolia Select predefined parameter sets Compression?

chose sample Listen to it! amount of loss Choose an another loss pattern

Drop only frames with an importance between min. and max.

Sample statistics

Talking or silence? Voiced or unvoiced?

Judge the speech quality by your self!

Conclusion

Presented an approach to assess VoIP packet traces Combined PESQ, E-Model and playout scheduling because

Verified PESQ for

Speech frames differ in their importance Playout scheduler are not standardized Playout rescheduling harms speech quality. non-random packet losses (R=0.94) and delay spikes (R=0.87).

Precision performances depend on the variance of the samples’ speech quality.

Assessing VoIP: Summary

Open-source software including

Implementation of all common playout schedulers G.711 and G.729 coding Sample database including human rating results

Verified software in various research projects,

e.g. voice over WLAN, impact of handover, ad-hoc. http://www.tkn.tu-berlin.de/reseach/qofis C. Hoene, S. Wiethölter, and A. Wolisz, "Predicting the Perceptual Service Quality Using a Trace of VoIP Packets", In Proceedings of QofIS’04, Barcelona, Spain, September 2004. C. Hoene, H. Karl, and A. Wolisz, "A Perceptual Quality Model for Adaptive VoIP Applications", In Proceedings of International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS'04), San Jose, California, USA, July 2004, Paper won the Best Paper Award of the conference. C. Hoene and E. Dulamsuren-Lalla, "Predicting Performance of PESQ in Case of Single Frame Losses", In Proc. MESAQIN 2004, Prague, CZ, June 2004. S. Möller and C. Hoene, "Information About a New Method For Deriving the Transmission Rating Factor R From MOS in Closed Form", ITU, May 2002, Temporary Document for the study group 12.

Predicting the Perceptual Service Quality Using a Trace of ... - CiteSeerX

Predicting the Perceptual Service Quality Using a Trace of ... - CiteSeerX

Suggest Documents

PREDICTING PRODUCT AESTHETIC QUALITY USING ... - CiteSeerX

Service quality in Medellin hotels using perceptual ...

Perceptual visual quality metrics: A survey - CiteSeerX

Reversible watermarking using a perceptual model - CiteSeerX

a perceptual interface using integral projections - CiteSeerX

Guaranteeing Quality of Service for the Web using ... - CiteSeerX

Service Quality - Developing a Service Quality Index (SQI) - CiteSeerX

Predicting short-transfer latency from TCP arcana: A trace ... - CiteSeerX

Dynamically Predicting the Quality of Service: Batch, Online, and ...

The PROVE Trace Visualisation Tool as a Grid service - CiteSeerX

Perceptual Considerations for Quality of Service ... - Brunel University

Perceptual quality feedback based progressive frame ... - CiteSeerX

Perceptual Quality Measurement and Control: Definition ... - CiteSeerX

Predicting energy measurements of service-enabled ... - CiteSeerX

A perceptual study - CiteSeerX

Service Quality of mHealth - CiteSeerX

Service Quality Dimensions - CiteSeerX

Perceptual Sound Quality Assessment

Quality-of-Service Routing in Ad-Hoc Networks Using ... - CiteSeerX

Measuring ATM video quality of service using an objective ... - CiteSeerX

Using SLA Context to Ensure Quality of Service for ... - CiteSeerX

PERCEPTUAL QUANTITATIVE QUALITY ...

Using SLA Context to Ensure Quality of Service for ... - CiteSeerX

Quality of Service using Traffic Engineering over MPLS - CiteSeerX