ATLAS: automatic temporal segmentation and

ATLAS: automatic temporal segmentation and annotation of lecture based on modeling transition time

Rajiv Ratn Shah*

Yi Yu*

Suhua Tang# Roger Zimmermann* Anwar D. Shaikh*

*School

of Computing, National University of Singapore #School of Informatics and Engineering, The University of Electro-Communications {rajiv,yuy,rogerz}@comp.nus.edu.sg, [email protected] ACM Multimedia Grand Challenge, November 5, 2014 VideoLectures.NET Challenge (MediaMixer, transLectures):Temporal segmentation and annotation of Lecture videos

1

VideoLectures.NET Challenge  The number of online lecture videos available is increasing rapidly  There is still insufficient accessibility and traceability of lecture video contents  It is very desirable to enable people to navigate and access specific topics within lecture videos

VideoLectures.NET Challenge (MediaMixer, transLectures):Temporal segmentation and annotation of Lecture videos

2

ATLAS ATLAS is our solution to this challenge ATLAS works in two phases 

Transition times prediction



Text annotations determination


3

Demo


4

Contributions  ATLAS has two main novelties 

A SVM hmm model is proposed to learn temporal transition cues



A fusion scheme is suggested to combine transition cues extracted from heterogeneous information of lecture videos

 Text annotations corresponding to these temporal segments are determined by assigning the most frequent N-gram token of the subtitle resource tracks (SRT) block under consideration


5

System Overview

TT2

Fusion

TT1

 Temporal transition: SVM hmm + fusion  Text annotations: N-gram token of SRT VideoLectures.NET Challenge (MediaMixer, transLectures):Temporal segmentation and annotation of Lecture videos

6

Evaluation PTT1

ATT1

PTTi A= (a1, a2, a3, …, ap)

ATTj

td1

tdi

T= (t1, t2, t3, …, tq)

PTT = Predicted Transition Time, ATT = Actual Transition Time, td = time difference between ATT and nearest PTT

∑𝑟𝑘=1 𝑆𝑆𝑆𝑆𝑆(𝑃𝑃𝑇𝑖 , 𝐴𝐴𝑇𝑗 ) 𝑃𝑃𝑃𝑃_𝑆𝑆𝑆 = 𝑁 𝑟 ∑𝑘=1 𝑆𝑆𝑆𝑆𝑆(𝑃𝑃𝑇𝑖 , 𝐴𝐴𝑇𝑗 ) 𝑅𝑅𝑅𝑅𝑅𝑅_𝑆𝑆𝑆 = 𝑀 where, N = # ATTs, M = # PTTs and r = # ( PTTi, ATTj ) pairs

PTT4

Extra PTT

tdi = | PTTi – ATTj | // tdi is the time diff

PTT5 ATT4 Missed ATT PTT6

ATT5 ATT6 ATT7

PTT7 PTTN

Extra PTT

td4

ATTM

td5 tdn

If tdi < 5 Score(PTTi , ATTj ) = 1.0 ElseIf tdi < 10 Score(PTTi , ATTj ) = 0.8 ElseIf tdi < 15 Score(PTTi , ATTj ) = 0.6 ElseIf tdi < 20 Score(PTTi , ATTj ) = 0.4 ElseIf tdi < 25 Score(PTTi , ATTj ) = 0.2 Else Score(PTTi , ATTj ) = 0.0


7

Results

Effect of fusion VideoLectures.NET Challenge (MediaMixer, transLectures):Temporal segmentation and annotation of Lecture videos

8

Conclusion  ATLAS determines the temporal segmentation by fusing the transitions cues computed from the visual contents and the text analysis  ATLAS annotates the texts corresponding to the determined temporal transitions  ATLAS facilitates the accessibility and traceability within lecture video contents


9

Acknowledgment  The authors are very grateful to Nimisha Drolia and the anonymous reviewers for constructive suggestions to improve the quality of this work.  This research has been supported by the Singapore National Research Foundation under its International Research Centre @ Singapore Funding Initiative and administered by the IDM Programme Office through the Centre of Social Media Innovations for Communities (COSMIC).


10

Thank You


11

Evaluation (1) PTT1

ATT1

PTTi ATTj

td1

tdi

A=(a1, a2, a3, …, ap) T=(t1, t2, t3, …, tq)

where, PTT = Predicted Transition Time,

PTT4

Extra PTT

ATT = Actual Transition Time,


ATT5 ATT6 ATT7

PTT7 PTTN

Extra PTT

td4

ATTM

td = time difference between ATT and nearest PTT

td5 tdn


12

Evaluation (1) tdi = | PTTi – ATTj |

PTT1

ATT1

PTTi

td1

tdi

ATTj A=(a1, a2, a3, …, ap) T=(t1, t2, t3, …, tq)

PTT4

Extra PTT


ATT5 ATT6 ATT7

PTT7 PTTN

Extra PTT

td4

ATTM

td5 tdn

// tdi is the time diff between PTTi and the nearest ATTj

If tdi < 5 Score(PTTi , ATTj ) = 1.0 ElseIf tdi < 10 Score(PTTi , ATTj ) = 0.8 ElseIf tdi < 15 Score(PTTi , ATTj ) = 0.6 ElseIf tdi < 20 Score(PTTi , ATTj ) = 0.4 ElseIf tdi < 25 Score(PTTi , ATTj ) = 0.2 Else Score(PTTi , ATTj ) = 0.0

∑𝑟𝑘=1 𝑆𝑆𝑆𝑆𝑆(𝑃𝑃𝑇𝑖 , 𝐴𝐴𝑇𝑗 ) 𝑃𝑃𝑃𝑃_𝑆𝑆𝑆 = 𝑁 ∑𝑟𝑘=1 𝑆𝑆𝑆𝑆𝑆(𝑃𝑃𝑇𝑖 , 𝐴𝐴𝑇𝑗 ) 𝑅𝑅𝑅𝑅𝑅𝑅_𝑆𝑆𝑆 = 𝑀

where, N = # ATTs, M = # PTTs and r = # ( PTTi, ATTj ) pairs


13

System Overview (2) N-gram SVMhmm language model model

TT2

Fusion

TT1

 Temporal transition: SVM hmm + fusion  Text annotations: N-gram token of SRT VideoLectures.NET Challenge (MediaMixer, transLectures):Temporal segmentation and annotation of Lecture videos

14

System Overview (3) SVMhmm model

TT1 TT2


15

System Overview (4) N-gram language model

TT1 TT2


16

ATLAS: automatic temporal segmentation and

ATLAS: automatic temporal segmentation and

Suggest Documents

ATLAS: Automatic Temporal Segmentation and ... - NUS Computing

AUTOMATIC MULTI-ATLAS-BASED CARTILAGE SEGMENTATION ...

Evaluation of an atlas-based automatic segmentation ... - Sophia

A multi-atlas approach to automatic segmentation of the ... - CiteSeerX

Automatic atlas-based segmentation of the prostate: A ... - NA-MIC

A fully automatic multi-atlas based segmentation method for prostate

SEGMENTATION AND LABELLING AUTOMATIC MODULE

Automatic Detection and Segmentation of

AUTOMATIC SIMULTANEOUS SEGMENTATION AND FAST

Automatic segmentation of

Offline Automatic Segmentation based

Automatic segmentation of

Groupwise Combined Segmentation and Registration for Atlas ...

Spatio-temporal Shadow Segmentation and Tracking - CiteSeerX

Temporal Convolutional Networks for Action Segmentation and ...

Temporal video segmentation - a survey.pdf

Temporal Video Segmentation Using Unsupervised

Automatic Synchronization and Segmentation of ...

Automatic image event segmentation and quality ... - CiteSeerX

AUTOMATIC VIDEO OBJECT SEGMENTATION AND TRACKING ...

Automatic segmentation and measurements of ...

Automatic detection, segmentation and classification ...

Automatic and Reliable Segmentation of Spinal

Automatic Segmentation of Texts and Corpora