cross-validation based decision tree clustering for hmm ... - Google Sites

Recommend Documents

cross-validation based decision tree clustering for hmm ... - IEEE Xplore

paper, we adopt the cross-validation principle to build such a decision tree to minimize the ... Index Termsâ HMM-based speech synthesis, cross validation,.

Cross-validation based decision tree clustering for ... - Google Sites

Conventional decision tree based clustering is a top-down, data driven training process, based on a greedy tree growing

clustering-based decision tree classifier construction

To improve these decision tree classifiers, we research the initial data structure of a training data set in order to gain information for constructions of more ...

Mutual Information Phone Clustering for Decision Tree ... - Google Sites

State-of-the-art speech recognition technology uses phone level HMMs to model the ..... ing in-house linguistic knowledg

decision tree-based context clustering based on cross validation and ...

Combination of cross validation and hierarchi- cal priors within decision tree-based context clustering offers better model selection and more robust parameter ...

A Clustering-based Decision Tree Induction ... - Semantic Scholar

colic, credit-a, credit-g, diabetes, glass, heart-c, heart-h, heart-statlog, hepatitis, ionosphere, iris, kr-vs-kp, labor, letter, lymph, mushroom, primary-tumor, segment ...

based decision tree algorithm

Classification rules for hotspot occurrences using spatial entropy- based decision tree algorithm. Indry Dessy Nurpratami, Imas Sukaesih Sitanggang*.

Clustering Via Decision Tree Construction - Semantic Scholar

nition [16], machine learning [15], and database and data mining (e.g.,. [25, 32, 7, 14, ... Cluster tree construction: This step uses a modified decision tree algorithm ...... S. Guha, R. Rastogi, and K. Shim (1998) CURE: an efficient clustering alg

Supervised Clustering and Fuzzy Decision Tree

Abstract. Fuzzy decision tree induction algorithms require the fuzzy quantization of the input variables. This paper demonstrates that supervised fuzzy clustering ...

Spanning Tree Based Attribute Clustering

rithm against earlier attribute clustering algorithms and evaluate the performance in several domains. Key words: Maximum Spanning Tree, Clustering.

Factoring Decision Tree - Google Sites

All Polynomials. Factor out Greatest. Common Factor first! Binomials. Difference of Two. Squares a2 - b2 = (a + b)(a-b).

Decision Tree based Classifiers for Large Datasets

developed for building decision trees from large datasets. ... computation to operate datasets with a big ..... Large Da

DECISION TREE BASED INFORMATION INTEGRATION FOR ...

of decision trees from machine learning. In Section 5, we evaluate the performance of our algorithm using different versions of the SCOP. We conclude with a ...

DECISION TREE BASED INFORMATION INTEGRATION FOR

Each component classifier is trained to answer the first question, i.e., whether a ... est neighbor, in the database of proteins with known classification, based on ...

Inverse S-Transform Based Decision Tree for

Apr 8, 2011 - kalkulasi nilai maksimum transformasi Stockwell invers tanpa penapis (MUNIST) ... termasuk transformasi Stockwell (ST) telah digunakan untuk ...

Decision Tree State Clustering with Word and ... - Research at Google

nition performance. First an overview ... present in testing, can be assigned a model using the decision tree. Clusterin

MDL-based Decision Tree Pruning

This paper explores the application of the Min- imum Description Length principle for pruning decision trees. We present a new algorithm that intuitively captures ...

Decision Tree-based Paraconsistent Learning

to the design of a Paraconsistent Learning system, able to extract useful ..... values of the supremes in class C" and C#, a value much more favorable to belief in ...

Decision Tree-based Paraconsistent Learning

It is possible to apply Machine Learning, Uncer- tainty Management and Paraconsistent Logic concepts to the design of a Paraconsistent Learning system, able.

HMM-BASED MOTION RECOGNITION SYSTEM ... - Google Sites

hardware and software technologies that allow spatio- temporal ... the object trajectory include tracking results from v

Improving HMM-Based Chinese Handwriting ... - Google Sites

Viterbi algorithm, the HMM scales well to large data set. HMM-based handwriting .... There have been large databases of

Towards Feature Learning for HMM-based Offline ... - Google Sites

input-image and supervised methods can be applied easily, many state-of-the-art systems for the recognition of handwri

Ranking with decision tree - Google Sites

This is an online mistake-driven procedure initialized with ... Decision trees can, to some degree, overcome these short

Clustering Based Lifetime Maximizing Aggregation Tree for Wireless ...

Clustering Based Lifetime Maximizing Aggregation Tree for Wireless ... important issue in all facets of wireless ..... Wireless Networks, Cambridge, UK, pp. 45-52 ...

cross-validation based decision tree clustering for hmm ... - Google Sites

Download PDF

1 downloads 182 Views 333KB Size Report

Comment

CROSS-VALIDATION BASED DECISION TREE CLUSTERING FOR HMM-BASED TTS. Yu Zhang. 1,2. , Zhi-Jie Yan. 1 and Frank K. Soong. 1

CROSS-VALIDATION BASED DECISION TREE CLUSTERING FOR HMM-BASED TTS Yu Zhang 1

Introduction

Microsoft Research Asia, Beijing, China

2

1,2

1

, Zhi-Jie Yan and Frank K. Soong

Shanghai Jiao Tong University, Shanghai, China

Cross-validation based decision tree clustering

◮ Conventional HMM-based speech synthesis ⊲ spectrum, excitation, and duration features are modeled and generated in a unified HMM-based framework ⊲ decision tree along with ML and MDL criteria is used for parameter tying ◮ Conventional decision tree based context clustering ⊲ ML-based greedy tree growing algorithm ⊲ MDL-based stopping criterion ◮ Cross validation-based decision tree ⊲ improve the conventional greedy splitting criterion ⊲ propose a new stopping criterion in node splitting

◮ Divide training data D

Yes

D

Λ

m

D

m

Yes

... m DK

Experiments

No

Smqy Smqn

m

\

m D1 m D2

m Λ1 m Λ2

m DK

...... m ΛK

◮ Determining the number of cross-validation folds K Table: The log spectral distortion for different K on the development set

m δ(D1 )q m δ(D2 )q

K 4 6 8 10 14 LSD (dB) 5.32 5.33 5.32 5.32 5.31 ◮ Objective Test Results

m δ(DK )q

Log spectral distance

Root mean square error of F0

Root mean square error of durations 30.8000

23.4000

m

δ(D )q

◮ MDL criterion for stopping m M DL m ML δ(D )q = δ(D )q − αL log G

⊲ likelihood increased by node splitting mqy mqn CV m CV CV CV m δ (Dk )q = Lk (Dk ) + Lk (Dk ) − Lk (Dk ) ⊲ select the best question over all validation sets X CV m qm = arg max δ (Dk )q q

23.2000

CV

5.8500

RMSE of F0 (Hz/frame)

α=1.0

5.8000 5.7500 5.7000 5.6500

α=0.5 5.6000

23.0000

α=1.0

22.8000

α=1.0

30.6000

MDL

RMSE of duraon (ms/phone)

CV

5.9000

α=0.5

22.6000

22.4000

α=0.8

30.4000

30.2000

30.0000

MDL 29.8000

CV 29.6000

5.5500

Stop automatically

Stop automatically

22.2000

Stop automatically

5.5000

29.4000

0

2000

4000

6000

8000

State number

10000

12000

22.0000

0

0

2000

4000

6000

State Number

8000

1000

2000

3000

4000

5000

10000

State Number

◮ Subjective Test Results

k

◮ Node stopping criteria ⊲ node splitting intuitively stops when X CV m δ (Dk )qm < 0

Main problems ◮ Splitting criterion: greedy search is sensitive to the biased training set ◮ Stopping criterion: not effective when training data is not asymptotically large / manually-tuned threshold

Log Spectral Distance (dB)

D

x∈Dkm

m

MDL

5.9500

◮ Node splitting criteria ⊲ evaluate likelihood on K validation sets X CV m m Lk (Dk ) = P (x|Λk )

Yes No Smqy Smqn m

◮ Training database ◮ MDL-based decision tree ⊲ Mandarin corpus, 16 kHz, 1,000 sentences, female ⊲ standard MDL method: α = 1.0 ⊲ tuning α on development set speaker ⊲ 40th-order LSP + gain, f0 ◮ Cross-Validation based decision tree ⊲ first- and second-order dynamic features ⊲ stop node splitting using intuitive criterion ⊲ 25,761 different rich context phone models ⊲ MDL-based criterion (Eq.(1))

6.0000

No R-voiced? Sm

Likelihood Increase

No R-voiced? Sm

D

⇒ Yes

into K subsets at each node

m D2

D \ m D \

◮ ML criterion for node splitting m ML δ(D )q = L(Smqy ) + L(Smqn) − L(Sm)

Experimental Setup

m D1

m

MDL-based decision tree clustering

m

1

Conclusions

k

⊲ MDL criterion can also be used X CV m δ (Dk )qm + αL log G < 0 k

⊲ in our experiments, the first criterion gives good results

(1)

◮ Use cross validation in decision tree clustering for HMM-based TTS ◮ Propose a splitting and a stopping criterion in tree building ◮ Compared with conventional method, cross-validation yields better performance given similar model size