Using Weights in mclust - CiteSeerX

Recommend Documents

Feature Selection using Linear Classifier Weights - CiteSeerX

with classification models. 2.1 Learning Algorithms. NaÃ¯ve Bayes. We use the multinomial model as described in. [5]. The predicted class for document d is the ...

Model-based Methods of Classification: Using the mclust Software in ...

Jan 27, 2007 - ley, and Raftery 2004; Fraley, Raftery, and Wehrens 2005; Tran, ... Development Core Team 2006) package mclust (Fraley and Raftery 1999,.

Constraining of Weights using Regularities

ordering relations on real numbers (equality, disequality,3t ,Eu , uwv ,3twv ), and is identified with the conyx unction of its elements. АВБдfiЕДЗЖЙИeЖСРТДФ У ...

COMPUTING INTERPOLATION WEIGHTS IN AMG ... - CiteSeerX

method for the construction of twoâlevel neighborhood matrices. ..... jâ¥kâ1. O(1). ) . â¢ Finally, the form of the new Schur complements AG in terms of a sparse.

Weights of Road Accident Causes using Analytic ... - CiteSeerX

causes associated with road accidents using decision making approach of Analytic Hierarchy ... investigate what actually the right decision and prevention.

Using AMOEBA to Create a Spatial Weights Matrix and ... - CiteSeerX

Jul 7, 2005 - Using AMOEBA to Create a Spatial Weights. Matrix and Identify Spatial Clusters. Jared Aldstadt,1,2 Arthur Getis1. 1Department of Geography ...

Predicting beef carcass composition using tissue weights ... - CiteSeerX

Jun 7, 2010 - ... Scottish Agricultural College, King's Buildings, Edinburgh, EH9 3JG, ... was evaluated using data recorded on 30 Aberdeen Angus and 43 ...

Determining feature weights using a genetic

A wide variety of constraints can be imposed on rosters depending on the ... Nurse rostering problems have been solved using a variety of different mathematical and artificial ..... is calculated using a weighted nearest neighbour function: d(vÎ³, vR

Text Summarization using Term Weights - Semantic Scholar

sentences from text using features like word and phrase frequency (Luhn, 1958),[3] ... Sentence weighting: summing over all signature word weights, modifying the .... âAutomatic condensation of electronic publications by sentence selection.

DATA INTEGRATION USING WEIGHTS OF EVIDENCE MODEL

Weights of evidence model is used to predict occurrence of an event with known evidences in a study area where training data are available to estimate the ...

mclust 5: Clustering, Classification and Density

/_/ /_/\____/_____/\____//____//_/ ... All the datasets used in the examples are available in mclust or in other R packages, such as gclus. (Hurley, 2012) ...... (eruptions) for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA.

data integration using weights of evidence model: applications in ...

prediction of Sn-W-U mineral occurrence in the South Mountain Batholith in Nova Scotia, Canada. After introduction of the principal of weights of evidence model ...

Eichler cohomology in general weights using spectral theory

Sep 2, 2014 - (z) being in the image of the Maass raising operator Kâr. Finally we apply standard ...... [13] Hans Petersson. Zur analytischen Theorie der ...

Using Weights in Social Graphs to Improve Search - Stanford ...

Jul 25, 2010 - Computer Science Department. Stanford ... ranked only by degrees of separation. ... Degrees of separation

Using Weights of Evidence in the Spatial Relation ... - Springer Link

Sep 2, 2009 - extinguishment and environment rebuilding. This paper studied the theoretical architecture of weights-of-evidence approach. The case study ...

Using Mean Bootstrap Weights in Stata: A BSWREG Revision

mean bootstrap weights while accounting for the number of weights used to ... Many of Statistics Canada's surveys provide a final weight (or final design weight) and .... number of employees ranging between 0 and 100 are classified as small; ..... Bu

LATENT CLASS ANALYSIS WITH SAMPLING WEIGHTS - CiteSeerX

Keywords: latent class analysis, mixture model, sampling weights, log-linear ... stratification, which are other relevant aspects of complex sampling designs.

Iterative Partitioning with Varying Node Weights - CiteSeerX

For example, limits on the total node weight in each partition ... including the popular LIFO and CLIP algorithms, and demonstrates using partitioning re- sults published in 1] .... The method has also been ..... manual, June 23, 1998. 24] B. W. ...

Learning Permutations with Exponential Weights - CiteSeerX

whole sequence of trials minus the total loss of the best permutation chosen in hindsight for the whole sequence, and the goal is to find algorithms that have ...

Localization Operators and Exponential Weights for ... - CiteSeerX

Oct 13, 2005 - (Vgf)1/ws Lâ(R2d) < â . It can be shown that S(1) can be represented as projective limit of weighted modulation spaces, see Proposition 2.3.

Iterative Partitioning with Varying Node Weights - CiteSeerX

varying node weights, e.g., VLSI circuit partitioning where node weight typically represents cell area. .... Physical layout considerations for sub-problems translate into size/area constraints for partitioning (see .... We believe that current pract

Trailer Weights

CUTTER SLIDE FORD. 34. 15550. 20000. 1996. CUTTER FORD. 36. 15950. 20000. 1996. CUTTER CHEVY. 36. NA. 19500. 1996. LAND YACHT CHEVY. 30.

Brain weights in alcoholics

closed a high incidence of cerebral cortical atrophy in alcoholics.'-4 The neuropathology of this entity has received only brief comment.56 In the present report ...

On weights in MIDAS - SCB

Jun 8, 2011 - daughter in the base-year 2002, and that this household has a certain weight ..... 4) Household X[x1..xnx; fx-min(fx,fy)] is the remaining donating ...

Using Weights in mclust - CiteSeerX

Download PDF

8 downloads 0 Views 56KB Size Report

Comment

The mclust package (Fraley and Raftery, 2002, 2006) can be extended to allow for each observation to have a weight which controls its contribution to the model ...

Using Weights in mclust Thomas Brendan Murphy University College Dublin, Ireland [email protected] Luca Scrucca Universit`a degli Studi di Perugia, Italy [email protected] June 2012

1

Introduction

The mclust package (Fraley and Raftery, 2002, 2006) can be extended to allow for each observation to have a weight which controls its contribution to the model fit. A brief background on how weights can be included is given in Section 2. A function me.weighted(), which is included in mclust, provides a weighted version of the me() function in mclust. The use of me.weighted() is demonstrated on two examples in Section 3 and Section 4.

2

Background

The log-likelihood function l and complete-data log-likelihood function lc for the finite mixture of normal distributions, as used in mclust, are of the form   G n � � (1) τg f (xi |µg , Σg ) l= log  i=1

and lc =

n � G �

g=1

zig log [τg f (xi |µg , Σg )] .

(2)

i=1 g=1

Suppose we introduce weights for each observation, so that the weight for observation i is given (w) by wi , then we can define the log-likelihood l(w) and complete-data log-likelihood lc to have the form   G n � � (3) l(w) = τg f (xi |µg , Σg ) wi log  i=1

and lc(w) =

n � G �

g=1

zig wi log [τg f (xi |µg , Σg )] .

(4)

i=1 g=1

The E-step of the EM algorithm for fitting the normal mixture model with weights is unchanged by the introduction of the weights. However, the M-step is changed in that the values zig in the 1

complete-data log-likelihood function (2) are replaced by zig wi in the weighted complete-data loglikelihood (4). Thus, if these changed values are supplied into the mstep() function (and related functions) in mclust then the weights can be used in an appropriate manner for model fitting. The me.weighted() function has been written to implement the EM algorithm for fitting normal mixtures with weights where the first step of the algorithm is an M-step; the function is similar in syntax to the me() function. The function takes the same arguments as me() but it also takes an optional additonal vector of weights. Note that, if any of the weights are greater than one, then all of the weights are rescaled to have a maximum of one. Also, the parameter estimates are not effected by multiplying the weights by a constant value.

3

Example: Parameter Sensitivity

We can use mclust to complete a model-based clustering of the iris data (Anderson, 1935). However, we may wish to assess the sensitivity of the fit to the deletion of an observation from the data. This can be easily achieved by assigning the deleted observation, j, a weight wj = 0 and each other observations weights wi = 1 for i �= j. First, we use the Mclust function to complete a model-based clustering of the iris data. The output is stored in the object fit. > data(iris) > X fit > > > >

j print(fit$parameters$variance$sigma) , , 1

Sepal.Length Sepal.Width Petal.Length Petal.Width

Sepal.Length Sepal.Width Petal.Length Petal.Width 0.150651 0.130801 0.0208446 0.0130910 0.130801 0.176045 0.0160325 0.0122145 0.020845 0.016032 0.0280826 0.0060157 0.013091 0.012215 0.0060157 0.0104237

, , 2

Sepal.Length Sepal.Width Petal.Length Petal.Width

Sepal.Length Sepal.Width Petal.Length Petal.Width 0.40004 0.108654 0.39940 0.143682 0.10865 0.109281 0.12389 0.072844 0.39940 0.123890 0.61090 0.257389 0.14368 0.072844 0.25739 0.168082

> print(fitnew$parameters$variance$sigma) , , 1

Sepal.Length Sepal.Width Petal.Length Petal.Width

Sepal.Length Sepal.Width Petal.Length Petal.Width 0.153558 0.133385 0.0214014 0.0134660 0.133385 0.179578 0.0165283 0.0125735 0.021401 0.016528 0.0285980 0.0060943 0.013466 0.012573 0.0060943 0.0106006

, , 2

Sepal.Length Sepal.Width Petal.Length Petal.Width

Sepal.Length Sepal.Width Petal.Length Petal.Width 0.40066 0.108856 0.40018 0.144000 0.10886 0.109222 0.12420 0.072928 0.40018 0.124198 0.61194 0.257876 0.14400 0.072928 0.25788 0.168274

Thus, we can see the impact of deleting a single observation from the data. The above code can be easily adapted to consider deleting data subsets of size greater than one.

4

Example: Bootstrapping

The inclusion of weights also facilitates computing bootstrap standard errors for parameter estimates; the use of boostrapping (Efron, 1979; Efron and Tibshirani, 1993) to compute standard errors for mclust parameter estimates is explored in Everitt and Hothorn (2010), O’Hagan (2012) and O’Hagan et al. (2012). The following code chunk uses B bootstrap samples to find standard errors for the mclust model parameters.

3

> > > > > > > > > > > > >

library(mclust) data(iris) X