Abstract. FAST (González-Brenes et al., 2014) is a toolkit that enables efficient unsupervised learning of. Hidden Markov Models (HMMs) with features.
The FAST Toolkit for Unsupervised Learning of HMMs with Features
Yun Huang† Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA Jos´e P. Gonz´alez-Brenes† School Research, Pearson, Philadelphia, PA, USA
JOSE . GONZALEZ - BRENES @ PEARSON . COM
Peter Brusilovsky Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA † Both authors contributed equally.
Abstract FAST (Gonz´alez-Brenes et al., 2014) is a toolkit that enables efficient unsupervised learning of Hidden Markov Models (HMMs) with features. The toolkit allows emission and transition probabilities to be parameterized by features extracted from the observations. FAST uses a recent version of Expectation-Maximization (EM) algorithm (Berg-Kirkpatrick et al., 2010) that allows unsupervised learning of directed graphical models with features. This variant of EM allows conditional probability tables to be parameterized with a maximum entropy model, instead of the typical conditional probability tables. FAST is recently available at the machine learning open-source repository mloss1 , but has been available elsewhere for over a year. The current version was originally designed for predicting student performance, but it is not limited to this application. In student modeling, HMMs are a popular tool for for inferring student knowledge from performance data. Unfortunately, they do not allow modeling the feature-rich data that is now possible to collect in modern digital learning environments. Because of this, many ad hoc HMM variants have been proposed to model a specific feature of interest. For example, variants have studied the effect of students’ individual characteristics, the effect of help in a tutor, and subskills. These ad hoc models are successful for their own specific purpose, but are speci1
https://mloss.org/software/view/609/
Proceedings of the 32 nd International Conference on Machine Learning, Lille, France, 2015. JMLR: W&CP volume 37. Copyright 2015 by the author(s).
YUH 43@ PITT. EDU
PETERB @ PITT. EDU
fied to only model a single specific feature. Our experiments suggest that using features can improve the AUC of HMMs up to to 25%. Moreover, FAST can be 300 times faster than models created in BNT-SM (Chang et al., 2006), a toolkit that facilitates the creation of Dynamic Bayesian Networks for educational applications. We are currently pursuing a minor refactorization of the documentation and the API to make FAST more general beyond student modeling applications.
References Berg-Kirkpatrick, Taylor, Bouchard-Cˆot´e, Alexandre, DeNero, John, and Klein, Dan. Painless unsupervised learning with features. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 582–590. Association for Computational Linguistics, 2010. Chang, Kai-min, Beck, Joseph, Mostow, Jack, and Corbett, Albert. A bayes net toolkit for student modeling in intelligent tutoring systems. In Intelligent Tutoring Systems, pp. 104–113. Springer, 2006. Gonz´alez-Brenes, J. P., Huang, Y., and Brusilovsky, P. General Features in Knowledge Tracing: Applications to Multiple Subskills, Temporal Item Response Theory, and Expert Knowledge. In Mavrikis, Manolis and McLaren, Bruce M. (eds.), Proceedings of the 7th International Conference on Educational Data Mining, London, UK, 2014.