On Extending Neural Networks with Loss Ensembles for Text

Recommend Documents

Building Ensembles of Neural Networks with Class

switching the class labels of a subset of training examples selected at random. ... Ensemble methods for automatic inductive learning aim at generating a collec- .... composed of neural networks (shown with solid lines in the plot) and decision ....

Ensembles of Artificial Neural Networks: Analysis and

Thanks in general to the Department of Engineering and Computer Science of the ...... Furthermore, there are a huge number of methods to design and build ..... havior. The artificial neurons receive some values from the input sensors or other ......

Generating Text with Recurrent Neural Networks

Recurrent Neural Networks (RNNs) form an expressive model family for ... extremely deep neural network with weight sharing across time, the same HF ...

text detection with convolutional neural networks - LIRIS

Orange Labs, 4, rue du Clos Courtel, 35512 Rennes, France. Manolis. ... Learning was also used in order to precisely localize the text lines by simply training the ...

Text Summarization Using Neural Networks

Text Summarization Using Neural Networks. KHOSROW KAIKHAH, Ph.D. Department of Computer Science. Texas State University. San Marcos, Texas 78666.

Computing with Neural Ensembles - Google Sites

Computing with Neural Ensembles. Miguel A. L. Nicolelis, MD, PhD. Anne W. Deane Professor of Neuroscience. Depts. of Neu

Bayesian Neural Network Ensembles

Nov 27, 2018 - A more accurate, though more. Third workshop on Bayesian Deep Learning (NeurIPS 2018), MontrÃ©al, Canada. arXiv:1811.12188v1 [cs.LG] 27 ...

On Multiplicative Integration with Recurrent Neural Networks

Nov 12, 2016 - which information from difference sources flows and is integrated in ... We call this kind of information

A Learning Algorithm For Neural Network Ensembles

propose here a new method for selecting members of regression/classification ensembles. In particular, using artificial neural networks as learners in a ...

applications of general regression neural networks for path loss

Abstract. This paper presents the results of the General Regression. Neural Networks applications for the prediction of propagation path loss in a specific urban ...

Fast Text Compression with Neural Networks - Matt Mahoney's ...

Fast Text Compression with Neural Networks. Matthew V. Mahoney. Florida Institute of Technology. 150 W. University Blvd. Melbourne FL 32901.

Text-Independent Speaker Authentication with Spiking Neural Networks

Abstract. This paper presents a novel system that performs text-independent speaker authentication using new spiking neural network (SNN) architectures.

Embodied Modeling With Spiking Neural Networks For ...

the constraints of neuromorphic hardware currently being de- veloped, including the DARPA SyNAPSE neuromorphic chips for very low power spiking model ...

Guidelines for Financial Forecasting with Neural Networks

Neural networks are good at classification, forecasting and recognition. They are also good candidates of financial forecasting tools. Forecasting is often used in ...

Exponential Synchronization for Stochastic Neural Networks with ...

Dec 31, 2013 - The exponential synchronization issue for stochastic neural networks (SNNs) with mixed time delays and Markovian jump parameters using ...

On Stability Analysis for Generalized Neural Networks with Time ...

Oct 21, 2014 - The activation functions of neuron satisfy the following assumption. ..... By applying Leibniz integral rule to Ì 4( ), it follows that. Ì 4 ( ).

Text Classification using Artificial Neural Networks - Minerva

May 13, 2015 - as âshitpostingâ. The vast majority (around 95%) of posts classified by this method are correctly classified, and I have manually removed the ...

Text Normalization using Memory Augmented Neural Networks

May 31, 2018 - Sergio G. Colmenarejo, Edward Grefenstette, Tiago Ramalho, John ... Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap.

Quantized Neural Networks: Training Neural Networks with Low ...

Sep 22, 2016 - Department of Computer Science and Department of Statistics. UniversitÃ© de ... The QNN code is available online. 1 ..... The ADAM learning method (Kingma and Ba, 2014b) also reduces the impact of the weight scale.

Extending TETRA with wireless sensor networks

In the specific, ETSI constrains the maximum output power (Pout) to +10 dBm ... protocols: the low power listening (LPL) and the collection tree protocol (CTP).

Exploring Recurrent Neural Networks for On-Line

(i.e. 5.60% vs 7.75% EER) for the 4vs1 case. Additionally, our Proposed BLSTM system outperforms other state-of-the- art signature verification systems such as ...

Optimization with Neural Networks

In the state-of-the-art neural approach the discrete elementary decisions (not necessarily binary) are represented by continuous Potts mean- eld neurons, inter-.

Dynamical Nonlinear Neural Networks with

14. Dynamical Nonlinear Neural Networks with. Perturbations Modeling and Global Robust Stability. Analysis. Gamal A. Elnashar. Automatic Control Center.

Fingerprint Classification with Neural Networks

[email protected]. DÃbio Leandro Borges. Universidade Federal de GoiÃ¢nia. Escola de Engenharia ElÃ©trica. PraÃ§a UniversitÃ¡ria s/n - Setor UniversitÃ¡rio, ...

On Extending Neural Networks with Loss Ensembles for Text

Download PDF

0 downloads 0 Views 211KB Size Report

Comment

Nov 14, 2017 - In statistics and machine learning, ensemble meth- ... placed with an ensemble of loss functions, and the .... We use Python and its Ten-.

On Extending Neural Networks with Loss Ensembles for Text Classification Hamideh Hajiabadi Ferdowsi University of Mashhad (FUM) Mashhad, Iran

Diego Molla-Aliod Macquarie University Sydney, New South Wales, Australia

[email protected]

[email protected]

arXiv:1711.05170v1 [cs.CL] 14 Nov 2017

Reza Monsefi Ferdowsi University of Mashhad (FUM) Mashhad, Iran [email protected] a vector y with the one-hot encoding of the target Abstract Ensemble techniques are powerful approaches that combine several weak learners to build a stronger one. As a meta learning framework, ensemble techniques can easily be applied to many machine learning techniques. In this paper we propose a neural network extended with an ensemble loss function for text classification. The weight of each weak loss function is tuned within the training phase through the gradient propagation optimization method of the neural network. The approach is evaluated on several text classification datasets. We also evaluate its performance in various environments with several degrees of label noise. Experimental results indicate an improvement of the results and strong resilience against label noise in comparison with other methods.

1

Introduction

In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance (Mannor and Meir, 2001). It has been proved that ensemble methods can boost weak learners whose accuracies are slightly better than random guessing into arbitrarily accurate strong learners (Bai et al., 2014; Zhang et al., 2016). When it could not be possible to directly design a strong complicated learning system, ensemble methods would be a possible solution. In this paper, we are inspired by ensemble techniques to combine several weak loss functions in order to design a stronger ensemble loss function for text classification. In this paper we will focus on multi-class classification where the class to predict is encoded as

label, and the output of a classifier yˆ = f (x; θ) is a vector of probability estimates of each label given input sample x and training parameters θ. Then, a loss function L(y, yˆ) is a positive function that measures the error of estimation (Steinwart and Christmann, 2008). Different loss functions have different properties, and some well-known loss functions are shown in Table 1. Different loss functions lead to different Optimum Bayes Estimators having their own unique characteristics. So, in each environment, picking a specific loss function will affect performance significantly (Xiao et al., 2017; Zhao et al., 2010). In this paper, we propose an approach for combining loss functions which performs substantially better especially when facing annotation noise. The framework is designed as an extension to regular neural networks, where the loss function is replaced with an ensemble of loss functions, and the ensemble weights are learned as part of the gradient propagation process. We implement and evaluate our proposed algorithm on several text classification datasets. The paper is structured as follows. An overview of several loss functions for classification is briefly introduced in Section 2. The proposed framework and the proposed algorithm are explained in Section 3. Section 4 contains experimental results on classifying several text datasets. The paper is concluded in Section 5.

2

Background

A typical machine learning problem can be reduced to an expected loss function minimization problem (Bartlett et al., 2006; Painsky and Rosset, 2016). Rosasco et al. (2004) studied the impact of choosing different loss functions from the viewpoint of statistical learning theory. In this section,

L(y, yˆ) ( 0 L0−1 = ( 1

Name of loss function Zero-One (Xiao et al., 2017)

z≥0 z