Using Deep Neural Networks for Extracting Sentiment Targets in Arabic Tweets Ayman El-Kilany 1, Amr Azzam 1, and Samhaa R. El-Beltagy2 1
Faculty of Computers and Information , Cairo University, Giza 12613, Egypt 2 Center for Informatics Science, Nile University, Giza 12588, Egypt
[email protected],
[email protected],
[email protected]
Abstract. In this paper, we investigate the problem of recognizing entities which are targeted by text sentiment in Arabic tweets. To do so, we train a bidirectional LSTM deep neural network with conditional random fields as a classification layer on top of the network to discover the features of this specific set of entities and extract them from Arabic tweets. We’ve evaluated the network performance against a baseline method which makes use of a regular named entity recognizer and a sentiment analyzer. The deep neural network has shown a noticeable advantage in extracting sentiment target entities from Arabic tweets. Keywords: Sentiment Target Recognition, Long Short-term Memory Networks, Recurrent Neural Networks, Conditional Random Fields, Arabic
1 Introduction Sentiment target recognition is the task of identifying the subset of entities in a piece of text which are targeted by the text’s overall sentiment. Sentiment target recognition links between sentiment analysis and named entity recognition as it seeks to recognize only entities which are the focus or the subject of a tweet’s positive or negative sentiment. The task we explore is formulated as follows: given some input text annotated with either positive or negative sentiment, identify the entity in that text that is targeted by the given sentiment. For example, in the following text: “So happy that Kentucky lost to Tennessee!” there are two entities: ‘Kentucky’ and ‘Tennessee’. The overall sentiment of this tweet is positive and as such the sentiment target recognizer must identify the target of this positive sentiment which is “Tennessee”. Sentiment target recognition in this setting is a specific case from a more general framework which assumes that each entity in the text may have its own sentiment. In this paper, we are proposing a solution to find targets of the overall text sentiment rather than breaking the text into multiple sentiments with multiple targets for each
sentiment. Such a setting allows us to tackle the sentiment target recognition problem as specific case of named-entity recognition. In this paper, we propose two models for the sentiment target recognition task. All our experiments are carried out on Arabic text. The first model, is a baseline one which utilizes a named entity recognizer[1][2] and a sentiment analyzer [3] to recognize the target entities of some given text’s sentiment. The second model is a deep neural network for named entity recognition which is provided with target named entities in the training phase in order to identify them during the extraction phase. The deep neural network which we have used, consisted of a bidirectional LSTM network [4] with a conditional random fields [5] classifier as a layer on top of it. A word2vec model for Arabic words was built in order to represent the word vectors in the network. The deep neural network model has shown a noticeable advantage over the baseline model in detecting sentiment targets over a dataset of Arabic tweets. Fig 1 and Fig 2 depict the components of both baseline and deep neural network models.
Fig 1: Components of the baseline model
Fig 2: Components of the deep neural network model
The rest of this paper is organized as follows: section 2 provides a background with respect to tasks that are related to sentiment target recognition; section 3 describes the collection and annotation of Arabic tweets; Section 4 details the process of building the word embeddings model while section 5 describes both the baseline model and the deep neural model; section 6 presents the an evaluation of the proposed model through comparison of its performance against the baseline model while section 7 concludes this paper.
2 Background The task of sentiment target recognition falls somewhere in between stance detection, targeted sentiment analysis, and named entity recognition. While named entity recognition seeks to find all entities in the text, targeted sentiment analysis seeks to assign each entity or aspect of an entity in some given text, its relevant sentiment which is highly dependent on the context in which it appeared. Stance detection on the other hand, seeks to determine whether the author of some given text, is pro or against some predefined ‘target’ which can be an entity or an issue. Many models have been developed for each of the aforementioned tasks, but recently the use of deep neural networks models in all of those tasks, has been investigated as detailed here. Long Short-term Memory Networks (LSTMs) [6] are a special type of recurrent neural networks [7] which operate on sequential data. Given some input as a sequence of vectors, an LSTM network should return a decision about each vector in the sequence. LSTMs are designed to tackle dependencies in long sequences by using a memory-cell to conserve the state of the operations on earlier vectors in the sequence and prevent vectors from forgetting those through multiple iterations. In [8] and [4], Bidirectional LSTMs were trained for the tasks of named entity recognition, part of speech tagging and chunking without the inclusion of any handcrafted features. Word embeddings [9] and character embeddings [8] were used as the only input representations for the sentence sequences in the LSTM network. Given the bidirectional theme, each word representation vector was derived from its right context and left context in order to present its state correctly. Recognizing named entities with LSTM network was found to obtain state-of-the-art performance in four different languages [8]. Detecting the polarity of aspects within English text, is another task for which a multitude of systems were developed using deep neural approaches [10]. An example of such as system is presented in [11] where the authors used a deep convolutional neural network to address the problem. In [8], a conditional random fields (CRF) classifier was used on the top of an LSTM network in order to find the best tagged sequence of words and named entities for an input sequence of words. The CRF classifier was used to model tagging decisions jointly rather than predicating the tag for each word independently. A CRF classifier was also used in [12] for the targeted sentiment analysis task in order to extract each entity contained within some given text and assign it a sentiment value using a set of handcrafted features for both named entities and sentiments.
Recurrent neural networks have also been used for the stance detection task [13]. An example is presented in [13] where a gated recurrent neural network is used to identify the sentiment of some input text given a target. The target word(s) as well as its left context and its right context words are represented clearly in the network in order to obtain a better classification for the sentiment related to this target. Using a deep neural network to solve similar problems as those presented in [14] has a clear advantage for discovering important features through the network’s hidden layers. The automatic discovery of features provides the capability of discovering a variety of hidden information within text without having to worry about feature engineering. In this work, we use a bidirectional LSTM to learn the features of sentiment targets, which are usually named entities, rather than learning the features of all named entities. Tackling such a problem using a regular named entity recognizer would be challenging as the handcrafted features of named entities and sentiment targets aren’t expected to be different.
3 Data Collection and Annotation When we embarked on the task of sentiment target recognition in Arabic tweets, we discovered that no previous work has been carried out in this area. We therefore decided that we need to build our own dataset. To create this dataset, we used a previously collected set of tweets that had 480,000 entries. The dataset was collected by searching the Twitter API using entries from the NileULex sentiment lexicon [15]. Tweets from this dataset were randomly selected, annotated with sentiment using the sentiment analyzer described in [3], and annotated with entities using a modified version of the Named Entity Recognition system described in [1]. Tweets that were annotated as neutral or that had no entities were discarded. A total of 12,897 tweets were generated this way. The number of generated tweets was much more than the target we were aiming to annotate because we knew that some tweets will be ambiguous, while others will have wrong entities. The next step following the collection of this dataset, was revising and correcting the automatically assigned labels, as well as specifying the sentiment target for each tweet. To facilitate this process, a web based, user friendly application was developed. In cases where the assigned sentiment was wrong or where the detected entity was incorrect, the annotator was asked to correct it. The annotator was also asked to specify the target entities for each tweet either by selecting from already identified entities, or by manually entering the target entity. The annotation process was carried out by a paid expert annotator. To ensure that the resulting annotations would be of high quality, we periodically collected random annotated samples and examined their correctness. The annotator was aware that we were doing this and the agreement was that s/he will only get paid after the annotation process was completed. This process resulted in the annotation of over 5000 target entities. We created a set of tags similar to those employed by the IOB format [8]which is used for the named entity recognition tagging, for sentiment target recognition task.
The sentiment target recognition tagging scheme only has three tags: B-target, Itarget, O-tag, and. A word is tagged with B-target if it is the first word (beginning) of a target entity in a tweet. The I-target tag is used when the target is made up of a sequence of words. While the first word in such a case is tagged with the B-target tag, the remaining words in the sequence are tagged with I-target tag. Any word which is not a target is tagged with the O-tag. This dataset is available by request from any of the authors. It will also be availed online.
4 Building Word Embeddings One of the goals of this work was to experiment with a bidirectional Long Short-Term Memory (BI-LSTM) neural network. Previous work for text related tasks has shown that the performance of such networks is usually enhanced by the use of word embeddings. Word embeddings[16] refer to a model for feature learning used in NLP tasks to transform input words into a clustered representation of real numbered vectors. The model results in the construction of low dimensional vector representations for all words in a given input text corpus. Words sharing common contexts in that corpus are transformed into vectors that are close to each other in the vector space. As such, it can be stated that word embeddings produce an expressive representation for words as it captures their semantics. Using word embeddings within deep learning models for NLP tasks such semantic parsing; named entity recognition and sentiment analysis has been shown to result in a great performance boost. These tasks usually use a limited set of labeled instances for training purposes. Word embeddings are usually constructed from huge corpora, so it is very beneficial in these tasks to make use of a pre-trained word embeddings in order to have a generalized model with a reliable estimation for the problem parameters. For the English language, it is quite easy for researchers wishing to experiment with word embeddings to use publicly available pre-trained embeddings. However, for Arabic such embeddings do not exist, so we had to build our own. Training a word embeddings model usually requires a large input corpus. The corpus that we have used consisted of fifty four million Arabic tweets collected over the duration of 4 month (during 2015) by researchers at Nile University. In this work, we have applied two word embeddings generation models to generate a vector for each corpus word. The first model was built using word2vec [16] which was trained using the skipn-gram model. Building the model involved a series of experiments which involved using different settings and configurations. The tool used for building the Word2Vec model was gensim1 which is a widely used Python Library for topic modeling and other text related tasks. At the end, we used a model which was set to produce a vector of length 100, and whose parameters were set to a window size of 5 and a minimum word frequency cut-off of 5. The advantage of using this method is that it preserves the contextual similarity of words. Moreover, word2vec learns the words order from the context of the large corpus of tweets. 1
https://pypi.python.org/pypi/gensim
The second applied embeddings model uses character-level embeddings which derives a representation for every character, to create word embeddings. The model that we have used is similar to the one presented in [8]. At the beginning, the character embeddings are randomly initialized. After that character embeddings are constructed by passing each character in every word in the sentiment target recognition task dataset to a bidirectional LSTM from the forward and backward directions. The embedding of each word is generated from the concatenation of the bidirectional LSTM representation of the word characters. The advantage of character embeddings is that it allows for building task specific embeddings which are generated from the labeled training data. In addition, it is capable of handling the outof-vocabulary problem.
5 The Implemented Models The goal of our work was to experiment with a bidirectional Long Short-Term Memory (BI-LSTM) neural network for addressing the task of sentiment target recognition. However, since this task has not been addressed before for Arabic, we needed to provide another baseline model to which we can compare obtained results to. So in this paper, we present two different models for sentiment target recognition in Arabic tweets. The first model is the baseline one, which relies on a windowing method to detect target entities. The second model presents our proposed approach which is a sequence tagging model based on a hybrid approach between a bidirectional Long Short-Term Memory (BI-LSTM) neural network and a conditional random fields (CRF) classifier. Both models are detailed in the following subsections. 5.1 The Baseline Model When building the baseline model, we followed a windowing approach which detects sentiment targets based on the distance between each entity in the tweet and sentiment words that appear in that tweet. In this model, input tweets pass through a cleaning step to remove special characters and to normalize the tweet words for consistent representation. The cleaned tweets are provided to a modified version of the named entity recognizer described in [1] for detecting the named entities in each tweet. The tweets are then passed to the Nile University sentiment analyzer for tweet sentiment identification and sentiment words detection [3]. For the latter task, the sentiment analyzer produces two lists; one for the negative words in each tweet and another for the positive ones. The implemented algorithm then selects the target entity based on calculating the Euclidean distance between the generated entities and the words present in the sentiment lists. The followed algorithm details are shown in Fig 3. 5.2 The Deep Neural Network Model The proposed sentiment target recognition model is based on a hybrid sequence tagging approach that combines a deep neural network (bidirectional-LSTM) with a
top layer consisting of a conditional random fields (CRF) classifier. The proposed model follows a similar architecture to ones introduced in [14] and [4] for different NLP tasks such as named entity recognition and part of speech tagging. The proposed bidirectional LSTM-CRF model architecture and how it is applied for the sentiment target recognition task is explained in the following subsections.
Fig 3: Algorithm for the Window-based method for detecting sentiment targets 5.2.1 Bidirectional Long Short-term Memory Networks (BI-LSTMs) Recurrent Neural Networks (RNNs) [7] represent a class of neural networks that have the ability of preserving historical information from previous states. RNNs provide the advantage of information persistence through loops or iterations. These loops allow a RNN to represent information about sequences as the RNN takes a sequence of vectors (x1,x2, … ,xt) as an input and produces another sequence (h1, h2,…,ht) as an output that represents information about the input sequence over time t. RNNs have been applied to a variety of tasks including speech recognition [17] and image captioning [18]. RNNs have also been used for NLP tasks such as language modeling , named entity recognition (NER) [8], and machine translation [19]. However, RNNs suffer from the problem of long-term dependencies which means that RNNs struggle to remember information for long periods of time and that in practice a RNN is biased to recent input vectors in an input sequence. An LSTM [6] is a special kind of RNN that is designed to solve the long-term dependency problem through using several gates called the forget gates. Adding these gates gives an LSTM the ability to control the flow of the information to the cell state.
The forget gates prevent the back propagated errors from vanishing or exploding points through a sigmoid layer which determines which information to remember and which to forget. In this research, we have used the following implementation of LSTM: = ( = (1 − = ( ℎ = Where
)
+
ℎ
+
⨀
+
⨀ tanh(
+
ℎ
+
+
)
(1)
+ +
ℎ
+
)
)
⨀ tanh( )
(2) (3) (4)
is the sigmoid function and ⨀ is dot product.
The bidirectional LSTM network that we have utilized consists of two distinct LSTM networks. The first generates a representation for the left context of the current word t while the second is responsible for providing the representation for the right context through passing over the tweet in the reverse direction. Generally, the first is called the forward LSTM and the latter is called the backward LSTM. Each word in a bidirectional LSTM is represented through concatenating the output of the forward and backward LSTM. The vector resulting from bidirectional LSTM provides an effective representation for each word, as it takes context into consideration in this representation. 5.2.2 The Conditional Random Fields (CRF) Tagging Model Sentiment target recognition can be modeled as a binary classification problem where each word/term in a tweet is either classified as a target or not. However, such a model does not consider dependencies between word/terms in a sentence. As natural language enforces grammatical constraints to build interpretable sentences, it is challenging for a classification model to tag the words independently. For the sentiment target recognition task, we take the relationship between words in a sentence into consideration in order to construct a solid interpretation for the state of every word. Towards this end, we’ve used conditional random fields. The conditional random fields (CRF) model is a statistical based method which can be applied in machine learning tasks that need structured prediction. A conditional random fields model can tag words in a sentence either as a target or not- while taking into consideration the neighboring words that represent the context. 5.2.3 BI-LSTM-CRF In this work, we have augmented a bidirectional LSTM with a CRF layer in addition to a word embeddings layer to build and train a bidirectional LSTM-CRF sequence tagging neural network for the sentiment target recognition task, where a tag can
denote either a target or a non-target word. Given tweet X of size n words (x1; x2,...,xn), each word is represented as an m dimensional vector that is created by the word embeddings layer. Specifically, in our model, the embedding vector for each word in a tweet is obtained from the concatenation of the word2vec embeddings representation of the word and its character-level embeddings representation which yields the weights of the first layer of our model. Bidirectional LSTM layer uses the first layer weights in order to generate the features for CRF layer. The advantage of using a bidirectional LSTM is its capability to represent the features to the right and left of the word for which the model is trying to predict a tag (right and left contexts). The CRF layer receives these contextual features extracted by the bidirectional LSTM to predict a tag for the current word based on the previous and consequent words tags in the sentence. Fig 4 overviews the interaction between embeddings layer, the bidirectional LSTM layer and CRF layer to show an example of Arabic tweet words and their predicted tags.
Fig 4: Overview of BI-LSTM-CRF model layers with an embeddings layer For the proposed bidirectional LSTM-CRF sentiment target recognition model, we define a set of annotations as follows: T: The number of distinct tags. C: A transition matrix that represents the transition score from tag i to tag j calculated from the training data where C is a square matrix of size T. L: A matrix that represents the output score of the bidirectional LSTM where L is a matrix of size N * T where each word in the tweet has a score for each distinct tag. For any input tweet X of size n words = (x1; x2,...,xn), the network is expected predict a sequence of tags y = (y1; y2,…, yn). A matrix of scores of size N * T is calculated for each word in the tweet. The score function is defined as the summation of both the output of C and L as follows: ( , )=
,
+
,
(5)
A softmax function is applied to produce probability for the tag sequence y: ( , )
(6) ( , ) ∑ ∈ As Y represents a set of all possible tags sequences predictions for a sentence X. The objective function for the sentiment target recognition model during training aims to maximize the log-probability of the correct tag sequence as follow: ( | )=
log
( | ) =
( , )−
∈
( , )
(7)
During prediction, the output sequence is chosen from the set of all possible sequences that give the maximum score of the following: =
∈
( , )
(8)
6 Performance Evaluation The objective of our evaluation was to test our hypothesis that using a deep neural network to discover a set of features for a challenging task such as sentiment target recognition in a complex language like Arabic, can outperform a baseline solution for the task. Towards this goal, training and testing sets were derived from the annotated Arabic tweets dataset previously described in data collection and annotation section. From those, only 3000 were selected. Two thousands tweets were used for building the training set while the rest (1000) were used for testing. Tweets in both the training and testing sets were cleaned by removing all special characters. They were also normalized as follows: “ “إ,””أ, and “ ”آcharacters were replaced with “”ا, and “ ”ىwas replaced by “”ي, while “ ”ةwas replaced by “”ه The sentiment target recognition tagging network with a bidirectional LSTM augmented with a CRF layer described in the previous section was built by adapting the implementation and network structure availed by the authors of [8] for tagging named entities. Given the relatively small size of our training set, the network was trained using 100 epochs in order to set the neuron weights. To analyze the effect of the word2vec vectors on the results, the bidirectional LSTM-CRF network was trained once with the word vectors from word2vec model concatenated with word vectors from character embeddings model and once with the word vectors from character embeddings model only. About 94% of words in the training dataset had a vector representation in our set of word2vec vectors. Both our model and the previously described baseline method, aim at recognizing sentiment targets in each Arabic tweet in the test set. The precision, recall and f-score metrics were used to compare between the models. An annotated sentiment target is considered to be detected by the model if any of its constituent words is detected. Table 1 shows the scores for baseline method, bidirectional LSTM-CRF with
character embeddings only and bidirectional LSTM-CRF with character embeddings and Word2Vec. Table 1: Baseline and BI-LSTM models scores Model
Precision
Recall F-Score
Baseline Method
0.449
0.519
0.4815
BI-LSTM-CRF with character embeddings only
0.732
0.615
0.668
BI-LSTM-CRF with character embeddings and Word2Vec
0.737
0.714
0.726
The results in Table 1 show the clear superiority of the bidirectional LSTM-CRF network over the baseline method with respect to all metrics, especially when using word embeddings. Handcrafting a set of features for sentiment target recognition is a challenging task especially since most targets are simply a special case of a named entity, but through training the bidirectional LSTM network, it was able to derive features that distinguish between sentiment target entities and other entities. The results also show that both bidirectional LSTM-CRF networks (with and without word embeddings) have achieved comparable precision, but the network with the word2vec model has achieved a much higher recall. This means that the bidirectional LSTM network with the word2vec model has discovered more target entities from the tweets while conserving the same extraction precision. This improvement achieved by bidirectional LSTM-CRF network with word2vec model emphasizes the power of word embeddings for improving the performance of NLP tasks.
7 Conclusion This paper presented a model for detecting sentiment targets in Arabic tweets. The main component of the model is a bidirectional LSTM deep neural network with a conditional random fields (CRF) classification layer, and a word embeddings layer. Results obtained by experimenting with the proposed model, show that the model is capable of addressing the sentiment target recognition problem despite its inherent challenges. The experiments also revealed that the use of a word2vec model enhances the model’s results and contributes to the overall effectiveness of the network in extracting sentiment targets. Comparing the proposed model against a baseline method showed a considerable advantage in terms of results. Acknowledgments. We would like to thank Abu Bakr Soliman for building the user interface used for annotating sentiment target data and for collecting the 54 million tweets that we have used for building our word embeddings model. This work was partially supported by ITIDA grant number [PRP]2015.R19.9.
8 References 1. Omnia Zayed, Samhaa R. El-Beltagy: Named Entity Recognition of Persons’ Names in Arabic Tweets. In: Proceedings of Recent Advances in Natural Language Processing (RANLP 2015). , Hissar, Bulgaria (2015). 2. Zayed, O., Samhaa R. El-Beltagy: A Hybrid Approach for Extracting Arabic Persons’ Names and Resolving their Ambiguity from Twitter. In: Proceedings of 19th International Conference on Application of Natural Language to Information Systems (NLDB2015), Published in: Métais, E. et al. eds., NLDB 2015, Lecture Notes in Computer Science (LNCS). , Passau, Germany (2015). 3. El-Beltagy, S.R., Khalil, T., Halaby, A., Hammad, M.H.: Combining Lexical Features and a Supervised Learning Approach for Arabic Sentiment Analysis. In: CICLing 2016. , Konya, Turkey (2016). 4. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv Prepr. arXiv1508.01991. (2015). 5. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the eighteenth international conference on machine learning, ICML. pp. 282–289 (2001). 6. Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Comput. 9, 1735–1780 (1997). 7. Schuster, M., Paliwal, K.K.: Bidirectional Recurrent Neural Networks. Trans. Sig. Proc. 45, 2673–2681 (1997). 8. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. Proc. NAACL-HLT (NAACL 2016). (2016). 9. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed Representations of Words and Phrases and Their Compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. pp. 3111–3119. Curran Associates Inc., USA (2013). 10.Pontiki, M., Galanis, D., Papageorgiou, H., Androutsopoulos, I., Manandhar, S., AL-Smadi, M., Al-Ayyoub, M., Zhao, Y., Qin, B., De Clercq, O., Hoste, V., Apidianaki, M., Tannier, X., Loukachevitch, N., Kotelnikov, E., Bel, N., Jiménez-Zafra, S.M., Eryi\ugit, G.: SemEval-2016 task 5 : aspect based sentiment analysis. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). pp. 19–30. Association for Computational Linguistics (2016). 11.Khalil, T., Samhaa R. El-Beltagy: NileTMRG: Deep Convolutional Neural Networks for Aspect Category and Sentiment Extraction in SemEval-2016 Task 5. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval 2016) - to appear-. , San Diego, California, USA (2016). 12.Mitchell, M., Aguilar, J., Wilson, T., Durme, B. Van: Open Domain Targeted Sentiment. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. pp. 1643–1654 (2013). 13.Zhang, M., Zhang, Y., Vo, D.-T.: Gated Neural Networks for Targeted Sentiment Analysis. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA. Association for the Advancement of Artificial Intelligence (2016). 14.Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011). 15.El-Beltagy, S.R.: NileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic. In: Proceedings of LREC 2016. , Portorož , Slovenia (2016). 16.Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. ICLR Work. (2013).
17.Sak, H., Senior, A.W., Beaufays, F.: Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition. CoRR. abs/1402.1128, (2014). 18.Mao, J., Xu, W., Yang, Y., Wang, J., Yuille, A.L.: Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN). CoRR. abs/1412.6632, (2014). 19.Zhang, B., Xiong, D., Su, J.: Recurrent Neural Machine Translation. CoRR. abs/1607.08725, (2016).