Text Content Reliability Estimation in Web

0 downloads 0 Views 278KB Size Report
This paper illustrates how a combination of information re- trieval, machine ... to be true [18]. Reliability content is a criterion that, following topic .... where each wCCi represents the truth value for a hypothesis hCCi extracted from the corpus.
Text Content Reliability Estimation in Web Documents: A New Proposal Luis Sanz, H´ector Allende, and Marcelo Mendoza Department of Informatics Universidad T´ecnica Federico Santa Mar´ıa, Chile [email protected], {hallende,mmendoza}@inf.utfsm.cl

Abstract. This paper illustrates how a combination of information retrieval, machine learning, and NLP corpus annotation techniques was applied to a problem of text content reliability estimation in Web documents. Our proposal for text content reliability estimation is based on a model in which reliability is a similarity measure between the content of the documents and a knowledge corpus. The proposal includes a new representation of text which uses entailment-based graphs. Then we use the graph-based representations as training instances for a machine learning algorithm allowing to build a reliability model. Experimental results illustrate the feasibility of our proposal by performing a comparison with a state-of-the-art method.

Keywords: Text reliability, content-based trust, textual entailment.

1

Introduction

Text content reliability can be defined as the degree in which the text content is perceived to be true [18]. Reliability content is a criterion that, following topic relevance, is one of the most influencing aspects that should be considered for assessing the relevance of a Web publication [14]. However, it is very difficult to measure it because is related to a qualitative property of the information. In this article we introduce an approach for content reliability estimation that can be applied to Web documents. The techniques applied in this article provide jointly an effective method to automatically obtain a text reliability measure that can be used to assess a variety of Web publications. These techniques include a text segmentation strategy based on its syntactic-grammatical structure, a new text representation based on its sentences and the estimation of a distance measure between its content fragments and a knowledge corpus. A key element of our approach is the use of an entailment structure of each document to build a reliability model. We use these entailment structures, that we call entailment-based graphs, to represent how reliable is the content of a document with respect to a knowledge corpus. Then, by considering gold standard reliability scores and by using each graph as a training instance, we build a training dataset which allow us to learn a realiability model. To illustrate the

feasibility of our proposal, we conduct an evaluation of our methods by using assessments provided in the Automatically Evaluating Summaries of Peers Tasks, AESOP Task [4] as gold standard scores. Then, we apply learning strategies to build our reliability models. We conduct a comparison against ROUGE-SU4 [9], a state-of-the-art summarization method which exhibits similar properties for the problem of reliability estimation. The remainder of this article is organized as follows. A discussion about the related work can be found in the next section, where we discuss credibility, reputation and the relation of reliability with summarization. Section 3 includes a formal approach to the problem, a general view of the proposal, and an illustrative example of our strategy. Section 4 presents experimental results obtained from a comparison between our approach and an alternative method. Finally, the implications and findings of this article are discussed in section 5.

2 2.1

Related Work Credibility, reputation, and summarization

Some approaches that emerged from Information Retrieval (IR) and Natural Language Processing (NLP) have dealt with the problem of assessing text reliability. In the IR field, the most common approach is called credibility. Most of the credibility studies have focused on the analysis of user behavior and the way in which they evaluate the veracity of publications [10,11,16]. Credibility analysis has been slightly focused on analysis of text content. There are also specific attempts in the credibility analysis field, aimed to reliability assessment of blogs content [13,7,17,1]. In summary, credibility is applied indiscriminately to multiple concepts besides to content reliability measure, so the last one can be seen as a subarea of the first one. Other remarkable attempts in IR field related to reliability measure are focused on Information Quality and Cognitive Authority frame [2,14]. The most common strategies of reliability measures in IR are inclined to the analysis of reputation, from votes of users or authors and, occasionally, to content comparison. A major inconvenient of this approach is that many publications are written by anonymous or unknown authors and, moreover, the contents of the publications for a given author have variable reliability levels. Note that a good reputation does not necessarily imply a high level of reliability. In the NLP field, the issue of the reliability measure has been handled in very specific cases being most of the efforts related to summarization [8,3,12,9,6]. We need to explain how sumarization is related to text reliability focusing on particular on the relation with the AESOP Task. We address this issue in the following section. 2.2

AESOP Task, legibility, responsiveness, and pyramid score

For summarization, the Automatically Evaluating Summaries of Peers Task (AESOP Task, [4]) has concentrated attempts from more than 30 universities from

different countries to evaluate text content quality related concepts, becoming a major international endeavor dedicated to this topic. The challenge proposed in the AESOP Task and its benchmark has the following characteristics: Over 44 topics a set of summaries has been made. Every topic was formed by 20 documents, where every text corresponds to an article published by an international News Agency. The articles were extracted from the AQUAINT-2 collection, a LDC English Gigaword subset which collects approximately 2.5 GB of text, with around of 907,000 documents corresponding to the period between October 2004 and March 2006. The articles were written in English and were obtained from a variety of agencies, including France Presse, the Central News Agency of Taiwan, the Xinhua News Agency, and the New York Times, among others. The documents were selected by experts of the National Institute of Standards and Technology (NIST). The selection was based on the name and a brief description of the topic. For each topic 118 summaries were made, from which 8 were made manually and 110 were made automatically by using different automatic summarization techniques. Then 3 measures related to the content quality of each summary were assessed: Legibility, responsiveness and pyramid score. To assess legibility, expert evaluators assigned to each summary a numeric value from 1 to 5, related to how fluent and readable the summary turns to be, without having into account its content. To measure responsiveness, the evaluators assigned to each summary a numeric value from 1 to 5 based on the perception of how the summary fulfills the topic. The pyramid score [12] was assessed for each summary based on the level of concordance between the summary content and the descriptive text of each topic. A summary set was defined for each topic, compounded by the set of Summary Content Units (SCUs) which describes each topic. These content fragments were manually identified by a group of experts. A weight was assigned to every SCU, depending on the number of model summaries it matchs. Thus, the pyramid score for a summary was calculated as the total weight divided by the maximum weight obtained from a summary with an average extent (where the average extent was determined by the SCUs count measure in the model summaries corresponding to the assessed topic). Our approach for reliability estimation takes advantage of the existence of a benchmark dataset for summarization. We will use these scores for the construction of reliability training datasets, allowing to learn reliability models. We will compare our results with a state-of-the-art summarization method, ROUGE-SU4 [9], to illustrate the feasibility of our approach. As we will explain in Section 3, the reliability of a document regarding a knowledge corpus can be modeled as a summarization process: A document is a good match of a corpus if the corpus is likely to generate the document.

3

A New Proposal for Content Reliability Measuring

3.1

Reliability definition

Now we introduce a formal definition of reliability, which allow us to discuss how we can estimate it. Let H be a set of possible hypotheses. A hypothesis h ∈ H is a statement where a truth value can be assigned. It is assumed that h is expressed by using a text. A set of truth assignments w for every possible proposition can be considered; w represents a mapping from H to {0 = false, 1 = true}, i.e. w : H → {0, 1}. Let T be a space of possible texts and let t ∈ T be a specific text. A set of hypotheses can be extracted from t and regarding w, we can build a set of pairs Ht = {(ht1 , wt1 ) . . . (hti , wti ) . . . (htn , wtn )}, where each wti represents the truth value for hti . Notice that in our approach we assume the principle of the excluding third or principium tertium exclusum. Now we can assume that a body of knowledge is consolidated in a specific knowledge corpus C, and their hypotheses can be perceived as reliable. Thus, we can build a set of pairs CC = {(hCC1 , wCC1 ) . . . (hCCi , wCCi ) . . . (hCCz , wCCz )}, where each wCCi represents the truth value for a hypothesis hCCi extracted from the corpus. The reliability content of a text t regarding CC can be represented by the joint distribution p(Ht , CC|H1 ), where H1 is the hypothesis which states that Ht and CC were generated by the same truth assignment function w. The reliability of t regarding C can be estimated by the amount of information that t represents with reference to C. This amount of information can be measured in terms of coding length cl(t), that is the negative logarithm of the probability of t. Reliability, then, is defined as the gain (in terms of compression) in coding length obtained for codifying t when C is known. reliability(t, c) = log p(t, c | H1 ) − log p(t).

(1)

This measure can also be seen as a log-likelihood statistic: reliability(t, c) =

log(p(Ht , CC|H1 )) , p(Ht , CC|H0 )

(2)

where H0 denotes the independence hypothesis. 3.2

A document-corpus reliability representation

The proposed method for content reliability measuring has the following characteristics: – This method uses a knowledge corpus as a point of reference, built manually from texts that exhibit a high reliability level. – The reliability of a text is seen as a measure of similarity between text and a knowledge corpus.

– Our method uses a representation, in which the text to be assessed and the knowledge corpus are decomposed into content fragments, based on its syntactical - grammatical structure. We use a content fragment decomposition strategy based on a discourse commitment extraction algorithm[5]. – Our text representation is based on entailment-based graphs, which represents textual entailment (TE) relationships among the content fragments which compounds each document and the corpus. – We use the entailment-based graphs and a set of human experts scores for the construction of a training dataset. Then, we apply support vector regression to build reliability models. – Our reliability model considers the entailment structure of each document instead of a standard text-based representation. A key component of our approach is the discourse commitment extraction algorithm, which allow us to represent each document by its content fragments elements. This method considers the following steps: 1. Text enrichment: The text is processed by conducting part-of-speech tagging, named entity recognition, pronominal and nominal coreference identification, lexical dependency parsing, and probabilistic context-free grammar extraction. 2. Decomposition: The text is decomposed into content fragments by detecting sentence connectors. These connectors are inferred from the representations obtained in the text enrichment step by applying heuristics. For further details of the content fragment decomposition strategy please see [5]. Notice that we can measure a textual entailment distance among content fragments. To do this, we propose to use the edit distance function for textual entailment defined by Negri et al. [15] and implemented in EDITS (Edit Distance Textual Entailment Suite). Then, for each document - knowledge corpus pair, we build a bipartite entailment-based graph, with the nodes on one side corresponding to content fragments extracted from a knowledge corpus and on the other side to content fragments extracted from the document to be assessed. The arcs among them represent textual entailment relationships weigthed by using the edit distance entailment function. In Figure 1 we illustrate this process. We use each entailment-based graph as a representation of the reliability that each document exhibits regarding the corpus. According to Equation 1, the reliability(t, c) function is modeled as the gain in coding length obtained for t from C. Now we propose to use these graphs for reliability estimation. To do this we use these graphs as training instances of a machine learning algorithm in order to build a reliability model. Then, to assess the reliability of an unseen document, we will construct its entailment-based graph and by using the reliability model we will estimate its reliability.

Fig. 1. Bipartite entailment-based graph for document reliability assessment.

3.3

Learning to estimate reliability

In this article we explore how we can use our entailment-based graphs as training instances of a reliability model. To follow a supervised approach we need to label each training instance according to its perceived reliability. Many approaches can be explored to conduct this task such as crowdsourcing or expert labeling. In this article we will consider the last course of action. We assume that for each document - corpus pair, a reliability score is provided by, for example, expert labeling. A key element of our approach is how to use our entailment-based graphs as entries of a support vector regression model. We propose to transform each graph into a vector representation as follows. For each content fragment of the corpus CCj , we calculate a document - corpus weigth ηCCj given by: ηCCj =

x X

wij ,

(3)

i=1

where each wij is the edit distance function for textual entailment between CCj and a content fragment ti extracted from the document. Then, a vector representation for the document is built over the corpus space, where each dimension of the corpus represents a corpus fragment CCj , and the j-th component of the vector corresponds to ηCCj . Then, each document - corpus pair can be represented by a vector, constructed from its entailment-based graph. Then, each training instance is compounded by the document - corpus vector (the feature vector) and its score (the label). Finally, using this dataset it is possible to conduct a machine learning process for reliability model estimation. 3.4

An illustrative example

Now we illustrate our proposal. Let CC be a knowledge corpus compounded by the following texts: – text1 : All men are mortals and they fear death, so they study medicine to heal their body and not die

– text2 : Medicine has made great strides over the past 100 years. Its advances have allowed the extention of life Let t be a text to be analyzed: Men no longer fear death and are no longer dedicated to medical school. Text enrichment The text txt1 is processed according to the following steps: 1) Part-Of-Speech tagging (POS tagging): A POS tagging process is conducted over the text. The tags for txt1 are shown in Table 1. Table 1. POS txt1 tags. Id Word Lemma POS Id Word Lemma POS 1 All all DT 12 study study VBP 2 men man NNS 13 medicine medicine NN 3 are be VBP 14 to to TO 4 mortals mortal NNS 15 heal heal VB 5 and and CC 16 their they PRP$ 6 they they PRP 17 body body NN 7 fear fear VBP 18 and and CC 8 death death NN 19 not not RB 9 , , , 20 die die VB 10 so so IN 21 . . . 11 they they PRP

2) Named Entity Recognition (NER): A NER process is conducted over each text. In the case of txt1 none entity was detected. 3) Detection of nominal and pronominal coreferences: Nominal and pronominal terms are identified. In the case of txt1 we detect the following coreferences: – coreferent: mortals. – corefered: they. 4) Syntactic dependency analysis: A syntactic analysis is conducted over each text. The dependency analysis corresponding to txt1 is presented in Table 2. 5) Dependency parsing of probabilistic context-free grammar (Tree PCFG): (ROOT (S (S (S (NP (DT All) (NNS men)) (VP (VBP are) (NP (NNS mortals)))) (CC and) (S (NP (PRP they)) (VP (VBP fear) (NP (NN death))))) (, ,) (IN so) (S (NP (PRP they)) (VP (VBP study) (NP (NN medicine)) (S (VP (TO to) (VP (VP (VB heal) (NP (PRP their) (NN body))) (CC and) (RB not) (VP (VB die))))))) (. .))) .

The same text enrichment pre-process is applied to txt2 and t. Decomposition To decompose each text we detect sentence connectors. Punctuation, coreferences, POS tags and the PCFG structure are considered to conduct this process according to the heuristics proposed by Hickl & Bensley [5].

Table 2. Syntactic dependency analysis of txt1 . 1. det ( men-2 , All-1 ) 2. nsubj ( mortals-4 , men-2 ) 3. cop ( mortals-4 , are-3 ) 4. nsubj ( fear-7 , they-6 ) 5. conj and ( mortals-4 , fear-7 ) 6. dobj ( fear-7 , death-8 ) 7. dep ( mortals-4 , so-10 ) 8. nsubj ( study-12 , they-11 ) 9. ccomp ( mortals-4 , study-12 )

10. dobj ( study-12 , medicine-13 ) 11. aux ( heal-15 , to-14 ) 12. xcomp ( study-12 , heal-15 ) 13. poss ( body-17 , their-16 ) 14. dobj ( heal-15 , body-17 ) 15. xcomp ( study-12 , not-19 ) 16. conj and ( heal-15 , not-19 ) 17. dep ( heal-15 , die-20 )

These heuristics decompose txt1 , txt2 and t into the fragments showed in Table 3. Table 3. The content fragment decomposition process. Corpus segments are denoted by CC and text fragments by ti id CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 t1 t2

Proposition All men are mortals mortals fear death mortals study medicine mortals study medicine to heal their body mortals study medicine to not die Medicine has made great strides Medicine has made great strides over the past 100 years Medicine advances have allowed the extention of life Men no longer fear death Men are no longer dedicated to medical school

Then we built a bipartite entailment-based graph with the nodes on one side corresponding to content fragments extracted from the corpus (CCs) and on the other side to content fragments extracted from the document to be assessed (t1 and t2 ). The arcs among them are weigthed by using the edit distance entailment function. Table 4 shows the distance values obtained. Table 4. The edit distance values used for weighting the entailment-based graph. Arc A11 A12 A13

dij 0.089 0.532 0.199

Arc A14 A15 A16

dij 0.162 0.165 0.238

Arc A17 A18 A21

dij 0.232 0.171 0.056

Arc A22 A23 A24

dij 0.203 0.147 0.157

Arc A25 A26 A27

dij Arc dij 0.137 A28 0.160 0.218 0.243

Finally, the vector representation obtained for t by applying Equation 3 is the following:

t→

4

CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 0.145 0.735 0.346 0.319 0.302 0.456 0.475 0.331

Experimental Results

4.1

Data preparation

We built a dataset for reliability model estimation by considering the data instances of the AESOP Task. In particular, we considered each topic as a particular knowledge field, and each summary produced by the group of NIST experts as the corpus for each topic. Then, for each topic (we considered 44 topics) we have 8 expert summaries which are considered as its corpus. The set of automatically generated summaries was divided into two parts, one to be used as a training set and the other to be used for testing. For each topic, the AESOP Task provides 110 summaries. We used 55 for training, reserving the others 55 for testing. This division was randomly conducted. We evaluated our approach by the pyramid and the responsiveness scores. We discard the use of the legibility score in this article. We consider two learning strategies, one based on point-wise learning, i.e. each training instance is considered as a vector - label pair, and a list-wise learning, where each training instance is considered a sorted list of vector - label pairs. For the list-wise learning approach, each list was randomly generated. For each topic, 100 lists were randomly generated, with 10 vector - label pairs in each one. Then, we obtained a total of 8,800 lists (100 for pyramid and 100 for responsiveness for each topic). For each summary in the testing set we built its entailment-based graph, measuring the outcome of the realibility model obtained by using point-wise learning. For the evaluation of the list-wise approach, we generated 8,800 testing lists, following the same process considered in the training phase. We explored the use of Support Vector Regression (SVR) for model estimation. We used the Kernel Methods Matlab Toolbox implementation of this algorithm. 4.2

Results

We used as a performance measure a loss function. This function was calculated as follows. Let x be a set of testing documents; y(), the gold standard function, and g() our reliability function. The loss function that assess g() is given by: flost = − log(P (y|x, g)), where P (y|x, g) =

n Y exp(g(xy(i) )) Pn k=i (g(xy(k) )) i=1

and yk (i) is the index of the document at position i. Finally, a global loss value given by the sum of each particular loss value was calculated. Table 5 shows the results obtained for the proposed approach in this article, which is called RTE Graph, as well as for the state-of-the-art summarization method ROUGE-SU4. These results were obtained by using SVR, where a tuning process was conducted for parameter optimization.

Table 5. Global loss values for RTE Graph and ROUGE-SU4 approaches Approach Loss pyramid score RTE Graph (list-wise) 9,008 RTE Graph (point-wise) 9,307 ROUGE-SU4 12,913

Loss responsiveness score 13,223 15,334 16,572

Table 5 shows that our method achieves a better global performance than ROUGE-SU4, using list-wise and point-wise. In particular, the list-wise approach outperforms the point-wise approach, being this difference more significant when we consider the responsiveness model. We evaluated also the specific contribution of each element considered in our entailment-graph representation. We started this analysis by evaluating the impact of the use of the edit distance textual entailment function. To do this we calculated the mean of the distance value for each document in the dataset and we performed a comparison against its pyramid score. Table 6 shows these results. Table 6. Comparing pyramid scores and edit distance RTE mean scores.

5% highest scores 5% lowest scores

mean edit distance (RTE graph) 0,1506 0,1892

As Table 6 shows, we find that for the summaries that exhibit the 5% higher pyramid score, its mean distance is smaller than the mean distance of the summaries in the 5% lower pyramid score. This fact indicates to us that the use of the edit distance textual entailment based function allowed us to provide an entailment representation which correlates well with the gold standard. Regarding the decomposition method considered for content fragment detection, significant improvements were obtained by using coreference identification. This can be observed in Table 7. Table 7 shows that the use of coreferences is a very effective strategy for the decomposition phase. Its impact in the performance of the method illustrates that in particular the decomposition method is a critical step of our approach.

Table 7. loss with and with out use of coreferences in disaggregation process

with use of coreferences without use of coreferences

5

loss pyramid score 9.008,56 9.846,34

loss responsiveness 13.223,39 14.324,06

Concluding Remarks

In this article we introduced a new document representation which allow us to decide if a corpus is likely to generate a given document. By using textual entailment relationships among content fragments extracted from a corpus and documents to be assessed, we have built entailment-based graphs. We use these graphs and a set of human experts scores to train a machine learning model. By comparing the outcomes of the model over a set of testing documents, we conclude that our approach is feasible, outperforming in information loss a stateof-the-art summarization method. We introduced a text representation which models if a corpus is likely to generate the text. To obtain this representation we used several NLP techniques and its computational costs were very significant compared to standard term-based representations. This fact limits the use of our approach for on-line document ranking. However, our approach can be considered as a semantic indexing strategy, being these computational costs incorporated to off-line indexing processes. Our approach for text reliability is close to summarization but they are different concepts. Notice that a good summary is a reliable text, but the opposite is not necessarily true. We take advantage of the first fact (a good summary is a reliable text) to explore the feasibility of our entailment representation. In this article our main matter of interest was the construction of a reliability estimation model. The merit of this article is to illustrate that this approach is feasible. However there are many open issues for the near future. We are exploring the use of language models for text reliability estimation, trying to address the dependence of our approach to the existence of gold standard scores. Currently we are exploring also how to use our graphs to extract realiability measures, without using supervised learning. Finally, another important issue is the construction of benchmarks for the evaluation of these strategies.

Acknowledgments Mr. Sanz was supported by a Mecesup postgraduate fellowship grant Nr. FSM0707, Mr. Allende was supported by projects Fondecyt 1110854 and Basal FB0821 CCTVal FB/13HA10, and Mr. Mendoza was supported by projects UTFSMDGIP 24.11.19 and Fondef DO9I1185.

References 1. Al-Eidan, R.M.B., Al-Khalifa, H.S., Al-Salman, A.S.: Towards the measurement of arabic weblogs credibility automatically. In: Proceedings of the 11th iiWAS conference, pp. 618–622, ACM, New York, NY, USA (2009) 2. Cusinato, A., Della Mea, V., Di Salvatore, F., Mizzaro, S.: Quwi: quality control in wikipedia. In: Proceedings of the 3rd workshop on Information credibility on the web. pp. 27–34. WICOW ’09, ACM, New York, NY, USA (2009) 3. Dang, H.T.: Overview of DUC 2005. In: Proceedings of the 2005 Document Understanding Workshop. Vancouver, B.C., Canada (2005) 4. Dang, H.T., Owczarzak, K.: Overview of the tac 2008 update summarization task. In: Proceedings of the First Text Analysis Conference (TAC 2008). pp. 1–16. Gaithersburg, Maryland USA (2008) 5. Hickl, A., Bensley, J.: A discourse commitment-based framework for recognizing textual entailment. In: In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing. pp. 171–176. ACL, Prague, Czech Republic (2007) 6. Hovy, E., Lin, C.y., Zhou, L.: Evaluating duc 2005 using basic elements. In: Proceedings of DUC-2005. pp. 1–6. Vancouver, B.C., Canada (2005) 7. Juffinger, A., Granitzer, M., Lex, E.: Blog credibility ranking by exploiting verified content. In: Proceedings of the 3rd workshop on Information credibility on the web. pp. 51–58. WICOW ’09, ACM, New York, NY, USA (2009) 8. Kolluru, B., Gotoh, Y.: On the subjectivity of human authored short summaries. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation. pp. 12–18. ACL, Michigan, USA (2005) 9. Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Proc. ACL workshop on Text Summarization Branches Out. pp. 9–17. Barcelona, Spain (2004) 10. Metzger, M.J.: Making sense of credibility on the web: Models for evaluating online information and recommendations for future research. Journal of the American Society for Information Science and Technology 58(13), 2078–2091 (November 2007) 11. Metzger, M.J., Flanagin, A.J., Medders, R.B.: Social and heuristic approaches to credibility evaluation online. Journal of Communication 60, 413–39(27) (2010) 12. Nenkova, A., Passonneau, R.: Evaluating content selection in summarization: The pyramid method. In: Proceedings of the HLT-NAACL conference, p. 8. Association for Computational Linguistics, New York, NY, USA (2004) 13. Nichols, E., Murakami, K., Inui, K., Matsumoto, Y.: Constructing a scientific blog corpus for information credibility analysis. In: Proceedings of the PACLING conference, pp. 1–6. ACM, Sapporo, Japan (2009) 14. Rieh, S.Y.: Judgment of information quality and cognitive authority in the Web. J. Am. Soc. Inf. Sci. 53(2), 145–161 (2002) 15. Negri, M. and Kouylekov, M. and Magnini, B and Mehdad, Y. and Cabrio, E.: Towards Extensible Textual Entailment Engines: The EDITS Package. AIIA, Emergent Perspectives in Artificial Intelligence, Reggio Emilia, Italy, Proceedings, (2009) 16. Vega, L.C., Sun, Y.T., McCrickard, D.S., Harrison, S.: Time: a method of detecting the dynamic variances of trust. In: Proceedings of the 4th workshop on Information credibility. pp. 43–50. WICOW ’10, ACM, New York, NY, USA (2010) 17. Weerkamp, W., Rijke, M.D.: Credibility improves topical blog post retrieval. In: In ACL-08: HLT. pp. 923–931. ACL, Columbus, USA (2008) 18. Xu, Y.C., Chen, Z.: Relevance judgment: What do information users consider beyond topicality? J. Am. Soc. Inf. Sci. Technol. 57(7), 961–973 (2006)

Suggest Documents