An Evolutionary-Based Method for Reconstructing Conversation

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

An Evolutionary-Based Method for Reconstructing Conversation Threads in Email Corpora Mostafa Dehghani

Masoud Asadpour

Azadeh Shakery

Intelligent Information Systems lab School of ECE, College of Engineering, University of Tehran, Tehran, Iran [email protected]

Social Networks lab School of ECE, College of Engineering, University of Tehran, Tehran, Iran [email protected]

Intelligent Information Systems lab School of ECE, College of Engineering, University of Tehran, Tehran, Iran [email protected]

reconstructing email conversation threads in accurate structure. We map the problem of reconstructing conversation threads to an optimization problem. We represent a conversation with a single rooted tree. This rooted tree is generated based on the parent-child relations among the emails of the conversation. The target in optimization problem is to find a jungle of best rooted trees in the space of any possible rooted tree. To solve this problem, the first solution that comes to mind is exhaustive search. But in fact this is impossible because of the very high cost of searching all the space of possible solutions. It is proven that the number of different jungles of rooted trees with labeled nodes is calculated as follows:

Abstract—Email is a type of Web data which is produced in enormous quantities. It is beneficial to detect conversation threads contained in the email corpora for various applications, including discussion search, expert finding and even email clustering and classification. Conversation thread in email corpora can be defined as a cluster of exchanged emails among the same group of people by reply or forwarding on the same topic. According to this definition, we can define parent-child relation between emails, so email conversation threads seem to demonstrate tree structure. This paper presents a new approach based on genetic programming for reconstruction of conversation threads in emails data. This approach considers finding email conversation threads as an optimization problem, and exploits genetic programming to search intelligently in the space of possible solutions. Rather than several studies that have been conducted on this problem, this work concentrates on detecting accurate structure of conversation threads in high recall. This paper provides a comprehensive evaluation on the BC3 data set. Preliminary results suggest that our method provides acceptable precision and higher recall than existing methods. Keywords-email; programming

conversation;

I.

emails

thread;

n

n-1 J(n)= nn-k , k-1

that is greater than Catalan number [1]. For example, we have 16 different jungles of rooted trees with just 3 nodes. We can clearly see that, the space of all possible solutions is very large. To overcome this difficulty, some prior work [2][3] completely ignore the tree structure of conversations and find conversations as clusters of emails without considering any tree structure inside them. Finally, they arrange emails that belong to the same conversation in a chronological order, and so the structure of conversations becomes linear. In fact, they consider another search space that is very smaller than the real space. Some work [2][4] also have made some assumptions to restrict the original search space. For example, they considered all of the emails of some conversation to have the same subject line. By this greedy assumption, they reduced the search space and just focused on a small part of solution space to find the structure of a conversation. Therefore, they lose some part of solution and are trapped in local optimums. Since the assumption that all the emails in a conversation thread have the same subject is not always true. It is possible that users change the subject during a same conversation or that two separate conversations have the same subject especially general subjects like “Call for Participation”. In this work, we suggest an intelligent evolutionary search in the whole solution space to find global optimum instead of some approximate solution. We exploit genetic programming approach to search the space of all possible solutions. We

genetic

INTRODUCTION

Email has become one of the most popular tools for handling conversation among people. Many messages are being sent and received per day and a lot of information is exchanged with emails. Emails are not completely independent of each other, because they could be written as a response for other emails. Therefore, detecting these dependencies among emails can improve the quality of email data analysis. A conversation thread in email corpora can be defined as a cluster of exchanged emails among the same group of people by reply or forwarding on the same topic. Based on this definition, we can consider parent-child relation between emails, such that the child of some email is a composed email as reply or forwarding of parent email. In this way, conversations have tree structure. It could be beneficial to detect conversation threads from email data for many useful applications. These applications include but not limited to improving user experience in email management, increasing the quality of discussions search in public emails, etc. In this paper, we suggest a novel approach for

978-0-7695-4799-2/12 $26.00 © 2012 IEEE DOI 10.1109/ASONAM.2012.195

(1)

k=1

1164 1132

evaluate our proposed method on a real dataset and compare the results with some other existing approaches in terms of numerical measures. Results show our method is highly efficient in reconstructing email conversation threads. The rest of the paper is organized as follows. In Section 2, we begin with a brief overview of related work. In Section 3, we describe our approach to the problem and formalize the problem of email conversation threads reconstruction by genetic programming. Section 4 gives our experimental results and evaluation analysis. Finally, we make conclusion and recommend for further study in Section 5.

detecting tree structure of conversation, but in their evaluation they conducted these assumptions specially "same subject line for all emails in a same thread" are not completely accurate. Wang et al. [9] sharped the distinction between definition of emails conversation and emails thread. They extracted emails threads using Zawinski algorithm and then they merged and decomposed extracted threads based on their subject line in order to reconstruct the conversations. Recently, Joshi et al. [4] used segmentation and detection of near duplicate emails to find and organize messages that should be grouped together based on reply and forwarding relationships. They made an assumption that when an email replies to another email, the content of the old email exists in the newly composed email as a quoted text in a separate segment, and then reconstructed conversation threads by considering this segmentation patterns. Erera and Carmel [3] clustered emails into coherent conversation by using a similarity function among emails that takes into consideration all relevant email attributes such as email subject, participants, date of submission, and email content. Their approach is the most similar to ours in some aspects but, their method cannot reconstruct accurate structure of conversation threads.. There is also some other previous work that regarded conversation reconstruction problem on forum data [10] and employed conversation threads to improve forum search [11]. While these methods are highly effective in detecting threads, they may fail to detect all conversations and reconstruct true structure of them. In fact, some prior works focus only on detecting clusters of emails that belong to the same conversation and do not possess enough capability in reconstructing accurate structure of threads. Some work also make assumptions that are not consistent with our problem definition. In this work, we concentrate on reconstructing email conversation threads without any incorrect restricting assumption in both structure and content of conversations.

RELATED WORK II. Extensive research has been done over the past years on using email structure and features to detect conversation threads. Existing methods can be categorized into two groups: (1) metadata-based, and (2) content-based. Metadata based approaches use email headers fields such as IN-REPLY-TO and REFERENCES. However, since these header fields are optional for email clients, they are not always available and also such data are not capable enough to reconstruct all email conversations accurately. One of the most popular algorithms for email threading is Zawinski algorithm [5] that is based on metadata. Also there are several approaches that use content of emails. The work which has been done by Lewis and Knowels [6] regarded email threading as a retrieval problem. In their work, they studied five retrieval strategies to indicate if one message is a response to another. Their results showed that the most effective strategy is using the quotation of one email as a query and the unquoted part of other emails as documents. They just focused on body text of emails and did not consider other features of emails. Shen et al. [7] proposed a method based on single-pass clustering algorithm for detecting conversation threads in Chat data. Also, they used some linguistic features to improve the quality of detected conversation. They are not regarded in tree structure of conversations and just focus on detecting clusters of emails that belong to the same conversation and arrange them in a chronological order. In this way, detected conversations have linear structure. Wu and Oard [2] grouped emails into the same conversation thread if they have the same subject line and have at least one sender/recipient in common. Another related work is Aaron and Jen-Yuan method [8]. They investigated how to identify email conversations with regard to missing message recovery. They assumed that all emails in the same conversation have the same subject line and the lifetime of a conversation is usually less than a fixed period of time. Thus, they divided emails into some groups such that all emails in each group have the same subject line and maximum time distance between any two emails in one group is less than a fixed time window. Then, they tried to reconstruct the tree structure of emails threads by detecting parent-child relation among the emails of the same group. Although their assumption is suitable for reducing the search space for

III. METHODOLOGY Genetic Programming (GP) is a model for testing and selecting the best choice among a set of results, each represented by a complex structure like tree. In fact, GP model generates solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover. In this work we use GP model to reconstruct conversation threads in email corpora. GP model has some pillars like representation of solution, fitness function, initialization, selection strategy, and reproduction strategy. We continue with explanation about the elements of GP model in the problem of email conversation threads reconstruction. A. Representation In our method, we characterize each conversation thread with a single chromosome. Thus, in GP model, every chromosome is represented by a rooted tree such that the root of the tree is equivalent to the first email in the conversation and other tree nodes are organized based on parent-child relations among other emails in that conversation.

1133 1165

B. Initialization In every GP model, an initial population is needed to start to search the space of possible solutions with the individuals in that population. To achieve good exploration, this population must be diverse enough to be able to cover the whole search space. RAMPED HALF-AND-HALF is the traditional tree generation algorithm for genetic programming [12]. RAMPED HALF-AND-HALF takes a tree depth range (DMin , DMax ) and picks a random value within the depth range. Then, with 1/2 probability it uses the GROW algorithm to generate the tree, otherwise it uses the FULL algorithm with the chosen depth. In this way, number of possible rooted trees with nodes is O(nn ). We use a heuristic to prune the space of all possible trees by removing solutions that are not feasible for our problem. We reduced the search space of rooted tree into ordered rooted tree, since in conversations we have chronological order between emails. Thus, parent emails always have earlier time relative to their children. We apply this pruning in generating initial population and also in reproduction in our GP method.

work we consider both local closeness of participants of a conversation and global closeness of them. In other words, in addition to considering the relations of people due to their participation in the conversation in question, we consider the closeness of them with regard to their communication social network. F1p (C) symbolizes the function that gives the similarity of participants of conversation C, relating to similarity of all pairs of emails' participant in that conversation. Thus, F1p (C) is calculated as follows: F1p (C)=

Simp (e1 ,e2 )=

n , diff(C)

(2)

F2p (C)=

and ei →ej SimS (ei , ej )

log n

,

(4)

2 Se1 ∩Se2 Se1 + Se2

Pe1 +|Pe2 |

.

∑pi ,pj∈p(C) closenessSN (pi ,pj ) 2 log |p(C)|

closenessSN pi , pj =

where SimS (ei , ej ) is subject similarity of two emails ei and ej according to normalized term overlapping measure. Let Sek be the set of words belonging to the core subject of email ek . Then, subject similarity of two emails is defined as: SimS (e1 ,e2 )=

2 Pe1 ∩Pe2

.

adj(pi ) ∩ adjpj |adj(pi )∪ adjpj |

(8)

.

(9)

Thus, Fp (C), which shows participant-fitness of chromosome C, could be calculated as a combination of local closeness of participants and global closeness of them: FP (C)=WF1 F1p (C)+ WF2 F2(C).

.

(7)

In the above equation, aggregated closeness among participants of one conversation has been calculated, where p(C) is the set of participants in conversation C. For estimating closeness of participant we use their social communication network. First we create a directed graph indicating the social network of email communication. In that graph, nodes represent participants, and for each email in dataset, the node that represents the sender of the email is connected to the nodes that represent recipients of that email. Then, we can calculate closeness of two nodes in this graph using social network distance measures. In this paper, we exploit neighborhood overlap, which is a simple measure to estimate the closeness of two nodes.. Let adj(p) denotes the set of all neighbors of node p in social network graph including itself; closeness of two nodes is defined as:

(3)

Also Fs (C) calculates the subject-fitness of chromosome C based on aggregated similarity between subjects of all parentchild pairs of emails in the conversation. This value is logarithmically normalized due to the number of emails in the form of: ∑ei , ej ∈C

(6)

On the other hand, we define F2p (C) to calculate the closeness of participants of emails that belong to the same conversation regarding their social network. The idea is that the people who are closer to each other in terms of communication, are more probable to contribute in a special conversation.. We calculate the closeness of people in a conversation considering their communication. It is calculated as follows:

where n is the number of nodes in chromosome and diff(C) is time difference between root of chromosome and the email that has latest time in chromosome in terms of day. Unlike prior work, we do not consider any tight time window to limit lifetime of conversations. We let conversations to continue in long period of time and give them a relatively good score in case of concluding lots of emails.

Fs (C)=

,

where Pek is the set of all participants of email ek including sender and all recipients (To and Cc).

In the above equation, Ft (C) is a function that gives the time-fitness of chromosome C , according to lifetime of its equivalent conversation and also the number of emails in that conversation: Ft (C)=

2 log 2 n

where Simp (ei , ej ) is the similarity of participants of two emails ei and ej . To estimate this value, normalized overlap similarity has been used as follows:

C. Fitness Function In order to evaluate individuals' fitness from several different aspects, a multiple objective function has been used in our method. To begin with, we define fitness components and then we combine them with weighted sum strategy. Fitness of chromosome C is defined as follows: F(C)=Wt Ft (C)+Ws Fs (C)+WP FP (C)+ WC FC (C).

∑ei , ej ∈C Simp (ei , ej )

(5)

(10)

In the above equation, we can control the contribution of each part of participant-fitness with WF1 and WF2 . Intuitively, WF1 should be set higher than WF2 to make the effect of local closeness more important than global closeness.

In equation (2), Fp (C) regards to evaluating relationships among the people who participate in a conversation. In this

1134 1166

the selected individual and replace it with new generated subtree if the following conditions are satisfied: x The time of the parent of chosen node from the individual is earlier than the root of new generated subtree. x The depth of tree after replacement does not exceed DMax . First condition controls reproduction to generate feasible solutions and the second one controls the size of chromosome to prevent bloating. If crossover was chosen as reproduction operation, we select two individuals from the selected individuals and choose a node from each individual randomly. Then, we exchange the subtrees with root of chosen nodes from each individual if the time and size constraints for new trees are satisfied.

One of the most important aspects of goodness of a detected conversation is similarity among the content of emails that belongs to that conversation. We consider this aspect in fitness function by having FC (C) . In fact, FC (C) gives score to a conversation based on aggregating content similarity between all parent-child emails in the conversation. () is calculated as: FC (C)=

∑ei , ej ∈C

and ei →ej SimC (ei , ej )

log n

,

(11)

where SimC (ei , ej ) is the content similarity between emails ei and ej . There are many methods to estimate similarity between text documents. Among them, we use vector space model. This model represents each document with a vector of term weights. This method then estimates the similarity of documents by calculating the similarity between their term vectors. In this paper, we also represent content of each email with a term vector and adjust term weights with TF-IDF method and use Cosine similarity to estimate the similarity of vectors.

After mutation or crossover operation, we add new breed individual to the new generation, and reproduce a new individual again until the size of new generation is equal to initial population. It should be noted that we set β=20% . Also it is experimentally approved that crossover tended to cause fitness to rise more rapidly with larger population [13]. In our problem, we have large population so we consider λ=0.25.

D. Selection In GP model, during each successive generation, a proportion of the existing population is selected to breed a new generation. In our method, we use a combination of Roulette Wheel method and Elitism to select individual solutions. We select W individuals in each iteration such that αW individuals are become from the best individuals in terms of fitness function and (1-α)W individuals are selected through a fitness proportionate strategy. In this way, fitter solutions, as measured by a fitness function, are typically more likely to be selected. We assign selection probability to each individual in population that is calculated as: P(Ck )=

F(Ck ) . ∑Ni F(Ci )

For reproduction in our GP method, we moreover use simulated annealing idea. We control diversity of generations regarding the number of iterations of algorithm. It is desirable to generate new individuals with high diversity in early stages and decrease the degree of diversity gradually in following stages. To achieve this goal, when we breed a new individual by mutation or crossover, if the fitness of new offspring was higher than its parents, it is added to next generation, otherwise, we add the new individual with following probability: F(Cold )-F(Cnew )

(12)

T Padd (Cnew )∝e, (14) where T is the number of iterations up to the current stage. Using this technique makes our GP method to reproduce new generation with regard to both exploration and exploitation.

Therefore each individual has a chance proportional to its probability to be selected for generating next generation.

F. Termination The process of generation and then evaluation of individuals' fitness and also selection is repeated until a termination condition has been reached. In our method, termination condition satisfied when average of individuals' fitness of population does not increase during the three consecutive stages. Now, regarding the explained elements, our method first chooses the initial population, and then repeats these steps until termination. (1) calculate the fitness of each individual, (2) select the best-fit individuals, (3) breed new individuals through crossover and mutation to generate a new generation.

E. Reproduction To generate next generation from those that were selected through the selection strategy, we use duplication, mutation and crossover. In this paper, we generate β% of next generation with duplicating each individual N(C) times in which: N(Ck )=

β F(Ck ) N W , 100 ∑i F(Ci )

(13)

where is the number of selected individuals and denotes the size of population. It is clear that duplication could help have a good exploitation in our GP method through inheritance of good properties of last generation. Also, we generate (100-β)% of next generation with mutation and crossover. For each new individual breeding, we first decide to generate new individuals through mutation operation with probability λ or crossover operation with probability 1-λ . If mutation was chosen, we select an individual from the selected individuals and choose a node from that individual randomly. Then, we produce a new subtree using the initialization strategy. After that, we remove the subtree with the root of chosen node from

IV.

EXPERIMENTS

In this section, the experimental results are presented to indicate the effectiveness of our method. We first demonstrate the test collection, and then state the used evaluation measure, and finally, we will describe and analyze the experimental results.

1135 1167

Figure 2. Social network graph of email communications in BC3 corpus

Figure 1. Conversation threads in BC3 Corpus

A. Data Collection The dataset we use for experimental evaluation is the BC3 corpus [14], which is a subset of W3C. The W3C corpus is data derived from a crawl of the World Wide Web Consortium’s sites at w3c.org. The B3C corpus contains 269 emails and 40 conversation threads with an average of 6.752 emails. This dataset is annotated with human written abstract summaries and is also labeled for speech acts. The longest conversation comprised 11 emails and the maximum depth in conversation trees is 6 and the maximum degree of nodes is also 6. The tagged conversation threads of BC3 corpus are shown in Fig.1. Text content of BC3 emails has overall 3222 sentences. This corpus consists of 162 IDs. Therefore, we have 162 nodes in BC3 social network graph. The number of total edges in social network graph is 423 and if we consider the weighted graph, the number of edges are 319 edges with maximum weight of 8. The social network graph of email communications in BC3 corpus is shown in Fig. 2.

Our evaluation process is based on comparison of tagged conversation threads in BC3 with our algorithm detected conversation. To evaluate effectiveness of our method, we use some measures. Let NA denotes the number of all emails pairs that each belongs to one of the detected conversations with our algorithm for which there exists a tagged conversation in BC3 containing both emails of the pair. Also let NTC denotes the number of all emails pairs that belong to the single tagged conversation. It is clear that: NGE = Ci ∈TC

Precision=

NT . NGE

This measure determine the quality of detecting conversation threads regardless of their structure. To evaluate the power of algorithm in reconstruction of conversation in accurate structure, we define PrecisionPC as followed:

PARAMETERS VALUE Explanation

DMax

Maximum depth of chromosomes

15

Wt

Weight of time-fitness

0.05

WS

Weight of subject-fitness

0.25

Weight of content-fitness

0.45

WP

Weight of participant-fitness

0.25

-

Number of Initial Population

100

-

Number of Selected Individuals in each iteration

20

PCA , PCTC

(17)

where PCA denotes the number of all parent-child relations that belong to one of the detected conversations with our algorithm for which there exists a tagged conversation in BC3 containing that relation. Also, PTC denotes the number of all parent-child relations that exist in tagged conversations. PrecisionPC showes the ability of our algorithm to detect parent-child relations and consequently the ability of detecting accurate structure of conversation threads. Also, In order to assess the ability of our method in reconstructing all conversation threads, we calculate Recall measure.

value

WC

(16)

Note that at the best case, the value of precision could be 1.0, since any pair of emails belonging to tagged conversation also belongs to the corresponding detected conversation.

PrecisionPC =

Parameter

(15)

We define the Precision of algorithm in detecting conversation compared with tagged conversation to be:

B. Experimental Results Upon the algorithm termination a dynamic threshold is used to select the best chromosomes as final results. For dynamic thresholding, we sort chromosomes regarding their fitness and simply find the maximum difference between two successive chromosomes’ score and cut the list from this point. We test our algorithm in different cases and tune the parameters of our method. The value of our experiment’s parameters are tuned as given in Table I: TABLE I.

numCi (numCi -1) . 2

1136 1168

TABLE II.

PRECISION AND RECALL

Method

Precision

PrecisionPC

Our Evolutionary Method

0.747

0.796

Wu and Oard method Lewis and Knowels method Joshi et al. method

0.314

-

0.575

-

-

0.873

work using BC3 email corpus and showed that exploiting genetic programming contributes to improving conversation threads reconstruction in email corpora. Due to the difficulty of determining some parameters’ values, it could be beneficial, if we can add that parameters as a part of solution. For example, we can represent chromosomes in our model by a jungle of rooted trees instead of single rooted tree. Thus, the threshold used to select final results is considered as a part of solution and it is determined automatically. Another way to improve quality of our detected conversations is to use semantic similarity among the content of emails, for example using speech acts, and estimate their content similarity more accurately and also use better closeness measures to estimate similarity of people regarding their social network. ACKNOWLEDGMENT

Recall Recall (regardless (regarding to the structure) the structure)

0.937

0.876

0.511

-

0.425

-

-

0.63

We have tested Lewis and Knowels [6], Wu and Oard [2] method that they do not regard to the structure of conversations, and Also Joshi et al. [4] method, which can reconstruct conversation structure, on BC3 and report best results of them. The precision and recall of our results compared to other method are given in Table II. As can be seen, our method is highly efficient in reconstructing all conversation threads. Joshi et al. made an assumption that when an email replied to another email, the content of the old email exists in the newly composed email as a quoted text in a separate segment, and then reconstructed conversation threads based on patterns of near duplicate segments. In BC3 near 30% of replied emails do not contain quoted text; therefore Joshi et al. method for thread detection cannot obtain a high recall. Also, Wu and Oard grouped emails into the same conversation thread if they have the same subject line and have at least one sender/recipient in common. Unlike them our method does not consider any restricting assumption to reduce the search space, and there is no inaccessible solution to our method. Lewis and Knowels just focused on body text of emails and did not consider other features of emails but we regard to all important features of emails in this task. Therefore, our method is highly effective in reconstructing conversation in terms of precision and recall. There is another evaluation analysis that could be interesting. After we sort the chromosomes upon the algorithm termination, Instead of using dynamic threshold to select best of them as the final results, we select R chromosomes where R is the smallest possible number such that all tagged conversation threads take place in the descending sorted list before position R. Then, we calculate Precision of our method in this set. In other words, we calculate R-Precision measure. Table III presents R-Precision of our results. TABLE III.

The M. D. Author would like to thank Maedeh Mosharraf and also Morteza Mohgheghi for their valuable comments on the paper, and the people of Intelligent Information Systems laboratory for some useful early discussions. Also, this research is partially supported by Research Institute for ICT. REFERENCES [1]

B. Sagan, “A note on abel polynomlals and rooted labeled forests”, Discrete Math. 44, 293–298, 1983.

[2]

Y. Wu, and D. W. Oard, “Indexing emails and email threads for retrieval”. In Proceedings of the 28th annual international ACM SIGIR conference. Salvador, Brazil, 665–666, 2005. S. Erera, and D. Carmel, “Conversation detection in email systems”. Proceedings of the IR research, 30th European conference on Advances in information retrieval (ECIR'08), 498–505, 2008. S. Joshi, D. Contractor, K. Ng, , P. M. Deshpande, and T. Hampp, “Auto-grouping emails for faster e-discovery”. Proceedings of the VLDB Endowment, 4, 12, 1284–1294 ,2011. J. Zawinski, “Message threading”. www.jwz.org/doc/threading.html , 2000, Accessed: 11/23/2011. D. Lewis, and K. A. Knowles, “Threading electronic mail: a preliminary study”. Information processing & management, 209–217, 1997. D. Shen, Q. Yang, J.T. Sun, and Zh Chen, “Thread detection in dynamic text message streams”. In Proceedings of the 29th annual international ACM SIGIR conference. New York, NY, USA, 35–42, 2006. H. Aaron, , Y. Jen-Yuan,“Email thread reassembly using similarity matching”. In Proceedings of the Third Conference on Email and AntiSpam (CEAS) , 2006. X. Wang, , M. Xu, N. Zheng, and N. Chen, “Email conversations reconstruction based on messages threading for multi-person”. In Proceedings of the 2008 International Workshop on Education Technology and Training \& 2008 International Workshop on Geoscience and Remote Sensing - Volume 01 (ETTANDGRS '08), Vol. 1. IEEE Computer Society, Washington, DC, USA, 676–680.2008. E. Aumayr, J. Chan, C. Hayes, “Reconstruction of threaded conversations in online discussion forums”, ICWSM , 2011. H. Duan, C. Zhai, “Exploiting thread structures to improve smoothing of language models for forum post retrieval”. ECIR'11. pp. 350-361, 2011. J. R. Koza, “Genetic programming: on the programming of computers by means of natural selection” .MIT Press, Cambridge, MA, USA, 1992. S. Luke, and L. Spector, “A revised comparison of crossover and mutation in genetic programming”. In Koza, J. et al., editors, Proceedings of the Third Annual Genetic Programming Conference, , San Francisco, CA.Morgan Kaufmann, 208–213, 1998. J. Ulrich, G. Murray, G. Carenini, “A publicly available annotated corpus for supervised email summarization”, AAAI08 EMAIL Workshop, Chicago, USA, 2008.

[3]

[4]

[5] [6] [7]

[8]

[9]

R-PRECISOIN

R-Precision

R-PrecisionPC

0.889

0.784

[10] [11]

This results shows our method detect all of conversation in acceptable ranking. V.

[12] [13]

CONCLUSION AND FUTURE WORK

The goal of this work is to investigate a method to reconstruct email conversation threads in an accurate structure. We modeled the problem of reconstructing conversation threads as an optimization problem. Then, we used genetic programming to solve this problem. Finally, we evaluated our

[14]

1137 1169