Dec 14, 2013 - Programmers look for new source locations which may related to a given task ... Information that a progra
Mining Contexts for Recommending Source Locations to Explore
Seonah Lee and Sungwon Kang Dec. 14, 2013
International Workshop on ICT, 2013
1
Outline 1. Introduction 2. Proposed Approach 3. Evaluation
4. Conclusion
International Workshop on ICT, 2013
2
1. Introduction Research Background Related Work Question
International Workshop on ICT, 2013
3
Research Background (1/2) The relative cost for software evolution now represents > 90% of its total cost [Erlikh 2000] In software evolution tasks… Programmers spend > 50% time to
understand source code [Fjeldstad 1983] Programmers spend 35% time in navigating code bases [Ko 2005]
Activities Understanding Navigating
Changing
Software evolution Task: The smallest identifiable and essential piece of a job that serves as a unit of work that changes a software system (i.e. fixing bugs and enhancing features) International Workshop on ICT, 2013
4
Research Background (2/2) Programmers look for new source locations which may related
to a given task [Letovsky 1986] [Ko 2005][Latoza 2010]
Potential navigation paths exponentially increase Source Code
createArrowMenu
Source Code
International Workshop on ICT, 2013
5
Related Work Interaction History-based Code Recommenders Mylyn [Kersten 2006]
TeamTrack [DeLine 2005]
Display a collection of source
Recommend source
locations when a programmer selects a task ID Count the frequencies of source locations
locations historically associated with the location that a programmer selects Determine associations between source locations
Required a programmer‘s manual indication of a task International Workshop on ICT, 2013
Limited to recommending co-visited source locations 6
Research Question To effectively guide programmers' code navigation, collections of source locations to explore should be given, automatically How these collections of source locations can be
automatically created and visualized? visit source locations Collection of source locations relevant to a given Task
?
Programmer International Workshop on ICT, 2013
7
3. Proposed Approach Definition (Navigation Context) Principles (for Mining Contexts) Steps (for Mining Contexts) Tool (for Mining and Recommending
Contexts)
International Workshop on ICT, 2013
8
Definition (Navigation Context) Conceptually Information that a programmer needs to explore
and understand during a software evolution task
Technically Collection of source locations, frequently visited
relevant to similar tasks
International Workshop on ICT, 2013
9
Principles (for Mining Contexts) Relevance by Frequency The source locations that programmers frequently
visited are likely to be highly relevant to the tasks of the programmers
Relevance by Context If a source location of a navigation sequence is
highly relevant to a task, it is likely that the other source locations in the same navigation sequence are relevant to the same task International Workshop on ICT, 2013
10
Steps (for Mining Contexts in Programmer Interaction Histories) Navigation Context = Retrieve (Mine (Segment
(InteractionHistories)), Navigation Path) Interaction Traces a b c a b d b d, e f g e f c f c f, c a b x b d
{ c, d }
Retrieve:
Segment:
(b, 5), (a, 3),
(a b c) (a b d) (b d) (e f g) (e f c) (f c) (f) (c a b x) (b d)
(d,3),
Mine:
(c, 2) ,
Micro-clustering A • B / ||A|| ||B|| Cosine Similarity
Macro-clustering K-nearest clustering International Workshop on ICT, 2013
{ (a, 2), (b, 2), (c,1), (d, 1) } { (b, 2), (d, 2) } { (e, 2), (f, 2), (g, 1) (c, 1) } { (f, 2), (c, 1) } { (c, 1), (a, 1), (b, 1) (x, 1) }
(x, 1)
TF • IDF Similarity
{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}
{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}
{ (e, 2), (f, 4), (c, 2), (g, 1) }
{ (e, 2), (f, 4), (c, 2), (g, 1) }
11
Steps (for Mining Contexts in Programmer Interaction Histories) Navigation Context = Retrieve (Mine (Segment
(InteractionHistories)), Navigation Path) Interaction Traces
{ c, d }
a b c a b d b d, e f g e f c f c f, c a b x b d Retrieve:
(b, 5), (a, 3), (d,3),
Mine:Segment:
(c, 2) ,
(a b c) (a b d){ (a, (b 2),d)(b,(e2), f(c,1), g) (d, (e1) f} c) (f c) (f) (c a b x) (b d) Micro-clustering A • B / ||A|| ||B|| Cosine Similarity
Macro-clustering K-nearest clustering International Workshop on ICT, 2013
{ (b, 2), (d, 2) } { (e, 2), (f, 2), (g, 1) (c, 1) } { (f, 2), (c, 1) } { (c, 1), (a, 1), (b, 1) (x, 1) }
(x, 1)
TF • IDF Similarity
{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}
{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}
{ (e, 2), (f, 4), (c, 2), (g, 1) }
{ (e, 2), (f, 4), (c, 2), (g, 1) }
12
Steps (for Mining Contexts in Programmer Interaction Histories) Navigation Context = Retrieve (Mine (Segment
(InteractionHistories)), Navigation Path) Interaction Traces Mine: a b c a b d b d, e f g e f c f c f, c a b x b d
{ c, d }
Retrieve: { (a, 2), (b, 2), (c,1), (d, 1) } Segment:Micro-clustering { (b, 2), (d, 2) } (a b c) (a b d) (b d) (e f g) f c) ||B|| (f c) (f) (c a b x) (b d) A• B /(e||A|| { (e, 2), (f, 2), (g, 1) (c, 1) } Cosine Similarity { (f, 2), (c, 1) } { (c, 1), (a, 1), (b, 1) (x, 1) }
(b, 5), (a, 3), (d,3), (c, 2) , (x, 1)
TF • IDF
Macro-clustering K-nearest clustering
Similarity
{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)} { (e, 2), (f, 4), (c, 2), (g, 1) }
{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)} { (e, 2), (f, 4), (c, 2), (g, 1) }
International Workshop on ICT, 2013
13
Steps (for Mining Contexts in Programmer Interaction Histories) Navigation Context = Retrieve (Mine (Segment
(InteractionHistories)), Navigation Path) Retrieve: Interaction Traces
(b, 5),
a b c a b d b d, e f g e f c f c f, c a b x b d
{ c, d } (a, 3),
(d,3),
Segment:
(c, 2) ,
(a b c) (a b d) (b d) (e f g) (e f c) (f c) (f) (c a b x) (b d)
(x, 1)
Mine: Micro-clustering A • B / ||A|| ||B|| Cosine Similarity
{ (a, 2), (b, 2), (c,1), (d, 1) } { (b, 2), (d, 2) } { (e, 2), (f, 2), (g, 1) (c, 1) } { (f, 2), (c, 1) } { (c, 1), (a, 1), (b, 1) (x, 1) }
TF • IDF Similarity { (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}
Macro-clustering K-nearest clustering International Workshop on ICT, 2013
{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}
{ (e, 2), (f, 4), (c, 2), (g, 1) }
{ (e, 2), (f, 4), (c, 2), (g, 1) }
14
Tool (for Mining & Recommending Contexts) A graphical code recommender that visualizes source locations to visit It incorporates the proposed approach
- Display History - Display Recommendations - Collect interaction traces - Jump to source locations - Update diagram Layout
Recommendations Histories
International Workshop on ICT, 2013
15
4. Evaluation Evaluation Plan Evaluation Results
International Workshop on ICT, 2013
16
Evaluation Plan Simulations
User Studies
Experiment in an early phase
Simulation using Experimental Data
Wizard-of-oz Study
- 12 programmers performed the same 4 tasks
- 11 programmers performed the same 4 tasks
Application in a later phase
Simulation using Real Data
Diary Study
- 10 programmers - 4,397 interaction traces, used the tool in their extracted from the Eclipse environment for a month Bugzilla system
International Workshop on ICT, 2013
17
Evaluation Results Simulations
User Studies
NavClus showed two times
Wizard-of-oz study: 9 out of
higher recommendation accuracy than TeamTracks
11 programmers positively evaluated: “It provides a crucial hint”
0.2
“Uh, here are all answers”
F-measure
0.15
Diary study: it is limited to
0.1
the individual use of the tool
0.05 0
Myl yn TeamTracks 0.082 NavClus 0.14
Although all of 10 programmers Platf PDE ECF MDT orm 0.058 0.055 0.091 0.034 0.122 0.2 0.191 0.051
International Workshop on ICT, 2013
highly evaluated NavClus, it was not because of the NavClus recommendations 18
5. Conclusion Conclusion Future Work
International Workshop on ICT, 2013
19
Conclusion RQ: How navigation contexts can be automatically created and visualized? Navigation context: the information that a developer needs to explore and understand during a software evolution task
We propose a clustering technique that automatically forms
past programmers' navigation contexts We implemented the NavClus tool, and investigated the
effectiveness of the NavClus tool in real-world development
International Workshop on ICT, 2013
20
Future Work Comparison of Recommendation Techniques Data Clustering Association rule mining
Hidden markov model
Additional User Studies for Collaboration Contextual Knowledge Transfer Training new comer
NavClus International Workshop on ICT, 2013
21
Question?
Seonah Lee
[email protected]
International Workshop on ICT, 2013
22