An Evaluation framework for Software ... - Semantic Scholar

Front. Comput. Sci. DOI Wenjun WU: Software Crowdsourcing RESEARCH ARTICLE

An Evaluation framework for Software Crowdsourcing Wenjun WU ()1 Wei-Tek Tsai ()2,3 Wei LI ()1 1 State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China 2 School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ 85281, USA 3 Department of Computer Science and Technology, INLIST, Tsinghua University, Beijing 100084, China

© Higher Education Press and Springer-Verlag Berlin Heidelberg 2012

Abstract Recently Software Crowdsourcing has become an emerging area of software engineering. Few papers have presented a systematic analysis on the

1 Introduction

practices of software crowdsourcing. This paper first presents an evaluation framework to evaluate software crowdsourcing projects with respect to software quality, costs, diversity of solutions, and competition nature in crowdsourcing. Specifically, competitions are evaluated by the min-max relationship from game theory among participants where one party tries to minimize an objective function while the other party tries to maximize the same objective function. The paper then defines a game theory model to analyze the primary factors in these min-max competition rules that affect the nature of participation as well as the software quality. Finally, using the proposed evaluation framework， this paper illustrates two crowdsourcing processes, Harvard-TopCoder and AppStori. The framework demonstrates the sharp contrasts between both crowdsourcing processes as participants will have drastic behaviors in engaging these two projects.

Crowdsourcing has captured the attention of the world recently [1]. Numerous tasks or designs conventionally carried out by professionals are now being crowdsourced to the general public to perform in a collaborative manner. Specifically, crowdsourcing has been used for identifying chemical structure, data mining, medical drug development, logo design, and software development. Software crowdsourcing platforms including Apple’s App Store, TopCoder [2], and uTest [3] demonstrate the advantage of crowdsourcing in terms of software ecosystem expansion and product quality improvement. Apple’s App Store is an online IOS application market [4][5], where developers can directly deliver their creative designs and products to smartphone customers. These developers are motivated to contribute innovative designs for both reputation and payment by the micro-payment mechanism of the App Store. Within less than four years, Apple’s App Store has become a huge mobile application ecosystem with 150,000 active publishers, and generated over 700,000 IOS applications [6]. Around the App Store, there are many

Keywords crowdsourcing, software competition rules, game theory

engineering,

Received Month dd, yyyy; accepted Month dd, yyyy E-mail: [email protected]

community-based, collaborative platforms for the smart-phone applications incubators. For example, AppStori [7] introduces a crowd funding approach to build an online community for developing promising ideas about new iPhone applications. Another crowdsourcing example – TopCoder, creates a software contest model where programming tasks are posted as contests and the developer of the best solution wins the top prize. Following this model, TopCoder has established an online platform to support its ecosystem and gathered a virtual global workforce with more than 250,000 registered members and nearly 50,000 active participants. All these TopCoder members compete against each other in software development tasks such as requirement analysis, algorithm design, coding, and testing. The way of organizing software development in all these practices of software crowdsourcing is changing from traditional software factory or open source teams to decentralized, peer-production based ecosystems of software developers. Different from open source software, the openness of software crowdsourcing doesn’t refer to free access to source code. Instead, it denotes an OPEN call for participation in any tasks of software development, including documentation, design, coding and testing. These tasks are normally conducted by either internal members within a software enterprise or people from contracting firms. But in software crowdsourcing, all the tasks can be assigned to anyone from general public. Moreover, software crowdsourcing introduces more explicit incentives such as financial rewards to motivate community developers than open source software. Thus, it greatly extends the concept of open source community into the notion of market-driven software ecosystem. Although such a paradigm shift indicates a new trend of software engineering, fundamental principles behind software crowdsourcing have not been explored yet. We need to have a guideline to determine which parts of a complex software system to be crowdsourced? And more importantly, what are the ultimate goals of a software crowdsourcing process? The success of a software crowdsourcing project must be able to broaden participation, ensure quality of solution, encourage diversity of solutions, identify potential talents and

maximize learning for both active participants and passive observers. This paper examines a variety of issues in software crowdsourcing including quality, costs, diversity of solutions, and competition scenarios. Many authors have argued that crowdsourcing in general encourages creativity and problem solving [8], but software crowdsourcing has many unique features and issues. Specifically, software crowdsourcing needs to support: ● The rigorous engineering discipline of software development, such as rigid syntax and semantics of programming languages, modeling languages, documentation and process standards such as UML, CMMI [9], and 2167A [10]; ● The creativity aspects of software requirement analysis, design, testing, and evolution ● The Big Data aspects as numerous data will be generated and need to be analyzed for evaluation of both clients and developers, and their products. ● The psychology issues of crowdsourcing such as competition, incentive, recognition, open, sharing, collaboration, and learning. The psychology must be competitive while at the same time friendly, sociable, educational, and personal fulfillment for participants, requesters, and administrators. ● The financial aspects of all the parties including requesters, crowdsourcing platforms, and participants. ● Quality aspects including objective qualities such as functional correctness, performance, security, reliability, maintainability, safety, and subjective qualities such as usability. ● The liability issues in case of failure of software that caused the harm: who is responsible for the software faults? Developers? Administrators? Requesters? Users? ● Reputation of all parties including requesters, administrators, evaluators, and participants. This paper is organized as follows: Section 2 enumerates the major goals of software crowdsourcing and formulates the relevant factors; Section 3 proposes an evaluation framework to analyze software crowdsourcing processes; Section 4 models the competition mechanism defined in the evaluation framework via game theory; Section 5 uses the proposed

evaluation framework to present Harvard-TopCoder and AppStori development processes; Related work and the conclusion can be found in Section 6 and 7.

2 Characterizing Software Crowdsourcing Processes Software crowdsourcing can be broadly classified into two categories: 1) outsourcing only; and 2) outsourcing with competitions. TopCoder is an example of crowdsourcing with competitions where people contribute their software by participating in competitions, but uTest is an example of outsourcing without competitions. This paper focuses on competition-based software crowdsourcing. 2.1 Goals of Software Crowdsourcing Common software crowdsourcing goals are listed: Quality software: Obtaining quality software is a common goal for software crowdsourcing projects. But software quality has many dimensions including reliability, performance, security, safety, maintainability, controllability, and usability. Thus crowdsourcing organizers need to define specific software quality goals and their evaluation criteria. Quality software often comes from competent contestants who can submit good solutions for rigorous evaluation. Rapid acquisition: Instead of waiting for a software team to develop software, crowdsourcing organizers may post a competition, hoping that some teams have already developed something similar before, and they can deliver the needed software rapidly. Their goal is to reduce the time to acquire the software product, rather than acquiring quality software. Talents identification: A crowdsourcing organizer may be more interested in identifying those talents as demonstrated by their performance in the competition. Many outstanding TopCoder contestants have been identified and offered jobs by leading IT companies. This is also a reason that talented programmers participated in crowdsourcing competitions. Cost reduction: A common goal for a crowdsourcing organizer is to acquire software at a low cost. However, in addition to paying winning teams, crowdsourcing projects also need to pay for competition organization

and award prizes. Furthermore, software cost often includes software maintenance in the future. The objective is to minimize the cost in software crowdsourcing including organization and evaluation cost, and price to be paid. The “price” may consist of both monetary and recognition rewards. An unknown organization may run a competition with a high monetary reward to attract contestants to participate, but a leading company may pay low price yet many contestants are willing to participate as winning the competition will significantly improve the reputation ranking for contestants. Solution diversity: Fault-tolerant systems often require multiple versions of software to be used as redundancy just in case that one of them fails, and the rest of versions can be used for recovery. In competition-based software crowdsourcing, multiple versions of the software may be developed by contestants, and they can be used to enhance reliability. Schenk and Guittard [11] argued that it is important to have contestants from different regions of the world and with diverse background and professional skills to work together produce quality software. The objective is to maximize the diversity in crowdsourcing, which means more competitive contestants with diverse backgrounds to produce multiple versions of the software. Ideas creation: In some crowdsourcing competitions, the goal is to get new ideas from contestants hoping that these ideas can lead to real solutions later. The problem to be addressed may be very challenging, and thus organizers are not interested in solutions submitted, but the brilliant ideas in those solutions that often suggest new research directions. Broadening participation: In some crowdsourcing competitions, the goal is to recruit as many contestants as possible. One reason is to obtain quality solutions as the organizer can select the best solution among all the submissions; another possibility is that the organizer is mainly interested in letting the event known among potential participants or observers as the publicity is the real goal instead. A competition with a large number of contestants requires significant management overhead to organize and evaluate submissions. Participant education: In some software crowdsourcing competitions, organizers are not

interested either in the solutions submitted, or in identifying talents, but they are interested in teaching those contestants new knowledge or skillsets via competitions. Such a crowdsourcing project can be exemplified by an educational game site named by nonamesite.com, where high school students can participate in online competitive computer games. This web site is funded by DARPA to teach students STEM (Science, Technology, Engineering, and Mathematics) knowledge via competitions. Fund leveraging: Sometimes, a crowdsourcing project aims to stimulate other organizations and even companies to sponsor the same project [12]. For example, a national science organization may fund a competition on addressing the challenges of renewal energy, knowing that other organizations will participate in this competition by funding their own teams to explore innovative ideas on the topic. In this case, the science organization achieves its goal that more money will be available to support research in this specific area. Marketing: Another goal for crowdsourcing organizers is to raise the publicity of their organization among potential participants. They often conduct competitions on a regular basis so that many people will recognize the name of the organization and nature of its business. Different Targets: Crowdsourcing projects have different targets shown in Table 1. Table 1 Multiple Crowdsourcing Targets Targets

Rationale

Solutions submitted

1. To obtain quality solutions at low cost; 2) To acquire solutions rapidly; 3) To obtain ideas in the solutions.

Participants

1. To identify talents; 2. To make the project or organizations recognized; 3. To create a community of users with certain knowledge; 4. To educate participants certain knowledge.

Observers

1. To make them aware of the crowdsourcing organizations; 2) To motivate them to fund similar projects.

2.2 Goal Driven Software Crowdsourcing Design The design of a software crowdsourcing process is essentially determined by the goals of a software

crowdsourcing project. Crowdsourcing organizers need to carefully examine the basic goals in the different stages of the software project to be crowdsourced. For example, during the phase of concept formation in the requirement analysis, the top goal is to ensure the diversity of proposals so that the most innovative idea can be selected. Thus, the organizer must promote wider participation from the community to increase the diversity of solution for better quality of software product. But it has to increase in both competition prize and administrational overhead of the competition, which will lead to higher crowdsourcing cost. During the phase of development that aims the top quality solution, the organizers may either introduce higher incentive prize to attract more teams with excellent skills to compete with each other, or spend more money on extensive quality assurance, or both. Therefore, it is important to prioritize these goals and design the appropriate scope and boundaries for elements in a crowdsourcing project such as problem statement, competition prize and eligibility of participant as well as competition rules. After the priority of the specific goals is set, appropriate cooperation and competition mechanism of the project must be devised in order to steer the process towards these goals. There are two major styles of strategies in the design of competition and awarding mechanism: non-collaborative and collaborative competition mechanism. Firstly, when a project really intends to find the best talents in their labor market for developing high quality software products, it needs to take the strong competition strategy where the superior contestants outperform the others by producing the better products with less faults. Secondly, if education is the primary goal, the number of learners and the amount of knowledge they learned are important criteria. But the quality of solutions submitted by participants may not be a priority other than providing assessment of these participants. Thus, a mild and constructive competitive strategy should be adopted to encourage everyone to share his knowledge and experiences with other peers. Section 3 presents an evaluation framework to formalize the design of software crowdsourcing and

examine the major design factors.

3 ． Evaluation Framework for Software Crowdsourcing Our evaluation framework consists of three parts, including participation-output method, submission-award method and min-max competition model. The participation method defines a few common patterns for crowd production through all the lifecycle of software development. The submission-award method analyzes a few awarding schemes for winners based on the merit of their submission. The min-max competition model categorizes the major competition strategies. 3.1 Participation-Output Analysis The competition can be N  N where N people produced N different outputs. This is the case where each team merely produces one output. This can be done if multiple teams participate in the competition, but each can submit at most one solution only. Thus, all the teams can work on their components simultaneously but independently. As N increases, the number of diversity and the cost will also increase. Specifically, the higher award price should be set to attract more people to compete, and more effort is needed for evaluation. The competition can be N  M where N people produced M outputs, and M > N. This is similarly to N  N except teams may produce multiple outputs for competition. This will happen if price stake is high, and contestants are willing to compete even knowing that some of their solutions will surely be rejected. This phenomenon has happened before as a Harvard-TopCoder competition reported that 600 submissions have been submitted by 125 individuals [13], thus each team (assuming one team has one person only) has submitted on average more than 5 submissions for competition. In this case, N=125, but M=600, and if only top two teams will win, teams have submitted

solutions knowing that most if not all of their solutions will be rejected. Interestingly, the competition is high yet people were willing to compete. Similarly, the competition can be N  M where N people produced M outputs, but N > M. This is the case where multiple teams work together to produce one output, and thus the competition is not as stiff as in the previous case. Furthermore, such a competition may be no longer fair as some submissions will be done by individual teams, but some by multiple teams. Potentially, in the case where the work is N  M , and M > N, it is possible that teams may produce more than one outputs, and at the same time, multiple teams can work together to produce one output. As this is not commonly practiced, this kind of competition scenarios will not be discussed further in this paper. As M decreases, the competition becomes less competitive, and once M < N, the psychology among contestants changed significantly, as contestants start to help each other to form partnership. When M=1, or the competition is N  1 . In this case, all the N contestants work together to produce one submission, and the competition is actually friendly among contestants. However, the competition can be N  part(1) where N people worked on one submission, and each team is responsible for one part of the submission only, and each part has no overlap with each other. This process can be used when the software is decomposed into components to be independently developed or evaluated. The competition is not fair as each team receives different problems to solve. The work can be N  ovpart(1) where N people work on one submission, but each team is responsible for a part, but a part may overlap with other parts. This process can be applied when the software can be decomposed into components, but these may not be independently developed or evaluated. The competition is not fair for the same reason as before. Table 2 summarizes these scenarios.

Table 2 Crowd-Work Analysis Space

Relationship

Description

N N

N  2

Each team produces one output.

Characteristics Simultaneous work possible, as N increases, the game becomes more

competitive. Simultaneous work possible, as N or

N  M

N M

Each team can produce multiple outputs.

N  2

M increases, the game becomes more competitive, and the reward must increase. Collaborative teams need to

N M

N  M

Multiple teams work together to produce

coordinate with each other to produce

N  2

one output.

a quality output. As M decreases, the game becomes less competitive.

N  2

N 1

All the teams work on the same output.

Teams need to synchronize to work together. Each part may be worked on

N  part(1)

N  2

Each team works on one partition of the

simultaneously without much

output, and each partition has no overlap

coordination. As the number of

with each other.

partitions increases, the complexity of system integration increases.

N  Ovpart(1)

N  2

Similar to N  part(1), except that a partition may overlap with another partition.

Similar to N  part(1), except that teams need to synchronize to work together.

crowd funding for financial support. In general, multiple teams will come up with project concepts, yet the funding is limited, thus not all project concepts will be developed later. In rare cases where few project concepts are available, the crowd will have few projects for selection. Table 3 shows these competition scenarios for project conceptualization.

All the tasks for software development including project concepts, requirement analysis, design, coding, and testing, can be crowdsourced. However, they will require different game arrangements as specified in the following tables. Project concepts involve developing new ideas for software development. AppStori is an example that allows software developers to create new ideas and seek Table 3 Project Concepts Development and Evaluation Project Concepts Development N  N , each team comes with its own concept.

Project Concepts Evaluation As only few concepts will be selected for funding and implementation, the evaluation can be competitive to eliminate alternatives.

N  M , M >N, each team may come up with more than

As M increases, more alternatives need to be

one concepts.

eliminated.

N  M , M < N, teams collaborate to develop concepts.

As M decreases, fewer alternatives need to be eliminated.

N  1 , all the teams work together to develop one

As only one project concept is available, the evaluation

concept.

focuses on better components or structure within the solution.

Once a project concept is selected, both the requirement and design tasks can be crowdsourced.

Teams will come up with different requirements based on their own interpretation of the same project concept, and follow up with their own design solution. However, eventually in most cases, only one final requirement and

design document will be selected for implementation. In the Harvard-TopCoder process, the requirement and design will be done by Harvard researchers, and only

the components will be crowdsourced. Table 4 shows the analysis of requirement and design in software crowdsourcing.

Table 4 Requirement Analysis and Design Development and Evaluation Requirement/Design Development

Requirement/Design Evaluation

N  N , each team analyzes requirements and

As only few specifications will be implemented, the

develops its own specifications.

evaluation can be competitive to eliminate alternatives.

N  M , M  N , each team analyzes requirements, but may come up with more than one specifications

As M increases, more alternatives need to be eliminated.

(designs).

N  M , M  N , teams collaborate to analyze

As M decreases, fewer alternatives need to be

requirements and develop specifications (design).

eliminated.

N  1 , all teams work together to analyze

As only one specification (design) is available, the

requirements, and develop one specification (design).

evaluation focuses on better components or structure within the specification.

N  part(1)and N  ovpart(1): Similar to N  1 ,

Similar to N  1 , except that Extra evaluation is needed

except that each develops its own sub-specification

for integrating all the design pieces.

(sub-design) within the specification (design).

Table 5 summarizes the issues for component code Code development is the core software development and evaluation. crowdsourcing as this is most commonly practiced. Table 5 Component Code Development and Evaluation Component/Algorithm Development

Component/Algorithm Evaluation

N  N , each team analyzes the component

As only a few code components will win the

specification and develops its own code.

competition, the evaluation can be competitive to eliminate alternatives.

N  M , M  N , each team analyzes the component specification, but develops more than one component to

As M increases, more alternatives need to be eliminated.

compete.

N  M , M  N , teams collaborate to analyze the As M decreases, fewer alternatives need to be component specifications, and develop the component

eliminated.

code together.

N  1 , all teams work together to analyze the

As only one component is available, the evaluation

component specification, and develop one component

focuses on better sub-components or structure within

together.

the component.

N  part(1) and N  ovpart(1) : Similar Similar to N  1 , except that extra evaluation is to N  1 , except that each team responsible to develop needed for integrating all the components. one sub-components to satisfy the specification.

If multiple software components based on the same component specifications have been developed, it will be necessary to test all components to select the final winners. Each crowdsourced team can develop test cases to evaluate all the components, and test cases can

be pooled together as they are based on the same specifications. There is no case for N  M and M  N as each team will develop at most one set of test cases. Table 6 summarizes the issues related to test and evaluation of components.

Table 6 Test and Evaluation of Software Components Submitted Test Cases Development

Test Execution and Evaluation

N  N , each team analyzes the software specification

Each team can evaluate the test cases developed by

and the code to develop its own test cases.

other teams, and each team can execute its own test cases to evaluate the submitted software.

N  M , M  N , teams collaborate to analyze the

Each collaborative team can evaluate the test cases

software specifications, and the code, and develop test

developed by other teams, and each team can execute

cases together.

its own test cases to evaluate the submitted software.

N  1 , all teams work together to perform all the test

All teams work together to evaluate test cases

tasks.

developed, and evaluate the software submitted by running all the test cases developed.

N  part(1) and N  ovpart(1) : Similar to

The same as N  1

N  1 , each team responsible to develop test cases for a part of the software.

3.2 Submissions-Award Analysis One of the key discriminating factors in software crowdsourcing is the number of winners out of M submissions. If every submission wins, the contest is not competitive. However, if M is large, but only few will win, the contest may become excessively competitive. The competitive level in contests must be controlled; otherwise it will have negative impact on the motivation of potential participants. If a crowdsourcing contest is over competitive, few people are willing to participate; if it is not competitive, the reward of winning is diminished, and thus no incentive for them to compete.

For example, TopCoder has reported that the best number of contestants is two as only top two coders will win a price [14]. One way to overcome the excessively competitive nature is to raise the award significantly. This is similar to the lottery where the chance of winning a lottery is extremely small, but winning a lottery is lucrative, and thus many people are willing to participate in the lottery. Table 7 summarizes the issues of submission and awarding mechanism. Here we assume that rewards for winners include both monetary award and reputation.

Table 7 Award Based on Evaluation of Software Components Submitted Space

Relationship

Description

Characteristics

M 1

M  2

M products will be evaluated and

More competitive as M increases.

only one winner.

M  2

M  2

M products will be evaluated but

More competitive as M increases.

only top two will win.

M  K

M  M

M  k 1

Only top K products among M

M  2

products will win.

competitive as (M  K )decreases.

M products Everyone wins.

Not competitive.

M  2

3.3 Min-Max Analysis In [15], a software development process can have a sub-process where one party tries to minimize an objective function, yet the other party tries to maximize the same objective function as though both parties compete with each other in a game. For example, a specification team needs to produce quality

Competitive if M  K

; and game less

specifications for the coding team to develop the code; the specification team will minimize the bugs in the specification, while the coding team will identify as many bugs as possible in the specification before coding. The min-max process is important as it is a quality

assurance mechanism, and often a team needs to perform both. For example, the coding team needs to maximize the bug identification in the specification, but

it also needs to minimize the number of bugs in the code it produces. Table 8 summarizes the min-max relationship.

Table 8 Min-Max Definition Offense Activities

Defense Activities

Evaluate the inputs including any input documents,

Evaluate the outputs including any deliverables such as

prototypes, interviews and relevant materials that will be used

documents and software.

in performing the tasks. Goal: Maximize the number of faults in the input documents,

Goal: Minimize the number of bugs that will be found by

and provide feedback to those who prepared the inputs.

other teams or people (crowd).

The min-max can be further classified into three

categories as shown in the following table.

Table 9 Various Min-Max Definitions Classification Weak Min-Max or wmm

Descriptions Two teams involved in the min-max are mostly collaborative, and evaluation will not be based on the bugs identified.

Min-Max or mm

Two teams engage in the min-max may work together to solve a problem, but the number of bugs identified is used to evaluate the performance of these two teams with one team scores high (but the other team scores low) if the number of bugs is high.

Strong Min-Max or smm

Two teams engage in the min-max compete not only the number of bugs identified will be used as a key performance measures but also each team tries to eliminate its competitor.

Definition weak-min-max or wmm relationship: wmm (A, B) means that Team A has a min-max relationship with team B where A produces an output for B. A needs to minimize the number of bugs in the output, but B needs to identify as many bugs as possible in the same output. Furthermore, the number of bugs is not used as a performance measure for either team A or team B. Definition min-max or mm relationship: mm (A, B) means that team A has a min-max relationship with team B, where A produces an output that B uses. While A needs to minimize the number of bugs in the output, B needs to maximize the identifications of bugs. Furthermore, the number of bugs identified will be used as a performance measure for both A and B. Definition strong-min-max or smm relationship: smm (A, B) means mm (A, B), and both teams try to eliminate each other to win the game. Note that wmm, mm, and smm are all irreflexive because it is impossible to have wmm (X, X) for any

team X. Furthermore in general, they are not symmetric or transitive, but they are not antisymmetric. Specifically, mm (A, B) often does not necessarily mean mm (B, A). However, in some software crowdsourcing processes, both mm (A, B) and mm (B, A) are valid, and thus they are not antisymmetric where it means either R(x, y) or R (y, x) will be true but not both where R is a relationship. Some notable properties among these relationships are: ● smm (A, B) => mm (A, B); ● mm (A, B) => not wmm (A, B); ● not mm (A, B) => not smm (A, B); ● wmm (A, B) => not mm (A, B); ● wmm (A, B) => not smm(A, B); ● not wmm (X, X), not mm (X, X), not smm (X, X) for all X;

4 Game Theory Interpretations

In this section, we present a game theory model to elaborate the three styles of Min-Max mechanisms and discuss the primary factors in these games including the effort to finish the competing tasks, the award prize and skill level of competing players.

4.1 Non-cooperative Min-Max

Game

and

Strong

In a product-oriented software crowdsourcing process intending to identify the best talents and improve the quality of software, it often introduces programming contests with a strong Min-Max mechanism to eliminate less skillful contestants. Such a software crowdsourcing can be modeled as a non-cooperative game where each participant will try their best to beat their competitors, and thus each party may be mutually destroyed by each other as the worst outcome for all participants. Assume the cost to build a program for the contest is c; the award for winning the contest is w, w > c. The talent of all the players can be measured by their programming skill rate ranging from 0 to 1. For example, if a participant’s programming skill rate is 1, he should be top 1 contestant in this game, which means his submission may be the best program valued by the review team. Since this game has few stimuli for people to learn from each other, we can assume that two players don’t exchange their source code. According to the Min-Max definition in Table 9, a player has two optional strategies: (1) Defense Strategy – the player only focuses on developing his source code and improving the quality of his own program (2) Offense Strategy – the player focuses on building test cases that can expose code bugs in his opponent’s software Although this game should be N-player, we still model it in the form of two-player game for the purpose of simplicity and compute its Nash equilibrium in terms of mixed strategy. And the mixed strategy analysis of the two-player actually represents a statically decision among all the participants. The strategic form of this non-cooperative game is defined in Table 10.

Table 10: The Strategic Form for Strong Min-Max Process A,B

Defense

Offense

Defense

S aw  c , S bw  c

-c, 0

Offense

0,-c

0,0

When both players choose to take the Defense strategy, their payoff depends on the skill level of coding. Let Sa and Sb represents Player A and B’s skill rates, then Player A and B’s award expectation should be Sa  w  c and Sb  w  c . Assume S a  w 

c ,S b  w  c .

When one of the players chooses to take the Offense strategy, he will not gain anything in the game but make the other player’s software submission fail the quality test and lose the game. So the offense player’s payoff is zero and the defender’s payoff is –c. When both players follow the Offense strategy, neither of them will present high quality software to win the game. Apparently, in the case of pure strategy case, the Nash equilibrium lies on the (Defense, Defense), where two players only care about the quality of their own software submissions. However, such a balance can be easily shifted if a new comer with lower skill level tries to find the problems in the work of senior developers and eliminate them. By computing the Nash equilibrium by the mixed strategy, we can get the following equations where  means the probability for Player A to choose the Defense strategy and  means the probability of Player B to take the Defense strategy.

S bw

 c    c 1     0

(1)

S aw

 c    c 1     0

(2)

Thus, we can calculate

 

c c ,  S bw S aw

(3)

The Nash equilibrium reveals that every player makes his decision based on the skill level of his opponent. The higher skill rate a player’s opponent has, the less chance he chooses the Defense strategy to improve his code.

Instead, he will aggressively try to find more bugs in his opponent to sabotage the opponent’s leading position in the game. Since everyone tries to eliminate the contestants with higher skill level, nobody has motivation to share knowledge and help his peers to upgrade their skills. From the aspect of the game organizer, this property of the equilibrium also serves the purpose of identifying talents. Only the best player can deliver high-quality software and survive this intense competition.

Defense Strategy and player B takes the Offense Strategy, A’s award will not be affected. Instead, he can fix the bugs in his software and get the award based on the quality of his software. Table 11 : The Strategic form in the Min-Max Game A,B Defense Offense

And the allocation of a player’s effort to code enhancement and bug finding depends on the value of

Defense

Offense

S aw  c , S bw  c

S aw  c , b

a ,S bw  c

0,0

c 1 .When the cost c is higher than w , the  is w 2

S bw

 c   S bw  c 1     b

(4)

greater than 0.5, indicating every player is willing to make more effort on improving his code. Because the increasing complexity of the software tasks demands all the participants in this programming contests to spend more time and attention on development.

S aw

 c   S aw  c 1     a 

(5)

4.2 Coordination Game and Min-Max: In a software crowdsourcing process intending to identify the best solutions and improve the quality of software, it often introduces medium Min-Max and rewarding mechanism to balance competition and collaboration. Such a software crowdsourcing can be modeled as a cooperative game to encourage co-creativity by allowing each one to contribute and gain rewards through participation. Cooperative games often tend to have positive influence on the player creativity as they are willing to help each other to improve their performance. All parties can gain from the game mutually if they trust each other and are willing to make mutually satisfying group decisions. In some scenarios where people have not formed a well-functioning community yet, they may also take tit-for-tat strategy to cooperate with others on the condition that other players also make reciprocal contributions. The strategic form of this game is defined in Table 11. We define a new award for the Offense strategy: a ,b . When player A takes the

Thus, we can compute the probability for each player to take the Defense strategy:

 

S bw  c b

,  

Again we assume

S aw  c a

(6)

S aw  c , S bw  c : the

complexity of the task doesn’t make the cost go beyond the coding reward. In contrast to the Nash equilibrium of the previous game, the skill level of the opponent player becomes the positive factor impacting the player’s defensive decision. Moreover, it is not the only dominate factor any more. The ratio between the offense award and coding award determines the probability

 and  .

When the opponent’s coding reward

S bw goes

higher，the player A would more likely to improve the quality of software submission. When the offense award

b increases up to 2S bw  c  , the  will drop down to 0.5, indicating that the player A spend less effort in improving his code. In most circumstances, the award for testing and troubleshooting should be set

between

S bw  c and S bw , which ensures  stays

higher than 0.5. Therefore, the appropriate awarding mechanism can motivate all the players to enhance their software implementation and demonstrate their capability to gain more rewards in the game. At the same time, it also encourages players with superior skills to write test cases to help weak players to improve their software.

obtainw  c , where w is the award for the project and c is the cost. But when the tester takes the “free rider” option and fail to ensure the quality of the product, his payoff is reduced to w   c whereas the tests can get the whole reward without any contribution.

Table 12 : The Strategic form in Weak Min-Max A,B

Don’t Work

Offense

Don’t Work

0,0

0, 

Work

w  c, w

w c w c

4.3 Collaborative Game and Weak Min-Max In a learning-oriented software crowdsourcing process intending to promote knowledge sharing and collective intelligence, it often sets up a collaborative environment with a weak Min-Max mechanism to encourage people to participate and make contribution. Such a software crowdsourcing can be modeled as a collaborative game or public goods game, where each player’s effort contributes positively to the welfare of the whole community. The fundamental problem in this game is the tension between private and common interest. When two people collaboratively work on the same project, ideally they should dedicate their resources and effort to accomplishing the common goal. However, since there is no penalty on the faulty code delivered by the two players, they may choose to take the “free rider” strategy to increase their own payoff by withholding their contribution and still benefiting from the public pool. Many researchers [16][17] have studied this problem and model it using prison dilemma game. We extended their game models to describe the weak Min-Max mechanism. Assume there are two distinctive players: player A is only responsible for developing software entities, whereas player B takes care of testing the code and generating bug reports. And the award for developers is not based on the performance of individuals like the previous two game models. Table 11 displays our model for the weak Min-Max. For the testing player, as long as he fulfills his obligation, he can either gain  or share the same award as the developing player. For the developer, when the testing player cooperates with him, he can

Let  and

,

 represent the probability for Player A

and Player B to choose the Work and Offense strategy respectively. We can have Eq 7 and Eq 8.

w   1      w  c 

(7)

0  1   w   c    w  c 

(8)

Thus, we can compute the Nash equilibrium

 

 w  c      c  w  w   w  w 

(9)

For Player A, his motivation to work in this game depends on the offense gain, the cost of the task and the quality penalty w  w  caused by the problem of the delivered software products. Eq.9 shows that the increase in both offense gain and quality loss leads to more incentive for Player A to improve the software product. Because the increase in  can stimulate more testing effort, Player A needs to work harder to fix more bugs found by Player B. Furthermore, when the quality penalty rises up, it will push both Player A and B to work harder to maintain the quality software and lower the quality penalty. Interestingly, in both Eq. 6 and Eq.9, the cost c becomes a reverse factor for players to take higher initiative in the game. This deserves further investigation in our future work.

5. Illustration Project leaders often design their crowdsourcing processes based on the stakeholders and nature of the projects. This section demonstrates contrasting examples of software crowdsourcing, Harvard-TopCoder and AppStori, to illustrate the differences. Harvard-TopCoder is a joint project funded by Harvard University to take advantage of the TopCoder platform for biomedical research. AppStori is a community-based, collaborative platform for the development of smart-phone applications. It provides a “preview” window for iPhone application enthusiasts to choose their favorite ideas, support and actively engage in the promising projects. Figure 1 and Figure 2 show the development process for both projects respectively. 5.1 Harvard-TopCoder Processes The development process of the Harvard-TopCoder project can be characterized as a process with N participants to produce M components, and all components will be evaluated by the same N participants and G competition administrators. The Catalysts and Researchers will go over the biomedical research problem, and the Researcher will prepare the final problem statements for competition. The crowd will review the problem statements prepared by Researchers and come up with solutions, and the solutions will be evaluated by Researchers. Thus, we can define three competitive relationships between the stakeholders, including: wmm (Catalysts, Researchers), wmm (Researchers, crowd), and mm (crowd, Researchers). The first two relationships are collaborative, where both parties discuss and refine the problem statements. The third relationship between crowd and researchers are more competitive because all the M submissions need to be rigorously evaluated by G researchers. Note that G Researchers will also prepare test cases, and use these test cases to score solutions submitted by the crowd.

Fig.1 Harvard-TopCoder Process

Algorithm

Development

This TopCoder process can be characterized as follows: Harvard-TopCoder Process Construction: N  M , N  2,M 

N ;

Evaluation: G  M , and G  1 ; wmm (Catalysts, Researchers); wmm (Researchers, crowd); mm (crowd, Researchers); Quality: High as M is reported to be very large, and thus highly competitive. Cost: Unknown, but Harvard’s name tag carried a high reward price. Diversity of solutions: High as M is large. Nature of competition: Competitive and Fair.

This competition process can be immediately strengthened by having participating teams evaluate the solutions prepared by other teams, with the top two contestants to win. Thus, the revised process has two new constraints: (reward: M  2 ), and smm (team i, team j) for all teams i and j in the crowd and i  j . The revised process can be formulated as follows: Revised Process (1) Construction: N 

M , N  2,M  N ;

Evaluation: (N 

G )  M , and G  1 ;

Reward: M  2 ; wmm (Catalysts, Researchers);

will win. This can be expressed as follows:

wmm (Researchers, crowd); mm (crowd, Researchers); and smm (team i, team j).

Third Revised Competition: Construction:

The revised competition is considered more competitive as each participant tries to eliminate other contestants from winning. If a component has been detected to contain a bug, it will be eliminated from further consideration. This process of elimination will continue until no more components can be eliminated, and top two components will receive an award. The revised competition can be modified again to pursue better quality solution. For example, instead of the original N people evaluate the products; and a different set of L people with G game administrators will evaluate N outputs with only two winners. Let crowd1 and crowd2 denote N developers and L reviewers respectively. Second Revised Competition: Construction: N  M , N  2,M  Evaluation: (L 

N ;

G )  M , and G  1 ;

Reward: M  2 ; Reward-for-evaluation: L  2 ; wmm (Catalysts, Researchers); wmm (Researchers, crowd1); mm (crowd1, Researchers); wmm (Researchers, crowd2); mm (crowd2, Researchers); mm (crowd1, crowd2). Both crowds can have two winners to be awarded. Their submissions on either biomedical algorithm implementations or test cases must be rigorously evaluated by researchers. Also, the selected test cases are used to validate the quality of the submitted solutions. Therefore, there are three regular Min-Max games among two crowds and the researchers. Another modification can be made in the process where N people will produce M products, and the same N people with another set of L people will participate in evaluation of M products. Furthermore, only the top two teams will win, and only the top two evaluators

N  M ,N  2,M  N ;

Evaluation: (L 

N  G) M

,

L  2

and

G  1; Reward: M  2 ; Reward-for-evaluation: L  2 ; wmm (Catalysts, Researchers); wmm (Researchers, crowd1); mm (crowd1, Researchers); wmm (Researchers, crowd2); mm (crowd2, Researchers); mm (crowd1, crowd2); smm (team i, team j); and both teams i and j in crowd1; One can see that the 3rd revised game is more competitive than the first revised competition because more people will be involved in evaluation, as (L + N + G) > (N + G). Similarly, it is more competitive than 2nd revised competition because (L + N + G) > (L + G). The game theory model defined in Section 3 formulates Min-Max relationships in all the versions of the Harvard-Topcoder processes. Both Harvard-TopCoder process and the revised processes have elements of a cooperative game as the Catalyst and Researcher teams work as groups with a team goal and objective, and don’t need to go through defense-offense steps for quality assurance. However, the individual people in the crowd need to compete with each other to gain the limited rewards. Evaluation of crowd submissions indicates Min-Max competition. Such a competition tends to become more intense when the evaluation is undertaken by the same crowd. Although in Section 3, we already describe a game model on the competition mechanism, we don’t analyze the amount and structure of award prize as well as its impact the incentive of crowdsourcing participant. Here, we start with a simple two-party game model to discuss these factors. Assume only two parties A and B are interested in join the competition with winning award Price, and A will win with probability PA (assuming B will also participate) with EA effort, and B will win with

probability PB with EB effort. Note every player in TopCoder has a skill rating score based on his performance in each contest [18][19]. Before each game, players can check the ranking list to estimate their winning probabilities. We can have the following table as the two-party join decision form. Table 13 : The Strategic form in Join Decision A joins not A joins

B joins not

B joins

0,0

0, price  E B

price  E A ,0

PA price  E A

PB price  E B

The price includes both the monetary reward PM and/or reputation reward PR, price  PM  PR . If Player

A

and

B

are

equally

competitive,

then PA  PB  0.5 . Apparently, when .5 price  E A and .5 price  E B are both positive, “A Joins, B Joins” decision will become a Nash equilibrium. Otherwise, if .5 price  E A is positive, A will participate in the competition; similarly B will join if

if Pk 1 price  E k 1 is positive where PK+1 is the probability for party K+1 to win. If the price is high enough, more people will be willing to participate if the payoff can be positive. TopCoder Phenomenon: The multiparty payoff analysis can explain the phenomenon that a recommended number of TopCoder contestants for each competition is two [13]. Specifically, if two highly ranked participants declared to enter the competition, other lower ranked participants will choose not to participate, as the expected payoff P3 price  E 3 is too small for the effort. Estimating Reputation Value: While a recommended number of contestants for TopCoder is two, but the Harvard-TopCoder has reported a large number of submissions, and teams have even submitted multiple submissions knowing that almost all of their submissions will not win anyway. This is because of intrinsic value for the competition due to the reputation of organizers. Thus, the true Price =

price M  priceR where price M is the announced

.5price  E B is positive. competition price, and The above game theory analysis can be extended to the case where multiple people join the competition. Assuming K people already decided to join, and party K+1 is contemplating to join the competition. As the first K persons already determined to join, Table 14 will have one row only. However, in this table, we treat the total payoff for the K people. Table 14 : The Strategic form in N-party Join Decision K joined

K+1 joins not

K+1 joins

P   E i ,0

((1  Pk 1 )price   E i ),

Pk 1 price  E k 1 In other words, when party K+1 joins the competition, each already joined contestant will have less chance of winning. Thus, party K+1 will join only

price R is the intrinsic

reputation price of the competition. This can be used to estimate the intrinsic reputation price. Specifically, for a competition to have K submissions, the overall payoff must have

priceM  priceR   E i  0 ; so,

priceR   E i  priceM and

(10)  E i can

be

estimated and priceM is the announced price. For example, if a competition has a price tag of $200K, and each effort requires on average 20 hours of effort at $150 per hour rate, then total effort for the competition with 600 contestants can be calculated:  E i = 20 × 150 × 600 = $1.8M. As the announced

price is $200K, thus the implied intrinsic reputation

value is at least $1.6M, much higher than the announced price. Min-Max Optimization for Price and Number of Participants: Once the intrinsic reputation value is known, an organizer can determine the optimal number K of participants and the monetary award price for a given competition. This is again like game theory where one tries to minimize one aspect, but maximize the other aspect. Specifically,

price M  price R  EK  0 ;

high award price and run by a reputable organizer.

5.2 AppStori Processes The AppStori process is illustrated in the following figure with multiple groups of stakeholders.

(11)

where E is the average effort and K is the desired number of participants to be maximized, and

price M is the desired price target to be minimized. The above formula can be changed to

price M  EK  price R ; For example, if

(12)

price R is $1.6M as in the previous

example, each team needs to spend $3000 on the competition. In this case, even if

price M is zero,

there are about 533 teams willing to compete due to the intrinsic reputation value. If the announced price is $1M, then about 866 teams will be willing to join the competition due to both reputation value and the monetary award. Eq.12 actually indicates that even if contestants need to pay the organizer to join the competition with a zero target price (thus PriceM will be negative), a small number of teams are still willing to participate due to high reputation value. Competition Design Tradeoffs: The above analysis demonstrated interesting tradeoffs: 

If the goal is to obtain quality solutions at a low cost, it is important that a recognized organizer runs the competition.



If the cost saving is the primary goal, ask a reputable organizer to run the competition with low or zero award price.



If the goal is to obtain quality solutions and broadening participation, it is important to have a

Fig.2

AppStori Software Crowdsourcing Process

This process can be characterized as follows: AppStori Process Construction: N  N , N  2 ; Evaluation: G  N , and G  1 ; wmm (Project Team, Review Board); wmm (Project Team, Funding Contributor Crowd); wmm (Project Team, Beta Tester Crowd); Quality: May be high if N keeps going up. Cost: Crowd funding. Averagely the initial funding for each project ranges from $5000 - $20,000. Diversity of solutions: High as N is large. Nature of competition: Collaborative and Fair. In contrast to the competitive nature of the coding stage in the Harvard-TopCoder process, this process encourages collaboration. Each AppStori project adopts agile development methodology to have a small but cohesive team to work on the coding. And the team has weak min-max relationship with other parties in the process, including the AppStori review board, funding contributor crowd, and beta tester crowd. wmm(project team, review board): the founders of the project team can post a project proposal, specifying the features of the mobile application, the project

budget, the potential revenue model in AppStore and milestones. They needs to minimize the issues of their proposal in front of the AppStori review board to make the proposal become an AppStori project online. The AppStori review board screens the submitted proposals by maximizing the weakness in the proposal to ensure the quality of the accepted AppStori projects. wmm(project team, funding contributor crowd): every AppStori project is funded by crowd. On the web page of the project, there is a progress indicator showing how much donation the project has received and how much to go by the deadline. The project team must endeavor to push the project toward the milestones defined in the proposal and minimize the probability that the donation crowd would withdraw their contributions. wmm(project team, beta tester crowd): any AppStori community member can sign up to become a beta tester in a project. The team must work collaboratively with beta testers from the community to minimize the bugs in the code. After the beta testing, the project will submit the product to the AppStore review board of Apple, which often assigns two reviewers to check the product following the AppStore review guidelines [20]. The quality of the accomplished product and the good communication with the community determine whether the project can eventually reach the funding goal. As a crowdfunding driven software crowdsourcing site, AppStori presents a different way to run crowdsourcing project. There are neither explicit ranking of individual designers and developers nor contest prizes for each project. Instead, each AppStori team estimates the effort for their project including its deadline and budget, and reach out to crowd for funding support. Therefore, the success of the project totally depends upon the dedication made by the project teams and the involved evaluators. For the perspective of the AppStori platform, it must constructively help application teams to refine their work to attract more donors for funding. We can apply the weak Min-Max model defined in Eq.7-9 to further elaborate this observation. Since the award is actually the budget estimated by the AppStori team who is the Player A in Table 12, we can assume c  w . And when the project fails to meet

the quality requirement, the project will lose all the donations. So we have w   0 . Thus, Eq.9 can be recomputed as:

 

 c  1,    1  c w w

(13)

Apparently, the project team and the evaluator crowd can always maintain 100% working memento as long as the project budget is properly estimated.

6 Related Work Many scientists analyzed the economics of crowdsourcing contests. Archak and Sundarrarajan [14] used game theory in analyzing crowdsourcing contests particularly related to optimal price structure for designing contests. Then Archak [21] extended the approach to studying the impact of TopCoder’s reputation system on the TopCoder community members, and analyzed the principal factors such as project payment and requirements on the quality of the outcome in the competition. The author presents an in-depth analysis of the reputation system and the registration strategy utilized by contestants. Similarly, other researchers [22] [23] attempted to model crowdsourcing as business auction and leverage the research of auction theory to build models for reward system and effective strategies for crowdsourcing participants. DiPalantino [22] proposed an all-auction model to describe the contest process of crowdsourcing and capture the essential relationship between rewards and participation. All these efforts have quite different focus from this paper. Their goal is to study the mechanism of crowdsourcing systems, such as pricing and bidding strategies as well as rewarding rules. Their discussions are mostly in the scope of classic auction methodology and are not related to issues in software development, i.e., maximizing the software quality and creativity via crowdsourcing. Bacon [24] introduced a new paradigm of software evaluation through a market-driven mechanism that presents rewards for developers, testers, and bug reporters to bidding for the tasks of bug fixing. The author also defined the notion of “sufficient correctness” in the context of crowdsourcing and designed the components for the market design. Bullinger and Moeslein [25] listed several factors in

designing crowdsourcing contests such as media (online, offline, or mixed), organizer (companies, government agencies, non-profit, and individuals), participants (individuals, teams, or both), contest periods (from very short term to very long term), reward/motivation (monetary rewards, reputation rewards, or both), evaluation (such as jury evaluation, peer evaluation, self-evaluation), types of deliverables (such as concepts, prototypes, and solutions). They also suggested several research topics in crowdsourcing. Leimeister and others proposed [26] the concept of “activation-enabling” as the basis for using competitions in software crowdsourcing and open innovation, and presented many factors related to IT software crowdsourcing. For example, motivation for participation can be learning, self-marketing, social motives, and direct compensation. Software crowdsourcing processes can be considered as a specific kind of software development where the crowds are involved certain aspects of software development. By comparing software crowdsourcing processes with traditional software development, one may identify their similarities and differences as well as the reason for quality software products [13]. One reason that software crowdsourcing can produce quality software is due to stiff competition rules and multi-phases of defense-offense (or min-max) as in competitive games among participants in software crowdsourcing. The min-max analysis has been applied to two software crowdsourcing processes TopCoder and AppStori [13]. Some also suggested that software crowdsourcing can be used to develop ultra-large systems (ULS) [27]. Bratvold and Armstrong [28] suggested that crowdsourcing projects to break down a complex task into many microtasks so that each microtask can be crowdsourced for a variety of reasons such as security. Specifically, as each microtask covers a small percentage of the overall project, the crowd who participated in the project will not know the trade secret of the project. According to their data, only 17% of people who are active among those indicated that they are interested in crowdwork. The quality of work can be determined by redundancy in work, peer review, and a "gold" process. A gold process is to issue tasks to the

crowd with known solutions, and the work returned by individuals can be determined by comparing the answer with the gold standard. They will allow only those people that have passed the previous gold tests to participate in future crowdsourcing tasks. They also suggested a crowd platform should have these four features:  

Access to the crowd with at least 100,000 workers; Provide API for clients to integrate a large number of submissions for evaluation and integration.



Provide a platform that is secure for all parties including clients and workers.



Understand the business processes conducted by large corporations such as Fortune 500 companies. Their recommendations are consistent with the results of this paper, as the quality of software turned in can be enhanced by crowd review.

7 Conclusion This paper analyzes software crowdsourcing processes, and examines its key characteristics. Specifically, the paper proposes a novel evaluation framework to analyze collaborative and competitive nature of software crowdsourcing processes toward different goals such as broadening participation, seeking innovative ideas and promoting collective learning. On the collaboration aspect, the framework defines the weak min-max mechanism to describe software crowdsourcing, a cornerstone principle of software ecosystems and the driving force to promoting participation and learning in the ecosystems. On the competition aspect, the framework defines the min-max mechanism and gives a game theory model to show how such mechanisms to identify highly skilled talents and improve the quality of software. We evaluated two crowdsourcing examples with this framework and illustrated their processes. Detailed analysis reveals the inherent causality between their goal priority and crowdsourcing mechanisms, which demonstrates the effectiveness of our framework. We plan to make crowdsourcing experiments to further validate the framework and extend its mathematical model. Based on this framework, we are developing software tools to facilitate software

architects to design software crowdsourcing processes, construct crowdsourcing platforms and make it a viable environment for producing high-quality software and promoting software engineering education. Compared to other software development approaches such as plan-driven software development and agile development, crowdsourcing software development provides flexible settings to develop software as most software development activities can be crowdsourced including concept development, requirement analysis, specification, design, coding, testing, and modification. The min-max nature built in the competition rules ensures that the delivered products have certain quality especially if multiple parties are involved.

Acknowledgement This project is sponsored U.S. National Science Foundation project DUE 0942453 and National Science Foundation China (No. 61073003), National Basic Research Program of China (No.

2011CB302505), the Open Fund of

the State Key Laboratory of Software Development Environment (No.

SKLSDE-2009KF-2-0X and SKLSDE-2012ZX-19), and

Fujitsu Laboratory.

References 1. A. Doan, R. Ramakrishnan, and A. Y. Halevy, "Crowdsourcing Systems on the World-Wide Web," Communications of ACM, Vol. 54, No. 4, April 2011, pp. 86-96. 2. Karim R. Lakhani, David A. Garvin, Eric Logstein, “TopCoder: Developing Software through Crowdsourcing,” Harvard Business School Case 610-032, 2010. 3. uTest, retrieved from https://www.utest.com/ on Jan 12, 2013. 4. Jan Bosch, From Software Product Lines to Software Ecosystems, SPLC '09 Proceedings of the 13th International Software Product Line Conference. 5. Slinger Jansen, Anthony Finkelstein, Sjaak Brinkkemper, A sense of community: A research agenda for software ecosystems, Software Engineering - Companion Volume, 2009. ICSE-Companion 2009. 31st International

Conference on Software Engineering. 6. Apple Store Metrics, http://148apps.biz/app-store-metrics/, retrieved June 25, 2012. 7. AppStori: http://appstori.com/, retrieved on Jun 10, 2012. 8. Aniket Kittur, “Crowdsourcing, Collaboration and Creativity,” XRDS, Winter 2010, Vol. 17, No. 2, 2010, pp. 22-26. 9. Capability Maturity Model Integration, Software Engineering Institute, Carnegie Mellon University/SEI-2002-TR-012. Pittsburg, PA (2002). 10. Department of Defense-STD-2167A: MILITARY STANDARD: Defense System Software Development, United States Department of Defense. 04 JUN 1985.. 11. E. Schenk and Claude Guittard, “Crowdsourcing: What can be Outsourced to the Crowd, and Why?” Workshop on Open Source Innovation, Strasbourg, France. 2009. 12. Raymond Tong , Karim Lakhani, Public-Private Partnerships for Organizing and Executing Prize-Base Competitions, Harvard University Berkman Center For Internert and Society, Research Publication No. 2012-13 13. Harvard University, Harvard University Clinical and Translational Science Center, “Algorithm Development through Crowdsourcing”, http://catalyst.harvard.edu/services/crowdsourcing/, retrieved on June 10, 2012. 14. Nikolay Archak, A. Sundararajan, “Optimal Design of Crowdsourcing Contests,” Proc. of International Conference on Information Systems (ICIS), Association for Information Systems, 2009, pp. 1-16. 15. W. Wu, W. T. Tsai, W. Li, “Creative Software Crowdsourcing”, to appear in International Journal on Creative Computing, 2013. 16. C.Y. Baldwin, K.B. Clark, “The Architecture of Participation: Does Code Architecture Mitigate Free Riding in the Open Source Development Model?”, Management Science July 2006 vol. 52 no. 7, pp 1116-1127. 17. D.G. Rand, A.Dreber, T.Ellingsen, “Positive Interactions Promote Public Cooperation”,

SCIENCE, Sep 2009 , vol 325, pp1272-1275. 18. Ralf Herbrich and Thore Graepel, TrueSkillTM: A Bayesian Skill Rating System, Tech. report, 2006, MSR-TR-2006-80. 19. TopCoder Inc., Algorithm Competition Rating System, retrieved from http://apps.topcoder.com/wiki/display/tc/Algorithm +Competition+Rating+System on Jan 12, 2013. 20. Apple App Store Review Guidelines, https://developer.apple.com/appstore/guidelines.ht ml, Sept. 9, 2010, retrieved on Jun 10, 2012. 21. Nikolay Archak, Money, “Glory and Cheap Talk: Analyzing Strategic Behavior of Contestants in Simultaneous Crowdsourcing Contests on TopCoder.com,” Proceedings of the 19th international conference on World Wide Web, 2010. 22. Dominic DiPalantino, Milan Vojnovi´c, “Crowdsourcing and All-Pay Auctions,” EC '09 Proceedings of the 10th ACM conference on Electronic commerce. 23. John Joseph Horton, Lydia B. Chilton, “The Labor Economics of Paid Crowdsourcing,” EC '10 Proceedings of the 11th ACM conference on Electronic Commerce, Pages 209-218. 24. David F. Bacon, Yiling Chen, David Parkes, and Malvika Rao, “A Market-Based Approach to Software Evolution,” 24th ACM SIGPLAN conference companion on Object oriented Programming Systems Languages and Applications, October 25–29, 2009, Orlando, Florida, USA. 25. Angelika Bullinger and Kathrin Moeslein, “Innovation Contests – Where are we?” AMCIS (Americas Conference on Information Systems) 2010 Proceedings, Paper 28. 26. J. M. Leimesister, M. Huber, U. Bretschneider, H. Krcmar, “Leveraging Crowdsourcing: Activation-Supporting Components for IT-based Ideas Competition,” Journal of Management Information Systems (JMIS), Vol. 26, No. 1, 2009, pp. 197-224. 27. Rick Kazman and Hong-Mei Chen, Metropolis Model: A New Logic for Development of Crowdsourced Systems, Communication of ACM Volume 52 Issue 7, July 2009.

28. David Bratvold and Casey Armstrong, The Definite to Microtasking, 2013. www.dailycrowdsource.com. Wenjun Wu is a Professor in the School of Computer Science and Engineering at the Beihang University. He was previously a research scientist from 2006 to 2010, at the Computation Institute (CI) at the University of Chicago and Argonne National Laboratory. He was a technical staff and post-doctoral research associate from 2002 to 2006, at the Community Grids Lab at the Indiana University. He received his BS, Master and PhD degrees in Computer Science from Beihang University in 1994, 1997 and 2001, respectively.He published over fifty peer-review papers on journals and conferences. His research interests include: crowdsourcing, green computing, cloud computing, eScience and cyberinfrastructure, multimedia collaboration. Wei-Tek Tsai is currently a professor in the School of Computing, Informatics, and Decision Systems Engineering at Arizona State University, USA. He received his Ph.D. and M.S. in Computer Science from University of California at Berkeley, and S.B. in Computer Science and Engineering from MIT, Cambridge. He has produced over 300 papers in various journals and conferences, two Best Paper awards, and awarded several Guest Professorships. His work has been supported by US Department of Defense, Department of Education, National Science Foundation, EU, and industrial companies such as Intel, Fujitsu, and Guidant. In the last ten years, he focused his energy on service-oriented computing and SaaS, and worked on various aspects of software engineering including requirements, architecture, testing, and maintenance. Professor Wei Li is the member of Chinese Science Academy. He received his Ph.D in Computer Science from University of Edinburgh and BS degree in Mathematics from Peking

University. He is the director of State Key Lab of Software Environment Development and vice-chair of Chinese Institute of Electronic. He was president of Beihang University from 2002 to 2009. Currently, he is serving as Editor-in-Chief for Science China Information Sciences, Editor for Journal of Computer Science and Technology and International Journal of Advanced Software Technology, Science in China Publisher. He has published over 100 papers and one book.

An Evaluation framework for Software ... - Semantic Scholar

An Evaluation framework for Software ... - Semantic Scholar

Suggest Documents

An Evaluation Framework for Reputation ... - Semantic Scholar

Software Components Evaluation: an Overview - Semantic Scholar

An Experimental Evaluation of Software ... - Semantic Scholar

An integrative framework for intelligent software ... - Semantic Scholar

An automated framework for software test oracle - Semantic Scholar

An automated framework for software test oracle - Semantic Scholar

An automated framework for software test oracle - Semantic Scholar

Software Architecture Visualization: An Evaluation Framework and ...

An affordance-based framework for CVE evaluation - Semantic Scholar

An Evaluation Framework for MAS Modeling ... - Semantic Scholar

an evaluation framework for publications on ... - Semantic Scholar

An Evaluation Framework for Traffic Information ... - Semantic Scholar

An Evaluation Framework for Controlled Natural ... - Semantic Scholar

An Evaluation Framework for Japanese Speech ... - Semantic Scholar

An evaluation framework for viable business ... - Semantic Scholar

An Evaluation Framework for Assessing and ... - Semantic Scholar

Performance Evaluation for Software Migration - Semantic Scholar

Evaluation Approaches for Software Architectural ... - Semantic Scholar

Performance Evaluation for Software Migration - Semantic Scholar

An Evaluation Framework for Software Project Initiation - wseas.us

An Efficient Argumentation Framework for ... - Semantic Scholar

AN EFFICIENT FRAMEWORK FOR ROBUST ... - Semantic Scholar

An Abstract Framework for Modeling ... - Semantic Scholar

An Integrated Framework for Operational ... - Semantic Scholar