An Enhanced Genetic Approach to Composing ... - Semantic Scholar

Hwang, G.-J., Yin, P.-Y., Hwang, C.-W., & Tsai, C.-C. (2008). An Enhanced Genetic Approach to Composing Cooperative Learning Groups for Multiple Grouping Criteria. Educational Technology & Society, 11 (1), 148-167.

An Enhanced Genetic Approach to Composing Cooperative Learning Groups for Multiple Grouping Criteria Gwo-Jen Hwang Department of Information and Learning Technology, National University of Tainan, Taiwan // [email protected]

Peng-Yeng Yin, Chi-Wei Hwang Department of Information Management, National Chi Nan University, Taiwan

Chin-Chung Tsai Graduate School of Technological and Vocational Education, National Taiwan University of Science and Technology, Taiwan

ABSTRACT Cooperative learning is known to be an effective educational strategy in enhancing the learning performance of students. The goal of a cooperative learning group is to maximize all members’ learning efficacy. This is accomplished via promoting each other’s success, through assisting, sharing, mentoring, explaining, and encouragement. To achieve the goal of cooperative learning, it is very important to organize well-structured cooperative learning groups. In this study, an enhanced genetic algorithm is proposed to organize cooperative learning groups to meet multiple grouping criteria. To show the usefulness of the algorithm, this study presents a case that, for a given course, the teacher sets the criteria of grouping that each concept of a certain course topic is precisely understood by at least one of the peer students in each group, and the average learning achievement of each group is approximately identical. Based on our enhanced genetic algorithm, an assistant system for organizing cooperative learning groups has been developed. Experimental results have shown that the enhanced approach is able to efficiently organize cooperative learning groups that more fit the instructional objectives set by the instructor.

Keyword Web-based learning, cooperative learning, meta-heuristic algorithm, genetic algorithm

1. Introduction Computer-supported cooperative learning has been a key trend in e-learning since it highlights the importance of social interactions as an essential element of learning that allows teachers to get directly involved in design activities (Hernández-Leo et al., 2006). “Cooperation” in this context means working together to accomplish common goals. Within the realm of cooperative activities, individuals seek outcomes that are beneficial to all members of the group. Cooperative learning refers to the instructional use of small groups so that students work together in order to maximize the learning efficacy of all group members (Johnson et al, 1991, Johnson & Johnson, 1999, Huber, 2003). Well-organized cooperative learning involves people working in teams to accomplish a common goal, under conditions in which all members must cooperate in the completion of a task, whereupon each individual and member is accountable for the absolute outcome (Smith, 1995). During the past decades, hundreds of relevant studies have been conducted to compare the effectiveness of cooperative, competitive, and individualistic efforts by a wide variety of researchers in different decades using different methods (Smith, 1995, Keyser, 2000, Ramsay et al, 2000, Rachel & Irit, 2002, Veenman et al, 2002). However, how to group the students in a cooperative learning context may have not been sufficiently discussed (Zurita et al., 2005). Also, for grouping students into a cooperative learning context, instructors may consider more than one factor or criterion, and educators in different educational contexts or backgrounds may have different sets of criteria for grouping students. In this paper, we formulate a Multi-Criteria Group Composition (MCGC) problem to model the composition of cooperative learning groups that meet multiple grouping criteria for various instructional purposes, and propose an enhanced genetic algorithm to cope with the problem. To evaluate the performance of the proposed algorithm, a series of experiments have been conducted by comparing the novel approach with other previously employed methods. ISSN 1436-4522 (online) and 1176-3647 (print). © International Forum of Educational Technology & Society (IFETS). The authors and the forum jointly retain the copyright of the articles. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear the full citation on the first page. Copyrights for components of this work owned by others than IFETS must be honoured. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from the editors at [email protected].

148

2. Relevant Research In past decades, cooperative learning researchers have shown that positive peer relationships are an essential element of success during the learning process, and isolation and alienation will possibly lead to failure (Tinto, 1993). In a cooperative learning group, students are assigned to work together with the awareness that success fundamentally depends upon the efforts of all group members. The group goal of maximizing all members’ learning abilities provides a compelling common purpose, one that motivates members to accomplish achievements beyond their individual expectations. Students promote each other’s success through helping, sharing, assisting, explaining, and encouraging. They provide both academic and personal support based on a commitment to and caring about each other. All of the group members are taught teamwork skills and are expected to use them to coordinate their efforts and achieve their goals (Smith, 1996). 2.1 Web-based Cooperative Learning Many studies have documented the effectiveness of cooperative learning in the classroom (e.g., Mevarech, 1993; Smith, 1996; Ghaith & Yaghi, 1998), and indicated some problems to be noticed, such as the “free-rider effects” (Hooper, 1992; Johnson & Johnson, 1990). With the rapid progress of computer and network technologies, researchers have attempted to construct interactive cooperative learning systems based on the experiences from these in-class cooperative learning (Klingner & Vaughn, 2000; Porto, 2001; Swain, 2001; Ghaith, 2002). In recent years, some cooperative learning activities have been performed on web-based learning environments, so that the students in different locations can cooperate to learn (Kirschner, 2001; Johnson et al, 2002; Sheremetov & Arenas, 2002, Macdonald, 2003). For example, Sun and Chou (1996) presented the CORAL system, which promotes cooperative and collaborative learning by providing windows that convey both verbal messages, such as voice, and nonverbal messages, such as facial expressions, to increase the social presence of the system. That is, the degree to which the system permits users to interact with others, as if they are face to face. Furthermore, Pragnell et al. (2006) have implemented a web-based environment and conducted two experiments to assess the quantity and quality of the interaction promoted by the system and how such factors as gender, background knowledge and role affect communication. 2.2 Composition of Cooperative Learning Groups From a variety of practical applications, it has been noted that well-structured cooperative learning groups are differentiated from poorly structured ones (Sun & Chou, 1996; Zurita et al, 2005). Therefore, it is an important issue to know how to organize cooperative learning groups in a way that can benefit all of the students in the class. Slavin (1989) suggested the best group size is from two to six, and the composition of the cooperative learning groups should take multiple criteria, such as personal interests, motivational orientations, learning achievements and sex distinctions of the students, into consideration. Other researchers also indicated that different grouping criteria for small groups may affect the learning performance and social behavior of the students (Beane & Lemke, 1971; Dalton, et al., 1989; Hooper & Hannafin, 1988). Therefore, it is necessary to study the composition of cooperative learning groups to meet multiple grouping criteria. Nevertheless, as the number of the students in an online course could be very large, it is almost impossible for a teacher to organize a set of cooperative learning groups to meet multiple criteria. To cope with this problem, in the following sections, a multi-criteria group composition problem is defined, and an enhanced genetic approach is proposed to efficiently compose well-structured cooperative learning groups to meet the grouping criteria.

3. Multi-Criteria Group Composition (MCGC) problem Researchers have indicated that different grouping criteria may affect the learning performance and social behavior of the students; however, it is almost impossible for the teacher to manually compose cooperative learning groups to meet multiple grouping criteria, especially for web-based courses with a large number of students. In this section, a 149

Multi-Criteria Group Composition (MCGC) problem is formulated to model the composition of cooperative learning groups that meet multiple grouping criteria for various instructional purposes. In a MCGC problem, the grouping criteria can include a variety of factors related to learning characteristics (e.g., motivation orientations, learning achievements and learning styles) of the students. For example, in a “Management Information System (MIS)” course, a hypothetical case here assumes that the teacher attempts to design a cooperative learning activity to observe whether the students with poorly-learned concepts can be improved via the assistance of other students with better conceptual understandings if the group composition is based on the student conceptual assessment results. Therefore, in composing the cooperative learning groups, the teacher may set the criterion that for each concept to be learned, at least one of the members in each group has learned the concept well. Moreover, it is assumed that the teacher wishes to control two additional factors, including the number of students and the average learning achievement in each group. The concepts to be learned are denoted by C1, C2, …, Cn (for example, for a certain topic of management information system, decision support expert systems, it involves several concepts, such as the meanings of logical operators, knowledge representation, facts, rules, inference chaining, knowledge acquisition, elements, constructs, ratings and uncertainty). The students in this course are denoted by S1, S2, …, Sm, and will be partitioned into r groups. The notations that could be used in this example are presented in Table 1. Notations n m r Ci Si Gi fi Lik xij

Table 1. Notations used in the MCGC problem for MIS course Descriptions Number of concepts to be learned in a course Number of participated students Number of cooperative learning groups The ith concept to be learned The ith participated student The ith cooperative learning group Pre-testing score of the ith student Binary indicator, Lij = 1 means the ith student already leaned the kth concept before the grouping, and 0 otherwise Binary decision variable, xij = 1 means the ith student is assigned to the jth group, and 0 otherwise

The group-composition problem of the MIS course described above can be interpreted as an MCGC problem for finding an optimal composition of cooperative learning groups and constrains that (1) each participated student is assigned to exactly one group, (2) the difference between the numbers of students in different groups is no more than one, and (3) each concept to be learned in the course has been precisely understood by at least one of the students in each group such that the student in each group can learn all the required concepts from his/her group members in the same group. Therefore, the optimization objective of this MCGC problem is to minimize the maximal difference of the average pre-testing score between any two groups. Formally, the MCGC problem is formulated as follows:

⎧⎪ ∑ xij f i ⎫⎪ ⎧⎪ ∑ xij f i ⎫⎪ i =1 i =1 Min − Z (xij ) = Max ⎬ ⎬ 1≤ j≤r ⎨ m ⎨ m 1≤ j ≤ r ⎪⎩ ∑i =1 xij ⎪⎭ ⎪⎩ ∑i=1 xij ⎪⎭ m

Minimize

m

(1)

subject to

xij ∈ {0, 1}, 1 ≤ i ≤ m , 1 ≤ j ≤ r

∑ ∑

r j =1

xij = 1 , 1 ≤ i ≤ m

x − ∑i =1 xij2 ≤ 1 , 1 ≤ j1 < j2 ≤ r i =1 ij1 m

(2) (3)

m

(4)

150

∑

m

i =1

xij Lik ≥ 1 , 1 ≤ j ≤ r

and

1≤ k ≤ n

(5)

In the above formulation, the objective function (1) is to minimize the maximal difference between the mean pretesting scores of any two distinct groups. Constraint (2) describes that xij is a variable that takes a binary value of either 0 or 1. Constraint (3) stipulates that each student is assigned to exactly one group. Constraint (4) ensures that the difference between the numbers of students of any two groups is no more than one. Constraint (5) guarantees that each concept has been understood by at least one of the students of each group. To cope with the MCGC problems, in the next section we shall propose an enhanced genetic algorithm which can derive quality solutions within reasonable times.

4. An Enhanced Genetic Algorithm to the MCGC Problem In this section, we first give the historical background of the genetic algorithms (GAs) and review the state-of-the-art technology. An enhanced genetic algorithm (EGA) to the MCGC problem under this study is then proposed. The differences between the additional features of EGA and the standard GAs are described in detail. Finally, a comprehensive example is provided to illustrate how our algorithm works. Chromosome encoding and initial population generation

Fitness evaluation

Selection

Crossover

Mutation

No Stopping criterion satisfied?

Yes Terminate and output best solution

Figure 1. Flowchart of the SGA 4.1. Review of GA Paradigm Holland (1975) and his students first coined the words “genetic algorithms” and developed the first versions of implementation in early 1970s. Their algorithms mimic the natural selection and genetics to enforce the principle of “survival of the fittest”, known as the Darwinian Rule, for interpreting the evolution (Balady, 2002). A simple GA (SGA) starts with an initial population which is a pool of candidate solutions encoded on the basis of the decision variables of the optimization problem in hand. The SGA improves the average fitness of the population by repeatedly 151

applying to them three genetic operators, namely selection, crossover, and mutation. When a pre-specified criterion for evolution convergence is met, the algorithm terminates and outputs the best solution ever seen so far. The flowchart of the SGA is presented in Fig. 1. In the following we describe each SGA component in detail. Chromosome encoding and initial population generation To apply a GA to solve an optimization problem, we need to encode the candidate solution to the problem in hand into a chromosome which is a sequence of genes. The SGA usually encodes the genes into binary values, although the real numbers and character symbols are allowed in a more sophisticated version. The genetic operators are applied to the chromosome form for performing the evolution. However, the chromosome needs to be decoded to its original solution for assessing its fitness that the solution solves the optimization problem. GA fosters a population of chromosomes. The initial population is usually generated at random in order to uniformly sample the solution space and prevent the modeling bias. The population evolves from generation to generation through the application of genetic operators and the population size, denoted by Pop, is kept constant throughout the evolution. Fitness evaluation According to the Darwinian Rule, the fitter individuals have a higher chance to survive. To evaluate the fitness of each chromosome, we need to define a fitness function to assess how well the solution decoded from the chromosome solves the addressed problem. Usually the objective function of the problem can be used for this purpose. However, the yielded solutions could be infeasible when the problem constraints are too strict; we can incorporate a penalty into the fitness to reflect the discount. Selection Selection is a process performing the survival of the fittest. The generated chromosome survives with a probability that is proportional to the ratio between its fitness and the mean fitness of the population it lives. Let the fitness of chromosome Y be Fitness(Y). The selection probability of chromosome Y is determined by Pr(Y) =

Fitness (Y )

∑

Pop

i =1

Fitness (Yi ) . The roulette-wheel selection is one of the most popularly used selection schemes

and it proceeds as follows. To perform the selection according to the probability distributions of Pr(Y), the accumulated selection probability Ai is computed by Ai =

∑

i j =1

Fitness (Y j ) . Then, a random real number q is

drawn from a uniform distribution U[0.0, 1.0]. The roulette wheel selection will pick Yi if Ai −1 < q ≤ Ai . The selection process will be repeatedly performed until the population is fully filled. Crossover Crossover is a genetic process where the segments of genes are exchanged among chromosomes in a hope that fitter combinations of genes are obtained. After selection, the mating parents are randomly selected from the population and they undergo crossover with a probability Pc. The SGA employs the single point crossover to initiate the evolution. Let chromosomes α and β be the mating parents, α = [ 1 0 1 1 0 0 1 0 1 1 ], β = [ 0 0 1 1 1 1 1 0 0 1 ]. A random gene position s is generated and the new offsprings of

α and β

after position

α′

α′

and

β′

are obtained by exchanging all the bits

s , say, s = 4, then

= [ 1 0 1 1 1 1 1 0 0 1 ], 152

β ′ = [ 0 0 1 1 0 0 1 0 1 1 ]. α ′ and β ′ retain the common genes (called schema) of their parents, i.e., [#011##10#1], where ‘#’ is a

Note that don‘t care bit. This means the crossover operation enables the SGA to explore in the constrained hyperspace defined by [#011##10#1]. Mutation Crossover provides a genetic mechanism for conducting evolution in the schema space of the existing genes. On the other hand, another essential genetic operator, called mutation, is a random alteration of existing genes to guarantee a non-zero probability of the evolution toward any feasible sequence and to regenerate lost genetic material. Mutation is performed with a very low probability Pm. For the binary coding chromosome, the mutation alteration is simply performed by bit flipping, i.e., flipping between 0 and 1. Let [ 1 0 1 1 0 0 1 0 1 1 ] be the slected chromosome to which the mutation is to be applied. A random gene position, say 4, is chosen and the allele on the position is altered. Therefore, the mutated chromosome becomes [ 1 0 1 0 0 0 1 0 1 1 ]. Stopping criterion The evolution of natural organism never stopped unless it extinguished. However, for the simulation GA program we need to set up a stopping criterion such that quality solution to the problem in hand can be obtained in reasonable time. The usually adopted stopping criterion includes (1) setting a maximal number of generations, (2) setting a minimal quality level for the finally obtained solution, and (3) setting a maximal number of generations between two consecutive improvement of the best solution observed so far. Since 1990s, many variations of GA technology have been developed. For example, Deb & Agrawal (1995) developed new genetic operators for conducting evolution with real number encoded chromosomes in the continuous search space. Adaptive strategies for varying the control parameters and chromosome encodings have been proposed to improve performance (Srinivas & Patnaik, 1994). Damidis (1994) reviewed several parallel GAs which divide the problem into chunks and solve the chunks simultaneously using isolated GAs running on distributed processors, as such the problem is solved more efficiently. The individuals can be exchanged among the processors, called migration, and the performance has been shown to be superior to that using a single population. To cope with multiobjective optimization problems, a Nondominated Sorting GA (NSGA) has been developed by Srinivas & Deb (1994). Their method can identify Pareto-optimal solutions which compromise among multiple objectives. 4.2. EGA to the MCGC Problem In this section, we propose an enhanced GA (EGA), which adds several new features and advances the GA technology in a number of aspects, so that it is customized to the scenarios of the MCGC problem. The differences between the EGA and the traditional GAs are described in detail. 4.2.1. Chromosome Encoding In GAs, a chromosome represents a candidate solution to the addressed problem. For our MCGC problem formulation (see Eqs. (1)-(5)), the traditional binary encoding rule will use an m×r binary vector as X = [x11, x12, …, x21, x22, …, xm1, x12, …, xmr]. However, Constraints (3) states that each student should be assigned to exactly one group, thus, only m bits of X take value 1 and the remaining m(r – 1) bits take value 0, resulting X a sparse vector. Instead of using the above m×r vector representation, we present another chromosome encoding rule which is more memory-saving and computationally efficient. Let the chromosome be represented by Y = [y1, y2, …, ym], where yi ∈ [1, r] is an integer value indicating the index of the learning group to which Si is assigned. The advantages of employing our chromosome encoding rule over the traditional binary encoding rule are two fold. (1) The storage and the encoding/decoding complexity for the chromosomes are both reduced. (2) Constraints (2) and (3) are implicitly 153

satisfied by using our chromosome encoding rule because xij = 1 for j = yi, and xij = 0 for ∀j ≠ yi, so our encoding rule stipulates that each student is assigned to exactly one group. To ensure the satisfaction of Constraint (4) which states that the difference between the numbers of students in different groups is no more than one, we enforce the grouping rule that either ⎣m r ⎦ or ⎣m r ⎦ +1 elements of Y can take the same value simultaneously when generating the initial chromosome population. As for Constraint (5), each concept to be learned has been already understood by at least one of the students in each group, it is very hard to devise an appropriate rule for composing such groups. Alternatively, we relax this constraint by allowing the yielding of infeasible solutions which violate Constraint (5), and decrease the fitness of such solutions. The details will be described in the next subsection. For example, the chromosome [2, 1, 1, 3, 2, 3] indicates the group composition where the 2nd and the 3rd students are assigned to the first group, the 1st and the 5th students to the second group, and the 4th and the 6th students to the third group, so each student is assigned to exactly one group and each group has the same number of students. This implies that the satisfactions of Constraints (2)-(4) are always met if our chromosome encoding rule is adopted. However, the satisfaction of Constraint (5) is not guaranteed and it will be accounted for by the fitness function. 4.2.2. Fitness Function To evaluate the fitness of each chromosome, we need to define a fitness function to assess how well the solution decoded from the chromosome solves the addressed problem. Clearly, the objective function Z(xij) (see Eq. (1)) can be used to assess the quality of each chromosome Y, that is, the smaller the difference of the average pre-testing score between any two groups constructed by the solution, the better the quality of chromosome Y. However, this value is discredited if the solution decoded from Y violates Constraint (5). Thus, we decrease the fitness of Y if it violates Constraint (5) by using the following penalty function.

∑ ∑ Penalty(x ) = n − r

n

j =1

k =1

ij

{

min 1, ∑i=1 xij Lik m

}

r

.

(6)

Precisely speaking, the penalty function computes the average number of concepts already known by the members of the same group and subtracts this number from n (the number of total concepts). Hence, the less number of concepts learned by each group, the greater the penalty is given to the solution which constructs the grouping. Now, we are able to define a fitness function as follow. Fitness(Y) = 1 −

(α × Z (x ))

2

ij

− β × Penalty(xij )

(7)

Because the values of objective function and penalty function fall in different ranges, 0≤Z(xij)≤F and 0≤Penalty(xij)≤n, where F is the maximum pre-testing score value, we adopt two normalization parameters, α and β, to transform the values of the separate terms into the same range [0.0, 1.0]. As such, the fitness value will not be dominated by a great value of either F or n. Furthermore, we use a quadratic form in the α×Z(xij) term and a linear form in the β×Penalty(xij) term (note that both terms have been normalized in the value range of [0.0, 1.0]) to decrease the impact of the objective function upon the fitness and increase the influence of the penalty function. As such, the yielding of infeasible solutions can be avoided very soon after the start up of the evolution. When all the solutions in the chromosome population become feasible (with zero value of β×Penalty(xij) term), the evolution turns to decrease the value of the α×Z(xij) term as much as possible to maximize the fitness. Hence, the greater the fitness value of the chromosome Y, the better the quality of the corresponding solution. The fitness function provides a comparison basis for performing the survival of the fittest rule. For example, let the maximum pre-testing score value F = 100 and the number of total concepts n = 10, so we can determine the values of normalization parameters as α = 0.01 and β = 0.1 to scale the α×Z(xij) term and the β×Penalty(xij) term into the same value range [0.0, 1.0]. Assume that a chromosome Y constructs a grouping resulting in a maximal difference of 20 between the mean pre-testing scores of any two groups, so the objective value Z(xij) = 20. Further assume that the average number of concepts already known by the members of each group in the same grouping is 2.5, so chromosome Y receives a penalty value of Penalty(xij) = 10 – 2.5 = 7.5. Therefore, the fitness of chromosome Y is Fitness(Y) = 1 – (0.01×20)2 – (0.1×7.5) = 0.21. 154

4.2.3. Selection The selection process stipulates that the individuals that more adapt or fit to the environment are more likely to survive. Based on preliminary experiments, we have decided to implement the roulette-wheel selection instead of the other selection schemes such as the tournament selection (Goldberg, 1989), because the former can quickly filter out less-fit chromosomes that encode infeasible solutions. As the MCGC problem is strictly constrained, each concept should be already learned by at least one of the members of each group, it is hard to conduct evolution when lots of infeasible solutions are present in the population. The roulette-wheel selection can effectively distinguish between feasible and infeasible solutions and is more suited to the scenario of the MCGC problem. For example, let the population consist of four chromosomes, Y1, Y2, Y3 and Y4, and also let Y1 and Y2 are feasible and Y3 and Y4 are infeasible, their fitness values are 0.75, 0.80, 0.30, and 0.20, respectively. The selection probability of Y1 is computed by 0.75/(0.75+0.80+0.30+0.20) = 0.37. The fitness values of the other chromosomes can be computed similarly and they are 0.39, 0.15, and 0.09, respectively. As the selection probabilities of Y1 and Y2 are significantly greater than that of Y3 and Y4, the infeasible chromosomes which violate the MCGC constraints will be quickly filtered out from the evolution population. 4.2.4. Crossover Crossover allows the chromosomes in the mating pool to exchange their genetic material such that better combination of genes can be preserved in their offsprings. In the optimization language, crossover enables the genetic algorithm to extend the exploration toward better building blocks and the average fitness of the population improves. Here, we develop a new crossover operator to cope with the constraints raised by the MCGC problem. Our crossover operator has several distinct features compared to the traditional crossover operators employed in the literature. The details are presented below. Step 1: Let the two mating chromosomes be Yi and Yj. Generate two crossover points at random, say, pos1 and pos2. Let the original gene values at pos1 and pos2 of Yi be Yi(pos1) and Yi(pos2), and the gene values at pos1 and pos2 of Yj be Yj(pos1) and Yj(pos2), respectively. Exchange the gene values at pos1 and pos2 between Yi and Yj. Step 2: Choose another gene of Yi whose value is equal to Yi(pos1), replace its value with Yj(pos1). Also choose another gene of Yi whose value is equal to Yi(pos2), replace its value with Yj(pos2). As such, the number of students assigned to the groups will not be affected by the crossover operation. Step 3: Similar to Step 2, perform the same replacement on chromosome Yj. The purpose of our new crossover operator is to guarantee that the offsprings produced from the crossover operation still satisfy Constraint (4), that is, the difference between the numbers of students of any two groups is no more than one. For example, let the mating chromosomes be Yi = [3, 1, 3, 2, 2, 1] and Yj = [2, 1, 3, 1, 3, 2]. We first generate two crossover points at random, say 1 and 4. The gene values between the two chromosomes on the two points are exchanged, and we obtain two intermediate chromosomes as Yi = [2, 1, 3, 1, 2, 1] and Yj = [3, 1, 3, 2, 3, 2]. It is seen that the two intermediate chromosomes violate Constraint (4), for example, there are three students assigned to the first group by Yi but only one student assigned to the third group. So we need to readjust the student assignment of the intermediate chromosomes. For the intermediate chromosome Yi, we can choose to replace the 5th gene whose value is equal to the 1st gene by 3, and further replace the 6th gene whose value is equal to the 4th gene by 2. So we obtain the final chromosome as [2, 1, 3, 1, 3, 2]. We observe that the new chromosome now satisfies Constraint (4) after the readjustment. Therefore, our proposed crossover operator is customized to MCGC problem and ensures that the yielded offsprings still satisfy the problem constraints. 4.2.5. Mutation Mutation is an occasional alteration of the gene and it is performed with a low probability, Pm. The purpose is to increase the gene diversity of the population and escape the barrier of local optima. Our mutation operator proceeds as follows. If chromosome Yi is determined to perform mutation, generate two points at random, say, pos1 and pos2. Let the original gene values at pos1 and pos2 of Yi be Yi(pos1) and Yi(pos2), respectively. Swap Yi(pos1) and Yi(pos2) 155

to generate a new chromosome and remove the old one. Our mutation operator guarantees that the numbers of students assigned to the groups are not affected by the mutation operation, only the assigned students are changed, so Constraints (2)-(4) are still satisfied. For example, let the chromosome to conduct mutation be [3, 2, 1, 2, 3, 1]. We first generate two mutation points, say 2 and 5. The gene values of the two points are exchanged and we obtain the mutated chromosome as [3, 3, 1, 2, 2, 1]. We see that the numbers of students in the groups are not changed but the assigned students are different. Therefore, our new mutation operator is different from the traditional bit-flipping one because the latter cannot cope with the MCGC constraints. 4.3 Illustrative Examples Assume that the MIS course has 10 concepts to be learned, that is, C1, C2, …, C10, which denote Data vs. Information, Computer Equipment for Information Systems, Closed systems, Computer components, HumanComputer Synergy, Strategic advantage, Strategic alliance, Strategic information system (SIS), Productivity, and Reengineering, respectively. Six students, denoted by S1, S2, …, S6, join this course and will be assigned to three cooperative learning groups. By the pre-testing results, we know which concepts have been understood and the pretesting scores obtained by separate students are shown in Table 2. Table 2. Pre-test results of individual students Concepts already learned Pre-testing score C1, C2, C3 83 C2, C3, C8, C9 65 C1, C5, C8, C9 72 C4, C6, C8, C10 75 C1, C6 90 C2, C6, C7 81

Student S1 S2 S3 S4 S5 S6 Chromosome encoding

With our chromosome encoding rule, we can generate a population of five chromosomes as follows. Y1 = [ 1 1 2 2 3 3 ] Y2 = [ 1 2 1 2 3 3 ] Y3 = [ 1 2 3 3 1 2 ] Y4 = [ 2 3 2 1 1 3 ] Y5 = [ 3 2 3 1 2 1 ] All of the five chromosomes meet the requirements of Constraints (2)-(4). In particular, the number of students assigned to each group is the same. Table 3. The values of the α×Z(xij) term, the β×Penalty(xij) term, the fitness, and the selection probability for each chromosome in the initial population Pr(Yi)= Fitness(Yi)= β×Penalty(xij) 5 α×Z(xij) 2 Fitness (Yi ) Fitness (Y j ) term 1−(α×Z(xij)) −(β×Penalty(xij)) j =1

∑

Y1 Y2 Y3 Y4 Y5

0.12 0.155 0.135 0.15 0.005

0.467 0.433 0.433 0.433 0.4

0.5186 0.542975 0.548775 0.5445 0.599975

0.188 0.197 0.199 0.198 0.218

Fitness function and roulette-wheel selection The fitness of each chromosome is evaluated according to the fitness function (see Equation (7)). The normalization parameters α and β are set to 0.01 and 0.1 because there are 10 concepts and the maximal testing score is 100. Table 156

3 lists the values of the α×Z(xij) term, the β×Penalty(xij) term, and the fitness for each chromosome in the initial population. Next, we compute the selection probability for each chromosome Yi by Pr(Yi) = Fitness (Yi )

∑

5 j =1

Fitness (Y j )

and the results are shown in the last column of Table 3. It is observed that Y5 has the highest selection probability of 0.218 because the values of its α×Z(xij) term and β×Penalty(xij) term are the smallest among that of all chromosomes. The selection probabilities of Y2, Y3, and Y4 are similar and ranked median, while Y1 has the least selection probability which is equal to 0.188. The roulette-wheel selection then selects chromosomes according to Pr(Yi) to form a mating pool. Assume that Y5 is selected twice, Y2, Y3, and Y4 are selected once, and Y1 is eliminated in the selection process, as shown below. Y1 ← Y5 = [ 3 2 3 1 2 1 ] Y2 ← Y3 = [ 1 2 3 3 1 2 ] Y3 ← Y5 = [ 3 2 3 1 2 1 ] Y4 ← Y2 = [ 1 2 1 2 3 3 ] Y5 ← Y4 = [ 2 3 2 1 1 3 ] Crossover We show the crossover of Y4 and Y5 as an example for illustration, the crossover of the other chromosome pairs can be similarly derived. First, two random crossover points, say 1 and 5, are generated. Second, the gene values at positions 1 and 5 between Y4 and Y5 are exchanged, so the gene values of Y4 and Y5 now become Y4 = [ 2 2 1 2 1 3 ] Y5 = [ 1 3 2 1 3 3 ] We randomly choose another locations (say 2 and 3) in Y4 whose gene values are equal to Y4(1)=2 and Y4(5)=1, and replace their gene values by 1 and 3, respectively. So now Y4 becomes Y4 = [ 2 1 3 2 1 3 ] Analogously, Y5 can be further modified as Y5 = [ 1 3 2 2 3 1 ] As such, the number of students assigned to the groups will not be affected by the crossover operation. Mutation After the crossover process for all pairs of chromosomes has been completed, the mutation operator is activated with a very low probability. Assume that Y4 is to be mutated. Two random locations are first generated, say 4 and 6. The gene values at these positions are swapped, and Y4 finally becomes Y4 = [ 2 1 3 3 1 2 ] We observe that the number of students assigned to the groups is not changed during all steps (selection, crossover, and mutation) of the genetic algorithm. Thus, the satisfaction of Constraints (2)-(4) is guaranteed throughout the whole process of our algorithm. While the requirement of Constraint (5) is reflected in the β term of the fitness function and the selection process will rule out the infeasible chromosomes with positive β×Penalty(xij) term values.

5. Experiments and Analysis The performance of the proposed EGA approach is analyzed according to experimental evidence. In this section, we first demonstrate a web-based system which can accommodate the MCGC problem using the EGA method and perform the parameter tuning on the proposed algorithm. Second, the performance evaluation of our algorithm is carried out by comparing to other competing algorithms. Finally, the robustness of our algorithm is verified by measuring the variance between repetitive runs and different problem scenarios.

157

Figure 2. Automatic analysis showing the degree of understanding for each concept involved in the course

Well understood concepts

Student ID Subject Score

Figure 3. Pre-testing score and the concepts that are well understood by each participated student

158

5.1 An Application of EGA Approach To assist the teachers in conducting the multi-criterion group composition for learning on-line courses, a web-based system has been implemented and merged into the Intelligent Tutoring, Evaluation and Diagnosis III (ITED III) system developed by our research team. To use our system, the teacher first selects an on-line learning course and specifies the starting and due dates and times of the pre-testing for the students to attend. The participated students need to complete the pre-testing during the specified time frame through our on-line testing interface. After the pretesting, our system automatically analyzes to what extent the student understands each concept involved in the course. Fig. 2 shows the analysis result using a continuous scale of [0.0, 1.0] to measure the degree of understanding (where 0.0 means the student did not understand the concept at all and 1.0 indicates full understanding). Based on the analysis, the system can report the pre-testing score and the concepts that are well understood by each student as shown in Fig. 3. Finally, our embedded EGA algorithm determines the best group membership of each student (see Fig. 4) such that the objective value of the grouping is optimized and the satisfactions of the MCGC constraints are met simultaneously.

Figure 4. The embedded EGA algorithm determines the best group composition of the students There are two critical parameters that affect the performance of our EGA algorithm, the number of chromosomes (N) and the maximal number of generations (K). To obtain the best performance, we repetitively run the EGA with various values of N and K as listed in Table 4. The computational time in seconds (t) and the optimal fitness value (F) obtained by the EGA with various values of N and K are shown in Table 5. It is seen that the best performance is observed when our algorithm experiences up to 100 generations with 20 chromosomes where the computational time (22.562) consumed is moderate and the optimal fitness value (0.936) obtained is the best among all trials. Hence, in our following experiments, we set the EGA parameters N and K to 20 and 100, respectively. Table 4. Combination of various values of N and K N K 10 10, 100, 200, 300, 400, 500, 1000 20 10, 100, 200, 300, 400, 500, 1000 30 10, 100, 200, 300, 400, 500, 1000

159

Table 5. Computational time and the optimal objective value obtained by the EGA with various values of N and K N = 10 N = 20 N = 30 K t F t F t F 10 1.453 0.702 2.468 0.803 3.515 0.823 100 12.703 0.881 22.562 0.936 32.234 0.917 200 25.265 0.921 44.718 0.898 63.859 0.923 300 37.796 0.880 67.031 0.804 95.640 0.855 400 50.328 0.917 89.250 0.926 127.906 0.874 500 63.140 0.833 111.453 0.843 158.922 0.929 1000 126.109 0.872 222.625 0.827 319.343 0.911 5.2 Performance analysis We evaluate the performance of the EGA by comparing it to several competing algorithms on a large simulation dataset of MCGC problems. The characteristics of the dataset and the information of the algorithms are described in the following. Simulation dataset The MCGC parameters involve the number of concepts (n), the number of participated students (m), the number of learning groups (r), the pre-testing scores of each student (fi), and the concepts already understood by each student (Lik). We generate a large simulation dataset of MCGC problems at random by varying the MCGC parameters. The combinations of the parameter values are shown in Table 6. m 50

100

300

500

Table 6. Combinations of the MCGC parameter values n r m n 10 10 20 20 10 1000 30 30 50 50 10 10 20 20 20 1500 30 30 50 50 10 10 20 20 60 2000 30 30 50 50 10 20 100 30 50

r 200

300

400

Competing algorithms • •

Enhanced Genetic Algorithm (EGA): The EGA refers to our algorithm described in Section 4. The EGA parameters are optimally tuned according to the preliminary experiments presented in Section 5.1. In particular, all of the EGA runs experience 100 generations with 20 chromosomes. Exhaustive Method (EM): The EM exhaustively enumerates all possible solutions to the MCGC problems. Thus, the true optimal fitness value can be found. However, as the MCGC problem is NP-hard, only small problems can be solved by EM. 160

•

Greedy Method (GM): The GM first computes an order for each student (Si) according to his/her pre-testing score (fi) and the number of concepts (

•

∑

n k =1

Lik ) he/she already knows. With this order, the student is processed

in turn and allocated to the group whose average pre-testing score of its current members is the closest to fi. Random Method (RM): The RM repetitively generates candidate solutions to the MCGC problems within the given CPU time limits. Thus, the optimality of the final solution is not guaranteed because the RM does not enumerate all possible solutions.

Experiment 1 In this experiment, we compare the results obtained by EGA with that obtained by EM. Since the results obtained by EM are true optimal fitness, we can assess how effective the EGA is by examining the difference between their results. Table 7 shows that the EM can only solve the two smallest problems within reasonable times and the true optimal fitness values for the two problems are both 1.0. For the rest problems with larger size, the EM cannot derive the optimal solution in practice because the MCGC problem is NP-hard and the computational time needed by the EM grows exponentially with the problem size. On the other hand, the optimal fitness obtained by EGA for the two smallest problems are very close to that obtained by EM, indicating that the EGA is a very effective method. We observe that the computational time consumed by the EGA also increases with the problem size, but the increasing rate is quite slow. Quality solutions can be derived by the EGA within only a few seconds, so the EGA is also computationally efficient,

m 15 20 50 100 300 N/A: not available

r 3 4 10 20 60

Table 7. The comparative performance between EGA and EM EGA EM t F t 0.031 0.999 1820 0.020 0.953 12542 0.121 0.997 N/A 0.309 0.992 N/A 2.043 0.969 N/A

F 1 1 N/A N/A N/A

Experiment 2 In this experiment, we assess the comparative performance between EGA, GM, and RM on the simulation dataset described in Table 6. The EGA and RM are stochastic-based methods, which means the result obtained by a run of the algorithm could be different from another run of the same algorithm. Thus, we report the results according to the average of 10 runs for the two algorithms.

m 50 100 300 500 1000 1500 2000 average

Table 8. Comparative performance between EGA, GM, and RM EGA GM t F t F t 0.141 0.962 0.031 0.913 0.016 0.358 0.943 0.016 0.879 0.016 2.188 0.897 0.031 0.833 0.016 6.328 0.926 0.016 0.838 0.016 22.714 0.914 0.641 0.857 0.016 48.417 0.890 1.250 0.812 0.016 87.538 0.886 1.891 0.799 0.031 23.955 0.917 0.554 0.847 0.018

RM F 0.810 0.791 0.773 0.790 0.747 0.682 0.728 0.760

As there is no measurable impact on the results from the experiments with different numbers of concepts, we only report the results for the experiment with 50 concepts. Table 8 tabulates the computational time (t) and the optimal 161

fitness value (F) obtained by the three algorithms. It is seen that the computational times consumed by EGA and GM grow with the number of students (m), while the computational time consumed by RM is rarely affected by m. The computational times are all acceptable in practice. Even for the largest problem which involves 2,000 students and 50 concepts, the longest computational time needed is less than 1.5 m. As for the quality of the final solution derived by the three algorithms, the optimal fitness found by the EGA is around 0.9, and it decreases at a slow rate as the number of participated students (m) increases. The GM can report an optimal fitness value between 0.8 and 0.9, while the result obtained by the RM is the worst and the optimal fitness is below 0.8 for most of the simulated MCGC problems except for the case with 50 participated students where the RM can obtain an optimal fitness of 0.81. Fig. 5 shows the variations of the optimal fitness value derived by the three competing algorithms as the number of participated students increases. It is seen that, for all of the three methods, the m optimal fitness decreases as the problem size (in the complexity of O (r ) ) becomes larger, but the optimal fitness obtained by the EGA decreases at the slowest rate. In summary, the EGA can obtain high quality solutions within reasonable times (less than 1.5 m) for up to 2,000 participated students, which meets the requirement of most real-world applications for cooperative learning group composition. The GM is relatively fast but the obtained result deteriorates as the problem size increases. The RM is not an applicable approach because the optimal fitness derived is less than 0.8 in most of the simulated MCGC problems.

Fitness values (50 concepts) 1.000

Fitness Value

0.900 0.800

EGA GM

0.700

RM

0.600 0.500 50

100

300

500

1000

1500

2000

Number of students Figure 5. Variations of the optimal fitness value derived by the three competing algorithms as the number of participated students increases 5.3 Robustness Analysis We analyze the robustness of the testing algorithms against repetitive runs and different problem scenarios. First, for the two stochastic methods, the EGA and the RM, we run each method 10 times and compute the standard deviation of the optimal fitness derived over the multiple runs. Fig. 6 shows the variations of the standard deviation of the optimal fitness obtained using the EGA and the RM, we observe that the EGA is relatively stable against different numbers of students while the standard deviation corresponding to the RM fluctuates between 0.06 and 0.14, indicating the RM is less reliable. 162

Second, for all of the three algorithms, we testify their robustness against different numbers of concepts. For a given number of concepts, we compute the standard deviation of the optimal fitness values for different numbers of students. Fig. 7 shows the variations of the standard deviation obtained using the three competing methods. We observe that the EGA produces the smallest standard deviation values and is the most reliable method against different numbers of concepts. The GM ranks at the middle and its standard deviation values are around 0.04 with different scenarios of concept number. The RM is the least reliable method which yields the highest standard deviation values.

0.16

Standard Deviation

0.14 0.12 0.1 EGA

0.08

RM

0.06 0.04 0.02 0 50

100

300

500

1000

1500

2000

Number of students

Figure 6. Variations of standard deviation of the optimal fitness obtained using EGA and RM

0.1

Standard Deviation

0.08 0.06

EGA GM

0.04

RM

0.02 0 10

20

30

50

Number of concepts in the subject unit

Figure 7. Variations of standard deviation of the optimal fitness obtained using EGA, GM, and RM

6. Pedagogical use of the EGA approach and future work This paper presents an EGA approach to assign small-group learning peers for cooperative learning. Cooperative learning is an important feature for web-based learning, as the web technology allows more flexibility for peer interactions and cooperation (Chou & Tsai, 2002; Lou, Abrami, & d’Apollonia, 2001). In the web-based learning 163

context, the instructor may face a large number of students; it may be quite demanding for the instructor to assign peer learners for each group. Also, when assigning peer learners, the instructor may simultaneously consider various factors, such as achievement and individual differences toward learning. The EGA adopted in this study can meet multiple criteria set up by the instructor and it can effectively assign group members for the instructor. This work is almost impossible to be undertaken manually when encountering a large sample of students or considering various factors at the same time. This paper presents an example of using students’ conceptual understanding for heterogeneous grouping; however, a similar approach can be utilized to assign peer learners based upon their variations of other factors, such as their motivational, social preferences or learning characteristics, either for heterogeneous grouping or homogeneous grouping. By using EGA, researchers or instructors can base on their own theoretical perspectives (such as supporting heterogeneous grouping or homogeneous grouping, or highlighting the importance of motivational factors; Lou et al., 1996) to maximally enhance learning outcomes. In sum, the usage of EGA can meet and adapt the multiple criteria set by the course instructor, and effectively assign appropriate peers. Future studies can use EGA to model different combinations of peer learners for cooperative learning and then carefully evaluate their learning outcomes. From theoretical perspective, researchers can examine the effects of different ways of assigning group learners, as how to better conduct cooperative learning is always a major concern for deep investigation for researchers (e.g., Cohen, 1994; Saleh et al., 2005; Webb et al., 1998, 2002). In practice, EGA can help the teachers, in any field, by any pedagogical perspective, to properly implement their cooperative learning either in classrooms or in web-based learning context. In other words, EGA is not for only one mode of cooperative learning, nor it is only for one pedagogical perspective; rather, it can be used by educators with various backgrounds. Note that the EGA can be applied to the composition of various cooperative learning problems with different combinations of grouping objectives and criteria based on the instructional requirements and strategies defined by the teachers. Currently, we are trying to apply the novel approach to a variety of online courses with different grouping criteria. Furthermore, we also plan to investigate in modeling and solving more complex cooperative learning problems by employing other meta-heuristic approaches.

7. Conclusions In this paper, we formulate the multi-criteria group composition problem that models the composition of a set of cooperative learning groups from a large number of students who take the same online course to meet multiple grouping criteria specified by the teacher. To cope with the problem, an enhanced genetic algorithm is proposed. A series of experiments haves been conducted by comparing the performance of the proposed algorithm with those of other approaches. Experimental results showed that the novel approach is able to efficiently and effectively compose cooperative learning groups that fit the tutoring criteria defined by the teachers. To assist the teachers in defining the grouping criteria for online courses, a web-based interface has been implemented and integrated with a distance learning system. Our algorithm and system have been shown to be superior to several competing methods and robust against different experimental criteria, however, there are still some limitations: (1) Our algorithm does not guarantee the finally obtained grouping is optimal in terms of the objective value due to the limited computational time allowed in practical applications. (2) Although the grouping obtained by our algorithm is better than that obtained by the greedy method, our algorithm is a bit slower than the latter.

Acknowledgements This study is supported in part by the National Science Council of the Republic of China under contract numbers NSC 95-2524-S-024 -002 and NSC 95-2520-S-024 -003.

References Balady, G. J. (2002). Survival of the fittest - more evidence. New England Journal of Medicine, 346 (1111), 852854. 164

Beane, W. E., & Lemke E. A. (1971). Group variables influencing the transfer of conceptual behavior. Journal of Educational Psychology, 62, 215-218. Chou, C., & Tsai, C.-C. (2002). Developing Web-based curricula: Issues and challenges. Journal of Curriculum Studies, 34, 623-636. Cohen, E.G. (1994). Restructuring the classroom-conditions for productive small-groups. Review of Educational Research, 64, 1-35. Webb, N.M., Nemer, K.M., & Zuniga, S. (2002). Short circuits or superconductors? Effects of group composition on high-achieving students’ science assessment performance. American Educational Research Journal, 39, 943-989. Dalton, D. W., Hannafin M. J., & Hooper S. (1989). Effects of individual and cooperative computer-assisted instruction on student performance and attitudes. Educational Technology Research and Development, 37 (2), 15-24. Damidis, A. (1994). Review of parallel genetic algorithms bibliography, Technical Report, Aristotle University of Thessaloniki, Thessaloniki, Greece. Deb, K., & Agrawal, R.B. (1995). Simulated binary crossover for continuous search space. Complex System, 9, 115148. Ghaith, G.M., Yaghi, H. (1998). Effect of cooperative learning on the acquisition of second language rules and mechanics. System, 26, 223-234. Ghaith, G.M. (2002). The relationship between cooperative learning, perception of social support, and academic achievement. System, 30, 263-273. Goldberg, D. E. (1989). Genetic Algorithms: Search, Optimization and Machine Learning, Reading, MA: AddisonWesley. Holland, J. H. (1975). Adaptation in Natural and Artificial Systems, Ann Arbor, MI: University of Michigan Press. Hernández-Leo, D, Villasclaras-Fernández, E. D., Asensio-Pérez, J. I, Dimitriadis, Y., Jorrín-Abellán, I. M., RuizRequies, I., & Rubia-Avi, B. (2006). COLLAGE: A collaborative Learning Design editor based on patterns. Educational Technology & Society, 9 (1), 58-71. Hiltz, S.R. (1994). The Virtual Classroom: Learning without Limits via Computer Networks, Norwood, NJ: Ablex. Hooper, S., & Hannafin M. J. (1988). Cooperative CBI: The effects of heterogeneous versus homogeneous group on the learning of progressively complex concepts. Journal of Educational Computing Research, 4 (4), 413-424. Hooper, S. (1992). Cooperative learning and computer-based instruction. Journal of the Educational Technology Research & Development, 40 (3), 21-38. Hooper, S. (2003). The effects of persistence and small group interaction during computer-based instruction. Computers in Human Behavior, 19, 211-220. Huber, G. L. (2003). Processes of decision-making in small learning groups. Learning and Instruction, 13, 255-269. Hwang, G. J. (2003). A Concept Map Model for Developing Intelligent Tutoring Systems. Computers & Education, 40 (3), 217-235. Johnson, D.W., & Johnson, R.T. (1990). Cooperative Learning and Achievement. In S. Sharan (Ed.), Cooperative Learning: Theory and Research, New York: Praeger, 23-37.

165

Johnson, D.W., Roger, T., & Smith, K.A. (1991). Active Learning: Cooperation in the College Classroom, Edina, MN: Interaction Book Company. Johnson, D.W., & Johnson, R.T. (1999). Making cooperative learning work. Theory into Practice, 38 (2), 67-73. Johnson, S.D., Suriya, C., Yoon, S.W., Berrett, J.V., & Fleur, J.L. (2002). Team development and group processes of virtual learning teams. Computers & Education, 39, 379-393. Kelley, T.L. (1939). The selection of upper and lower groups for the validation of test item. Journal of the Educational Psychology, 30, 17-24. Keyser, M.W. (2000). Active learning and cooperative learning: understanding the difference and using both styles effectively. Research Strategies, 17, 35-44. Klein, J.D., & Schnackenberg, H.L. (2000). Effects of informal cooperative learning and the affiliation motive on achievement, attitude, and student interactions. Contemporary Educational Psychology, 25, 332-341. Klingner, J.K., & Vaughn, S. (2000). The helping behaviors of fifth graders while using collaborative strategic reading during ESL content classes. TESOL Quarterly, 34, 69-98. Kirschner, P.A. (2001). Using integrated electronic environments for collaborative teaching/learning. Research Dialogue in Learning and Instruction, 2, 1-9. Lou, Y.P., Abrami, P.C., & d’Apollonia, S. (2001). Small group and individual learning with technology: A metaanalysis. Review of Educational Research, 71, 449-521. Lou, Y.P., Abrami, P.C., Spence, J.C., Poulsen, C., Chambers, B., & d’Apollonia, S. (1996). Within-class grouping: A meta-analysis. Review of Educational Research, 66, 423-458. Macdonald, J. (2003). Assessing online collaborative learning: process and product. Computers & Education, 40, 377-391. Mevarech, Z.A. (1993). Who benefits from cooperative computer-assisted instruction? Journal of the Educational Computing Research, 9 (40), 451-464. Oz, E. (2002). Management Information Systems, Third Edition, Boston: Course Technology. Porto, M. (2001). Cooperative writing response groups and self-evaluation. ELT Journal, 55 (1), 38-46. Pragnell, M. V., Roselli, T., & Rossano, V. (2006). Can a Hypermedia Cooperative e-Learning Environment Stimulate Constructive Collaboration? Educational Technology & Society, 9 (2), 119-132. Rachel, H.L., & Irit, B.N. (2002). Writing development of Arab and Jewish students using cooperative learning (CL) and computer-mediated communication (CMC). Computers & Education, 39, 19-36. Ramsay, A., Hanlon, D., & Smith, D. (2000). The association between cognitive style and accounting students’ preference for cooperative learning: an empirical investigation. Journal of Accounting Education, 18, 215-228. Sheremetov, L., & Arenas, A.G. (2002). EVA: an interactive Web-based collaborative learning environment. Computers & Education, 39, 161-182. Slavin, R.E. (1989). Research on cooperative learning: consensus and controversy. Journal of the Educational Leadership, 47 (4), 52-54. Smith, K.A. (1996). Cooperative learning: Making groupwork work. New Directions for Teaching and Learning, 67, 71-82. 166

Srinivas, M., & Deb, K. (1994). Multiobjective optimization using nondominated sorting in genetic algorithms. International Journal of Evolutionary Computation, 2, 221-248. Srinivas, M. & Patnaik, L. M. (1994). Adaptive probabilities of crossover and mutation in genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, 24 (4), 656-667. Sun, C.T., & Chou, C. (1996). Experiencing CORAL: design and implementation of distant cooperative learning. IEEE Transactions on Education, 39 (3), 357-366. Swain, M. (2001). Integrating language and content teaching through collaborative tasks. The Canadian Modern Language Review, 58, 44-63. Saleh, M., Lazonder, A. W., & de Jong, T. (2005). Effects of within-class ability grouping on social interaction, achievement, and motivation. Instructional Science, 33, 105-119. Tinto, V. (1993). Leaving College: Rethinking the Causes and Cures of Student Attrition, Second Edition, Chicago: University of Chicago Press. Veenman, S., Benthum, N.V., Bootsma, D., Dieren, J.V., & Kemp, N. (2002). Cooperative learning and teacher education. Teaching and Teacher Education, 18, 87-103. Webb, N.M,, Nemer, K.M., Chizhik, A.W., & Sugrue, B. (1998). Equity issues in collaborative group assessment: Group composition and performance. American Educational Research Journal, 35, 607-651. Zurita, G., Nussbaum, M., & Salinas, R. (2005). Dynamic Grouping in Collaborative Learning Supported by Wireless Handhelds. Educational Technology & Society, 8 (3), 149-161.

167