Building a Transcript of the Future

5 downloads 0 Views 266KB Size Report
ABSTRACT. The pathways and learning outcomes of university students ... is known to enhance learning [4]. As we work ... The grade point average (GPA), typi-.
Building a Transcript of the Future Benjamin P. Koester University of Michigan Deptartment of Physics 450 Church St. Ann Arbor, MI 48109

[email protected]

James Fogel

University of Michigan Department of Economics 611 Church St. Ann Arbor, MI 48109

[email protected]

[email protected]

ABSTRACT The pathways and learning outcomes of university students are the culmination of numerous experiences inside and outside of the classroom, with faculty and with other students, in both formal and casual settings. These interactions are guided by the general education requirements of the university and by the learning goals of the student. The only official record and representation of each student’s education is captured by their academic transcript: typically a list of courses described by name and number, grades recorded on an A-F scale and summarized by GPA, degrees awarded, and honors received. This limited approach reflects the technological affordances of a 20th century industrial age. In recent years, scholars have begun to imagine a transcript of the future, perhaps combining a richer record of the student experience along with a portfolio of authentic products of student work. In this paper, we concentrate on first, and develop analytic methods for improving measures of both classroom performance and intellectual breadth. In each case, this is done by placing elements of individual transcripts in context using information about their peers. We frame the study by addressing basic questions. Were the courses taken by the student difficult on average? Did the individual stand out from their peers? Were the courses representative of a broad intellectual experience, or did the student delve into detail in the chosen field of study? And with whom did they take courses?

CCS Concepts •Applied computing → Computer-assisted instruction; •General and reference → General conference proceedings; •Networks → Network economics; •Information

This work is licensed under a Creative Commons Attribution International 4.0 License.

Copyright is held by theowner/author(s).

LAK ’17 March 13-17, 2017, Vancouver, BC, Canada c 2017 ACM. ISBN 978-1-4503-4870-6/17/03.  DOI: http://dx.doi.org/10.1145/3027385.3027418

Harvard University Department of Economics 1805 Cambridge Street Cambridge, MA 02138

[email protected] Timothy A. McKay

Galina Grom

University of Michigan Deptartment of Physics 450 Church St. Ann Arbor, MI 48109

William Murdock III

University of Michigan Deptartment of Physics 450 Church St. Ann Arbor, MI 48109

[email protected] systems → Data analytics;

1.

INTRODUCTION

An academic transcript summarizes the career of a student. University registrars record and store the grades, degrees, courses, and other credentials that students seek when they enroll [10]. Students may then choose to share them with institutions where they continue their studies, or with potential employers. As such, the transcript serves first and foremost as an official validation that a student did indeed participate in academic activities at the university (took classes, earned a degree). It also forms a learning profile of the student; a representation of the education they have received. The courses, grades received in those courses, a degree, and some indication of the area of study provide our only record of the way each student met the educational goals of the institution (e.g. [19]). Institutions use their collection of transcripts to better understand the progress of students through their campuses. Individuals charged with evaluating students - typically graduate or professional school admissions teams and potential employers - learn to read the genre of the transcript with care, attempting to glean deeper information about academic performance, intellectual breadth, disciplinary depth, and effort. Often this interpretation of the transcript becomes baroque, creating a narrative from crumbs of information. Students, too, react to the transcript, attending closely to things which we record (like GPA) while discounting those we don’t (intellectual breadth or effort). Clearly the traditional transcript paints an incomplete picture of the student experience both in and out of the classroom ([2]). Hence, many have advocated for an eportfolio [18, 20] combining information about courses and co-curricular activities [9] with content produced by the student, all with the official stamp that accompanies a transcript. For the student, a portfolio that grows over an academic career can engage the individual in reflection on progression toward future goals. The process of metacognition is known to enhance learning [4]. As we work toward a rich transcript of the future, we be-

gin by asking basic questions. Absent appropriate context, many elements of the traditional transcript are difficult to interpret. For instance, were the courses taken difficult? In what areas, and to what degree, did this individual stand out from their peers? Were the courses taken representative of a relatively broad intellectual experience, and how deeply did this student delve into detail in the chosen field of study? With whom did they take courses - were they isolated with a homogenous group, or did they interact with a diverse cohort? Even today, advances in the availability and processing of information allow each transcript to be placed in context, becoming a richer, more informative portrait of a student. Relatively simple calculations, based entirely on current administrative data, can provide substantially refined measures of relative student performance, along with more insight into the nature of the courses taken and diversity of classmates.

affect its primary uses.

1.1

1.2

Use of Grades in the Transcript

Letter grades are the common currency of academic success in higher education [23]. While their apparent similarity suggests that all A’s are equal, challenges to their effective exchange emerge at many levels. Differences among disciplines in content, evaluative style, grading practice, and strength of students make grades awarded in different classes difficult to compare [12, 14]. Reward systems in higher education use grades as incentives in ways which further distort their utility for both students and faculty [1] and periodic attempts to regularize grades across courses and disciplines have found little success (e.g. [22]). Some authors have been sufficiently troubled to declare the use of grades unethical:“inconsistent grading policies render the competitions for academic awards unfair, deprive people of positions they merit, and leave people, students, institutions, and societies less well off than they would otherwise be.” [14]. Insofar as grades continue to be an integral part of the transcript, derivatives of them will continue to be used as summary statistics. The grade point average (GPA), typically a credit-hour weighted-average, is the most widespread. On its face, GPA cannot be a measure of how much a student has learned: it does not grow when a student takes more courses. Indeed, it may fall as a student learns more. Imagine the dismay of the 4.0 GPA student receiving their first B+ in a challenging junior level class. Despite learning more - perhaps a lot more - their GPA has fallen [16]. Instead, GPA serves the same dual purpose present in all grades. It is considered both as a summary of a student’s mastery of course material and a tool for ranking and locating each student among their peers. Both purposes are undermined by inconsistency of practice: mean grades given out in courses at the University of Michigan vary by more than 25% across the disciplines. These variations provide false signals, suggesting to all students particular talent in courses with high average grades and a lack of ability in courses with low average grades. Ranking too is harmed. Just as in Meyer’s time [17], a student may obtain the same high GPA through exceptional performance in courses giving low average grades or average performance in classes giving high average grades. Perhaps the simplest paliative for grading dispersion is to report a mean course grade alongside each student’s grade on a transcript. This approach was adopted, for example, at Dartmouth in 1994 [6], but it does nothing to alter GPA or

In the outside world, GPA is also used in several ways. In some contexts, it is seen as a measure of tenacity. Those who achieve a high GPA have demonstrated a sustained ability to meet whatever requirements have been placed on them. In other contexts, GPA is used as a predictor of future performance. Those with high GPAs are expected to perform well in subsequent endeavors. Is GPA adequate for these purposes? Perhaps. Employers, graduate and professional admissions committees, and those administering University honors of various kinds have used GPA for a century, and found value throughout [13]. But there are real reasons for concern, and several alternatives to GPA have been considered, most of which acknowledge the considerable effort expended to assign grades, but suggest that the information is best applied in a relative sense [14].

Alternatives to GPA

Practical alternatives to the GPA begin with grades as reported by instructors, then utilize information contained in the distribution of grades to refine measures of ranking [25, 15, 5, 11, 3]. Larkey and colleagues [15] were among the first to use basic linear models in an effort to extract measures of student achievement. Caulkins et al. [5] presented several methods that recompute GPA after normalizing student grade distribution according to other students in the same course. In essence, each of these takes existing grading data as-is and calculates a metric that is less sensitive to the peculiarities of grading practices among courses and more indicative of student performance. Perhaps the most sophisticated of these efforts, a Bayesian latent trait formulation of a summary statistic to replace GPA-based measures [12], was considered for use by the faculty at Duke, but ultimately rejected [8]. So far, none of these schemes has entered widespread use. In what follows we consider several possible replacements for GPA.

1.3

Courses and Classmates

The transcript also provides insight into the breadth and depth of each student’s course of study. Readers of a transcript place the list of courses a student takes into context using a combination of their experience and imagination. Institutions might provide much more precise insight, clarifying how students met their graduation requirements. Institutions might also place the intellectual breadth and disciplinary depth of courses taken by a student in local context, comparing each student’s transcript to those of their peers, both within their discipline and across the institution. In an important sense, the list of courses taken by a student provides a measure of their success or failure to meet the goals of a liberal education, regardless of peformance. It also encodes information about the student’s creativity in meeting course distribution requirements, whether that means exposure to many new topics or detailed study in a few. Institutions which have access to the transcripts of all their students can use this information to add important context to what they present for each student. They can refine their measures of student performance, perhaps augmenting or replacing GPA. They can also place the coursetaking and enrollment patterns in context, along with the diversity of individuals with which the student took classes.

A student taking courses that expose them to peers from many different majors, backgrounds, and identities likely has a much different experience from one who primarily interacts with students like them in course of study, background, or identity. Advances in the use of information technology in education have revitalized the use of data in evidence-based advising and decision making [3]. A new field of Learning Analytics has emerged, working to use data about learners and their environments to understand and optimize learning [24]. Perhaps the time has come to seriously reconsider the ways in which we represent a student’s performance in college. In this work, we explore a few small steps toward the transcript of the future. Keeping in mind the intent of the transcript - to accurately summarize and represent a student’s education – we explore several alternatives to the GPA and their sensitivities, as well measures of intellectual breadth and diversity of experience.

2.

DATA

The University of Michigan Data Warehouse contains grades, student admissions information, and demography back to Fall 1974. In this work, we are chiefly concerned with grades, which are complete back to at least Fall 1998. Thus, we consider student course data from Fall 1998 through Winter 2014 on the Ann Arbor Campus. This data set includes 4,384,169 grades for 192,987 students, who received grades in 15,571 courses distributed among 318 subjects. We only consider grades for students that took a course for credit and were still enrolled at the end of term. We ignore withdrawals, incompletes, audits, and pass/fail evaluated courses. For students that took a course multiple times, we consider only the last attempt, mirroring the U-M’s standard practice used to compute a GPA. This gives a grand total of 132,223 course-terms over 16 years. Where we are concerned only with student grades, our student sample is quite liberal. Students who transfer into and out of the University and across its many colleges are retained. Students who arrived on campus before Fall 1998, or who had not yet completed their degrees in Winter 2014 are also included. Some statistics require a student’s complete undergraduate record for calculation. These include measures of academic performance at graduation and the diversity of subjects and classmates with whom the student interacted. For these statistics, we further limit our data set to include only students admitted as freshmen since Fall 2005, who took at least 30 courses on campus, and graduated. At Michigan, majors of students are only finalized upon graduation; they are not expected to declare an intended major before matriculation to campus, and usually declare an intent only later in their sophomore year. As at most universities, instructors award letter grades which are then converted to grade points: an A = 4.0, A- = 3.7, B+ = 3.3, ..., E = 0. We note that the grade points given for ’plus’ grades to Business School undergraduates are unusual. A+ = 4.4, B+= 3.4, and so forth. In this analysis, these are normalized to the grade points given by the rest of the University. We also only consider undergraduate courses at the University of Michigan, which are numbered between 100 and 499. 100

and 200 level courses are mostly introductory lectures, discussion, laboratories, and seminars, and they contain among their ranks the highest enrollment courses on campus. By contrast, 300 and 400 level courses are typically directed at majors and have smaller enrollments formatted as small lectures, laboratories, and research for credit. Course types and size vary dramatically, ranging in enrollment from 1 to more than 2000 in a term and in structure from apprenticeship to massive lecture. Course content spans the intellectual range of the university, and both instructional style and grading practice vary with discipline.

2.1

Placing Student Performance in Context

A typical transcript reports a single student’s list of courses taken and grades received without context. Little information is provided about what the courses were like, who took and taught them, how they were taught, evaluated, or graded. Lacking this context, the transcript is difficult to interpret. Given that registrars possess information adequate to provide substantial context, we argue that it should be put to use. Consider the list of the courses a student took and the grades they received. Since courses are listed only by name and number, the reader must guess at content studied, work done, nature of the instructor or classmates, evaluative style, or grading practice. If we consider instead the context available from the full collection of student transcripts, substantially more information is available. It becomes possible to compare this student’s performance to others in this class, even to relevant subsets of others (e.g. those who continued on to complete the same major). We can learn more about who takes this class; what they take before or concurrent with it, what they continue on to study, and what degrees they ultimately receive. We can see whether students in this class typically receive grades higher or lower than they get in other courses. Analysis of the full network of courses and student grades allows us to knit together performance comparisons even between students who were never coenrolled in a class. In this context, grades and courses taken may be viewed through a different lens. Of course the quality of these relative performance measures depends on how deeply each student is embedded in the campus network of coenrollment. As a result, we begin with some descriptive exploration of how many classmates each student in our sample has encountered directly. Undergraduate degree-seeking students need 120 credits to graduate, which typically requires taking between 30 and 50 courses. Usually, before declaring a major, they take an array of introductory and prerequisite courses which include classes both small (writing courses and seminars) and large (lecture classes with enrollments from 50 to 2000). Most direct comparisons among coenrolled students take place in these large lecture courses. To give some sense for the numbers of classmates encountered by each student during their career, Figure 1 shows the number of total classmates for sets of students that, at any point in their career, took one of a selection of introductory courses at Michigan. These courses are part of the Psychology, English, Chemistry, and Statistics Majors, but are taken by large numbers of non-majors as well. Over their careers, the students in our sample accumulate anywhere from a dozen or so classmates to nearly

800

Course−Correlation Matrix

400

Count

Color Key and Histogram

0

25,000, with typical values ranging from 7,500 to 12,500. Grades handed out in these head-to-head comparisons form the basis of the grading system and the grade point averages that characterize students’ performance over their career.

−0.2

0

0.2

0.4

Value

0.6

0.8

1

CHEM126 CHEM130 CHEM125 MATH116 ENGR100 ENGR101 MATH216 MATH215 PHYSICS240 PHYSICS241 PHYSICS140 PHYSICS141 BIOLOGY162 PHYSICS126 PHYSICS125 PHYSICS127 BIOLOGY305 BIOLOGY225 CHEM210 CHEM211 CHEM215 CHEM216 FRENCH231 FRENCH232 COMM102 COMM101 ENGLISH124 POLSCI160 POLSCI111 ENGLISH223 STATS100 NTHRBIO161 MATH115 ECON401 ECON402 MKT300 ACC271 ECON102 ECON101 STATS250 UC280 BIOLOGY171 BIOLOGY173 OMENSTD220 NTHRCUL101 STATS350 PSYCH250 PSYCH270 PSYCH240 SPANISH231 SPANISH232 SOC100 ENGLISH225 MATH105 ENGLISH125 PSYCH111

PYSCH111 CHEM210 STATS350 ENGL225

400

Figure 2: The distribution of course taking. For the top 40 courses by total enrollment, the Pearson correlation coefficiencts between taking pairs of courses at the University of Michigan. Lighter shades symbolize higher correlation, and hierachical “complete linkage clustering” dendrograms arrange courses into groups that indicate similar patterns of course-taking.

0

200

Number of Students

600

800

PSYCH111 ENGLISH125 MATH105 ENGLISH225 SOC100 SPANISH232 SPANISH231 PSYCH240 PSYCH270 PSYCH250 STATS350 ANTHRCUL1 WOMENSTD2 BIOLOGY173 BIOLOGY171 UC280 STATS250 ECON101 ECON102 ACC271 MKT300 ECON402 ECON401 MATH115 ANTHRBIO16 STATS100 ENGLISH223 POLSCI111 POLSCI160 ENGLISH124 COMM101 COMM102 FRENCH232 FRENCH231 CHEM216 CHEM215 CHEM211 CHEM210 BIOLOGY225 BIOLOGY305 PHYSICS127 PHYSICS125 PHYSICS126 BIOLOGY162 PHYSICS141 PHYSICS140 PHYSICS241 PHYSICS240 MATH215 MATH216 ENGR101 ENGR100 MATH116 CHEM125 CHEM130 CHEM126

0

5000

10000

15000

20000

25000

Number of classmates

Figure 1: Histogram of the total number of unique classmates a student had during their academic career, given that they took one of four selected large courses at the University of Michigan. Patterns of course-taking also create clusters of students who take a series of large introductory courses together, allowing us to repeatedly compare the performance of these classmates at large nodes in the student network. Figure 2 shows the course correlation matrix for the 40 University of Michigan courses with the largest average enrollments. It is no surprise that courses cluster within subject (e.g. CHEM, PHYSICS), or that there is a general clustering of science courses. There are also courses (e.g. PSYCH 111) with little association to core science courses, and in general no particularly strong correlation with any other high enrollment courses. This is in part due to the diversity of careers of our students, but also to a fundamentally different distribution of course enrollments: science majors take several core high-enrollment classes together with other science majors. Humanities majors experience much more flexible requirements, and hence are less strongly coupled to either science students or even one another, especially outside their major. As a result, more caution is warranted in the comparison of weakly-coupled students than for those who are richly connected. Finally, the assignment of grades varies broadly among academic departments of the University. Figure 3 answers the question “what kinds of grades do students receive in the courses where most grades come from?” To construct this figure, we first identify all departments which teach at least 10 courses with average enrollments of at least 50 students. For those departments and their biggest courses, we

record the total enrollment and mean grade, aligning the courses from highest to lowest average enrollment and indicating grade in a color scale ranging from red to white as mean grades vary from 2.65 to 3.85 on a four point scale. Grade trends depend strongly on department, level, and division. This reconfirms earlier work of [7], which concludes that the use of grades and grading depends strongly on academic discipline. Interestingly, this was less true in the time of [17], when diversity in the application of new grading standards apparently had more to do with individual faculty than disciplines. Today at Michigan, eight of the ten lowest grading departments are in science, engineering, and math, along with Economics and the Romance Languages. The highest grading departments are more mixed, with humanities departments making up five of the top ten highest, joined by others like Biomedical Engineering and the School of Education.

2.2

Performance Measures Enriched by Context

The general system of grades described above has long been in place and is not likely to disappear soon. Despite its drawbacks, considerable effort is expended in the determination of grades, so that for most courses grades contain useful information both about characteristics of the course and the relative performance of the students who are enrolled. Given signs that grading standards vary significantly across courses and departments, we are concerned that students are punished for taking “difficult”, low graded courses or incentivized to pursue “easy” courses which tend to assign higher grades. Our challenge then is to leverage existing grades to better understand both grading practices in courses and student performance. To this end, we consider four different metrics of average student performance: traditional GPA, two previously suggested standardizations of the GPA, and a fixed effects model which we introduce for the first time here.

104

105

106

107

150

151

254

256

270

280

LS&A First Year Seminars

118

304

310

362

391

392

401

402

406

490

School Of Education

139

139

140

149

344

345

346

347

348

349

School of Music, Theatre and Dance

211

221

231

321

331

418

419

450

458

499

Biomedical Engineering

230

240

350

351

354

363

368

453

459

468

Office of International Programs

100

101

101

102

102

122

201

202

281

331

Near Eastern Studies Department

101

102

221

231

232

243

322

325

326

386

Germanic Languages & Lit Dept

100

201

204

205

206

209

240

301

374

399

American Culture Program

102

111

200

209

210

211

272

315

370

375

Department of Linguistics

220

240

253

270

295

300

324

375

400

483

Women's Studies Department

101

102

125

126

201

202

220

225

226

230

Asian Languages And Cultures

100

110

120

121

130

150

151

220

231

300

School Of Art And Design

111

120

230

240

250

270

280

303

370

401

Psychology Department

122

210

245

252

254

354

356

358

454

456

School Of Nursing

101

101

102

191

192

222

231

232

372

385

Classical Studies Department

102

105

110

111

139

201

211

232

302

360

Program in the Environment

124

125

223

225

239

240

313

317

325

367

English Language & Literature Dept

100

101

103

110

151

195

280

390

455

490

Engineering Undergraduate Educ

101

111

140

160

300

314

353

389

489

496

Political Science Department

122

201

202

230

280

296

310

312

381

481

Studies In Religion

200

236

236

272

290

350

360

366

366

370

Screen Arts and Cultures

101

161

272

285

298

330

344

364

365

368

Anthropology Department

103

111

111

340

358

450

451

458

490

495

Department of Afro−American and African Studies

101

102

112

212

222

250

251

271

272

394

History Of Art Department

101

101

110

111

111

230

241

320

330

340

School Of Kinesiology

110

160

161

201

218

241

266

318

322

396

History Department

300

306

310

400

418

422

427

428

429

436

Molecular, Cellular, and Developmental Biology

211

212

260

303

325

351

360

402

421

431

Civil & Environmental Engr

101

102

103

104

106

111

112

115

127

142

Astronomy Department

409

410

411

412

431

432

434

462

485

486

College Of Pharmacy

100

210

239

256

301

306

337

375

418

438

Sch Of Nat Resources & Environ

101

102

111

211

351

361

371

381

439

458

Communication Studies

100

101

102

210

303

305

310

344

345

368

Sociology Department

215

225

245

285

305

315

325

335

345

405

Aerospace Engineering

271

272

300

300

300

300

301

312

350

471

School of Business Administration

180

181

196

201

202

232

303

355

359

361

Philosophy Department

220

242

250

330

350

360

412

420

480

489

Materials Science & Engineering

230

330

341

342

343

344

360

460

466

487

Chemical Engineering Department

312

313

314

315

316

317

322

323

326

425

College of Architecture & Urban Planning

125

126

127

128

135

136

140

141

240

241

Physics Department

211

235

240

250

350

360

382

395

450

495

Mech Eng & Applied Mech Dept

125

126

130

210

211

215

216

230

241

260

Chemistry Department

201

202

265

310

316

333

334

366

373

425

Industrial−Operations Engr Dep

100

250

350

401

402

408

412

425

426

470

Statistics Department

183

203

215

270

280

281

314

370

482

496

Electrical Engr & Computer Sci

101

102

103

103

231

231

232

232

275

276

Romance Languages Department

118

162

171

172

173

207

225

226

305

310

Biology Department

101

102

310

340

395

398

401

402

404

435

Economics Department

100

181

183

198

270

280

303

370

380

482

Program In Computer Science

100

102

103

105

106

107

110

111

113

222

Earth and Environmental Sciences

103

105

115

116

156

215

216

217

417

425

Mathematics Department

Figure 3: Grades at the University of Michigan. For Departments offering at least 10 courses with typical enrollments of at least 50 students, the course catalog number and mean grade (red to white = 2.653.85 grade points) are shown. From left to right, the top 10 courses are ranked by enrollment, while from top to bottom, Departments are ordered by their enrollment-weighted mean grades, and color-coded according to division: red for Natural Science and Engineering, blue for Social Sciences, and green for Arts and Humanities

• Grade Points Above Replacement 2 (GPAR2) like GPAR1, corrects student grades to the course mean, but then standardizes this difference with the standard deviation of the course. Again, we modify the Caulkins formulation such that a student’s final GPAR2 is a credit-hour weighted average of GPAR2s in all courses. • Student Fixed-Effects (SFE) models every grade given to a student as a linear combination of student and course fixed effects, estimated across the full array of student-course-term records. Student i’s grade in course c, term t is written as the sum of a course and term-invariant student component StudentF Ei , a student-invariant course-term component ClassF Ect , and an idiosyncratic error term, ict . Gradeict

= StudentF Ei + ClassF Ect + ict

The coefficients in this model may be estimated with by ordinary least-squares techniques. Our use of the two Grade Points Above Replacement (GPAR) statistics is loosely inspired by Major League Baseball’s Wins Above Replacement statistics, which compute“How many more games did your team win with you as a player than they would have with a plausible replacement player?” Here, this question is reformed: “How much higher were your grades than those which would have been received by a plausible replacement student?” In this case, the plausible replacement is the average student in every class you took. Unlike GPA, the two GPAR statistics leverage basic information about the grade distribution in each class. Both take into account the mean grade awarded in the class, ascribing this to the instructor rather than the students, and referencing every student’s grade to this local average. GPAR2 also takes the dispersion in grades into account. The intent here is further account for varying dispersion in grades, but this method may backfire, ascribing extraordinary discriminatory power to a course in which almost everyone receives the same grade. Both GPAR statistics, like GPA, fail to account for the possibility that students in some courses may be, on average, much more successful students than those in other courses. Our final model jointly estimates student and course effects as parameters in a matrix that couples students to one another through courses taken together. The matrix still contains the full information about the grade distribution in every class, and a solution that returns coefficients encoding the student and course effects should in principle be a more accurate measure of student ability in college courses, at least if the assumptions underlying the fixed effects model are valid. This method does account for the possibility that students taking one course are on average substantially stronger than those taking another.

• Grade Point Average (GPA) is the traditional weightedaverage of course grades, where the weights are credit2.3 Intellectual Depth and Breadth hours. Courses are otherwise assumed to be the same Collective analysis of course taking patterns for all stuand no information about course grade distributions is dents can support a variety of measures of intellectual diverincluded. sity. The distribution of individuals among a set of groups may be described by a class of diversity indices, that take • Grade Points Above Replacement 1 (GPAR1) is the general form (e.g. [21]): a modification of the prescription given in [5]: student grades are compared to the course mean. The orig R  1 inal formulation given by Caulkins is modified here  q q−1 1 q  (1) D= = pi GPAR1 represents a student’s career performance and, R q−1 q−1 1 like GPA, is a credit hour weighted average. 1 pi p i

This is the inverse of the weighted generalized mean. R is the number of unique groups, and pi is the proportion of members in group i. Setting q = 2 makes this inverse of the weighted arithmetic mean, and q = 1 is the weighted geometric mean; it reduces to the exponential of the Shannon entropy. When we set q = 0, this is the harmonic mean, which just reduces to the total number of groups, R. As q increases, the weight given to the most abundant groups increases. The variety of subjects of the courses found on a transcript are a proxy for the intellectual breadth and depth of the content that a student was exposed to over his or her career. Using the same sample built for the GPA statistics, we compute a subject diversity index with q = 2 for each individual. Here, each subject is considered a group. In much the same way as GPA summarizes grades, the statistic 2 D reduces a complex aspect of the transcript to a single number. As with the simple GPA, this statistic makes no explicit account of the student-student network in which an individual is embedded. While a student may have taken a broad range of courses, he or she could have potentially taken them all with students from the same major. Indeed the opportunity to interact with individuals from different intellectual backgrounds is a stated goal of many institutions. This compels the construction of measures that capture the diversity of classmates, that is, people with whom the student had the opportunity to interact. We use graduating major to classify each of a student’s classmates, which we call major diversity, wherein we set q = 2. This final statistic involves the computation of a large student-student course coenrollment network (e.g. Figure 1), which is the subject of forthcoming paper. To make computation more manageable, we restrict this network and consider only the course coenrollments of students that entered in the College Literature, Science, and Arts (LSA) since Fall 2005 with any other student at the University.

RESULTS

In none of these first three metrics is there an explicit account of the strength of the other students in the course. In simpler terms, a grade received in a course with above

1. Biomedical Engineering 2. Chemistry Department 3. Department of Afro−American and African Studies 4. Earth and Environmental Sciences 5. Electrical Engr & Computer Sci 6. English Language & Literature Dept 7. Mathematics Department 8. Molecular, Cellular, and Developmental Biology 9. Organizational Studies 10. Physics Department 11. Psychology Department



3.5



● ●

3●

●●

6 11

●●

● ●

● ● ●





●● ● ●●



●● ● ● ●●

● ●





3.0

● ● ●● ● ● ● ●

●● ●●





CUMGPA

The heterogeneity in grades is evident. These grades depend on an interplay of the term the course was taken, the subject, the strength of the students in the class, and the instructor, among other things. GPA is a ubiquitous statistics of a career, agnostic to everything about courses and students aside from the grades assigned. Given that we know mean grades vary by 25% among departments, this cannot be a perfect estimator of student success. GPAR1 attempts to correct for variations in average course grades by comparing student grades to the course means; it is a first-order correction to GPA. Courses with unusually high or low mean grades should skew this grade-based metric less. GPAR2 offers a second-order correction to this effect: deviation from the mean is measured in units of standard deviation, under the assumption that large deviations where the grade distribution was narrow implicitly contain more information than those where the distribution of grade was broad.

4.0

Comparing Single Metric Performance Measures: GPA, GPAR1/2, SFE

9●



● ●



● ● ● ● 5

1●

● ●

7●

8● ● ●2 10 ●



●●



4●

2.5

3.1

Figures 5 and 6 show similar comparisons. The GPA and GPAR1 track each other well, with considerably less scatter at the centroid than SFE vs. GPA; GPA1 and GPA are more similar in construction than SFE, and hopefully this implies that SFE carries more information. GPAR1 vs. SFE has a scatter somewhere in between the first two figures. In this comparison, the two metrics track each other, and naively one expects symmetry about the line of equality in the distributions. Instead a tilt is apparent: in the lower left quadrant, GPAR1 > SFE, and the in the upper right quadrant, GPAR1 < SFE.



2.0

3.

or below average students should be interpreted differently than one with average students. This information is put to use in the fixed-effects model. In Figure 4  200, 000 final GPAs and SFEs are plotted. Histograms show the distribution of students in each dimension and individual points show means for courses offered in some departments. SFE, GPAR1, and GPA are all in units of grade points. Increasing GPA generally tracks with increasing SFE but with considerable scatter. That the centroid of the distribution in SFE does not fall on 0 reflects the strong negative skew of the SFE distribution; the departmental mean SFEs more closely match the centroid of the SFE distribution, as they cluster around zero. The GPA distribution is strongly peaked, due in part to the truncation of the GPA scale at 4.0; SFE experiences no such truncation and is spread more broadly, which reflects this measure’s better ability to distinguish individual students from one another. A student in the centroid has a GPA of  3.5 and SFE of  0.2. At nearly fixed GPA, departments range from those with higher SFE (Math) to those with lower SFE (English, Psychology). At fixed SFE, GPAs in Math are considerably lower than those in Organizational Studies. While the stucture of the two-dimensional distribution is smooth, its composition is rich.

−1.0

−0.5

0.0

0.5

1.0

SFE

Figure 4: GPA vs. SFE for ∼ 200, 000 undergraduate majors at the University of Michigan. Histograms represent total numbers of students in each GPA or SFE range. Points show the means of these quantities for different departments, with a selected few departments labeled.

3.2

Student and Course Effects

The sorting of students by some measure of achievement into different subjects has been commented on many times

0.3

Student−Course FE Space,

● BIOLOGY225

0.2



CHEM216

● CHEM215



● STATS250

250



3●

50





300

● 200



















UC280



● PHYSICS127 PSYCH240

CHEM125 ECON402 PHYSICS125 ● FRENCH232 ● CHEM130



CHEM211

PHYSICS141

ECON101 ●

ECON401 MATH116









ACC271

● BIOLOGY162

CHEM126 PHYSICS126 ●





● ●

STATS350



POLSCI160



SPANISH232 ●

ENGLISH124

WOMENSTD220





PSYCH111





MATH115



POLSCI111 SPANISH231

COMM102







● ANTHRCUL101

FRENCH231

PSYCH250

PSYCH270







COMM101



ENGLISH125



ANTHRBIO161



SOC100



ENGLISH223



−0.2

−0.5

4●

ENGR101 MATH216



PHYSICS240 MATH215 ● ECON102 PHYSICS140

0.0

●● ● ● ● ● ●

400

Mean Student FE

0.0



● ●● ● ●●

0.1

350

● 8 ● 1● 9● ● ● 2 ● ●● ● ● 10 ● ● ● ●7 ● ●● ●● ●● ● ● ● ● ●●● ● 5 ●●● ●●11 ●6 ● ● ● ● ● ● ●● ● ●● ●

MKT300 ENGR100 PHYSICS241



CHEM210 BIOLOGY171



GPAAR

BIOLOGY173

100 150

−0.1

0.5

1.0

GPAAR vs. GPA

1. Biomedical Engineering 2. Chemistry Department 3. Department of Afro−American and African Studies 4. Earth and Environmental Sciences 5. Electrical Engr & Computer Sci 6. English Language & Literature Dept 7. Mathematics Department 8. Molecular, Cellular, and Developmental Biology 9. Organizational Studies 10. Physics Department 11. Psychology Department

ENGLISH225





−0.3

−1.0

STATS100

2.0

2.5

3.0

3.5

4.0

2.6

2.8



3.0

GPA

3.2

3.4

3.6

3.8

Mean Course FE

Figure 5: GPAR1 vs. GPA for ∼ 200, 000 undergraduate majors at the University of Michigan. Contours represent the density of students in each range, and labeled dots show the means of these quantities for various majors.

Figure 7: The correlation between student effect and course effect for high enrollment courses at the University of Michigan. For each course, the student and course effects averaged over all terms are plotted for Natural Sciences (red), Humanities (black), and Social Sciences (blue).

0.5

1.0

GPAAR vs. SFE

1. Biomedical Engineering 2. Chemistry Department 3. Department of Afro−American and African Studies 4. Earth and Environmental Sciences 5. Electrical Engr & Computer Sci 6. English Language & Literature Dept 7. Mathematics Department 8. Molecular, Cellular, and Developmental Biology 9. Organizational Studies 10. Physics Department 11. Psychology Department

100

200 250

300

9● ●2●8● ● ● ● ● ● 10 ●7 ●

350

0.0



GPAAR

course effect quadrant are ENGLISH 125, 223, 225, WOMENSTD 220. In general, the course-effects are not unexpected given the grading patterns in Figure 3. However, we caution against interpretation of the SFE as an independent, intrinsic quality of a student, or the course effect as a measure of difficulty and rigor of a course. Student-course interaction is one source of confusion in this picture. Comparing CHEM 210 and CHEM 211, a naive expectation is that the students taking this lecture/lab combination are identical. They have very similar SFE, but receive quite different grades.

● ●

● ● ●●● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ●● ● ● ●

6 11

5

1●

3● ●

−0.5

150

4●

50

−1.0



−1.0

−0.5

0.0

0.5

1.0

Student FE

Figure 6: GPAR1 vs. SFE for ∼ 200, 000 undergraduate majors at the University of Michigan. Contours represent the density of students in each range, and labeled dots show the means of these quantities for various majors.

over the years. The situation with the SFE is no different (Figure 7) in its assessment of the typical students in different subjects. For the top courses in total enrollment, the lowest course effect courses have students with higher average SFEs. These top courses are a mix of science (23), humanities (16), social science (13). Introductory science classes cluster in the low mean course fixed-effect, high SFE quadrant of the plot: CHEM 210,215 (Organic Chem I & II); BIOLOGY 171 (Organismal and Population Biology); PHYSICS 140,240 (Mechanics,Electricity and Magnetism); MATH 115,116,215,216 (Calc I-IV) as well as ECON 101,102,401 (Micro, Macro, Intermediate MacroEcon) and Accounting 271. Intro science labs (BIOLOGY 173, CHEM 211, CHEM 216, PHYS 141, PHYS 241) also contain these high SFE students, but have higher mean course effects as already noted in [1]. In the low SFE, high

3.3

Subject and Major Indices

Figure 8 shows the subject diversity (the diversity of the subjects one studied) and majors diversity (the diversity of majors of one’s classmates) indices for all LSA students. The mean of the index, its standard error, and the number of students is given. In Table 1 the indices are listed for a select set of majors Several majors – especially Physics and Chem – sit below the mean subject diversity, while Business Administration stands out as particularly diverse; the sheer number of these students with high subject diversity pushes the overall distribution higher. Physics BS and Chem BSChem stand out as high in major diversity, while Psych BA and Business Administration sit on the low end of the distribution.

4.

DISCUSSION

Our purpose in this paper is to consider ways to use the full collection of student transcripts to add context to each. We have suggested new ways to estimate performance and an initial method for measuring the intellectual diversity of a student’s experience. We conclude by considering a few practical questions which would emerge. We examine the re-ranking of students within and among departments that would occur as a consequence of alternative performance metrics, and interpret the observed variation in intellectual

Table 1: Subject and Major Indices by Major. Diversity indices are tabulated for University of Michigan for selected majors that comprise extremes of of the distributions.

5000

LSA: Major Index

MAJOR Psychology BA Psychology BS English BA Economics BS Bus. Admin. BBA Physics BS Mathematics BS Chemistry BSChem

2000

3000

N = 37127

0

1000

Frequency

4000

19.3 +/− 0.0334

0

10

20

30

40

5000

ULSA: Subject Index

6.48 +/− 0.017

3000 2000 0

1000

Frequency

4000

N = 37127

0

5

10

15

D (q=2)

Figure 8: Subject and Major Indices for Graduating LSA majors. The mean and its standard error are given for each distribution, as well as the number of students. diversity that emerges college-wide.

4.1

Intra-Department Performance

For three departments in LSA (Figure 9) we consider graduates of those departments in Winter 2012. Each appears with an anonymous ID in both the left column (final cumulative GPA, left axis) and right column (SFE, right axis). A single student is connected between the columns by a line. Crossing of lines indicates reranking within the department. Low Spearman rank correlations (bottom) indicates a greater amount of re-ranking. Within a department drastic re-ranking rarely occurs, with most of the shuffling happening among close neighbors. At least one exception to this trend is student 248591 in Philosophy (middle panel), who fell from the top third of the class in GPA to nearly dead last in SFE. The transcript reveals that this individual earned only about half of their credits at Michigan, all in two years. Most of the courses were 300 and 400 level with relatively high mean grades (3.1 ≥ g ≤ 3.7). This student

16.4+/-0.088 19.8+/-0.27 17.4+/-0.14 19.2+/-0.25 13.9+/-0.091 23.8+/-0.68 20.9+/-0.26 25.3+/-0.49

N 3169 352 1979 452 3370 124 663 161

earned a relatively high GPA by receiving below average grades in courses which awarded high mean grades.

4.2

D (q=2)

4.7+/-0.028 5.74+/-0.087 4.21+/-0.034 5.68+/-0.083 12.6+/-0.066 3.63+/-0.096 5.42+/-0.069 3.5+/-0.074

GPA Error

Gross re-ranking within a department is in part a consequence of unusual transcripts, and we hypothesize that local reranking is mostly ‘noise’. Ultimately, GPA and similar measures have an error associated with them that is part systematic and part statistical. The systematic component includes things like a student’s major, which courses a student took, and when they were taken, while the statistical component is driven by how well-sampled is the student’s career; the latter should approach zero as the number of courses taken goes to infinity.Bootstrap resampling provides a simple insight into the magnitude of statistical uncertainty in GPA. For each student, we use bootstrap resampling of courses (N = 100) to compute a bootstrap mean GPA and error. The median standard error on the cumulative GPA for graduates is 0.058. Higher GPAs necessitate lower standard errors. This means that in practice, GPAs which differ by less than 0.05 grade points are statistically indistinguishable. This reality is never acknowledged by our system of awards, which attends carefully to the meaningless third decimal place in GPA.

4.3

College Honors

GPA forms the basis for traditional University or College Honors. Students are ranked by GPA and selected accordingly. One concern with this system is that if students accumulate most of their GPA in departments that assign high grades, honors will be biased. Figure 3 suggests that in the highest enrollment classes, which often reach 400-level, there are grading trends among and within departments. In U-M’s College of Literature, Science, and the Arts, academic honors (called “distinction”) is awarded on the basis of GPA. If this ranking were done by other means, award of these honors would go to different students. Figure 9 hints at the ways distinction might change if we ranked by SFE instead of GPA. The color of the lines indicates how students in departments are re-ranked when LSA graduates are ranked by SFE instead of GPA: students in Physics generally receive higher rankings, those in History lower, and those in Philosophy a mix. The alternative ranking scheme does indeed reshuffle the assignment of distinction. At one extreme of the reordering is English, in which the number of normal, high, and highest distinction students goes from 30, 9,and 9 to 20, 8, and 0 students. At the other end is Mathematics which goes from 13, 10, and 4 to 22, 11, and 15.

Philosophy Department, Winter 2012

500176 21390 286374 461904 354201 140935 34690

483424 446656 435322 383730 500176 21390 286374 461904 354201 140935 34690

483424 286374 169386 250547 499486 441094 201154 354201 208747 427851 380123 140935 165399 6123 46130 60450 390234 108590 175546

165399 441094 60450 132622 13521 18154

3.39

383730

169386 17444 250547 6123 208747 46130 201154 10898 108590 228271 499486

483424 286374 169386 250547 499486 441094 201154 354201 208747 427851 380123 140935 165399 6123 46130 60450 390234 108590 175546 327914 355764 18154 228271 487134

355764

416432 427851

331413 52614 327914 243349 175546 390234 69834

460477 52614 331413 13521 505761

460477 237003

77465 480960 184986 437390 5219 284361 467459 454861 228804 279361 216042 398857 3008 3900 289742 123810 186846 175841 74434 424250 266619 400417 82555 175704 372274 251052 487483 284927 178669 99802 213948 198004 168079 154973 17456 84074 302409 481283 406716 168018 443100

443100 84074 442662 168018 5502 58418 166749 168079 337754

345529 287562 5502 442662 128135 337754 65971 93697 400402 166749 58418

135054 345344 128135 400402 93091 506926 483113 345529 93697 465613 466677

483113 135054 506926 93091 345344 240878 465613 3886 466677 515554

170876 336087

166854 3886 336087 515554

460477 52614 331413 13521 505761

166854

132622

514458

316280 207491 416432

316280 207491 416432

514458 520477 236846

520477

330419 520671 170876

460477 237003

347246 243349

347246 207491

69834

505761

396993 102767 298437 42408

132622

347246 243349

347246 207491

396993 454861

240878

416432 427851

331413 52614 327914 243349 175546 390234 69834

456535

505761

487134

69834

142607 171806 63042 270906 451707 330419 520671 236846

237003

39496

487134

316280

0.5978

0.648

383730

169386 17444 250547 6123 208747 46130 201154 10898 108590 228271 499486 165399 441094 60450 132622 13521 18154

327914 355764 18154 228271 487134

355764

425721 181066 524887 438587 271377 5219 34690 446656 21390 17444 514995 10898 500176 334496

380123 514995

77465

42408 298437 5219 102767 17456 279361 400417 424250 186846 480960 74434 302409 3900 284361 289742 3008 467459 216042 213948 266619 82555 198004 251052 175704 437390 184986 487483 178669 372274 228804 398857 284927 123810 481283 406716 99802 154973 287562 175841 65971

3.39

483424 446656 435322 383730

382057 461904 201248 396993 435322 33668

0.148

380123 514995

199984 181066 382057 271377 524887 200198 396993 438587 33668 201248 334496 425721 5219 265019

425721 181066 524887 438587 271377 5219 34690 446656 21390 17444 514995 10898 500176 334496

53972 53972 456535

0.0978

265019

3.89

200198

265019

3.89

200198

382057 461904 201248 396993 435322 33668

265019

3.39

199984

0.648

199984 181066 382057 271377 524887 200198 396993 438587 33668 201248 334496 425721 5219

History Department,1870

199984

0.148

3.89

Physics Department, Winter 2012

171806 142607 63042

451707

316280 237003

39496

−0.852

227393 133394 500300

227393 133394

217742

217742





CUMGPA

500300 151919





217742

SFE

−0.9022

169327

2.39

2.39

2.39

−0.852

270906 169327



CUMGPA

rho = 0.9062

217742

SFE



CUMGPA

rho = 0.9062

151919

SFE rho = 0.9418

Figure 9: The Re-ranking of Students by Department and College: Physics, Philosopy, and History Winter 2012. The final GPA (left) and SFE (right) for students that graduated with an undergraduate degree in Winter 2012 is represented. Anonymized ID numbers match particular students to the text. Lines connect a student between the two columns, and aid in tracking re-ranking within a department. Blue lines indicate that a student was ranked higher in the College of LSA using SFE as a ranking criteria than GPA. Red lines indicate that they fared worse under an SFE ranking. The Spearman rank correlation for intra-department GPA and SFE ranks is given at the bottom as well: lower rank correlations indicate a greater degree of intra-departmental reranking.

4.4

Subject and Major Diversity

As with grading, degree pathways and requirements vary between departments in LSA, and certainly in the University beyond. Graduation requirements exist in part to encourage intellectual diversity in study, and this places a constraint on how much diversity (or lack thereof) is present in a transcript. For instance, Business Administration BBA students have high subject diversity, but upon closer inspection, it turns out that this is one of the few undergraduate programs for which multiple subjects exist within a school: subjects of FIN (finance), STRATEGY, MKT (marketing), ACC(accounting) and several others are all exclusive to the business school. This is in contrast to, for instance, mathematics where all courses are designated MATH. For this reason, it may be that a better measure of subject diversity only depends on the Department that ’owns’ a course. Until then, the subject indices may be best considered only within departments, not across.

4.4.1

Major Trends

Individually, the lowest subject diversity indices come from students graduating with degrees in Dance or Art and Design, likely after a transfer from LSA. In fact this describes the bottom 10 in the list. The lowest subject index was 1.40, for a student that took over 50 courses in DANCE, the remaining 11 coming from ENGLISH, WOMENSTD, and an array of singles in other subjects. Interestingly, this student’s major diversity index was 23.3, or about 4 points above the mean, which indicates that a broad array

of students from other majors were classmates with this individual.Business Administration students were the highest in subject diversity, with one individual at 24.4. However, 20 of this student’s courses come from 13 subjects that are owned by the School of Business. For these situations, the subject diversity becomes a measure intra-department breadth. This same student’s major diversity index = 18.46, which is below the mean University-wide.

4.4.2

Outlier Careers

Within a department, these measures are more standardized, and comparable. Overall, subject and major diversity show a mild (ρ = −0.14) anti-correlation with cumulative GPA, and have no correlation with SFE (not shown) - our diversity indices provide new, nearly orthogonal information. What is it telling us? As an example (for its large numbers of students) we consider Psych BA graduates. For the 10 highest SFE students, the subject indices range from 3.4 - 7.4. The two students at these boundaries had major indices of 19.1 and 19.2. The student with lower subject index took 37 courses from a total of 11 subjects: 18 PSYCH, 15 SW (Social Work), 7 WOMENSTD and an array of others. The higher index is comprised of 31 courses in 19 different subjects: 10 in PSYCH, and the others spread across the academic spectrum. Neither double-majored. At the other end of the Psych BA SFE spectrum (SFE < -0.75), students range from 4.8 - 8.8 on the subject diversity indices, and 16.7-24.8 on the major indices. The individual at 4.8 took 33 courses in 15 subjects: 14 in PSYCH, 3 in FRENCH and the others spread across other subjects with 1 or 2 instances.

This is in contrast to the 19 subjects taken by the 8.8 student across 37 courses: 10 in PSYCH, 4 in ASIANLANG, 4 POLSCI, and 3 in COMM. These two students were, respectively, 19.1 and 21.1 in the major diversity index.

[8] B. Gose. Duke rejects plan to alter calculation of grade-point averages. The Chronicle of Higher Education, March 1997. [9] J. Gutowski. Co-curricular transcripts: Documenting holistic higher education. The Bulletin, 34(5), 2006. 4.5 Conclusions Retrieved January 19, 2017 from This paper explores ways in which existing student tranhttp://www.acui.org/publications script information might be placed in context using straight/bulletin/article.aspx?issue=306id=1900. forward techniques. Rather than advocating for any of these [10] J. Hope. Support campuswide educational goals with measures in detail, we prefer to promote the idea of enriched transcript enhancements. The Successful Registrar, transcripts, and to encourage the community to consider 16(7):1–5, 1 Sept. 2016. other ways in which we might use existing information to [11] V. E. Johnson. An alternative to traditional gpa for better represent the experience of students on college camevaluating student performance. Statistical Science, puses. pages 251–269, 1997. [12] V. E. Johnson. Grade inflation: A crisis in college 5. ACKNOWLEDGMENTS education. Springer Science & Business Media, 2006. This work has been supported by the NSF WIDER grant [13] E. R. Julian. Validity of the medical college admission DUE-1347697 for the REBUILD project, by NSF TUES test for predicting medical school performance. grant DUE-1245127, and by the University of Michigan Provost’s Academic Medicine, 80(10):910–917, 2005. Learning Analytics Task Force through the Learning Ana[14] C. Knapp. Assessing grading. Public Affairs lytics Fellows Program. We thank Kar Epker for prelimQuarterly, 21(3):275–294, 2007. inary work which inspired the diversity analysis. We also [15] P. D. Larkey and J. P. Caulkins. Incentives to fail. H. thank the U-M Registrar Paul Robinson, the Office of the John Heinz III School of Public Policy and Registrar, U-M CIO Laura Patterson, and all the staff at Management, Department of Social and Decision the U-M Information Technology Services division for both Sciences, Carnegie Mellon University, 1992. maintaining and supporting access to this remarkable data [16] J. Lorkowski, O. Kosheleva, and V. Kreinovich. How set. Finally, we acknowledge the important contributions of to modify grade point average (gpa) to make it more former U-M Provost Phil Hanlon and current U-M Provost adequate. In International Mathematical Forum, Martha Pollack. Their strong advocacy of appropriate revolume 9, pages 1363–1367, 2014. search using student record data has made learning analytics [17] M. Meyer. The grading of students. Science, at Michigan possible. This research has been determined ex28(712):243–250, 1908. empt from human subjects control under exemption #1 of [18] R. Miller and W. Morgaine. The benefits of the 45 CFR 46.101.(b) by the U-M Institutional Research e-portfolios for students and faculty in their own Board (HUM00079609). words. Peer Review, 11(1):8, 2009. [19] A. of American Colleges and Universities. The leap 6. REFERENCES vision for learning: Outcomes, practices, impact, and [1] A. C. Achen and P. N. Courant. What are grades employers’ views, 2011. Retreived Jan 19, 2017 from made of? The journal of economic perspectives: a https://www.aacu.org/publicationsjournal of the American Economic Association, research/publications/leap-vision-learning-outcomes23(3):77, 2009. practices-impact-and-employers. [2] A. W. Astin. What matters in college: Four critical [20] A. of American Colleges Universities. E-portfolios. years revisited. Jossey-Bass, 1993. Retrieved on Jan 19, 2017 from [3] M. A. Bailey, J. S. Rosenthal, and A. H. Yoon. Grades https://www.aacu.org/eportfolios. and incentives: assessing competing grade point [21] S. E. Page. Diversity and Complexity. Primers in average measures and postgraduate outcomes. Studies Complex Systems. Princeton University Press, 2010. in Higher Education, 41(9):1548–1562, 2016. [22] Princeton Office of the Dean of the College. Grading [4] J. D. e. Bransford, A. L. e. Brown, and R. R. e. at Princeton, 2015. Retreived on Jan. 19, 2017 from Cocking. How people learn: Brain, mind, experience, http://odoc.princeton.edu/faculty-staff/gradingand school. National Academy Press, 1999. princeton. [5] J. Caulkins, P. Larkey, and J. Wei. Adjusting gpa to [23] J. Schinske and K. Tanner. Teaching more by grading reflect course difficulty, 1996. Working Paper, less (or differently). CBE-Life Sciences Education, Carnegie Mellon University, The Heinz School of 13(2):159–166, 2014. Public Policy and Management, retreived on Jan 17, [24] G. Siemens and P. Long. Penetrating the fog: 2017 from http://repository.cmu.edu/heinzworks/42/. Analytics in learning and education. EDUCAUSE [6] Dartmouth Office of the Registrar. Median Grades for review, 46(5):30, 2011. Undergraduate Courses, 2015. Retreived on Jan 19, [25] J. W. Young. Adjusting the cumulative gpa using item 2017 from response theory. Journal of Educational Measurement, http://www.dartmouth.edu/ reg/transcript/medians/. pages 175–186, 1990. [7] R. D. Goldman, D. E. Schmidt, B. N. Hewitt, and R. Fisher. Grading practices in different major fields. American Educational Research Journal, 11(4):343–357, 1974.