Computerized Psychological Testing: Overview and ...

27 downloads 78 Views 1MB Size Report
ment and improved tolerance for systems failures (Digital. Equipment Corporation, 1974). Commercially available ... (equipment), and the capability of the software (computer programs) available on the ...... psychiatric history? Psychological ...
Professional Psychology Research and Practice No 1,42-51

Copyright 1987 by the American Psychological Association. Inc. 07 35-702 8/S7/SOO. 75

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Computerized Psychological Testing: Overview and Critique Michael J. Burke

Jacques Normand

Department of Management and Organizational Behavior New York University

Federal Reserve Board Washington, D.C.

We present an overview and a eritique of computerized psychologies! testing and assessment. Emphasis is placed on describing computer testing systems currently in place, discussing considerations (factors) in developing a computerized psychological testing system, examining the research on potential benefits and problems associated with computerized psychological testing, and discussing Ihe need for the adoption of a set of guidelines, both scientific and ethical, for computerized psychological testing. We conclude that computerized psychological testing systems have the potential of being practical, cost-effective, and psychometrically sound means of assessing individuals. The potential of computerized psychological testing can be realized if proper considerations are made in designing, developing, and implementing these testing systems, and if professional standards (guidelines) are adhered to by computer test service providers and users. Before the adoption of computerized psychological testing becomes widespread, a number of serious issues deserve the attention of professionals.

Because of the range of issues relating to computerized psychological testing, an in-depth presentation of all the issues is simply beyond the scope of this article. We therefore attempted to provide numerous references on most issues and topics to benefit the reader who is interested in these areas. References to specific product lines and services offered by commercial vendors are omitted.

Over the past two decades, the use of computers in psychological assessment has steadily increased. During this time, computers have assisted in administering and scoring tests as well as in providing interpretive reports. Some of the early work related to the use of computers in psychological assessment, for a variety of different purposes, was reported by Gedye (1968), KJeinmuntz and McLean (1968), Lang (1969), and Stillman, Roth, Colby, and Rosenbaum (1969). Sampson (1983) reported that the use of computers in the testing field began when large time-sharing computers were used to optically scan, score, and profile standardized tests. This was followed by the addition of interpretive narrative reports to these systems. More recently, tests have been administered via computer terminals. Systems for administering tests via computer terminals also incorporate test scoring and interpretation components. A number of different types of computer systems for administering, scoring, and interpreting tests are commercially available. Our purpose is to provide an overview of existing computerized psychological testing systems, to describe some of the considerations in developing and implementing a computer testing system, to discuss the potential benefits and problems associated with computerized psychological testing, and to discuss the need for strict adherence to a set of standards for the administration and interpretation of computerized psychological tests.

Types of Computer Testing Systems In this section, brief descriptions of computer testing (assessment) systems are provided. Our intent in this section is not to describe all possible computer testing systems, but to expose the reader to some of the more well-known and used computer testing systems. (For summaries of the early automated testing work involving the computerization of traditional tests, see Denner, 1977, and Thompson & Wilson, 1982.)

Optical Scanners One of the oldest computer testing systems in psychology and education is one in which the mark-sensing optical scanner is used. In this system, light technology is used to detect carbon pencil marks at specific coordinates on special answer sheets. These carbon pencil marks are then translated to data codes (via a dossier language program) and the data is then stored on an output medium (e.g., hard disk, tape). All further data manipulations are usually performed by application programs (software). The cost associated with optical scoring and interpretation services varies with respect to the type of test being scored, the nature of the output desired (raw scores, profiles, or interpretive reports), and the type of mailing service used. Although it is a relatively economical option in comparison with hand-scoring and profile plotting by a skilled professional, a practical disadvantage is the considerable delay in processing time.

MICHAEL S. BURKE received his PhD in psychology from the Illinois Institute of Technology in 1982. He is Assistant Professor of Management at New York University and is presently conducting research in the areas of utility analysis, validity generalization, and personnel selection. JACOUI-S NORMAND received his PhD in industrial/organizational psychology from the Illinois Institute of Technology in 1982. He is currently Senior Personnel Analyst at the Board of Governors of the Federal Reserve System in Washington, D.C. His research interests include personnel selection, psychometrics, and survey research. CORRESPONDENCE CONCERNING THIS ARTICLE should be addressed to Michael J. Burke. 600 Tisch Hall. Department of Management and Organizational Behavior. New York University, 40 West 4th Street, New York, New York 10003.

42

COMPUTERIZED TESTING

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Microcomputer/Table-Top Scanner Combination A recent innovation in scanning technology is the combination of a microcomputer-printer and a table-top scanner. The scanner uses transmitted light technology to read pencil marks made on a special scannable form, and the microcomputer works with the scanner to pick up these optical mark readings, process the information, summarize the results, and communicate them to the user via a printer or another auxiliary output medium. Currently, software packages are available for scoring and generating interpretive reports for some of the more commonly administered psychological tests. Overall, this type of stand-alone system provides the user with onsite capability for entering and processing test data. By combining a relatively inexpensive microcomputer with a modern version of the scanner, the measurement service provider can score large numbers of tests in a short time period and have a complete scoring system at its disposal. In an industrial application of such a system, American Telephone and Telegraph (AT&T) has reduced the time required to process 500 data forms into a data base from 7 days to one hour (Nardoni, 1983). Stand-Alone Microcomputer Testing Systems These types of systems are self-contained, computerized psychological testing systems that usually administer tests, score tests, and provide complete printed evaluations (reports). In addition to DEC and [BM equipment, the personal computer product line includes a wide range of name brand hardware such as Apple, COMPAQ, and KayPro, to name a few. Currently, testing software packages (programs) are available for these types of hardware. Furthermore, these systems are not limited to only psychological testing tasks. Initially, microcomputer systems were being marketed as hardware-software bundled packages. Recently, various test publishers have made available test software packages that are compatible with a wide range of personal computers. Thus measurement (testing) service providers can now purchase their hardware and software separately and are no longer restricted to a single vendor when acquiring new software. Multiuser Minicomputer Systems In this type of system, a medium-range minicomputerbased system with multiuser and multitask capabilities is used. A multiuser minicomputer system can become a small networking system by allowing remote users to use telecommunication channels to obtain access to the minicomputer. Some of the commercially available sytems are capable of being expanded to support up to 16 remote cathode-ray tube (CRT) terminals. This type of system is intended for those who seek to automate not only the testing process but additional office management functions such as appointment scheduling, client billing, word processing, and so on. Computer Networks The initial steps toward computer networking began with the introduction of time-sharing systems and the development

43

of data communications technology in the late 1950s. Timesharing networks (systems) involve a large central computer acting as a host to several remote terminals. As the demand for this service increased, the single central computer was replaced with multiple connected computers. In addition, as microcomputer technology developed, individual processors were programmed to communicate with neighbor processors and data base files. This resulted in easier software development and improved tolerance for systems failures (Digital Equipment Corporation, 1974). Commercially available applications of networking technology for psychological assessment services vary greatly, depending on the communication medium, the hardware (equipment), and the capability of the software (computer programs) available on the processing host computer. Different networking systems permit the user to input combinations of either total test scores or item responses, or allow the test taker to answer individual test items via a computer terminal keyboard. The associated network charges vary with regard to the type of telecommunication lines, length of time that the user is connected, specific test being inputted, and type of output desired (raw scores, profiles, or interpretation reports). As indicated in this discussion, each of the reviewed computerized testing systems offers an array of possible arrangements. Each of these arrangements raises different issues for the test service provider, ranging from ethical to cost-effectiveness considerations. In the following discussion, we address some of the major factors that a test service provider may consider when assessing the feasibility, quality, and practicality of a computerized psychological testing system. Considerations (Factors) in Developing a Computerized Testing System Before implementing a computerized testing system, it is highly recommended that one develop a standard outline for documenting desired specifications. This standardized documentation process will facilitate the overall implementation of the system. Cole, Johnson, and Williams (1975) provided a good description of how the desired specifications were documented at the Salt Lake City Veterans Administration Hospital before that computerized testing system was implemented. We recommend that the documentation for specifications for computer testing be organized with respect to four design considerations: (a) systems specifications, (b) equipment specifications, (c) programming specifications, and (d) data record and procedure specifications. The system specification considerations concern the nature of the jobs of those who could be affected by the system and reflect the needs for which the system will be implemented. It is important that these specifications focus on the assessment (testing) process. If possible, it may be advantageous to involve individuals in the measurement service office who play a role in processing clients. One can accomplish this by interviewing these individuals and having them participate in setting specifications for automating their work. For instance, specifications could be set for such job duties as collecting and recording demographic data, gathering interview information, scheduling testing sessions, administering tests, scoring tests, summarizing test results, reviewing test results, and

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

44

MICHAEL J. BURKE AND JACQUES NORMAND

providing orientation or feedback to clients. Specifications for computerizing these duties or tasks would lead to several basic systems requirements and would serve as the basic documentation for system design. This information would provide answers to questions such as whether assessment instruments would have to be administered via CRT terminals, whether test data would have to be evaluated and interpreted on-line, how large the required data base capability to store all test data would be, and whether other office functions would have to be automated. The equipment specifications for implementing a computerized testing system would be dictated by factors such as the flow and number of clients tested during a specific time period, the number of different testing sites, the complexity of item presentation (graphics capability), and the complexity of test scoring schemes. On the basis of such factors, one may find that the equipment specifications will have to incorporate a central processing unit, a large memory capability, communications equipment, and a particular type and number of computer terminals. In regard to program specifications, an initial determination is whether one will make the software investment by buying off-the-shelf programs or by possibly spending substantial amounts of time and labor to write such programs. When one is designing or selecting software programs, it would be helpful to consider the compatibility of the hardware and software, the appropriateness of the software (e.g., psychometric soundness of algorithms), the ease of operation (e.g., ergonomics), the appearance of the system to the tester (e.g., user friendliness), and the stability of the system (e.g., system capabilities and failure free). Beaumont (1981) provided insightful comments with respect to some of these considerations. In addition, if an organization is designing its own software programs, some thought will have to be given to what would be the most efficient language for the software, to whether the software should be modular to allow for expansion when new applications are needed, to what software design should be used, and possibly to whether test publishers are willing to engage, when appropriate, in agreements to computerize their tests. Concerning the data record, and procedure specifications, all system users should specify what individual client information is to be collected. The record length of individual records, nature of the data codes (e.g., alphanumeric or numeric), data manipulation procedures, type of data storage medium to be used, and the length of time for which the data is to be saved must be specified. System designers should also determine whether their data records are to be updated on a regular basis. Providing these specifications for data records and procedures will undoubtedly assist in the designing of an effective and efficient computer testing system. In addition to these technical specifications, steps must be taken toward determining how psychologically ready the organization is to accept a computerized testing system. A possible means of determining this before the development of systems specifications is by a needs-assessment survey completed by the testing staff. Byrnes and Johnson (1981) noted that along with evaluating an organization's readiness, computer system developers need to develop and implement a planned change strategy as well as to recognize the possibility of staff resistance. In essence, Byrnes and Johnson recom-

mended a systematic change strategy to maximize staff acceptance and support for the computer testing system. Byrnes and Johnson's recommendations are consistent with the research literature on organizational change and development (cf. Daft, 1983; Katz, Kahn, & Adams, 1980).

Potential Benefits As noted, computerized assessment systems are rapidly becoming management tools for those involved with testing, counseling, and guidance programs. For all computer testing systems, with the exception of those in which only optical scanners are used, the computer is capable of administering and scoring tests by means of interactive terminals. A number of potential benefits to the test taker, the test practitioner, and the management official may result from the use of a computer, in comparison with the conventional (paper-and-pencil) testing system. These potential benefits or advantages of computerized testing are discussed as follows.

Acceptability to Clients An argument against computerized psychological assessment, primarily in the clinical realm, is that it is depersonalizing to the client. The argument is that the client is an object of automated manipulations that interfere with the counseling process and increase the client's isolation. On the contrary, the research literature suggests that clients tend to react favorably to computerized testing or interviewing sessions (cf, Bresolin, 1984; Erdman, Klein, & Greist, 1985; Greist, Klein, Van Cura, & Erdman, 1975; Klingler, Johnson, & Williams, 1976; Klingler, Miller, Johnson, & Williams, 1977; Lucas, 1977; Lushene, O'Neil, & Dunn, 1974; Space, 1981). In addition, with respect to computer interviewing, several studies have suggested that as subject matter becomes more sensitive, clients report a greater appreciation for the computer interviewing session (Greist, Klein, & Van Cura, 1973; Greist, Van Cura, & Kneppreth, 1973; Slack & Slack, 1977). Bartram and Bayliss (1984) and Weizenbaum (1976) commented on the ease with which individuals establish rapport with the computer as well as the design of program software to achieve such a rapport. Skinner and Allen (1983) investigated the hypothesis that individuals may provide more accurate information about sensitive areas to a computer than they would in a face-toface interview or in a self-report questionnaire. Histories of alcohol, drug, and tobacco use were collected for 150 clients who had been randomly assigned to one of three conditions (a computerized interview, a face-to-face interview, or a selfreport questionnaire session). No important differences were found in reliability across the three assessment formats. The computerized interview, however, was rated as less friendly but shorter, more relaxing, and more interesting than the faceto-face or self-report formats. Detailed analyses of client factors within each assessment revealed that the computerized interview was most acceptable to individuals with good visual-motor performance skills and least preferred by better educated and defensive individuals. Although the computerized interview was found to be more acceptable, these results

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

COMPUTERIZED TESTING

do not necessarily indicate that Skinner and Allen obtained more accurate information. As with the studies mentioned earlier. Skinner and Allen's study was conducted on a clinical patient population, and it is therefore difficult to generalize these findings to other counseling and guidance settings. More research or information on ciient acceptance of computerized testing in other populations is needed. For instance, a few researchers (Carr, Wilson, Ghosh, Ancil, & Woods, 1982; VoJans & Levy, 1982) reported significant levels of anxiety among elderly clients. However, elderly clients who receive a minimum amount of training (i.e., 1 hr) are likely to perform significantly belter than are those who do not receive such training (D. F. Johnson & White, 1980). This latter finding is consistent with current evidence that suggests that any anxiety caused by the computer is shortlived for most subjects if adequate practice is provided (Lushene et al., 1974), and that it is usually a result of poorly designed procedures (Hedl, O'Neil, & Hansen, 1973). With some computerized testing systems and their accompanying software, there are some potentially serious technical problems that may interfere with client acceptance and test performance: Some systems do not allow the client to back up to check or change answers; some operate at a fixed pace and fail to consider differences in client response time; others lack clear instructions and implicitly assume client familiarity with computer terminals. Although some efforts have been made to overcome these obstacles (e.g., allowing test takers to back up one question), more are needed. More specifically, software packages should be designed in such a way as to allow test takers to go through a few practice items before starting the test, to skip over items when they experience some difficulty, especially for tests trial have a strong speeded component, and to be able to back up to a desired item. For those researchers or practitioners interested in assessing client attitudes toward computers, it may be helpful to refer to Loyd and Gressard*s( 1984; Gressard & Loyd, 1985) work concerning the development and validation of the Computer Attitudes Scale, and to Reece and Gable's (1982) study on the validation of a measure of general attitudes toward computers. In addition, Schmidt, Uny, and Gugel (1978) evaluated the opinions of examinees toward tailored (or adaptive) testing. In general, their results indicated that examinees have positive attitudes toward this form of testing. Also of note is Lawton and Gerschner's (1982) review of the literature on attitudes toward computers and computerized instruction.

With each response, a revised and more reliable estimate is made of the person's ability. The test proceeds until the estimate reaches a specified level of reliability. Generally, the results both are more reliable and require fewer items than a paper and pencil test. (p. 222)

A few researchers have examined issues related to the validity of computer-administered adaptive tests (ef. Kingsburv & Weiss, 1981; Sympson, Weiss, & Ree, 1982; Weiss, 1982, 1985). For instance, Sympson et al, compared the Armed Services Vocational Aptitude Battery (ASVAB) Arithmetic Reasoning and Word Knowledge tests with computer-administered adaptive tests as predictors of performance in an Air Force mechanic training course. In addition to finding that validity coefficients obtained via adaptive tests were not significantly different from those for ASVAB subtests, they found that adaptive tests could provide levels of measurement precision obtainable only with much longer ASVAB tests. Also, they noted that computer-adaptive tests that were one third to one half the length of conventional ASVAB tests could approximate the criterion-related validity coefficients of these conventional tests. These are important findings when one considers that adaptive tests administer different items to different examinees. Overall, the Sympson et al, study supported the potential utility of computer adaptive ability testing in a military environment. As Hulin, Drasgow, and Parsons (1983) pointed out, increased measurement accuracy is not the only benefit of adaptive testing. They indicated that testing time, fatigue, and boredom should all be reduced in an adaptive testing session. As with nonadaptive computerized testing, restrictions as to when a test is administered couid also be reduced through the use of walk-in testing stations. In addition, Urry (1977) reported that an analysis at the U.S. Office of Personnel Management placed the cost of adaptive testing at less than that of paper-and-pencil tests. In a report prepared for the Canadian government, Budgell (1982) also indicated that even in consideration of the capital investment in computer hardware, computerized adaptive testing could show a savings over conventional paper-and-pencil testing in 1 year, (For more technical summaries and reviews of the adaptive testing literature, see Green, Bock, Humphreys, Linn, & Recfcase, 1984; Hulin et al, 1983; McBride, 1982; Vale, 1981; Weiss, 1982; Weiss & Bete, 1973. For a review of issues related to the legal aspects of computerized adaptive testing, see Donlon, 1984.)

Cost-Effectiveness Adaptive (Tailored) Testing Tailored or adaptive testing, in which a computer program adjusts the test difficulty to the ability of the individual being tested, is a particularly advantageous use of the computer. According to Niehaus (1979), A computer-assisted or adaptive lest uses a multi-stage process to estimate a person's ability several times during the course of testing, and the selection of successive test items based on those ability estimates. The person tested uses an interactive computer terminal to answer a test question. If the answer is correct, the next item will be more difficult: if riot, an easier item follows.

45

and

Efficiency

As noted earlier, some researchers have reported that adaptive testing may represent a cost savings in some cases over conventional paper-and-pencil testing, A number of investigators have also reported that nonadaptive computerized testing also provides benefits of economy, speed, and reliability (cf. Greist & Klein, 1980; Niehaus, 1979; Space, 1981), With respect to improving the efficiency of measurement (testing) services, Space (1981) indicated that a computerized test battery, coupled with automated scoring, interpretation, and report writing, can literally reduce the turn-around time between completion of testing and return of the report from a typical time of 14 days to within 30 rnin or less. Byers

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

46

MICHAEL J. BURKE AND JACQUES NORMAND

(1981) also discussed the improved administration and scoring efficiency resulting from on-line computerized testing, In many instances, computerized administration of tests is conducted by a trained clerk rather than by a psychometrician or other trained professional. This permits the psychometrician to devote additional time to more complex tasks. However, computerized psychological testing may not prove to be cost effective in all situations. We are aware of numerous situations in which computer test service providers have purchased costly, unnecessary hardware that has hindered their ability to operate profitably. A clear lack of planning and assessment of one's situation, coupled with a slick salesperson and grandiose ideas regarding the profitability of such testing systems, appear to be the primary reasons for the financial problems that these practitioners have encountered. We caution that before one buys a dream, one should seriously consider the previously raised points regarding the implementation of computerized testing systems, as well as evaluate the products and services of more than one vendor, ft may also be wise to keep abreast of improvements in computer equipment because most computer testing software can be purchased separately from the hardware. We believe that improvements in computer equipment and testing software will make computerized testing a cost-effective tool for an even wider range of measurement service providers.

Reliability of Reports It is commonly noted that computer reports (not necessarily test scores) are very reliable; that is, if the same responses are given and entered into the computer during two or more testing sessions, the computer will generate the same report (output) each time (reducing error variance resulti ng from the interpretation of tests scores to zero, thus producing highly consistent interpretations), barring any system failures. On the other hand, especially with respect to clinical or subjective test reports, humans (e.g., clinicians) will generally not be as consistent in their interpretation of test results. Computers have the advantage of not having hangovers, family arguments before coming to work, or lapses of memory (with a few exceptions)! Along with the reliability of the outputted information (i.e., test score interpretations), it has been demonstrated that ability and skill tests administered via the computer have acceptable to high levels of reliability (cf. Barrett, Alexander, Doverspike, Cellar, & Thomas. 1982; Myers, Schemmer, & Fleishman, 1983). Barrett et al. developed computerized information-processing and preference measures and administered them to a sample of college students. The internalconsistency reliabilities of the computerized informationprocessing measures were adequate; test-retest reliabilities were lower than desirable. In addition, Myers et al. developed computer interactive tests designed to measure abilities identified as underlying critical tasks in various helicopter missions. A major finding of their study was that most tests had high internal consistency reliability estimates. Test-retest (2week interval) reliability coefficients were also calculated. Overall, Myers et al. noted that the test-retest coefficients were at the moderate level for the computer-based tests (i.e., average was .52). Because the Myers et al. study was part of a

test development effort, the test-retest analysis revealed areas in which the tests could be improved. Although the reliability of computer reports and test scores, in most of the current studies, is of an acceptable level, test service providers need to be aware of or have information available on the psychometric soundness of computerized psychological tests. The reason for this is that the computer administration of tests may influence the validity, the equivalence of computer and paper-and-pencil versions, and norms.

Validity, Equivalence, and Norms BersofF( 1983) indicated that there is no apparent reason to expect computerized administration to generally affect the validity of measures of most, if not all, psychological constructs. Factors specific to computer administration would have to be related to the external measure with which one is validating test scores, in order for validity to be affected. For instance, it appears that such psychological constructs as "intelligence" would have little to do with factors specific to computerized test administration. The very limited research on the equivalency of computer versus paper-and-pencil test results has provided partial support that the two versions are equivalent. Elwood and Griffin (1972) compared the automated version of the Wechsler Adult Intelligence Scale (WAIS) with the standard face-toface administration and found respective correlations (i.e., reliability coefficients), on the basis of a test-retest design, for Verbal IQ, Performance fQ, and Full-Scale IQ to be almost identical. In addition, Bersoff (1983) reported that no differences have been found on the Slossen Intelligence Scale, on the Eysenck Personality Inventory, and on a measure of time orientation. A high correlation between the paper-and-pencil test and the computerized version is not sufficient evidence for demonstrating equivalence and thus for using the old norms from the conventional test. Approximately identical frequency distributions of test scores in which no change in examinee rank is observed between the conventional and computerized versions would provide more sound evidence of their equivalence. (See Lord. 1980,Angoff, 1971, or Marco, 1981, for discussions concerning strategies for equating.) This is a very important issue and more research on the equivalence of computer and conventional versions of the same test is clearly needed. It is important that comparisons be made between the computerized test results and conventional test results for a particular test when theory or previous research suggests that the validity of the two versions may differ. For exemplary purposes, consider the psychological construct of "honesty." Furthermore, assume that an organization desires to convert a paper-and-pencil version of an honesty test to computer format. The organization is likely to find that examinees answer more honestly on a computer than on a conventional questionnaire when asked to respond to questions probing highly sensitive personal topics (cf. Evans & Miller, 1969; Koson, Kitchen, Kochen, & Stodolsky, 1970; O'Brien & Dugdale, 1978). This research suggests that the computer and conventional versions of a test for measuring the construct of honesty may differ in validity. Some researchers have also

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

COMPUTERIZED TESTING

found that individuals admit more to engaging in socially undesirable behaviors when examined via computer than when asked about such behavior in an interview (Carr, Ghosh, & Ancil, 1983; Greist & KJein, 1980; Lucas, Mullins, Luna, & Mclnroy, 1977; Slack & Van Cura, 1968), The organization may also find that effective test strategies for conventional paper-and-pencil tests may no longer be adequate for the test's computer versions. Consequently, the test's reliability and validity as well as the normative data may be affected. For instance, access to a test item bank for a conventional paper-and-pencil test allows the examinee to skip around and estimate what would be an appropriate response time for individual items. The type of responses that are required of the examinee on the computer version of certain tests would apparently prevent the generalization of the psychometric properties {i.e., norms, reliability, and validity) of the conventional version. In addition, scores on conventional tests such as verbal fluency and divergent thinking, which require the examinee to list in writing as many words as possible within a time limit, may depend heavily on the individual's typing skills when the test is administered via computer. Researchers {Greaud & Green, 1986) have also found large differences in scores on speed tests between the computer and paper-and-pencil modes of administration. Test equivalence appears to be more straightforward for ability tests than it is for personality tests (Allred & Green, 1984). More specifically, a high degree of equivalence between different modes of presentation is expected for power (vs. speed) tests that are fixed in length, have little if any change in format, and require some form of multiple-choice response. Nonetheless, because the computer test administration does appear to affect test-taking strategies, especially on interest and personality instruments on which sensitive material is solicited, it would be important to conduct differential validity studies in order to compare the computerized and conventional versions of these particular tests. This would help to ensure that valid inferences are drawn from the test scores for the two different administrative modes of the test. If the computer administration of a test is equivalent to the conventional administration, then norms developed with the conventional test can be used to interpret scores obtained by computers. There are a few qualifications. First, it is important that the equating method be applied across all subpopulations for which the conventional subpopulation norms are to be applied. Second, the norms obtained with the conventional administration of a test should be appropriate for the computer-tested subpopulations that they are to be applied to. Hofer and Green (1985) provided valuable insights concerning the effect of different kinds of information on the equivalence of tests and legitimacy of generalizing inferences from test scores when the tests are based on different modes of administration.

Validity of Computerized Interpretations In comparing clinical judgments with computerized interpretive reports, researchers have reported that computergenerated reports are of equal or superior validity (DeMita, Johnson, & Hansen, 1981; LaBeck, Johnson, & Harris, 1983;

47

Space, 1981). However, as Moreland (1985a) and Adams and Heaton (1985) pointed out, none of the computer programs in the neuropsychological area to date has produced results that are either satisfactory or equal in accuracy to those achieved by expert neuropsychologists (Adams, Kvale, & Keegan, 1984; Anthony, Heaton, & Lehman, 1980; Heaton, Grant, Anthony, & Lehman, 1981) or, in some cases, by neurologists (Kleinmuntz, 1968) and other physicians (Blois, 1980). Nonetheless, in the field of clinical psychology, computers can be programmed to interpret the Minnesota Multiphasic Personality Inventory (MMPI) at least as well as human experts. The main methodological difference between these two research areas appears to be the presence of objective external criteria in neuropsychological studies; the primary criterion in validity studies of computerized personality test interpretations is expert evaluation of the computerized interpretation. (See Moreland, 1985a, 1985b, for a critical review of the research literature on the validity of computer-based clinical interpretations of personality tests, which has been limited almost exclusively to the MMPI.) Moreland (1985b) identified many methodological weaknesses associated with previous attempts to validate computerbased interpretations, and provided 14 design recommendations for future attempts. A conclusion to be reached from Moreland's review and from Graham and Lilly's (1984) report is that it is difficult to generalize the validity of computerbased test interpretations from the present limited data. One should also be cautious of computer reports because they may be highly reliable in producing incorrect information; that is, computer test reports can be generated on the basis of faulty or partly correct information. It has been our experience that this is particularly true in the case of computer-generated normative reports and standardized score reports. In addition, human programming errors of correct information may result in the generation of inaccurate test reports. Quality control of computer-based test reports is a very serious issue that to our knowledge has so far been woefully deficient.

Other Potential Benefits A number of other benefits have been noted by proponents of computer-based psychological testing. One is the opportunity for unique research projects. For example, Dunn, Lushene, and O'Neil (1972) were able to study MMPI response latencies via the computer medium. In addition, there exist vast opportunities for developing and researching new measures of psychological constructs (cf. Hofer & Green, 1985). Far too little attention has been given so far to using the unique capabilities of the computer in developing new types of items and tests. Another potential benefit is the assistance to individuals with visual, auditory, and physical limitations. Sampson (1983) reported that microcomputers, in conjunction with specialized data input and output devices, provide individuals with visual, auditory, and physical limitations opportunities to complete various tests with minimal assistance (cf. Wilson, Thompson, & Wylie, 1982). Although these and other benefits may result from a computer-based testing system, some po-

48

MICHAEL J. BURKE AND JACQUES NORMAND

tential problems in addition to those discussed earlier have been noted. Potential Problems

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Acceptability to Professionals Although client acceptance of computer-based testing appears to be somewhat favorable, the acceptability by professionals has been less clear. Space (1981) and Byrnes and Johnson (1981) noted that in clinical applications, professional acceptance is the weakest link to successful implementation of computer-based testing. Reports of acceptance have, however, been variable. Moreover, these reports have been primarily anecdotal and were not backed by statistical information. A few practitioners and researchers have offered suggestions for overcoming potential problems related to professional acceptance (Byrnes & Johnson, 1981; J. H. Johnson, Williams, Giannetti, Klingler, & Nakashima, 1978; J. H. Johnson, Williams, Klingler, & Giannetti, 1977; Klonoff & Clark, 1975). For instance, Klonoff and Clark (1975) reported that staff attitudes toward computers were favorable for those attending a 2.5-day seminar. Moreover, Byrnes and Johnson (1981) reviewed the computer implementation experiences of others and made suggestions for incorporating systematic changes into the implementation process. It is evident that at minimum, some form of education and preparation is necessary for successful implementation of computer-based psychological testing systems. Inadequate Provision for Human Factors Tomeski and Lazarus (1975) remarked that "many organizations develop computer products and applications with great attention to technical and economic factors, but with minimal attention to the most important resource: people™ (p. ix). This comment is particularly relevant to computerized testing because systems have not always been developed with serious attention to the test-taker-computer interaction. Some considerations that must be made include ensuring the clarity of instructions for taking the test, allowing practice before starting the test, removing any difficulties related to reading and understanding items, providing some system such as a back-up key for error checking and correction, allowing test takers to skip items in highly speeded test, and ensuring that long delays between an individual's answer to an item and the computer's response do not occur. Much too little attention has been given to improving these latter problem areas. Questions Concerning Test Procedures and Proper Feedback of Results As discussed earlier, questions have been raised in regard to the reliability, validity, and norms of computerized tests. We recommend that test service providers be aware of whether there exists appropriate information concerning the development of test norms for computer-administered tests (cf. Amer-

ican Psychological Association [APA], 1977, 1981; APA et al., 1985; APA, Division 14, 1980), In regard to confidentiality, the problems arc important and present, independently of whether computer procedures are used (Space, 1981). It has been argued, however, that threats to the confidentiality of testing are magnified when testing is conducted via the computer. One answer to this criticism is that strong ethical controls must be in place when the individual is providing information and when this person or others need access to the test results. It is our opinion that confidentiality of test results has less of a chance of being compromised when stored in a secured computer file than when stored in a locked file cabinet. Also, Ford (1976) discussed methods of limiting access to confidential information. Another serious problem is inadequate feedback to clients, especially with respect to clinical and personality tests. Only providing a narrative report for a clinically based test without the option of having a professional assist in interpreting the test results is under no circumstances sufficient and proper feedback. Ironically, such a practice is clearly tolerated under the auspices of the APA at its convention booths. Need for the Adoption of Standards for Computerized Testing Over the past 5 years, there has been increased use of conventional psychological tests in the area of employment testing (cf. ASPA/BNA, 1983). Likewise, the use of ability, personality, and interest tests has recently escalated in the areas of counseling and guidance. The increased use of tests in this latter area of counseling and guidance can be partly attributed to recent developments in automated test administration and scoring systems as well as to the proliferation of organizations and consultants offering these testing services. This increase in computer-based test interpretations, coupled with the problems noted earlier, has led some writers (cf. Matarazzo, 1983) to call for limiting the availability of such products to only specially qualified individuals. Another source of concern, related to the rise of computerized testing, is the proliferation of unsatisfactory interpretive systems (Lanyon. 1984). Concomitant with the growth of testing is the trend toward greater scrutiny of testing, especially in employment and educational settings. (See Bersoff, 1981, for a review of the literature on testing and the law.) On the basis of numerous and significant court decisions since Bersoffs (1981) review, it is clear that continued legal scrutiny of psychological testing is a reality. In order to ensure that computerized testing is capable of withstanding legal as well as professional scrutiny, there is an impending necessity for the testing profession to adopt a set of standards for computerized psychological testing. A few attempts have been directed toward the development of standards for computer-based psychological testing (Bersoff, 1983; Colorado Psychological Association, 1982; Green et al., 1984: Hofer & Bersoff, 1984; Sampson & Pyle, 1983). Although not yet official APA policy, significant work by APA's Committee on Psychological Tests and Assessments and its Committee on Professional Standards, which is based on the earlier work of Hofer and Bersoff, has been made

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

COMPUTERIZED TESTING

toward the development of acceptable professional standards. The work of these two APA committees represents significant progress toward the development of guidelines for ensuring that computer-based testing will be practiced at the highest scientific and ethical levels. The future of computer-based testing and assessment in education, psychology, and business is bright. Computerbased testing systems have the potential of being practical, cost-effective, and psychometrieally sound means of assessing individuals. Their potential can be realized if proper considerations are made in designing, developing, and implementing these testing systems and if professional standards are maintained by computer test service providers and users. Before the adoption of computer-based psychological testing becomes widespread, a number of significant psychometric issues as well as important practical and ethical considerations, noted earlier, must be addressed. References Adams, K. M, & Heatcm, R. 1C. (1985). Automated interpretation of neuropsychological test data. Journal of Consulting and Clinical Psychology. 53. 790-802. Adams, K. M.. Kvale. V. I., & Keegan. J. F. (1984). Performance of three automated systems for neuropsychological interpretation based on two representative tasks. Journal of Clinical Neuropsychohgy. 6, 413-431. Allred, L. J., & Green. B. F. (1984. January). Analysis oj'experimental CAT ASVAB test data. Unpublished manuscript. The Johns Hopkins University. Department of Psychology. American Psychological Association (1977). Standards for providers of psychological service*. Washington, DC: Author. American Psychological Association (1981). Speciality guidelines for the delivery of services. American Psychologist, 36, 640-681. American Psychological Association, American Education Research Association, & National Council on Measurement in Education (1985). Standards for educational and psychological tests. Washington, DC: American Psychological Association. American Psychological Association, Division 14 (1980). Principles for the use and validation of personnel selection procedures. Washington, DC: Author. Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed.. pp. 508600). Washington. DC: American Council on Education. Anthony, W. Z.. Heaton, R. K., & Lehman, R. A. W. (1980). An attempt to cross-validate two actuarial systems for neuropsychological test interpretation. Journal of Consulting and Clinical Psychology. 4S. 3\7-}26, American Society for Personnel Administration/Bureau of National Affaire (ASPA/BNA) (1983). ASPA/BNA Survey No. 45, employee selection procedures. Washington, DC: U.S. Bureau of National Affairs. Barrett. G. V., Alexander, R. A., Doverspike, D., Cellar. D.. & Thomas. J. C. (1982). The development and application of a computerized information processing test battery. Applied Psychological Measurement, 6, 13-29. Bartram, D., & Bayliss, R. (1984). Automated testing: Past, present, and future. Journal of Occupational Psychology, 57, 221-237. Beaumont. J. G. (1981). Microcomputer-aided assessment using standard psychometric procedures. Behavior Research Methods & Instrumentation. 13. 430-433. Bersoff, D. N. (1981). Testing and the law. American Psychologist, 36. 1047-1056.

49

Bersolf, D. N. (1983). A rationale and proposal regarding standards for the administration and interpretation of computerized psvchi> logical testing. Report prepared for Psych Systems, Inc., Baltimore, MD. Blois, M. S. (1980). Clinical judgment and computers. New England Journal of Medicine. 303, 192-197. Bresolin, M. J.. Jr. (1984). A comparative study of computer administration of the Minnesota Muiiiphasic Personality Inventory in an inpatient psychiatric setting. Unpublished doctoral dissertation, Loyola University of Chicago. Budgell, G. R. (1982). Preliminary analysis of the feasibility of computerized adaptive testing and item banking in the public service. Unpublished report. Public Service Commission. Ottawa, Canada. Byers, A. P. (1981). Psychological evaluation by means of an on-line computer. Behavior Research Methods & Instrumentation, 13, 585-587. Byrnes, E., & Johnson, J. H. (1981). Change technology and the implementation of automation in mental health care settings. Behavior Research Methods & Instrumentation. 13. 573-580. Carr, A. C., Ghosh, A., & Ancil, R. J. (1983). Can a computer take a psychiatric history? Psychological Medicine. 13, 151-158. Carr] A. C.. Wilson. S. L., Ghosh. A., Ancil. R. J., & Woods, R. T. (1982). Automated testing of geriatric patients using a microcomputer-based system. International Journal of Mart-Machine Studies, 28. 297-300. Cole, E. B.. Johnson. J. H., & Williams. T. A. (1975). Design considerations for an on-line computer system for automated psychiatric assessment. Behavior Research Methods & Instrumentation, 7. 195-198. Colorado Psychological Association (3982). Guidelines for the use of computerized testing services. Denver: Author. Daft. R. L. (1983). Organization theory and design. St. Paul, MM: West Publishing. DeMita, M. A., Johnson, J. H., & Hansen. K. E. (1981). The validity of a computerized visual searching task as an indicator of brain damage. Behavior Research Methods & Instrumentation. 13. 592594. Denner, S. (1977). Automated psychological testing: A review. British Journal of Social and Clinical Psychology. 16. 173-179. Digital Equipment Corporation (1974). Introduction to minicomputer networks. Maynard, MA: Author. Donlon, T. F. (1984). Legal aspects of computerized adaptive testing. Paper presented at the annual meeting of the American Educational Research Association, New Orleans. Dunn, T. G.. Lushene, R. E., & O'Neil, H. F. (1972). Complete automation of the MMPI and a study of its response in latencies. Journal of Consulting and Clinical Psychology, 39. 381 -387. Elwood, D. L., & Griffin, R. H. (1972). Individual intelligence testing without the examiner: Reliability of an automated method. Journal of Consulting and Clinical Psychology, 38. 9-14. Erdman, H. P.. Klein. M. H.. & Greist. J. H. (1985). Direct patient computer interviewing. Journal of Consulting and Clinical Psi'chology. 53, 760-773. Evans, W. M., & Miller, J. R. (1969). Differential effects on response bias of computer versus conventional administration of a social science questionnaire. Behavioral Science, 14. 216-227. Ford. W. E. (1976). A client-coding system to maintain confidentiality in a computerized data system. Hospital and Community Psychiatry, 27. 624-625. Gedye. J. L. (1968). The development of a general purposes psychological testing system. Bulletin ofthe British Psychological Society, 21. 101-102. Graham, J. R., & Lilly, R. S. (1984). Psychological testing. Englewood Cliffs. NJ: Prentice-Hall.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

50

MICHAEL J. BURKE AND JACQUES NORMAND

Greaud, V. A., & Green, B. F. (1986}. Equivalence of conventional and computer presentation of speed tests. Applied Psychological Measurement, 10, 23-34. Green. B. F., Bock, R. D., Humphreys. L. G., Linn, R. L.. & Reckase. M. D. (J984). Technical guidelines for assessing computerized adaptive tests. Journal o( Educational Measurement, 21, 347-360. Greist, J. H.. & Klein, M. H. (1980). Computer programs for patients, clinicians, and researchers, in psychiatry. In J. B. Sidowski, J, H. Johnson. & T. A. Williams (Eds.). Technology in menial health care delivery systems (pp. 161-181). Norwood, NJ: Ablex. Greist. J. H.. Klein, M, H., & Van Cura, L. J. (1973). A computer interview for psychiatric palicnl large! symptoms. Archives of General Psychiam: 31, 247-253. Greist. J. H., Klein. M. H.. Van Cura. L, ).. & Erdman. H. P. (1975). Computer interview questionnaires for drug use/abuse. In D. J. I.eltieri (Ed.), Predicting adolescent drug use: A review of issues, methods and correlates (pp. 147-164). Washington. DC: U.S. Government Printing Office. Greist, J. H., Van Cura, L. J.. & Kneppreth, N. P. (1973). A computer interview for emergency room patients. Computers in Biomedical Research. 6, 257-265. Gressard, C. P., & Loyd, B. H. (1985). Validation studies of a new computer altitudes scale. Paper presented at the annua! meeting of the American Educational Research Association, Chicago. Heaton, R. K., Grant, I., Anthony, W. Z., & Lehman. R. A. W. (1981). A comparison of clinical and automated interpretation of the Halstead-Reitan Battery. Journal of Clinical Neuropsychologv, 3, 121-141. Hedl, J. J., O'Neil, H. H.. & Hansen, D. N. (1973). Affective reactions toward computer-based intelligence testing. Journal of Consulting and Clinical Psychology, 40,217-222. Hofer. P. J.. & Bersoff, D. N. (1984). Standards for lite administration and interpretation of computerized psychological testing. Unpublished manuscript (available from D. N, Bersoff, Suite 511, 1200 Seventeenth Street, N.W., Washington. DC 20036). Hofer. P. J.. & Green, B. F. (1985). The challenge of competence and creativity in computeri/ed psychological testing. Journal of Consulting and Clinical Psychology, 53. 826-838. Hulin, C. L, Drasgow, F., & Parsons, C. K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: Dorsey. Johnson, D. F., & White, C. B. (1980). Effects of training on computerized test performance in the elderly. Journal of Applied Psychology, 65, 357-358. Johnson, J. H., Williams, T. A., Giannetti, R. A.. KJingler, D. E., & Nakashima. S. R. (1978). Organization preparedness for change: Staff acceptance of an on-line computer assisted assessment system. Behavior Research Methods & Instrumentation. 10, 186-190. Johnson. J. H., Williams. T. A.. Klingler, D. E., & Giannelli, R. A. (1977). Imervemional relevance and retrofit programming: Concepts for the improvement of clinician acceptance of computer generated assessment reports. Behavior Research Methods & Instrumentation. 9. 123-132. Katz, D.. Kahn, R. L., & Adams, J. S. (1980). The study of organizations. San Francisco: Jossey-Bass. Kingsbury, G. G., & Weiss. D. J. (1981). A validity comparison of adaptive and conventional strategies fur mastery testing (Research Report 81-3). Minneapolis: University of Minnesota, Department of Psychology, Computerized Adaptive Testing Laboratory. Kleinmumz. B. (Ed.) (1968). Formal representation of human judgment. New York: Wiley. Kleinmumz, B., & McLean, R. S. (1968). Computers in behavioral science: Diagnostic interviewing by digital computer. Behavioral Science. 11. 75-80. Klingler, D. E.. Johnson, J. H.. & Williams, T. A. (1976). Strategies

in the evolution of an on-line computer-assisted unit for intake assessment of mental health patients. Behavior Research Methods & Instrumentation, S, 95-100. KJingler, D. E., Miller, D.. Johnson, J. H., & Williams, T, A. (1977). Process evaluation of an on-line computer-assisted unit for intake assessment. Behavior Research Methods & Instrumentation, 9, 110-116. Klonoff. H., & Clark. C. V. (1975). Measuring staff attitudes toward computerization. Hospital and Community Psychiatry, 26, 823825. Koson, D., Kitchen, C., Kochen, M.. & Stodolsky, D. (1970). Psychological testing by computer: Effect on response bias. Educational and Psychological Measurement, SO, 803-810. LaBeck, L. J., Johnson, J. H., & Harris, W. G. (1983). Validity of a computerized on-line MMPI interpretive system. Journal of Clinical Psychology, 39, 412-416. Lang, P. J. (1969). The on-line computer in behavior therapy research. American Psychologist, 24, 236-239. Lanyon. R. I. (1984). Personality assessment. Annual Review of Psychology, 35, 667-701. Lawton, J., & Gerschner, V. T. (1982). A review of the literature on attitudes towards computers and computerized instruction. Journal of Research and Development in Education, 16, 50-55. Lord, F. M. (1980). Applications of Hem response theory to practical testing problems. Hillsdale, NJ: Erlbaum. Loyd, B. H., & Gressard. C. P. {1984). Reliability and factorial validity of computer attitude scales. Educational and Psychological Measurement, 44, 501-505. Lucas, R. W. (1977). A study of patients' attitudes to computer interrogation. International Journal of Man-Machine Studies, 9, 69-86. Lucas, R. W., Mullins. P. J., Luna, C. B., & Mclnroy, D. C. (1977). Psychiatrists and a computer as interrogators of patients with alcohol-related illnesses: A comparison. British Journal ofPsvchiatry, 131, 160-167. Lushene, R. E., O'Neil. H. H., & Dunn, T. (1974). Equivalent validity of a completely computerized MMPI. Journal of Personality Assessment, 38, 353-361. Marco. G. L. (1981). Equating tests in an era of test disclosure. In B. F. Green (Ed.), Issues in testing: Coaching, disclosure, and ethnic bias (pp. 105-122). San Francisco: Jossey-Bass. Matarazzo, J. M. (1983, July 22). Computerized psychological testing. Science, 221, 323. McBride, J. R. (1982). Adaptive mental testing: The state of the art. Catalogue of Selected Documents in Psychology, 12, 24. (Ms. No. 2455) Moreland, K. L. (1985a). Computer-assisted psychological assessment in 1985: A practical guide. Computers in Human Behavior, 1, 221233. Moreland, K. L. (I985b). Validation of computer-based test interpretations: Problems and prospects. Journal of Consulting and Clinical Psychology, 53, 816-825. Myers, D. C., Schemmer, F. M., & Fleishman, E. A. (1983). Analysis of computer interactive tests for assigning helicopter pilots to different missions {Research Report R-83-8). Bethesda, MD: Advanced Research Resources Organization. Nardoni. R. (1983). Screening optical scanners for personnel. Personnel Journal, 62, 806-811. N'iehaus, R. J. (1979). Computer-assisted human resources planning. New York: Wiley. O'Brien, T., & Dugdale, V. (1978). Questionnaire administration by computer. Journal of the Market Research Society. 20, 228-237. Reece, M. J., & Gable, R. K. (1982). The development and validation of a measure of general attitudes toward computers. Educational and Psychological Measurement, 42, 913-916.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

COMPUTERIZED TESTING

Sampson. J. P. (1983). Computer-assisted testing and assessment: Current status and implications for the future. Measurement and Kvaluaiion in Guidance, 5. 293-299. Sampson, J. P., & Pyle, K. R. (1983). Ethical issues involved with the use of computer-assisted counseling, testing, and guidance systems. Personnel and Guidance Journal, 61. 283-287. Schmidt. F. L.. Urry. V. W.. & Gugel. J. F. (1978). Computer assisted tailored testing: Examinee reactions and evaluations. Educational ami Psychological Measurement 38, 265-273. Skinner. II. A., & Allen. B. A. (1983). Does the computer make a difference? Computerized versus face-to-face versus self-report assessment of alcohol, drug, and tobacco use. Journal of Consulting mid Clinical Psychology, 5 /, 267-275. Slack. W. V., & Slack. C. W. (1977). Talking to a computer about emotional problems: A comparative study. Psychotherapy: Theory, Research, ami Practice, 14. 156-164. Slack, W. V., & Van Cura, L. J. (1968). Patient reaction to computerbased medical interviewing. Computers and Biomedieal Research, /. 527-531. Space. L. G, (1981). The computer as psychometrician. Behavior Research Methods & instrumentation, J3, 595-606. Stillman. R.. Roth, W. T.. Colby. K. M.. & Rosenbaum, C. P. (1969). An on-line computer system for initial psychiatric inventory. American Journal of Psychiatry, 125, 8-11. Sympson, J. B.. Weiss, D. J.. & Ree, M. J. (1982). Predictive validity oj conventional and adaptive tests in an Air Force training environment (AFHRL TR 81-40). Brooks Air Force Base, TX: Manpower and Personnel Division, Air Force Human Relations Laboratory7. Thompson, J. A., & Wilson. S. L. (1982). Automated psychological

51

testing. International Journal of Man-Machine Studies, 17, 279290. Tomeski, E. A.. & Lazarus. H. (1975). People-oriented computer systems: The computer in crisis. New York: Van Nostrand Reinhold. Urry, V. W. (1977). Tailored testing: A successful application of latent trait theory. Journal of Educational Measurement. 14, 181-196. Vale, C. D. (1981). Design and implementation of a microcomputerbased adaptive testing system. Behavior Research Methods & Instrumentation, 13, 399-406. V'olans. P. J.. & Levy, R. (1982). A re-evaluation of an automated tailored test of concept learning with elderly psychiatric patients. British Journal of Clinical Psychology, 21. 93-161. Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement. 6. 473492. Weiss, D. J. (1985). Adaptive testing by computer. Journal of Consulting and Clinical Psychology, ij, 774-789. Weiss, D. J., & Betz. N. E. (1973). Ability measurement: Conventional or adaptive? (Research Report 73-1). Minneapolis: University of Minnesota. Psychometric Methods Program. Weizenbaum, J. (1976). Computer power and human reason. San Francisco: W. H. Freeman. Wilson. S. L., Thompson, J. A.. & Wylie, G. (1982). Automated psychological testing for the severely physically handicapped. International Journal of Man-Machine Studies. 17, 291-296.

Received June 17, 1985 Revision received October 30, 1985 »

Suggest Documents