Clinical Radiology (2001) 56: 341±347 doi:10.1053/crad.2001.0678, available online at http://www.idealibrary.com on
Review Measuring the Eects of Image Interpretation: An Evaluative Framework STEPHEN BREALEY Department of Health Sciences and Clinical Evaluation, Alcuin College, The University of York, Heslington, York, U.K. Received: 26 June 2000 Revised: 30 August 2000
Accepted: 3 November 2000
The relaxing of restrictions on reporting ®lms has resulted in radiographers and other health care professionals becoming increasingly involved in the interpretation of images in areas such as mammography, ultrasound and plain ®lm radiography. However, errors and variation in the interpretation of images now represents the weakest area of clinical imaging. This has been highlighted by the diculty of establishing standards to measure the ®lm reading performance of radiographers as part of role extension initiatives. Despite a growing literature of studies that evaluate the ®lm reading performance of dierent health care professionals, there is a paucity of evidence of the subsquent eects on the referring clinician's diagnosis, management plans and patient outcome. This paper proposes an evaluative framework that can be used to measure the chain of events from the initial technical assessment of observers' potential to interpret images using search behaviour techniques, through to the potential costs and bene®ts to society. Evaluating the wider implications of alternative or complementary reporting policies is essential for generating the evidence base to comprehensively underpin policy and practice and direct future research. Brealey, S. (2001). Clinical Radiology 56, 341±347. # 2001 The Royal College of Radiologists Key words: role extension, evaluation, ®lm reading performance, evidence base.
Health care technology is `the methods used by health care professionals to promote health, prevent and treat disease and improve rehabilitation and long-term care' [1]. These methods include `hardware' such as diagnostic technologies; `software' such as diagnostic policies; and the skills of people working in the health services such as image interpretation [2]. Health care evaluation can be described as the process of choosing between alternative health technologies by estimating the net value of each. This can be achieved by identifying, measuring and valuing the inputs and outputs generated by each technology with regard to clinical, economic and social eects [3]. This process encompasses the concepts of ecacy, eectiveness and eciency. Ecacy describes the technical relationship between the technology and its eects under ideal conditions. Eectiveness concerns the extent to which a technology in routine circumstances brings about desired eects, such as changes in diagnosis, altered management plans and improvement in health. Author for correspondence and guarantor of study: Stephen Brealey, Department of Health Sciences and Clinical Evaluation, Alcuin College, The University of York, Heslington, York YO1 5DD, U.K. Fax: 44 (0) 1904 434517; E-mail:
[email protected] 0009-9260/01/050341+07 $35.00/0
Eciency is concerned with whether acceptable ecacy and eectiveness are achieved with the most prudent or optimal mix of resources [4]. The debate on the evaluation of diagnostic imaging intensi®ed with the adoption of computed tomography (CT) in the 1970s [5]. A framework was therefore proposed by Fineberg et al. [6] to measure the eects of dierent diagnostic technologies [7]. At that time, the role of the observer in image interpretation was a subsidiary issue as this was mainly the domain of radiologists. With the exception of Swinburne, the controversial question of who should interpret images was not raised [8]. However, a growth in both the range and volume of imaging procedures [9], a shortage of radiologists [10] and the blurring of professional boundaries [11] resulted in relaxing the restrictions on radiographic reporting during the 1990s [12]. Radiographers and other health care professionals now interpret images in areas such as mammography [13], ultrasound [14] and plain ®lm radiography [15, 16]. With the proliferation of role extension initiatives there is a need to apply the same rigour as advocated by Fineberg when evaluating diagnostic technologies to the evaluation # 2001 The Royal College of Radiologists
342
CLINICAL RADIOLOGY
of who or what combination of people are competent to report ®lms. This is important as the interpretation of images produced by dierent modalities is a fundamental element of clinical imaging yet considered the weakest area [9]. Furthermore, the eects of observer error and variation are rarely examined. An inaccurate report can lead to unnecessary investigations, aect the institution of eective therapy, and result in adverse events and hence extra costs, not just to the health care system, but to the patient and society in general. There is therefore a need to examine the wider implications of observer error and variation and by looking beyond the assessment of search behaviour and ®lm reading performance encompass a more global approach. The objective of the paper is to delineate a basic framework for evaluating the overall impact of ®lm reporting when choosing between alternative health care professionals. The methodology for measuring the eects of imaging modalities will be discussed before these concepts are applied to image interpretation.
Fig. 1 ± Donabedian's general classi®cation for assessment of health care illustrated by speci®c questions to computed tomography (CT) [3].
EVALUATION OF DIAGNOSTIC IMAGING TECHNOLOGIES
The evaluation of an imaging technology is a complex process as there is the conceptual hurdle to overcome of how one relates outcome to the eect of a diagnostic technique when other factors such as therapy are involved [17]. Donabedian described a general taxonomy for evaluating health care systems [18], as illustrated by speci®c questions to CT in Fig. 1 [3]. Fineberg et al. [6] de®ned these concepts to diagnostic imaging at four separate levels, subsequently extended by the Institute of Medicine [19] to ®ve and is applied to the assessment of magnetic resonance imaging (MRI) in Fig. 2 [4]. The proposed framework will use Fineberg's hierarchy to estimate the value of dierent reporting groups (e.g. radiographers versus radiologists) in terms of clinical, economic and social eects. The framework is also structured to permit the evaluation of radiographic reporting as an assessment of ecacy in controlled conditions and eectiveness in clinical practice and related to costs in terms of eciency. There are also methodological problems speci®c to the process of evaluating a cognitive task such as image interpretation which will be addressed within the framework. THE EVALUATIVE FRAMEWORK
The requirements for validating radiographers in a reporting role are best expressed by the concept of a chain of events. For radiographic reporting to be eective, it must ®rst be ecacious, but the reverse is not true. For example, radiographers may provide accurate diagnoses that have no impact on therapy. Even if there is an improvement in therapy, this may not be commensurate with increased levels of diagnostic accuracy. Within the context of dierent settings the proposed framework aims to provide a comprehensive assessment of the eects of ®lm
Fig. 2 ± The ®ve-stage evaluative hierarchy as applied to the assessment of the eects of magentic resonance imaging (MRI) [4].
reading performance both within and between professions. Fig. 3 illustrates the hierarchical approach and the multidisciplinary requirements to assessing the eects of reporting and the questions to be addressed at each level.
MEASURING THE EFFECTS OF IMAGE INTERPRETATION
343
Nodine and Kundel emphasize the importance of training and experience in the detection of abnormalities and that skilled search involves comprehensive, systematic sampling of clinically pertinent areas [29]. A study by Carr and Mugglestone [30] recorded the visual search behaviour of radiographers when viewing chest radiographs in experimental conditions. They found radiographers had comparable patterns of search strategies to radiologists and achieved a high rate of agreement regarding the presence or absence of abnormalities. There is also evidence that selectively trained radiographers can eectively report radiographs in clinical practice to a level of accuracy equal that of radiologists [31]. Therefore, radiographers search patterns and ability to detect abnormalities when viewing radiographs is similar to that of radiologists and training programmes have provided them with the clinical knowledge, skills and experience to interpret abnormal image appearances. This suggests that the search behaviour of radiographers should be assessed for a wide range of patients with diverse conditions and for images from various modalities. Similar ®ndings would provide justi®cation for investing in the extension of radiographer's reporting role to other areas of image interpretation. In conclusion, just as assessing image quality demonstrates the potential of the imaging system, analysis of observers search patterns re¯ects their potential for reporting.
Diagnostic Performance
Fig. 3 ± The evaluative hierarchy used to assess the eects of image interpretation as illustrated by speci®c questions to radiographic reporting.
Technical Competence Technical capability, or level one of Fineberg's hierarchy is concerned with the ability of a technology to reliably perform to speci®cations in a laboratory setting [20] which includes assessing image quality [21]. The purpose of developing imaging systems to produce good quality images is to assist the observers' ability to accurately diagnose disease. Perceptual errors occur when an observer fails to identify an abnormality [22]. A cognitive error is when an observer incorrectly interprets what the abnormality is, due to faulty reasoning [23]. At this stage of the proposed framework, it is important to acknowledge the role of research into the perceptual±cognitive processes that underlie the interpretation of medical images. Renfrew et al. [24] suggest that such research should investigate ®xation clusters and dwell time, availability of clinical history and comparative images and application of arti®cial intelligence. This can be achieved in both experimental [25,26] and clinical environments [27,28]. The issue of primary interest in the context of the overall framework is the potential of an observer to report ®lms.
The second level of Fineberg's hierarchy is diagnostic accuracy and is concerned with whether the application of imaging allows health care workers to make a more accurate assessment of the presence and severity of a disease. The precision of a technology in terms of both intra- and inter-observer reliability should also be established [20]. It is important to emphasize this concept is a joint function of the images produced and the observer's performance [21]. Indeed, the eect of observer variation when comparing dierent diagnostic methods is rarely satisfactorily taken into consideration. For example, observer performance changes depending on the way in which the image is displayed [32] and overall performance can be confounded by dierences in experience between observers [33]. This level of the proposed framework is concerned with establishing radiographers' ability to interpret images accurately and reliably. These studies are usually performed as part of a training programme, which involves assessment in both controlled (i.e. examination) conditions using objective structured clinical examinations (OSCEs) and during clinical practice [34]. The environment in which radiographers are assessed will aect what one can measure. However, while it is necessary to establish ecacy, high levels of accuracy and reliability do not re¯ect eectiveness. Only the assessment of radiographic reporting on diagnostic, therapeutic and patient outcome can achieve this. Measuring radiographers' performance in controlled conditions in comparison with a `reference standard' is an assessment of validity. Because a retrospective sample of
344
CLINICAL RADIOLOGY
®lms are selected it is possible to generate a double-/tripleblind consultant radiological report, which should produce valid results. Due to the possibility for adverse eects following reporting errors in clinical practice, high levels of accuracy are required, such as 95% [34]. A careful selection of ®lms with a high prevalence and a broad spectrum of disease is necessary so that observers performance is not falsely elevated [35]. Moreover, the appropriate sample size should be calculated depending upon how precise an estimate of sensitivity and speci®city is needed [36]. For an answer to the size of the shortfall of radiographers performance to the standard, a con®dence interval is appropriate [37]. When radiographers are assessed in clinical practice, it is often logistically feasible to compare their performance with a single consultant radiologist only [15, 31]. There is evidence however, that the magnitude of inter-observer variation between consultant radiologists in plain ®lm reporting is considerable [38]. Radiographers have also attained levels of `accuracy' which are comparable to those of radiologists [15, 31]. Therefore, a single radiologist's report should not be classi®ed as the reference standard, unless it is a consultant radiologist with extensive experience in the speci®c speciality, whose reports are validated by appropriate clinical follow-up. Otherwise, it is necessary to assess the degree of agreement between the radiographers and the radiologist as a measure of inter-observer reliability using the Kappa statistic [39]. In clinical practice there are also various factors that can aect performance such as disturbances, dierent times of the day, time constraints, equivocal cases and length of reporting sessions [40±42]. Because of the subjectivity involved in a cognitive task like ®lm interpretation, and the heterogeneity of reporting conditions, it is important to demonstrate the observers can perform consistently. This can be achieved by reinterpreting a subsample of ®lms as a measure of intraobserver reliability. Assessment in clinical practice also raises the issue of independence [43], as radiographers' performance can be dramatically in¯uenced by prior knowledge of another report [44]. The method for eliminating this source of bias is to ensure whoever is under evaluation is unaware or `blind' to any other relevant report or advice. The degree to which the assessment is intended to be pragmatic determines whether radiographers should have access to previous relevant reports, other colleagues' advice including the referring clinician, or even discussion with patients.
Diagnostic Outcome This level of Fineberg's hierarchy assumes that patient outcome cannot be aected by the imaging information unless it leads the clinician to do something dierent. The imaging information may change clinician's diagnostic certainty or dierential diagnosis, strengthen a competing diagnostic hypothesis, or simply reassure a clinician that no occult abnormality is present [45]. This level is also concerned with the extent to which the application of
imaging displaces alternative diagnostic techniques and thus the economic impact of the technology [4]. Studies have shown the principal qualities of a report as judged by clinicians are timeliness, reliability, accuracy, clarity, brevity and clinical correlation [10, 46]. The clarity and certainty conveyed in a report is particularly important. Although it may be completely accurate, if also complex and equivocal, this may impede eective communication and confuse the referring clinician [47]. There is also evidence that the readability of a report varies among individual members of the same profession and when interpreting images from dierent modalities [47]. Therefore, variation in the readability, content and certainty with which members of dierent professions report could have an impact on the diagnostic process by aecting the referring clinician's diagnostic con®dence. This can be assessed by asking the clinician to record their diagnosis and con®dence before the reports from the dierent groups are available and by repeating the process on receipt of the reports [48]. The most common method for quantifying diagnostic con®dence is to use direct estimation techniques such as a visual analogue scale [39]. Although a positive change by no means indicates an improvement in patient management or outcome, clinicians often place great value on results that do nothing more than reassure them [21]. Studies that measure the eect of reports on clinicians' subjective diagnostic probabilities could demonstrate which group induced the most change in clinicians' thinking. This can help to reduce both clinician and patient anxiety if the clinician is more con®dent in their diagnoses. The second aim of this level is to evaluate the economic impact of a group of observers, with the emphasis on displacement of alternative groups. For example, radiographic reporting could replace radiological reporting if more accurate and improves clinicians' diagnostic thinking, or equally as accurate at signi®cantly less cost. Although radiographers have been shown to report at a similar level of accuracy as radiologists a robust economic evaluation has not yet been undertaken to assess the cost-eectiveness of radiographic plain ®lm reporting [49]. The ®nal aim of this level is to assess whether radiographic reporting is complementary to radiological reporting, rather than directly substitutive. This addresses the concept of dual or multiple reading which requires the independent and combined assessment of two or more groups, such as radiographers and emergency nurse practitioners when reporting accident and emergency (A&E) ®lms [50]. To establish the most eective order of reporting, patients could be randomized to dierent predetermined access routes to who interprets the image ®rst. Dual reading of mammograms [51], barium enemas [52] and CT images [53] have been shown to be eective.
Therapeutic Outcome A test result may in¯uence a clinician's diagnostic thinking and still have no impact on patient management. A clinician may be unaware of the signi®cance of a result, or be unfamiliar with the available treatment. The change
MEASURING THE EFFECTS OF IMAGE INTERPRETATION
in probability of disease may be insucient to alter therapy. There may be no therapy available, or the patient may already be receiving the best therapy [20]. Conversely, changes in the clinician's diagnostic con®dence may lead to the initiation of new and more appropriate therapy, maintenance of the optimum treatment plan or eliminate the need for therapy [21]. Some would also argue a test result may not change management but could directly eect patient prognosis by excluding disease and reassuring the patient [54]. To assess the eect of dierent reporting groups on the planning and delivery of therapy the clinician should record patient management before and after the reports are known. The most rigorous methodology for addressing this problem would be a trial involving the random assignment of images to alternative groups of professionals and patient follow-up after an acceptable time to compare observed changes in management. Currently, there are only a few observational studies which have assessed the impact on patient management of dierent reporting groups [55, 56]. Rather than measuring changes in management as described above, these studies record whether ®lm interpretation errors would have been clinically important. A combination of the two approaches would be most valuable in that recording pre- and post-report plans allows quanti®cation of management changes which can qualitatively be judged as being clinically signi®cant or not.
Patient Outcome This level of Fineberg's hierarchy is concerned with the eect of an imaging technology on a patient's health. This is the ®rst point at which the expected costs, such as radiation, monetary expense, pain, and risk to life of a technology can be directly weighed against expected bene®ts, such as improved life expectancy and quality of life [3]. In the context of image interpretation, the important question is whether the report produced by one professional group in comparison with another results in improved patient outcome. The cost of an incorrect report may include patients receiving further unnecessary radiation or invasive tests, which has both ®nancial and health consequences. The bene®t of a correct report could be estimated as reduced patient anxiety or improved patient outcome. The diculty attributing the costs and bene®ts to dierent reporting groups is that patient outcome is mediated by the clinician's response to the report in terms of diagnostic con®dence and the appropriate and timely application of treatments. Even if radiographic reporting can be associated with bene®ts to the patient, it is dicult to measure such changes [4]. Nevertheless, quality of life questionnaires and crude outcome measures could be used to measure patient health status and number of days o work, respectively [57]. A recent study assessed the eect of radiographic plain ®lm reporting on patient outcome. The hypothesis was that if a signi®cant bony injury was missed and caused persistent symptoms or disability, the patient would be expected to reattend for further examination of the symptomatic area
345
[58]. The study concluded that appropriately trained radiographers could undertake reporting of selected skeletal examinations for A&E patients. However, the use of reattendance as a proxy for patient outcome has limited validity as it does not properly account for false-positive reports or the morbidity of patients who suer but do not re-attend. There are no studies at present which assess the eect of radiographic plain ®lm reporting on patient outcome using the measures referred to above. In this context such an extensive evaluation may not be necessary. However, reduced patient mortality is demonstrated for screening mammography [59], which is partly determined by the sensitivity of mammography as a screening test. Therefore, patient outcome can be in¯uenced by who or what combination of observers are involved in the reporting process. This level of evaluation would be more applicable to assessing radiographic reporting of ultrasound examinations and barium enemas.
Social Eciency This ®nal level of the evaluative hierarchy has only been advocated more recently [21]. This is when studies go beyond measuring the clinical eects of a technology to individuals to whether the cost of a technology is acceptable to society. For the policy-maker entrusted with making resource allocations, it is necessary to assess the extent to which reporting is an ecient use of resources to provide bene®ts to society [21]. A cost±bene®t analysis (CBA) could be performed to demonstrate `social eciency'. This requires all outputs of radiographic and radiological reporting being valued in monetary terms by directly asking individuals how much they would be willing to pay for the respective change in health. The analyses would include not only the direct costs to the NHS but also the personal costs borne by patients and their families, such as time and travelling expenses. The cost of back problems to society is well documented [60], as is the bene®t to society of the widespread use of mammography for screening and detection of breast cancer [61]. Such costs and bene®ts could be exacerbated depending on who is involved in the reporting policy. DISCUSSION
This framework facilitates the process of dierentiating between features inherent to image interpretation (technical competence and diagnostic performance) and those that in¯uence patients subsequent diagnosis, management and health. Levels 1 and 2 primarily relate to the assessment of ecacy whereas levels three to ®ve are concerned with eectiveness. With respect to eciency, each level can be regarded as an eect or output and related to costs i.e. cost per unit of output. For levels 2±4, it is possible to compute cost per correct diagnosis or cost per change in diagnostic thinking, or cost per changed management plan. At these levels, a cost-eective analysis (CEA) would be performed, which assumes the output is in some sense `worth having' [62]. Level 5 focuses on how reporting by dierent
346
CLINICAL RADIOLOGY
professional groups can eect the quality of life of a patient. This can be used in a cost±utility analysis and is broader than a CEA. Finally, level 6 addresses the consequences of dierent health care professionals reporting to society in the form of CBA. This framework provides the basis for estimating the net value of alternative reporting policies in terms of the clinical, economic and social concepts described in health care evaluation [3]. The method used to produce the most robust results would be to rigorously design and undertake a randomized controlled trial to evaluate the eectiveness (levels 3, 4 and 5) of dierent reporting groups (e.g. radiographers versus radiologists). It will never be possible to completely eradicate the heterogeneity associated with human interpretation of images although the implementation of role extension initiatives in the form of selectively trained radiographers has been shown to improve performance [15, 31]. Furthermore, the development of arti®cial intelligent devices will replace human observers in tasks such as the detection of breast cancer [9]. However, by evaluating the wider implications of image interpretation using the proposed theoretical framework underpinned by the rigorous and thoughtful application of research methodology it will be possible to generate the knowledge to comprehensively inform policy and practice. Acknowledgements. The author would like to acknowledge the advice of the two referees and Dr Roderick Mackenzie.
REFERENCES 1 Department of Health. Research for health: a research and development strategy for the NHS. London: Department of Health, 1991. 2 Health Technology Assessment Advisory Group. Assessing the eects of health technologies. London: Department of Health, 1992. 3 Russell I. The evaluation of computerised tomography: a review of research methods. In: Culyer AJ, Horisberger B, eds. Economical and medical evaluation of health care technologies. Berlin: SpringerVerlag, 1983, 298±316. 4 Mackenzie R, Dixon AK. Measuring the eects of imaging: an evaluative framework. Clin Radiol 1995;50:513±518. 5 Shapiro SH, Wyman SM. Cat fever. N Engl J Med 1976;294:954± 956. 6 Fineberg HV, Bauman R, Sosman M. Computerized cranial tomography: eect on diagnostic and therapeutic plans. JAMA 1977;238:224±227. 7 Maisey MN, Hutton J. Guidelines for the evaluation of radiological technologies. British Institute of Radiology: London, 1991. 8 Swineburne K. Pattern recognition for radiographers. Lancet 1971;1:589±590. 9 Robinson PJA. Radiology's Achilles' heel: error and variation in the interpretation of the Rontgen image. BJR 1997;70:1085±1098. 10 Audit Commission. Improving your image. How to manage radiology services more eectively. London: HMSO, 1995. 11 The College of Radiographers. Role development in radiography. London: College of Radiographers, 1996. 12 The Royal College of Radiologists and The College of Radiographers. Inter-professional roles and responsibilities in a radiology service. London: The Royal College of Radiologists & The College of Radiographers, 1998. 13 Pauli R, Hammond S, Cooke J, Ansell J. Radiographers as ®lm readers in screening mammography: an assessment of competence under test and screening conditions. BJR 1996;69:10±14.
14 Bates JA, Conlon RM, Irving HC. An audit of the role of the sonographer in non-obstretic ultrasound. Clin Radiol 1994;49: 617±620. 15 Loughran CF. Reporting on fracture radiographs by radiographers: the impact of a training programme. BJR 1994;67:945±950. 16 Meek S, Kendall J, Porter J, Freij R. Can accident and emergency nurse practitioners interpret radiographs? A multicentre study. J Accid Emerg Med 1998;15:105±107. 17 Kelly S, Berry E, Roderick P, et al. The identi®cation of bias in studies of the diagnostic performance of imaging modalities. BJR 1997;70:1028±1035. 18 Donabedian A. Evaluating the quality of medical care. Millbank Memorial Fund Quarterly 1966;44 suppl:166±206. 19 Institute of Medicine. Policy statement: computed tomographic scanning. Washington DC: National Academy of Sciences, 1977. 20 Guyatt GH, Tugwell PX, Feeny DH, Haynes RB, Drummond M. A framework for clinical evaluation of diagnostic technologies. Can Med Assoc J Vol 1986;134:587±593. 21 Fryback DG, Thornbury JR. The ecacy of diagnostic imaging. Med Decis Making 1991;11:88±94. 22 Berlin L. Malpractice issues in radiology: perceptual errors. AJR 1996;167:587±590. 23 Berlin L. Malpractice issues in radiology: errors in judgement. AJR 1996;166:1259±1261. 24 Renfrew DL, Franken EA, Berbaum KS, Weigelt FH, Abu-Yousef MM. Errors in radiology: classi®cation and lessons in 182 cases presented at a problem case conference. Radiology 1992;183: 145±150. 25 Schreiber MH. The clinical history as a factor in roentgenogram interpretation. JAMA 1963;185:137±139. 26 Potchen EJ, Gard JW, Lazar P, Lahaie P, Andary M. The eect of clinical history data on chest ®lm interpretation: direction or distraction. Invest Radiol 1979;14:404. 27 Berbaum KS, El-Khoury GY, Franken EA, Kathol M, Montgomery WJ, Hesson W. Impact of clinical history on fracture detection with radiography. Radiology 1988;168:507±511. 28 White K, Berbaum K, Smith WL. The role of previous radiographs and reports in the interpretation of current radiographs. Invest Radiol 1994;29:263±265. 29 Nodine CF, Kundel HL. The cognitive side of visual search in radiology. In: O'Regan JK, Levy-Schoen A, eds. Eye movements: from physiology to cognition. North-Holland: Elsevier Science Publishers B.V, 1987. 30 Carr D, Mugglestone M. Visual search strategies of radiographers: evidence for role extension. EJR 1997;7:179. 31 Robinson PJA. Short communication: plain ®lm reporting by radiographers ± a feasibility study. BJR 1996;69:1171±1174. 32 Krupinski EA, Weinstein RS, Rozek LS. Experience related dierences in diagnosis from medical images displayed on monitors. Telemed J 1996;2:101±108. 33 Kido S, Ikezoe J, Takeuchi N, et al. Interpretation of subtle interstitial lung abnormalities: conventional versus storage phosphor radiography. Radiology 1993;187:527±533. 34 Prime NJ, Paterson AM, Henderson PI. The development of a curriculum: a case study of six centres providing courses in radiographic reporting. Radiography 1999;5:63±70. 35 Ransoho DF, Feinstein AR. Problems of spectrum bias in evaluating the ecacy of diagnostic tests. N Engl J Med 1978;299:926±930. 36 Freedman LS. Evaluating and comparing imaging techniques: a review and classi®cation of study designs. BJR 1987;60:1071±1081. 37 Altman DG. Practical statistics for medical research. London: Chapman & Hall, 1991. 38 Robinson PJA, Wilson D, Coral A, Murphy A, Verow P. Variation between experienced observers in the interpretation of accident and emergency radiographs. BJR 1999;72:323±330. 39 Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use, 2nd ed. Oxford: Oxford University Press, 1995. 40 Gale AG, Murray D, Millar K, Worthington BS. Circadian variation in radiology. In: Gale AG, Johnson F, eds. Theoretical and applied aspects of eye movement research. Amsterdam: North Holland, 1984.
MEASURING THE EFFECTS OF IMAGE INTERPRETATION
41 Bryan S, Weatherburn G, Roddie M, Keen J, Muris N, Buxton MJ. Explaining variation in radiologists' reporting times. BJR 1995;68: 854±861. 42 Robinson PJA, Fletcher JM. Clinical coding in radiology. Imaging 1994;6:133±142. 43 Sackett DL, Haynes RB, Guyatt GH, Tugwell P. Clinical epidemiology: a basic science for clinical medicine, 2nd ed. London: Little, Brown and Company, 1991. 44 Aideyan UO, Berbaum K, Smith WL. In¯uence of prior radiological information on the interpretation of radiographic examinations. Acad Radiol 1995;2:205±208. 45 Thornbury JR. Clinical ecacy of diagnostic imaging: love it or leave it. AJR 1994;162:1±8. 46 Lafortune M, Breton G, Baudouin J-L. The radiological report: what is useful for the referring physician?. J Can Assoc Radiol 1988;39:140±143. 47 Sierra AE, Bisesi MA, Rosenbaum TL, Potchen EJ. Readability of the radiologic report. Invest Radiol 1992;27:236±239. 48 Wittenberg J, Fineberg HV, Ferraci JT Jr, et al. Clinical ecacy of computed body tomography II. AJR 1980;134:111±120. 49 The College of Radiographers. Reporting by radiographers: a vision paper. London: The College of Radiographer, 1997. 50 Remedios D, Ridley N, Taylor S, de Lacey G. Trauma radiology: extending the Red Dot system. Radiology 1998;60. 51 Anderson EDC, Muir BB, Walsh JS, Kirkpatrick AE. The ecacy of double reading mammograms in breast screening. Clin Radiol 1994;49:248±251. 52 Markus JB, Somers S, O'Malley BP, Stevenson GW. Doublecontrast barium enema studies: eect of multiple reading on perception error. Radiology 1990;175:55±56.
347
53 Naik KS, Spencer JA, Craven C, McClellan K, Robinson PJ. Computed tomography in staging lymphoma: a comparison of contiguous with interval 10 mm slices. Boston: Proc American Roentgen Ray Society, 1997. 54 Kelsey Fry I. Who needs high technology?. BJR 1984;57:765±772. 55 De Lacey G, Barker A, Harper J, Wignall B. An assessment of the clinical eects of reporting accident and emergency radiographs. BJR 1980;53:304±309. 56 Snow DA. Clinical signi®cance of discrepancies in roentgenographic ®lm interpretation in an acute walk-in area. J Gen Intern Med 1986;1:295±299. 57 Fletcher A, Gore S, Jones D, Fitzpatrick R, Spiegelhalter D, Cox D. Quality of life measures in health care. II: design, analysis, and interpretation. BMJ 1992;305:1145±1148. 58 Robinson PJA, Culpan G, Wiggins M. Interpretation of selected accident and emergency radiographic examinations by radiographers: a review of 11 000 cases. BJR 1999;72:1±6. 59 Shapiro S, Venet W, Strax P, Venet L, Roeser R. Ten- to fourteenyear eect of screening on breast cancer mortality. J Natl Cancer Inst 1982;69:349±355. 60 Clinical Standards Advisory Group. Epidemiology review: the epidemiology and cost of back pain. London: HMSO, 1994. 61 Eddy DM. Screening for breast cancer. Ann Intern Med 1989;111: 389±399; 858±859. 62 Drummond MF, Stoddart GL, Torrance GW. Methods for the economic evaluation of health care programmes. Oxford Medical Publications: Oxford, 1987.