Document not found! Please try again

The Design and Evaluation of a Computerized and Physical Simulator ...

19 downloads 2045 Views 1MB Size Report
Feb 26, 2009 - training simulators. The Virginia Prostate Examination Simulator .... ios being multiple and reconfigurable, abnormalities could be presented at ...
388

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 39, NO. 2, MARCH 2009

The Design and Evaluation of a Computerized and Physical Simulator for Training Clinical Prostate Exams Gregory J. Gerling, Member, IEEE, Sarah Rigsbee, Reba Moyer Childress, and Marcus L. Martin

Abstract—The most effective screening for prostate cancer combines the prostate specific antigen blood test with the digital rectal examination (DRE). In performing a DRE, two sequential tasks are completed: (task a) palpating the prostate to identify abnormalities and (task b) linking identified abnormalities to a disease diagnosis. At present, clinicians find too few abnormalities and have variable rates of detection, due in part to the inadequacy of training simulators. The Virginia Prostate Examination Simulator (VPES) was designed, built, and tested to address the inadequacies of current simulators by incorporating the design requirements of the basic elements of accurate anatomy, multiple and reconfigurable scenarios of graded difficulty, and technique and performance feedback. We compared the training effectiveness of the VPES with two commercial simulators in an experiment of 36 medical and nurse practitioner students. Results indicate each type of training simulator-improved abilities, in general. Upon closer analysis, however, the following key patterns emerge: 1) Across all types of training, more deficiencies lie in skill-based rather than rule-based decision making, which improves only for VPES trainees; 2) only VPES training transfers both to other simulators and previously unencountered scenarios; 3) visual feedback may increase the number of abnormalities reported yet hinder the ability to discriminate; and 4) applied finger pressure did not correlate with the ability to identify abnormalities. Index Terms—Evaluation, haptics, medical, nursing, simulation, simulator, tactile, training.

I. I NTRODUCTION

P

ROSTATE cancer has the second highest incidence rate (one in six) among cancers in American men, with an estimated 218 890 new cases during 2007 [1]. When diagnosed at an early and less aggressive stage, the five-year survival rate approaches 100%, compared to 88% for the 33% of patients with late stage diagnoses [2]–[4]. To promote early detection, the American Cancer Society advises screening via two diagnostic tools, which are the digital rectal examination (DRE) and

Manuscript received December 21, 2007; revised June 9, 2008. First published December 22, 2008; current version published February 19, 2009. This work was supported by an Undergraduate Medical Education Research Grant from the Academy of Distinguished Educators, School of Medicine, University of Virginia. This paper was recommended by Associate Editor C. P. Nemeth. G. J. Gerling and S. Rigsbee are with the Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA 22904 USA (e-mail: [email protected]; [email protected]). R. M. Childress is with the School of Nursing, University of Virginia, Charlottesville, VA 22904 USA (e-mail: [email protected]). M. L. Martin is with the School of Medicine, University of Virginia, Charlottesville, VA 22908 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TSMCA.2008.2009769

prostate specific antigen (PSA) blood test [5]. While the PSA blood test is the most effective detection method, it is nearly always accompanied by the DRE. The DRE is inexpensive and offers rapid diagnostic feedback. The DRE is important because the PSA blood test tends to overdiagnose (67% of findings are false positives) [6]–[8] and miss cancerous tumors (17% of tumors are detected by DRE alone) [6], [9]. In addition, the vast majority of cancerous tumors arise in the peripheral zone of the prostate gland, which coincidentally is accessible by palpation with the finger [10]. However, its positive predictive value (17%–34% [6]) is variable, and agreement between examiners on diagnosis is low (21%–40% [11], [12]). Both may be attributed to inconsistent, insufficient, or inadequate training of clinicians [13] and lack of continued practice thereafter [14]. Hands-on training is increasingly being emphasized in medical and nursing schools via the development of simulated mannequins [15], [16], part-task synthetic models [17]–[19], and virtual reality devices [20], [21]. The need is particularly acute, as the number of medical students is expected to increase by 30% over the next ten years [22] and expectations on performance continue to rise. One known impediment to learning is that the curriculum tends to heavily impart background information on basic science and clinical methods, with little hands-on training. Handson training, in the clinical breast exam, is known to increase tumor detections by 44%–66% [23], along with consistency [23]–[25] and reliability [26]. Typically, hands-on training utilizes either standardized patients or physical simulators. While exposure to standardized patients improves the trainee knowledge of anatomy and interaction with patients [27]–[29], the volunteers are typically healthy. In contrast, while physical simulators simulate abnormal disease states (with rubber synthetics) and improve the number and sizes of abnormalities detected [30]–[32], current simulators often present a small range of disease cases and offer little performance feedback. To address these issues, Gerling and Thomas’s breast cancer simulator was built with reconfigurable scenarios and augmented feedback via pulsing balloons [17], [18], and Pugh’s pelvic simulator provides visual feedback on hand position [19]. Force feedback devices (e.g., SensAble Phantom) with virtual reality are another means of simulation, but most deliver forces to a pen rather than directly to the finger pad [33]–[35] and thus are not appropriate for palpation tasks. To isolate learning issues, with the DRE in specific, we analyzed the curriculum for teaching this exam [36]–[38].

1083-4427/$25.00 © 2008 IEEE

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

GERLING et al.: DESIGN AND EVALUATION OF A SIMULATOR FOR TRAINING CLINICAL PROSTATE EXAMS

389

TABLE I TRAINEE NEEDS, SIMULATOR TRAINING ISSUES, AND DESIGN REQUIREMENTS. “LEVEL” IS THE DECISION MAKING LEVEL; “IDEAL TRAINING” IS THE ABILITIES THAT STUDENTS HAVE AT A LEVEL AND WHAT TRAINING THEY NEED TO IMPROVE

Using task analysis and Rasmussen’s skill, rule, and knowledge theory of decision making [39], [40], we found that clinicians must learn to complete two tasks sequentially: (task a) palpating the prostate to identify abnormalities and (task b) linking identified abnormalities to a disease diagnosis. The following two typical physical simulators for training the DRE were then analyzed: 1) the G300 life-size prostate model set (Anatomical Chart Company, Skokie, IL) referred to hereinafter as the static without torso (SWOT) and 2) the life/form prostate examination simulator (Nasco, Fort Atkinson, WI) referred to hereinafter as the static with torso (SWT). Three main issues were identified. The first issue is that one simulator unintentionally provides feedback via a visual cue tangential to the exam, while the second provides no feedback. The only feedback comes indirectly due to lack of a torso for the SWOT simulator, where cross sections of the prostate alone are palpated on the tabletop. While trainees can visually match surface landmarks to a disease, this inaccurate conceptual understanding is not extendable to unfamiliar disease cases. The second issue is that the simulators present so few scenarios (SWT = 3 and SWOT = 4) that the location and size of tumors and palpable landmarks on the prostate surface can be easily memorized. Trainees who memorize abnormality locations likely gain little from repetitive practice. The low number of scenarios also provides little opportunity for learners to relate combinations of symptoms to diagnosis over a range of disease cases. The third issue is that the simulators either misrepresent or exclude important contextual and tactile cues (e.g., the posterior section, rectal wall, prostate size and stiffness, and, most importantly, oncologically accurate disease cases). In general, a lack of contextual cues hinders one’s ability to connect classroom knowledge with hands-on skills [41]. The absence of the rectal wall (i.e., a physical barrier between finger and prostate) may alter tactile sensitivity [42] and finger mobility. The SWOT excludes both. As SWT prostates are overly stiff and abnormally large in size, they provide a false baseline. In addition to a limited number of simulated prostates, the two simulators focus on carcinoma [SWT = no prostatitis and no benign prostatic hypertrophy (BPH); SWOT = no prostatitis].

II. D ESIGN R EQUIREMENTS AND A SSESSMENT M ETRICS In an attempt to design an improved simulator, requirements were developed to address the aforementioned issues (Table I). The three design requirements are technique and performance feedback, multiple and reconfigurable scenarios of graded difficulty, and basic elements of accurate anatomy. In addition, assessment metrics were developed to test the utility of the design requirements and trainee performance. 1) Technique and Performance Feedback: To address the lack of feedback, finger pressure can be captured and presented to the user in two forms: technique and performance. Technique feedback informs the trainee about his or her finger location and magnitude of pressure exerted, while performance feedback communicates whether a trainee has palpated an abnormality. Splitting the two may help to determine if good technique is directly tied to a level of performance. Moreover, feedback timing, in an immediate mode, helps adapt automatic behaviors (recognition of tactile stimuli), prevent palpation errors (misclassification of abnormalities), and increase learning efficiency (decrease training time) [43]. 2) Multiple and Reconfigurable Scenarios of Graded Difficulty: To address the issue of memorization of a small number of scenarios, we introduce the concept of multiple and reconfigurable scenarios of graded difficulty. Multiple scenarios are configurations of more than one tumor location, stiffness, and size for a single disease or cancerous state. The ability to dynamically change tumor locations within a single prostate is the concept of reconfigurable scenarios. In addition to scenarios being multiple and reconfigurable, abnormalities could be presented at levels of graded difficulty; thus, abnormalities are more difficult to discriminate from surrounding tissue. Assessing skills over a range of test scenarios may help determine if underlying concepts learned with the simulator transfer to a range of real world patients. The transfer of skills to real patients is ultimately the most important outcome of training [43]. 3) Basic Elements of Accurate Anatomy: To address the misrepresentation or exclusion of contextual and tactile cues, the essential elements of anatomy should be included. Some relevant elements include a posterior section, rectal wall, in

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

390

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 39, NO. 2, MARCH 2009

Fig. 2. VPES balloon inflation method. Labels are (A) flow of water from main source, (B) valves to each balloon, (C) pressure monitor per balloon, (D) water inflated balloon, (E) backing substrate where pressure sensors are embedded, and (F) prostate silicone elastomer.

Fig. 1. VPES apparatus including the (A) torso, (B) computer, (C and D) custom-built electronics for balloon inflation, (E) electronics for conditioning signals from pressure sensors, (F) multiple instrumented prostates, and (G) an internal track.

addition to accurate prostate stiffness and size, and disease states. Prostates of normal stiffness and size provide a baseline from which trainees can judge subsequent presentations of abnormalities. Disease states that are oncologically accurate may help trainees connect classroom knowledge to the handson exam with real patients. 4) Assessment Metrics and Test Scenarios: Current assessment metrics do not sufficiently assess trainee ability. Qualitative and rudimentary quantitative metrics [44] make it difficult to objectively evaluate skills [45], [46]. A typical type of quantitative metric, used, for example, with the clinical breast exam, relies upon counting the number of tumors found and the number of assessable abilities both to detect abnormalities other than to carcinoma alone (task a) and to link identified abnormalities with one of the multiple diagnoses (task b). III. M ETHODS The design requirements are implemented in the Virginia Prostate Examination Simulator (VPES), which is a custombuilt computerized and physical simulator to improve DRE prostate palpation abilities. The design of the VPES seeks to support the two tasks identified, as measured via the newly designed assessment metrics, which evaluate trainee performance improvement and transfer of training to other simulators. An experiment with 36 medical and nurse practitioner students analyzed the impact of the type of simulator training (SWT, SWOT, or VPES) on performance. A. Apparatus: VPES The design of the VPES combines the use of siliconeelastomer materials to simulate the feel of tissue and a computer to generate reconfigurable scenarios [17], [18] and utilizes balloons and sensors embedded within instrumented prostates. From Fig. 1, the main components of the apparatus include the (A) torso, (B) computer, (C and D) custom-built electronics for balloon inflation, (E) electronics for conditioning signals from pressure sensors, (F) multiple instrumented prostates, and

Fig. 3. (a) Examples of prostate diseases modeled by VPES. (b) Four reconfigurations of a carcinoma prostate. (c) Balloon layout for the instrumented prostates (left to right) A, B, and C. Balloon numbers 1–4 correspond to the small balloons. For prostate A, the inner circle denotes a medium balloon under the median sulcus (vertical line represents this groove), which simulates a (boggy) partially or (firm) totally inflamed median sulcus, as found in prostatitis. Also in prostate A, the two oval shapes denote large balloons, which, when inflated, individually simulate asymmetric inflammation (prostatitis) or, in tandem, simulate symmetric inflammation (BPH). Prostates B and C are carcinoma only prostates.

(G) an internal track. The computer displays visual feedback to trainees. 1) Basic Elements of Accurate Anatomy: To promote a realistic training environment, the torso and rectal wall were constructed. The torso’s exterior skin and rectal wall are made of a pigmented silicone elastomer (BJB Enterprises, Tustin, CA; TC-5015 A/B and TC-5005 A/B/C), and internal structure is made of PVC pipe, pourable foam (BJB Enterprises; TC276 A/B), and steel plate. The instrumented prostates, which are more accurate in size and stiffness, are attached to the internal track, upon which the instrumented prostate under test is rotated into position beneath the rectal wall [Fig. 1(G)]. The dimensions of the simulated prostates are 55 mm (transverse,

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

GERLING et al.: DESIGN AND EVALUATION OF A SIMULATOR FOR TRAINING CLINICAL PROSTATE EXAMS

391

TABLE II SCENARIOS SIMULATED FOR EACH PROSTATE DISEASE FOR EACH SIMULATOR

width dimension) by 50 mm (longitudinal, length dimension). This length is 5% less than the SWT and comparable to the SWOT. The width is 30% smaller than the SWT but 13% larger than the SWOT. Although large variation in prostate size has been reported (26–80 mm transverse and 22–59 mm longitudinal [48], [49]), the chosen sizes well simulate the cases of disease, which lie toward the middle to upper end of the scale. Silicone-elastomer stiffness was chosen based on material measurements and a subjective study, both described in detail in Appendix B. Four to six polyethylene balloons, along with four pressure sensors, are embedded in each instrumented prostate. While the deflated balloons cannot be detected, those inflated with water simulate palpable abnormalities (hardness range of 0–35 Shore A durometers similar to the hardness of tumors at 0– 60 durometers) [26], Fig. 2. The evaluation of balloon hardness is described in Appendix B. There are four sizes of small balloons (5 mm × 5 mm, 7.5 mm × 6 mm, 10 mm × 7.5 mm, and 10 mm × 15 mm) in addition to two medium (30 mm × 15 mm) and one large (27.5 mm × 30 mm) size balloons. The small balloons simulate carcinoma nodules. Medium balloons simulate a partially or totally inflamed median sulcus [vertical groove marked by dotted line in third prostate in Fig. 3(a)] or produce a boggy feeling (prostatitis). Large balloons, located at either side of the prostate, simulate asymmetric (prostatitis) or symmetric (BPH) inflammation. In particular, tumors and inflammation can be differentiated by both size and stiffness (tumors in study utilized small balloons and Shore A durometer hardness of 25–30; inflammation utilized medium to large balloons and Shore A durometer hardness of 10–15). Using the embedded balloons, the VPES simulates all prostate disease states (normal, prostatitis, BPH, and carcinoma), in contrast to the other

simulators (SWT = no prostatitis and no BPH and SWOT = no prostatitis), Table II. Balloon location, size, and hardness span the occurrence of disease states, informed by the assessment of experienced clinicians and the literature [23], [26], [36]. 2) Multiple and Reconfigurable Scenarios: In addition to simulating all disease states, the VPES simulates each disease state more than once, which leads to a larger number of total possible scenarios (SWT = 4, SWOT = 6, and VPES = 121), Table II. Fig. 3(b) shows one cancerous prostate, where the four balloons embedded in the instrumented prostate are reconfigured in four ways. Of the six instrumented prostates, three simulate both BPH and prostatitis (with and without carcinoma), while the others simulate carcinoma only. In addition to balloon configuration, the surface texture and the exterior shape of the six prostates differ. Along with enabling a wealth of scenarios, systematic aids allow students to determine how palpable and contextual factors are combined to form a diagnosis. The decision tree teaches “if–then” relationships that are learned informally in the clinic. Decisions are set up in a hierarchical and consecutive format, whereby each “yes” or “no” decision leads ultimately to a specific disease diagnosis (Fig. 4). 3) Visual Display for Technique and Performance Feedback: Two types of user feedback were implemented: technique and performance. Per our previous definition, technique feedback informs a trainee about his or her finger position and magnitude of pressure exerted, both instantaneously and over the last 10 s. As this is a blind exam, the feedback helps trainees orient themselves and ensure that they have palpated the entire prostate’s surface area and depth. The graphical user interface (Labview, version 8.0; National Instruments, Austin, TX) displays the magnitude of finger pressure exerted, via the following: 1) colored tanks that correspond to pressure sensor locations

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

392

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 39, NO. 2, MARCH 2009

Fig. 4. Decision tree for rule-based decision making.

Fig. 5. Visual feedback display.

(“Finger Pressure by Region” in Fig. 5) and 2) continuous analog plots (“Finger Pressure Display” in Fig. 5). Data to populate the displays were collected via four FlexiForce sensors (Tekscan Inc., South Boston, MA) embedded in the backing of the silicone-elastomer substrate. The 9.53-mm-diameter flat sensors respond to normal force in the 0–1-lb range. Using a force-to-voltage circuit, changes in the sensor’s resistance are converted to voltage and fed into an A/D converter (National Instruments).

Performance feedback was implemented to the following: 1) Confirm that a trainee palpated an abnormality according to his verbal report to the proctor and 2) immediately display that an abnormality is being palpated in the practice session. Often, trainees believe they feel an abnormality but would like immediate confirmation. As implemented, when finger pressure was applied to a water-filled balloon, the change in water pressure is captured, recorded, and displayed to the trainee via continuous analog plots (“Balloon Pressure Display” in Fig. 5). Each inflated balloon is demarcated as a uniquely colored line. Another type of performance feedback was used only if trainees could not find tumors even with the aid of visual feedback. Nonnatural augmented feedback in the form of pulsating balloons is presented via the tactile modality. This type of feedback may help increase tumor detections and decrease false alarms [18] by focusing trainee awareness on specific stimulus dimensions and similar visual search and detection [50]. B. Experiment A total of 36 medical (N1 = 33; from second to fourth year) and nurse practitioner (N2 = 3; from first to second year)

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

GERLING et al.: DESIGN AND EVALUATION OF A SIMULATOR FOR TRAINING CLINICAL PROSTATE EXAMS

Fig. 6.

393

Derived dependent variables.

students participated in a repeated measure experiment designed to test how the type of training impacts DRE performance. The study was approved by the Institutional Review Board at the University of Virginia. Among the participants were 15 males and 21 females, with ages 21–50 (average: 25.4). The following seven steps were performed over 1 h for each participant: 1) informed consent and demographics questionnaire; 2) pretest; 3) 10-min training; 4) 15-min break; 5) posttest-1; 6) posttest-2; and 7) posttest questionnaire. Under the repeated measure design, the type of training (between subjects) was the first independent variable with three levels (SWT, SWOT, and VPES), and 12 participants were randomly assigned to each group. In addition, the VPES type of training was broken into the following two subgroups: 1) no feedback and 2) feedback trainees who received the addition of visual and augmented feedback. A second independent variable (within subjects) was an assessment simulator. For each of pretest and posttest-1, trainees were assessed by all three simulators with four tests conducted per device (i.e., SWT trainees were assessed by four pretests with the SWOT, four pretests with the SWT, and four pretests with the VPES; with similar procedure for posttest-1). Presentation order was counterbalanced. Posttest-2 assessment was conducted with the VPES simulator alone. The assessments on all three simulators and in posttest-2 are intermediate measures of transfer of training (i.e., to other simulators and previously unencountered scenarios on the VPES). The transfer of training is ultimately determined by subsequent ability in the clinic, but this is difficult to capture without a longitudinal study. 1) Dependent Variables: Tumor and inflammation identifications were the two base dependent variables. Eight derived dependent variables were set up to test the design requirements, specifically to account for the abilities to palpate the prostate to identify abnormalities (five-skill-based variables) and to link identified abnormalities to a disease diagnosis (two-rule-based

variables), Fig. 6. The eighth variable (simulator diagnosis) falls into both categories. Simulator diagnosis evaluates both tasks a and b together. For simulator diagnosis, the examiner must first correctly identify the abnormalities presented (task a) and then correctly link those abnormalities to a disease state (task b). An example of correct simulator diagnosis is if the examiner correctly reports the location and characteristics of a tumor and then correctly diagnoses the finding as carcinoma. In contrast, reported diagnosis evaluates only task b. For example, the examiner might incorrectly report a tumor as asymmetric inflammation but then correctly diagnose the inflammation as prostatitis. In this case, the examiner is correct for the reported diagnosis but incorrect for simulator diagnosis. Reported diagnosis therefore evaluates only whether the trainee correctly links abnormalities thought to be palpated with a diagnosis, regardless of what is truly palpated. In this way, rule-based decision making (task b) is decoupled from skill-based palpation (task a). Note that, because the same number of abnormal prostates were not available for each simulator (SWT = 3, SWOT = 4, and VPES was set to four), whole number comparisons could not be performed for the pretest and posttest-1. Therefore, percentage metrics were calculated as follows: % Correct Identification # Correctly Identified Abnormalities = # Abnormalities Reported

(1)

% Unreported Abnormalities =

# Unreported Abnormalities # Abnormalities Presented

% False Positives # Falsely Identified Abnormalities = # Abnormalities Reported

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

(2)

(3)

394

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 39, NO. 2, MARCH 2009

TABLE III SCENARIOS UTILIZED FOR PRETEST, POSTTEST-1, AND POSTTEST-2. ONE ABNORMALITY WAS INTRODUCED PER SCENARIO, UNLESS OTHERWISE NOTED IN PARENTHESIS

TABLE IV WITHIN-TRAINING TYPES, ASSESSMENT GROUPED, AND SIGNIFICANCE FROM PRETEST TO POSTTEST-1 (NO SHADED BACKGROUND: p < 0.05; LIGHT GRAY BACKGROUND: p < 0.10; AND DARK GRAY BACKGROUND: p > 0.10)

TABLE V BETWEEN-TRAINING TYPES; SIGNIFICANCE FOR POSTTEST-2

% Misclassified Abnormalities # Misclassified Abnormalities = . # Abnormalities Reported

(4)

A final dependent variable was the magnitude of finger pressure applied by the participants. 2) Test and Training Scenarios: The simulators were configured as in Table III, with four tests per assessment session simulator. For the VPES device, three instrumented prostates were used. One simulates (A) both BPH and prostatitis, while the other two simulate (B and C) carcinoma only [Fig. 3(c)]. Because the number and variety of scenarios could not be varied for the SWOT and SWT devices, these groups were trained with prostates utilized in their device’s pretest and posttest-1. However, for VPES trainees, abnormalities in sizes and locations were reconfigured from the test configuration in the training session. The decision tree was used in conjunction with only the VPES, as the SWT and SWOT simulators do not provide the range of disease states described therein. After being briefly explained during the training session, trainees could subsequently refer to the decision tree throughout the

training session. Finally, given that trainees typically utilize less than 30 s to palpate a single prostate, 10 min was found to be a sufficient time frame for the training. IV. R ESULTS Nonparametric statistics were used in the data analysis (Statistical Package for the Social Sciences version 12.0). An overall summarization of the results is presented in the following figures and tables. For Tables IV–VIII, the data were presented as follows (no shaded background: p < 0.05; light gray background: p < 0.10; and dark gray background: p > 0.10). Within-training type (Fig. 7 and Table IV) and betweentraining type (no significant results) performances from pretest to posttest-1 were compared. The Wilcoxon Signed Rank test analyzed paired comparisons of within-training types, and the Mann–Whitney U test was used in between-training types. Note that two forms of VPES training were combined into the single VPES training group once no significant differences were found under a Mann–Whitney U test between VPES feedback and no feedback groups.

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

GERLING et al.: DESIGN AND EVALUATION OF A SIMULATOR FOR TRAINING CLINICAL PROSTATE EXAMS

TABLE VI SWT TRAINING TYPE; SIGNIFICANCE FROM PRETEST TO POSTTEST-1

TABLE VII SWOT TRAINING TYPE; SIGNIFICANCE FROM PRETEST TO POSTTEST-1

TABLE VIII VPES TRAINING TYPE; SIGNIFICANCE FROM PRETEST TO POSTTEST-1

Fig. 7.

Within-training types, assessment grouped, and performance on pretest and posttest-1.

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

395

396

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 39, NO. 2, MARCH 2009

Fig. 8. Between-training types; performance on posttest-2.

Training type performance on posttest-2 was compared (Fig. 8 and Table V). Training type was further analyzed, per assessment device (Figs. 9–11 and Tables VI–VIII). For use in the discussion, simulator and reported diagnoses are also reported for each training type for the pretest, posttest-1, and posttest-2 (Fig. 12). Finally, a Spearman’s rho test was used to correlate finger pressure with performance in posttest-2. From the raw data reported in posttest-2, for each of the following training groups SWT (4.149 N; SD = 2.417), SWOT (4.29 N; SD = 4.251), and VPES (4.72 N; SD = 2.4083), no statistical correlation was found between finger pressure and simulator diagnosis (rho = −0.176, n = 32, and p = 0.335, two-tailed) nor between finger pressure and correct identifications (rho = −0.099, n = 32, and p = 0.588, two-tailed). V. D ISCUSSION The main findings are as follows: 1) Each type of training improved gross trainee performance from pretest to posttest-1. However, upon closer analysis, several key patterns emerge: 2) Across all types of training, more deficiencies in trainee abilities lie in skill-based rather than rule-based decision making, which only VPES training addresses; 3) the VPES type of training was the only type to transfer both to other simulators and new scenarios; and 4) while the addition of visual feedback may help VPES trainees to find more abnormalities, it may hinder the ability to differentiate abnormalities. Also note that

the magnitude of finger pressure did not affect performance in posttest-2. Given these results, the design requirements of multiple and reconfigurable scenarios of graded difficulty, in conjunction with the basic elements of accurate anatomy, appear to enable the positive outcomes for VPES trainees. The requirement for technique and performance feedback had little impact on the results. A. More Deficiencies in Skill-Based Palpation Than Rule-Based Linking We identified a disconnect between the two sequential tasks. While able to link the abnormalities palpated with a diagnosis [reported diagnosis (task b)], it was much more difficult for trainees to both palpate and diagnose abnormalities [simulator diagnosis (tasks a and b together)]. Difficulty in skill-based palpation was observed for all training types and in all tests. In each exam (pretest, posttest-1, and posttest-2), the mean for simulator diagnosis is significantly lower than the mean for reported diagnosis, across training types (for each comparison, p < 0.0005; Fig. 12). We believe that this is the first time that this gap has been identified for a palpation, or any medical, exam. The gap between the respective improvements for simulator and reported diagnoses is also observed from pretest to posttest-1 for each training type. However, only the VPES training type significantly improves their reported diagnosis (p = 0.002) even while simulator diagnosis improves significantly

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

GERLING et al.: DESIGN AND EVALUATION OF A SIMULATOR FOR TRAINING CLINICAL PROSTATE EXAMS

Fig. 9.

SWT training type; performance on pretest and posttest-1.

Fig. 10. SWOT training type; performance on pretest and posttest-1.

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

397

398

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 39, NO. 2, MARCH 2009

Fig. 11. VPES training type; performance on pretest and posttest-1.

Fig. 12. Simulator and reported diagnoses per training type for the three tests.

over all training types [SWT p < 0.0005, SWOT p = 0.034, and VPES p < 0.0005; Fig. 7 (“simulator diagnosis”) and Table IV]. This may be evidence for the utility of the decision tree and the greater number and variety of practice scenarios. While the aforementioned comparisons group assessment devices, VPES trainees also improved reported diagnosis when assessed by the SWOT (p = 0.017) and VPES (p = 0.019) devices in isolation, although not with the SWT (which indicates that the SWT device confused VPES trainees—perhaps because prostate size and stiffness appear abnormal; see Appendix B, subjective evaluation; Table VIII). Comparing improvements between training types, VPES trainees improved more than

SWOT and SWT trainees, and to a higher overall score, but no significant differences were found between groups. In contrast to VPES trainee performance, SWOT trainees showed negative improvement for reported diagnosis when assessed by the VPES simulator (difference of −0.167 and SD = 1.1934 between means; Fig. 10), and moreover, the SWT trainees showed negative improvement when assessed on their own simulator on the same scenarios they had been presented multiple times (difference between means of −0.4167 and SD = 1.3114; Fig. 9). Comparing posttest-1 results alone (apart from the pretest subtraction), reported diagnosis was higher for VPES trainees compared to SWT trainees (p = 0.078) and

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

GERLING et al.: DESIGN AND EVALUATION OF A SIMULATOR FOR TRAINING CLINICAL PROSTATE EXAMS

SWOT trainees (p = 0.015) with no difference found between SWT and SWOT trainees (p = 0.548). For posttest-2, however, no significant differences were found for reported diagnosis between the three training groups (Fig. 8 and Table V), which said that the VPES trainees were the only group to be correct in 100% of their diagnoses (four of four). In summary, VPES trainees appear to learn the “if–then” associations required to link the identified abnormalities to a disease diagnosis better than SWT and SWOT trainees. Greater improvement in reported diagnosis for VPES trainees may result from the decision tree for rule-based decision making, and the greater number and variety of scenarios available, both associated with the design element of multiple and reconfigurable scenarios. The significantly lower performance in skillbased palpation indicates a need for greater emphasis in that aspect of training. B. Transfer of Training Transfer of training was evaluated both with respect to other assessment simulators and the newly presented scenarios on the VPES. The main findings are that VPES training transfers to new scenarios and marginally to other simulators, due potentially to the lack of basic elements of accurate anatomy for those simulators. Conversely, the training of SWT trainees transfers to improved performance on other simulators, but not when new scenarios are introduced, which may indicate that SWT training urges memorization. SWOT training confuses trainees. 1) Transfer of Training to Other Assessment Simulators: We assessed performance from pretest to posttest-1 on the simulators not trained with. SWOT trainees improved from pretest to posttest-1 but only when assessed by their simulator (for each of simulator diagnosis, % correct identifications and % false positives, p < 0.05; Table VII). However, when assessed by the other two simulators, SWOT trainees show no increase or, worse, transfer of training on the posttest-1 compared to pretest scores for each dependent variable. Three of the four instances of zero to negative improvement occurred when assessed by the VPES, where mean values were as follows (% correct identifications = −0.10, % false positives = 0.07, and % reported diagnosis = −0.17). This finding is surprising because the size and stiffness of the prostates and tumors in the SWOT and VPES simulators are comparable, and the tumor positions were equivalent in the pretest and posttest-1. Similar poor performance was assessed with the SWT simulator (simulator diagnosis = −0.5, % correct identifications = 0.0454, and % false positives = −0.06). The main difference between the SWOT simulator and the VPES and SWT simulators is the exterior apparatus and a rectal wall. The inclusion of these design elements may be essential for transferring skills. The rectal wall may alter fingertip sensitivity [42] and restricts the mobility of the finger, so that techniques learned by SWOT trainees are uncharacteristic of actual practice. VPES trainees improved from pretest to posttest-1 when assessed by the SWT and SWOT devices, with marginal significance (p < 0.1). Assessed by the SWOT device, VPES trainees improved for reported diagnosis (p = 0.033) and simulator diagnosis (p = 0.095) while performing no worse for

399

% false positives and % correct identifications. Assessed by the SWT device, VPES trainees improved for simulator diagnosis only (p = 0.072). While VPES trainees did improve slightly or remain at the same level for the other metrics, the SWT is limiting as assessment device. The disease states are not nearly as complex. For instance, the VPES group was trained for BPH and prostatitis disease states and varied sizes of carcinoma, while the SWT group was only trained with BPH and fixed size carcinomas. The simulator also seems overly simplistic and to utilize inaccurately stiff prostates and tumors, points so noted by trainees from each group. Contrary to other training types, SWT trainees improved from pretest to posttest-1 under VPES assessment although not so clearly via SWOT assessment (Table VI). When assessed by the VPES device, SWT trainees significantly improved from pretest to posttest-1 (simulator diagnosis, % correct identifications, and % false positives; p < 0.05). Assessed by the SWOT, significant improvement was found for % correct identifications (p = 0.037), % false positives (p = 0.068), and reported diagnosis (p = 0.060) but not simulator diagnosis. The ability to transfer from pretest to posttest-1 (given abnormalities previously encountered) but not to new exam scenarios may be attributable to a tendency for SWT training to teach memorization. 2) Transfer of Training to New Exam Scenarios: Posttest-2 assessment was conducted via the VPES device only. Tumors and abnormalities of various sizes were reconfigured into new locations, although balloons were inflated to maximum hardness, as done previously. When training types were compared, the VPES trainees were significantly better than both SWT and SWOT training groups (Fig. 8). In comparing VPES and SWT training groups, simulator diagnosis and correct identifications were higher for VPES compared to SWT (simulator diagnosis: 3, SD = 0.739 to 1.75, and SD = 1.055; correct identifications: 4.667, SD = 1.073 to 3.25, and SD = 0.754) and those means were significantly different (p = 0.003 and 0.002; Table V). Similarly, in comparing VPES and SWOT training groups, simulator diagnosis, correct identifications, and reported diagnoses were higher for VPES compared to SWOT (simulator diagnosis: 3, SD = 0.739 to 1.333, and SD = 1.155; correct identifications: 4.667, SD = 1.073 to 2.583, and SD = 1.165; reported diagnosis: 4, SD = 0 to 3.417, and SD = 0.996), and those means were different, below the p < 0.01 level (p < 0.0005) or the p < 0.10 level (p = 0.089). Only differences at the p < 0.10 level were found between SWT and SWOT training groups. We believe that the SWT and SWOT simulators do not utilize a wide enough range of scenarios to train learners. As a result, these trainees may memorize physical landmarks rather than learn underlying palpation skills, thus making it harder for trainees to find new tumors. This could be why SWT trainees perform well in transferring training from pretest to posttest-1 only. Moreover, the lower number of correct identifications (mentioned earlier) and the high number of unreported abnormalities [VPES (0.583; SD = 0.669), SWOT (1.667; SD = 1.231), and SWT (1.417; SD = 0.996)] demonstrate that SWT and SWOT trainees have more difficulty detecting abnormalities not explicitly presented during training.

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

400

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 39, NO. 2, MARCH 2009

3) Summary for Transfer of Training: The design requirement of multiple and reconfigurable scenarios of graded difficulty aids training (helps negate memorization of abnormality location rather than learning hands-on discrimination) and assessment (posttest-2 possible only with VPES). In addition, the design requirement of the basic elements of accurate anatomy inherent in both the VPES and SWT devices aids training (notably, not only the need for a rectal wall but also accurate prostate stiffness and size, and oncologically accurate disease states). C. Visual Feedback and Performance The two VPES subgroups (visual feedback and no feedback) were compared before being collapsed into a single training type. Over no metric was statistical significance observed, due in part to the low number of participants per group (Nfeedback = Nno−feedback = 6). Differences in the means are discussed, but clearly, further work must be done to provide adequate support. VPES feedback trainees reported more correct identifications than nonfeedback trainees in both posttest-1 [all assessment devices grouped; 3, SD = 1.085 compared to 2.722, and SD = 1.018) and posttest-2 (4.833, SD = 1.169 compared to 4.5, and SD = 1.049). At the same time, however, VPES feedback trainees reported more false positives than nonfeedback trainees in posttest-1 (1.667, SD = 1.455 compared to 1.222, and SD = 1.06) and posttest-2 (1.333, SD = 0.667 compared to 0.667, and SD = 0.816). The data suggest that visual feedback may prompt students to correctly identify a larger number of abnormalities yet hinder ability to distinguish tumors from inflammation (misclassifications) and healthy tissue and abnormalities (false positives). This then may increase willingness to overreport. D. Finger Pressure Finger pressure had been monitored initially to ensure that trainees did not employ unreasonable finger pressure given that they were working with a simulator rather than a person. However, this was never an issue because all trainees fell within a similar range. Moreover, the average magnitude of finger pressure employed appeared to have no effect on trainee performance, neither simulator diagnosis nor correct identifications in posttest-2.

However, only VPES training improves skill-based decision making and transfers both to other simulators and previously unencountered scenarios. While further studies are needed to determine the validity of visual feedback in palpation tasks, the multiple and reconfigurable scenarios of graded difficulty, in conjunction with accurate anatomy, appear to be important elements of training. We hope that the design elements introduced in this paper will enhance other efforts to improve the training of clinical skills. Trainee improvement in the DRE, in specific, is essential for more consistent rates of detection, greater agreement between examiners, and higher numbers of early stage detections. A PPENDIX A B ACKGROUND ON THE P ROSTATE G LAND AND P ROSTATE D ISEASE S TATES The prostate is a walnut-sized and walnut-shaped gland that produces fluid for semen and lies in front of the rectum. Of the forms of prostate disease, prostatitis is an acute or chronic inflammation of the prostate gland, where the gland may be swollen symmetrically or asymmetrically and typically is tender, firm, or possibly enlarged with a boggy feel. BPH is an enlargement of the prostate without cancerous nodules. The carcinoma of the prostate may present clinically as small and firm isolated nodule(s) or the entire gland may be enlarged and feel hard. All are linked to the VPES simulator in Fig. 3 and to the characteristics of all three simulators in Table II. As each differs in physical characteristics, such as the frequency of occurrence (BPH is about three times as common as carcinoma [51]) and treatment options (surgery versus medication), clinicians must learn to correctly detect and diagnose each type of abnormality in order to provide appropriate and timely management. The DRE and PSA blood test help detect both different forms of prostate disease (prostatitis and BPH) in addition to carcinoma [37].

A PPENDIX B T ESTS OF S IMULATOR R EALISM The stiffness of the simulated material and balloons for the VPES was evaluated with material characterization, and a subjective experiment with physicians compared the three simulators.

VI. C ONCLUSION Performing a DRE requires combining skill-based ability to locate and correctly identify an abnormality and rule-based ability to link the abnormality to a diagnosis. However, current training apparatus are not adequately teaching these abilities. Based on an analysis of the DRE, this paper developed design requirements and implemented those requirements into a new simulator—the VPES. Assessment metrics were developed to test the effectiveness of the proposed design requirements. In a study with 36 medical and nurse practitioner students enrolled, the VPES is compared with two other devices for training and assessing clinician ability. The results indicate that each type of training improved the abilities of the novice trainees in general.

A. Simulated Prostate Stiffness The stiffness of the silicone elastomer used to simulate prostate tissue can be systematically controlled by varying the percentage of cross-linker (BJB Enterprises TC-5005, part C). As such, compression testing was used to select a formulation appropriate for prostate tissue. The cylindrical sections of silicone elastomer were tested, as their percentage of crosslinker had been varied in five steps from 25% to 125%. Each section’s dimensions were 45-mm height and 41-mm diameter. We used a mechanical testing and simulation device to measure force-displacement characteristics, which were converted to stress–strain plots, and from those, Young’s modulus was

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

GERLING et al.: DESIGN AND EVALUATION OF A SIMULATOR FOR TRAINING CLINICAL PROSTATE EXAMS

401

TABLE IX ABNORMALITY HARDNESS (SHORE A DUROMETERS) WITH VARIATION IN BALLOON WATER PRESSURE (IN POUND FORCE PER SQUARE INCH). HARDNESS RANGE, LINEAR FUNCTION, AND R2 VALUE ALSO SHOWN. BALLOON NUMBERS 1–4 CORRESPOND TO THE NUMBERING IN FIG . 3(c)

Fig. 13. Averaged results of the subjective evaluation of each of the three simulators by physicians (prostate is abbreviated as “P-” and abnormality as “A-”).

Fig. 14. Averaged results of the subjective comparison of the three simulators to each other.

derived. A strain range of 30% was used to estimate the Young’s modulus. The stress–strain relationship was linear over this range, which also corresponds with the magnitude of displacement of the finger in a typical palpation exam. Approximations in the 0%–30% strain range resulted in Young’s modulus values of between 118 439 Pa for 25% cross-linker and 3493 Pa for 125% cross-linker, with R2 values of 0.993 and 0.989, respectively. To inform a more precise selection to match prostate tissue, the percentage of the cross-linker (%C) was related to Young’s modulus (E) via the following regression equation: E = 11.981 × (%C)2 − 2941.2 × (%C) + 184 297, where R2 = 0.99. After a series of subjective comparisons, where two experienced urologists informally palpated both the prostates

of the SWOT simulator, and a series of simulated prostates in the 30%–60% cross-linker range, a 42% cross-linker was ultimately selected. Per equation (A.1), this value corresponds with a Young’s modulus of 81 901 Pa. This elastic modulus fits the 1–150 kPa range that is typical of soft tissues in compression. For comparison, typical modulus is given for thin cell gel samples (0.1–1 kPa) [52], liver and kidney organs, skin layers (1–150 kPa), cartilage (400–5000 kPa), and bone (1000–20 000 kPa) [53]–[57]. Although there are no means to conduct compression tests with the SWOT and SWT models to generate an elastic modulus, a durometer was used to quantify the hardness of each. The hardness for the VPES and SWOT devices was less than one (Shore A, durometers) compared to nearly ten for the SWT (hardness scale is 1–100).

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

402

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 39, NO. 2, MARCH 2009

B. Simulated Tumor Stiffness As the hardness of a polyethylene balloon is related to its inflated water pressure, we performed hardness tests to determine the appropriate levels to use in the experiments that correspond with actual tumors. The hardness of actual tumors falls between 0 and 60 Shore A durometers [26]. The measurements of the balloons, using a Shore A durometer, revealed that hardness could be set between 0 and 35. The measurements also showed that balloon hardness varies almost linearly with water pressure, with separate equations according to balloon size (Table IX).

C. Physician Subjective Evaluation of Simulator Realism A small human-subject study utilized physicians to subjectively evaluate each simulator’s degree of realism. The study’s 12 participants (four males and eight females; mean age: 28) were at the end of their first to fourth years of residency and practiced in emergency or family medicine. All participants had received prior DRE instruction and had performed at least 9 to 50 DREs in the clinic, with four having conducted over 50. This set of physicians is a desirable subset because younger clinicians in these particular departments, in addition to nurse practitioners and those in internal medicine, are the first line of cancer detectors and tend to conduct a large number of DREs. In an experiment of approximately 15 min, each physician palpated each of the three simulators, where each simulator was configured with cancerous tumors of equal size. After the end of each exam, a physician evaluated that simulator according to nine attributes, on a five point Likert scale. The averaged results indicate that the VPES device was evaluated at “realistic” levels superior to the other simulators in nearly all categories (Fig. 13). It was judged much more realistic than the SWT simulator. For some categories, in particular, abnormality hardness, the SWOT simulator was slightly preferred. In Fig. 13, the abbreviation “Similarity” refers to a question that asked if the simulator was similar enough to the real exam to provide valuable practice with a scale ranging from “Definitely Not Similar Enough” to “Definitely Similar Enough.” After each simulator was individually examined, they were comparatively ranked (i.e., for “Entire Simulator,” participants assigned the numbers 1 to 3 to the VPES, SWOT, or SWT). The results indicate that the VPES device received the highest average rankings, with a value of nearly three for several attributes (Fig. 14). R EFERENCES [1] American Cancer Society, Cancer Facts & Figures 2007, 2007. [2] C. Mettlin, G. W. Jones, and G. P. Murphy, “Trends in prostate cancer care in the United States, 1974–1990: Observations from the patient care evaluation studies of the American College of Surgeons Commission on Cancer,” Cancer J. Clin., vol. 43, pp. 83–91, 1993. [3] American Cancer Society, Cancer Facts and Figures, 2006. [4] Malecare, Gleason Score: Two Articles on Understanding Gleason Grade and Score, vol. 2007, 2006. [5] American Cancer Society, Prostate Cancer, vol. 2007, 2006. [6] AMA, Screening and Early Detection of Prostate Cancer, vol. 2007, 2005.

[7] R. Etzioni, D. F. Penson, J. M. Legler, D. di Tommaso, R. Boer, P. H. Gann, and E. J. Feuer, “Overdiagnosis due to prostate-specific antigen screening: Lessons from U.S. prostate cancer incidence trends,” J. Natl. Cancer Inst., vol. 94, no. 13, pp. 981–990, Jul. 2002. [8] S. Weinmann, K. E. Richert-Boe, S. K. Van Den Eeden, S. M. Enger, B. A. Rybicki, J. A. Shapiro, and N. S. Weiss, “Screening by prostatespecific antigen and digital rectal examination in relation to prostate cancer mortality: A case-control study,” Epidemiology, vol. 16, no. 3, pp. 367–376, May 2005. [9] G. F. Carvalhal, D. S. Smith, D. E. Mager, C. Ramos, and W. J. Catalona, “Digital rectal examination for detecting prostate cancer at prostate specific antigen levels of 4 ng./ml. or less,” J. Urol., vol. 161, no. 3, pp. 835– 839, Mar. 1999. [10] G. D. Grossfeld and P. R. Carroll, “Prostate cancer early detection: A clinical perspective,” Epidemiol. Rev., vol. 23, no. 1, pp. 173–180, 2001. [11] D. S. Smith and W. J. Catalona, “Interexaminer variability of digital rectal examination in detecting prostate cancer,” Urology, vol. 45, no. 1, pp. 70– 74, Jan. 1995. [12] C. G. Roehrborn, S. Sech, J. Montoya, T. Rhodes, and C. J. Girman, “Interexaminer reliability and validity of a three-dimensional model to assess prostate volume by digital rectal examination,” Urology, vol. 57, no. 6, pp. 1087–1092, Jun. 2001. [13] T. W. Hennigan, P. J. Franks, D. B. Hocken, and T. G. Allenmersh, “Rectal examination in general practice,” Brit. Med. J., vol. 301, no. 6750, pp. 478–480, Sep. 1990. [14] P. S. Bunting, “Screening for prostate cancer with prostate-specific antigen: Beware the biases,” Clin. Chim. Acta, vol. 315, no. 1/2, pp. 71–97, Jan. 2002. [15] J. B. Cooper and V. R. Taqueti, “A brief history of the development of mannequin simulators for clinical education and training,” Qual. Saf. Health Care, vol. 13, no. 1, pp. 11–18, Oct. 2004. [16] D. M. Gaba, S. K. Howard, K. J. Fish, B. E. Smith, and Y. A. Sowb, “Simulation-based training in anesthesia crisis resource management (ACRM): A decade of experience,” Simul. Gaming, vol. 32, no. 2, pp. 175–193, Jun. 2001. [17] G. J. Gerling, A. M. Weissman, G. W. Thomas, and E. L. Dove, “Effectiveness of a dynamic breast examination training model to improve clinical breast examination (CBE) skills,” Cancer Detec. Prev., vol. 27, no. 6, pp. 451–456, 2003. [18] G. J. Gerling and G. W. Thomas, “Augmented, pulsating tactile feedback facilitates simulator training of clinical breast examinations,” Hum. Factors, vol. 47, no. 3, pp. 670–681, 2005. [19] C. M. Pugh, W. L. Heinrichs, P. Dev, S. Srivastava, and T. M. Krummel, “Use of a mechanical simulator to assess pelvic examination skills,” J. Amer. Med. Assoc., vol. 286, no. 9, pp. 1021–1023, Sep. 2001. [20] P. H. Cosman, P. C. Cregan, C. J. Martin, and J. A. Cartmill, “Virtual reality simulators: Current status in acquisition and assessment of surgical skills,” ANZ J. Surg., vol. 72, no. 1, pp. 30–34, Jan. 2002. [21] R. T. Shimotsu and C. G. L. Cao, “The effect of color-contrasting shadows on a dynamic 3-D laparoscopic surgical task,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 37, no. 6, pp. 1047–1053, Nov. 2007. [22] G. Garrison, D. Matthew, and R. F. Jones, Analysis in Brief: Future Medical School Applicants, Part I: Overall Trends, vol. 7. Washington, DC: Assoc. Amer. Med. Colleges, 2007. [23] S. E. Bennett, R. S. Lawrence, D. F. Angiolillo, S. D. Bennett, S. Budman, G. M. Schneider, A. R. Assaf, and M. Feldstein, “Effectiveness of methods used to teach breast self-examination,” in Amer. J. Prev. Med., vol. 6, Jul./Aug. 1990, pp. 208–217. [24] V. A. Clarke and S. A. Savage, “Breast self-examination training: A brief review,” Cancer Nurs., vol. 22, no. 4, pp. 320–326, Aug. 1999. [25] D. C. Hall, C. K. Adams, G. H. Stein, H. S. Stephenson, M. K. Goldstein, and H. S. Pennypacker, “Improved detection of human breast lesions following experimental training,” Cancer, vol. 46, no. 2, pp. 408–414, Jul. 1980. [26] H. S. Bloom, E. L. Criswell, H. S. Pennypacker, A. C. Catania, and C. K. Adams, “Major stimulus dimensions determining detection of simulated breast lesions,” Percept. Psychophys., vol. 32, no. 3, pp. 251–260, Sep. 1982. [27] H. S. Campbell, M. McBean, H. Mandin, and H. Bryant, “Teaching medical students how to perform a clinical breast examination,” Acad. Med., vol. 69, no. 12, pp. 993–995, Dec. 1994. [28] C. Popadiuk, M. Pottle, and V. Curran, “Teaching digital rectal examinations to medical students: An evaluation study of teaching methods,” Acad. Med., vol. 77, no. 11, pp. 1140–1146, Nov. 2002. [29] C. Pilgrim, C. Lannon, R. P. Harris, W. Cogburn, and S. W. Fletcher, “Improving clinical breast examination training in a medical school: A

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

GERLING et al.: DESIGN AND EVALUATION OF A SIMULATOR FOR TRAINING CLINICAL PROSTATE EXAMS

[30]

[31] [32] [33] [34] [35]

[36] [37] [38] [39]

[40]

[41] [42] [43] [44] [45] [46] [47] [48]

[49] [50] [51]

[52] [53] [54]

randomized controlled trial,” J. Gen. Intern. Med., vol. 8, no. 12, pp. 685– 688, Dec. 1993. A. K. Madan, S. Aliabadi-Wahle, A. M. Babbo, M. Posner, and D. J. Beech, “Education of medical students in clinical breast examination during surgical clerkship,” Amer. J. Surg., vol. 184, no. 6, pp. 637–640, Dec. 2002. S. Aliabadi-Wahle, M. Ebersole, E. U. Choe, and D. J. Beech, “Training in clinical breast examination as part of a general surgery core curriculum,” J. Cancer Educ., vol. 15, no. 1, pp. 10–13, 2000. H. Pennypacker, L. Naylor, A. A. Sander, and M. K. Goldstein, “Why can’t we do better breast examinations?” Nurse Pract. Forum, vol. 10, no. 3, pp. 122–128, Sep. 1999. S. A. Wall and S. Brewster, “Sensory substitution using tactile pin arrays: Human factors, technology and applications,” Signal Process., vol. 86, no. 12, pp. 3674–3695, Dec. 2006. G. Burdea, G. Patounakis, V. Popescu, and R. E. Weiss, “Virtual realitybased training for the diagnosis of prostate cancer,” IEEE Trans. Biomed. Eng., vol. 46, no. 10, pp. 1253–1260, Oct. 1999. N. S. Raja, J. A. Schleser, W. P. Norman, C. D. Myzie, G. J. Gerling, and M. L. Martin, “Simulation framework for training chest tube insertion using virtual reality and force feedback,” in Proc. IEEE Syst., Man Cybern., Montreal, QC, Canada, 2007, pp. 2261–2266. L. S. Bickley and P. G. Szilagyi, Bates’ Guide to Physical Examination and History Taking, 9th ed. Philadelphia, PA: Williams & Wilkins, 2007. C. Jarvis, Physical Examination & Health Assessment, 4th ed. St. Louis, MO: Saunders, 2004. H. M. Seidel, Mosby’s Guide to Physical Examination, 5th ed. St. Louis, MO, 2003. K. R. Hammond, R. M. Hamm, J. Grassia, and T. Person, “Direct comparison of the efficacy of intuitive and analytical cognition in expert judgement,” IEEE Trans. Syst., Man, Cybern., vol. 17, no. 5, pp. 753–770, Sept./Oct. 1987. J. Rasmussen, “Deciding and doing: Decision making in natural contexts,” in Decision Making in Action: Models and Methods, J. O. G. Klein, R. Calderwood, and C. E. Zsambok, Eds. Norwood, NJ: Ablex, 1993, pp. 158–171. L. Wagemann, “Analysis of the initial representations of the human automation interactions (HAI),” Trav. Hum., vol. 61, pp. 129–151, 1998. G. O. Gibson and J. C. Craig, “Tactile spatial sensitivity and anisotropy,” Percept. Psychophys., vol. 67, no. 6, pp. 1061–1079, Aug. 2005. S. M. Alessi and S. R. Trollip, Multimedia for Learning: Methods and Development, 3rd ed. Boston, MA: Allyn & Bacon, 2001. J. Gaffan, J. Dacre, and A. Jones, “Educating undergraduate medical students about oncology: A literature review,” J. Clin. Oncol., vol. 24, no. 12, pp. 1932–1939, Apr. 2006. J. Chalabian and G. Dunnington, “Do our current assessments assure competency in clinical breast evaluation skills?” Amer. J. Surg., vol. 176, no. 6, pp. 497–502, Jun. 1998. H. MacRae, G. Regehr, W. Leadbetter, and R. K. Reznick, “A comprehensive examination for senior surgical residents,” Amer. J. Surg., vol. 179, no. 3, pp. 190–193, Mar. 2000. H. S. Pennypacker and C. A. Pilgrim, “Achieving competence in clinical breast examination,” Nurse Pract. Forum, vol. 4, no. 2, pp. 85–90, Jun. 1993. E. Ozden, A. T. Turgut, H. Talas, O. Yaman, and O. Gogus, “Effect of dimensions and volume of the prostate on cancer detection rate of 12 core prostate biopsy,” Int. Urol. Nephrol., vol. 39, no. 2, pp. 525–529, Jun. 2007. T. T. Marchie and V. C. Onuora, “Determination of normal range of ultrasonic sizes of prostate in our local environment,” West Afr. J. Radiol., vol. 8, no. 1, pp. 54–64, 2001. C. D. Wickens, J. D. Lee, Y. Liu, E. Sallie, and G. Becker, An Introduction to Human Factors Engineering, 2nd ed. Upper Saddle River, NJ: Pearson Educ., 2004. R. C. Hall MC, C. G. Roehrborn, and J. D. McConnell, “Is screening for prostate cancer necessary in men with symptoms of benign prostatic hyperplasia?” Semin. Urol. Oncol., vol. 14, no. 3, pp. 122–133, Aug. 1996. R. E. Mahaffy, C. K. Shih, F. C. MacKintosh, and J. Käs, “Scanning probebased frequency-dependent microrheology of polymer gels and biological cells,” Phys. Rev. Lett., vol. 85, no. 4, pp. 880–883, Jul. 2000. H. Yamada, Strength of Biological Materials. Baltimore, MD: Williams & Wilkins, 1970. F. J. Carter, T. G. Frank, P. J. Davies, D. McLean, and A. Cuschieri, “Measurements and modelling of the compliance of human and porcine organs,” Med. Image Anal., vol. 5, no. 4, pp. 231–236, Dec. 2001.

403

[55] J.-M. Schwartz, M. Denninger, D. Rancourt, C. Moisan, and D. Laurendeau, “Modelling liver tissue properties using a non-linear visco-elastic model for surgery simulation,” Med. Image Anal., vol. 9, no. 2, pp. 103–112, Apr. 2005. [56] A. Nava, E. Mazza, F. Kleinermann, N. J. Avis, and J. McClure, Determination of the Mechanical Properties of Soft Tissues Through Asperation Experiments, vol. 2878. Berlin, Germany: Springer-Verlag, 2003, pp. 222–229. [57] F. A. Duck, Physical Properties of Tissue. San Diego, CA: Academic, 1990.

Gregory J. Gerling (S’03–M’05) received the Ph.D. degree from the Department of Mechanical and Industrial Engineering, The University of Iowa, Iowa City. He has been with the Department of Systems and Information Engineering, University of Virginia, Charlottesville, since fall of 2005. Before returning to graduate school, he had industry experience in software engineering at Motorola, NASA Ames Research Center, and Rockwell Collins. His major research interests include human factors/ergonomics, computational neuroscience, haptics, biomechanics, and human–computer interaction. The application of his research seeks to advance neural prosthetics/robotics, aid people whose sense of touch is deteriorating, and improve human–robot interfaces, particularly in medicine.

Sarah Rigsbee is from Chesterfield, VA. She received the B.S. degree in electrical engineering from the Virginia Commonwealth University, Richmond, in 2005. She is currently working toward the M.S. degree in systems and information engineering from the Department of Systems and Information Engineering, University of Virginia, Charlottesville. She is also working full-time at the Applied Physics Laboratory, Johns Hopkins University, Laurel, MD.

Reba Moyer Childress received the B.S. and M.S. degrees in nursing from the School of Nursing, University of Virginia, Charlottesville, in 1979 and 1992, respectively, where she completed the Family Nurse Practitioner certificate program in 1991. She is an Assistant Professor of nursing with the School of Nursing, University of Virginia, where she is the Director of Clinical Simulation Learning Center.

Marcus L. Martin was born in Covington, VA. He received the B.S. degree in pulp and paper technology and the B.S. degree in chemical engineering from the North Carolina State University, Raleigh, in 1970 and 1971, respectively, and the M.D. degree from the Eastern Virginia Medical School, Norfolk, in 1976, where he was a member of the charter class and the first African American graduate. He completed his emergency medicine residency training at the University of Cincinnati, Cincinnati, OH, in 1981. He is currently with the University of Virginia (U.Va), Charlottesville, where he is a Professor and the immediate past Chair in the Department of Emergency Medicine, the Assistant Dean of the School of Medicine, and the Associate Vice President for Diversity and Equity. He established the Emergency Medicine Center for Education, Research and Technology and the Life Saving Technique course for medical students at U.Va using computerized human patient simulation. He is working on several research projects with systems engineers, developing computerized patient simulators.

Authorized licensed use limited to: IEEE Xplore. Downloaded on February 26, 2009 at 11:03 from IEEE Xplore. Restrictions apply.

Suggest Documents