MEASUREMENT IN PHYSICAL EDUCATION AND EXERCISE SCIENCE, 4(3), 157–173 Copyright © 2000, Lawrence Erlbaum Associates, Inc.
Reliability and Validity Evidence for the Testwell: Wellness Inventory—High School Edition (TWI[HS]) Judy L. Stewart Department of Health and Physical Education Motlow College
David A. Rowe Scottish School of Sports Studies University of Strathclyde
Richard E. LaLance Department of Health, Physical Education, Recreation, and Safety Middle Tennessee State University This study was designed to obtain reliability and validity evidence for the Testwell: Wellness Inventory—High School Edition (TWI[HS]), a 100-item inventory divided into 10 subscales of 10 items. Participants for this research were 437 9th- and 10th-grade students attending 5 Tennessee public high schools, who were either enrolled in Lifetime Wellness Curriculum classes or had not yet taken the class. Four research questions were posed for this study, with regard to 9th- and 10th-grade boys and girls: (a) What is the internal consistency reliability of the TWI(HS)? (b) What is the 12-week test–retest reliability of the TWI(HS)? (c) Is the internal structure of the TWI(HS) comprised of 10 domains, or factors, as hypothesized by its authors? and (d) Does the TWI(HS) measure changes in wellness knowledge, attitudes, and behaviors? Conclusions from the research were as follows: (a) Internal consistency reliability of the TWI(HS) subscales is lower than would be expected for subscales with so many items, (b) test–retest reliability of the TWI(HS) subscales ranges from low to acceptable, (c) the 10-factor theoretical structure of the TWI(HS) is not empirically supportable, and (d) the TWI(HS) does not detect many changes in wellness following a structured 12-week Lifetime Wellness program. Key words: wellness, lifetime wellness, assessment, questionnaire, validity, reliability Requests for reprints should be sent to David A. Rowe, Scottish School of Sports Studies, University of Strathclyde, Jordanhill Campus, Glasgow G13 1PP, Scotland. E-mail:
[email protected]
158
STEWART, ROWE, LALANCE
Many states across the United States have made changes in high school health and physical education requirements during recent years. A common change is that previously separate courses in health and physical education have been combined and revised into new curricular offerings under the heading of “wellness.” These changes seem to be consistent with the beliefs of those involved in the wellness movement. Wellness advocates profess that many lifestyle-related illnesses (e.g., cardiovascular disease, obesity-related illnesses, cancer, stroke) and deaths resulting from these illnesses can be prevented through wellness education programs, which help individuals to learn to make healthy lifestyle choices and changes (Hatfield & Hatfield, 1992; Omizo, Omizo, & D’Andrea, 1992; Robbins, 1994; Robbins, Powers, & Rushton, 1992; Rosenstein, 1989). Although some studies have been conducted to investigate changes in knowledge, attitudes, and behaviors following wellness instruction at the college level (McClanahan, 1990; Murray, 1996; Palombi, 1987), there is a lack of information at the high school level. The availability of valid and reliable instruments with which to measure wellness knowledge, attitudes, and behaviors of high school students is extremely limited and this may be why few studies have been conducted to determine the effect of wellness instruction on the knowledge, attitudes, and behaviors of adolescents or young adults. The state of Tennessee is among those who have made changes in the high school health and physical education requirement for high school graduation. At the beginning of the 1994–1995 school year a new course entitled “Lifetime Wellness” was introduced (Lifetime Wellness Curriculum Framework, 1994). The Lifetime Wellness curriculum was based on the national health goals as listed in Healthy People 2000 (U.S. Department of Health and Human Services, Public Health Service, 1992) and focused on the following seven aspects of wellness: (a) personal fitness and related skills, (b) mental health, (c) disease prevention and control, (d) safety and first aid, (e) sexuality and family life, (f) substance use and abuse, and (g) nutrition (Lifetime Wellness Curriculum Framework, 1994). Since the implementation of the Lifetime Wellness curriculum, no study has been conducted to determine if students in Tennessee are developing knowledge, attitudes, and behaviors related to personal fitness and health for a lifetime of wellness. At present, no reliability and validity evidence exists for the only widely available instrument that might be used in such research at the high school level. In view of this, our study was designed to obtain two types of reliability evidence and two types of validity evidence for this instrument, the Testwell: Wellness Inventory—High School Edition (TWI[HS]; National Wellness Institute, 1994), and to determine if it is an appropriate instrument for measuring wellness knowledge, attitudes, and behaviors of high school students. The following four research questions were posed in this study to investigate the reliability and validity of the TWI(HS):
WELLNESS INVENTORY
159
1. What is the internal consistency reliability of the TWI(HS) in 9th- and 10th-grade students? 2. What is the 12-week test–retest reliability of the TWI(HS) in 9th- and 10th-grade students? 3. Is the internal structure of wellness attitudes comprised of 10 domains, or factors, as measured by the TWI(HS)? 4. Do scores on the TWI(HS) measure changes in wellness knowledge, attitudes, and behaviors?
METHOD Participants Participants for this study were 9th- and 10th-grade students attending five public high schools in Tennessee. These particular schools were selected because they used a block class-scheduling plan. Block scheduling is a method of scheduling classes to meet for 90 min each full school day per semester, resulting in four class periods each day. A school year consists of two semesters. Each participant completed the Lifetime Wellness program (Lifetime Wellness Curriculum Framework, 1994) in one semester. The Lifetime Wellness Curriculum classes used in this study were taught in their entirety during the spring, or second semester of the school year, which enabled the researchers to collect data over a single semester. A total of 674 students were enrolled in the selected classes. Of these, 463 students (69%) returned parental permission forms. Due to absenteeism on the first scheduled testing day, 437 of the 463 students with parental permission participated in the study. Of these, 327 participants were 9th- and 10th-grade students enrolled in the Lifetime Wellness classes and 110 were 9th-grade students who had not yet received the Lifetime Wellness curriculum. Of the students in the Lifetime Wellness classes, 152 were scheduled to be administered the wellness inventory only once, at the beginning of the semester, and 175 students were scheduled to be tested twice, before and after the Lifetime Wellness course. However, only 127 of these were posttested due to absenteeism on the scheduled posttesting day (the unusually high absentee rate was due mostly to students attending a school trip and a funeral). Participants were minors; therefore parental permission was obtained before testing. All procedures were approved by the Institutional Review Board of Middle Tennessee State University. Testing Instrument The TWI(HS) was the instrument investigated in this study (National Wellness Institute, 1994). The TWI(HS) is a 100-item inventory divided into 10 subscales of 10
160
STEWART, ROWE, LALANCE
items each. The National Wellness Institute explains wellness as comprising six dimensions, or domains (physical, emotional, social, intellectual, occupational, and spiritual). Using the six theoretical dimensions, the test developers of the TWI(HS) subdivided three domains (physical, social, and emotional) into subcategories for the questionnaire. The subscales Physical Fitness and Nutrition, Self-Care, and Safety and Lifestyle were considered to belong to the physical dimension. The subscales Environmental Wellness and Social Awareness were considered to be subcategories of the social dimension. Under the emotional dimension, the authors of the inventory placed the subscales Emotional Awareness and Sexuality and Emotional Management. Three dimensions (intellectual, occupational, and spiritual wellness) were not subdivided. The 10 subscales on the TWI(HS) are therefore (a) Physical Fitness and Nutrition, (b) Self-Care, (c) Safety and Lifestyle, (d) Environmental Wellness, (e) Social Awareness, (f) Emotional Awareness and Sexuality, (g) Emotional Management, (h) Intellectual Wellness, (i) Occupational Wellness, and (j) Spirituality and Values. Each item on the test is a statement to which the participant responds using a 5-point Likert scale ranging from 1 (almost never) to 5 (almost always). Subscale totals can range from a minimum of 10 (indicating the lowest level of wellness) to 50 (indicating the highest level of wellness). Instructions for using the TWI(HS) indicate that subscale totals and total score from the whole questionnaire may be used (National Wellness Institute, 1994). Total scores for the questionnaire thus may range from 100 to 500. In this study, a separate answer sheet was provided for the students to respond to the questionnaire items.
Statistical Analysis Pre- and posttest data were collected over a 12-week interval at the beginning and end of a semester, during which some of the participants received instruction in Lifetime Wellness (Lifetime Wellness Curriculum Framework, 1994). Administration of both the pretest and posttest was done by the primary author during regularly scheduled class times. Standardized verbal directions were provided by the primary author before each questionnaire administration. Statistical analysis for this study was conducted using the SPSS statistical analysis program. Each research question was analyzed as follows: 1. Internal consistency reliability was determined using a two-way (Participants × Time) analysis of variance (ANOVA) model, excluding systematic error between test administrations (i.e., mean differences) from the error estimate (Baumgartner & Jackson, 1999). Intraclass correlations were calculated for each subscale. Data for this analysis were the pretest scores of all participants (N = 437).
WELLNESS INVENTORY
161
2. A one-way ANOVA model was used to calculate the intraclass correlation coefficient for 12-week test–retest reliability, because a mean difference between pre- and posttest scores would constitute measurement error. The one-way ANOVA model includes mean differences in the error estimate. Data for this analysis were the pre- and posttest scores of those participants who had not had exposure to the Lifetime Wellness curriculum (n = 110). 3. Exploratory factor analysis was used to determine if the TWI(HS) comprised 10 domains or subscales as suggested by the authors of the questionnaire. Confirmatory factor analysis was not used for the following three reasons: (a) No previous evidence supporting this structure has been provided by the questionnaire developers or other researchers, (b) the theoretical rationale for the structure of the questionnaire was not well-documented by the questionnaire developers, and (c) no thorough documentation of the item-development procedure was provided by the questionnaire developers. Data for this analysis were the pretest scores of all participants (N = 437). 4. Changes in wellness scores over time were analyzed by repeated measures t tests for each subscale and for the total questionnaire. Data for this analysis were the pre- and posttest scores of the participants who were in the Lifetime Wellness Curriculum classes (n = 127).
RESULTS Internal Consistency Reliability Cronbach’s alphas of .67 to .89 were obtained for the 10 subscales. These results are presented with subscale means and standard deviations in Table 1. A criterion of α = .70 was chosen to indicate throughout this study a minimally acceptable level of reliability based on the following recommendation of Nunnally (1982): “in basic research a good working rule is that the reliability coefficient should be at least .70” (p. 1600). Because internal consistency is usually expected to be higher than most other types of reliability, and because of the large number of items on each subscale, it was expected that the obtained internal consistency reliability coefficients for the TWI(HS) should exceed this minimal standard. Coefficients of .74, .76, and .79 were obtained for subscales that contained items related to Emotional Awareness and Sexuality, Physical Fitness and Nutrition, and Environmental Awareness. Six other subscales had coefficients ranging from .81 to .89. These subscales were Safety and Lifestyle, Social Awareness, Emotional Management, Spirituality and Values, Intellectual Awareness, and Occupational Wellness. One subscale (Self-Care) failed to meet the criterion, with an unacceptably low coefficient of .67. In light of the recognized expectation that the subscales would comfortably exceed the .70 criterion, it would be more appropriate to state that the inter-
162
STEWART, ROWE, LALANCE TABLE 1 Alphas, Means, Standard Deviations, and Valid n for Subscales
Subscale Physical Fitness and Nutrition Self-Care Safety and Lifestyle Environmental Wellness Social Awareness Emotional Awareness and Sexuality Emotional Management Intellectual Wellness Occupational Wellness Spirituality and Values Whole Questionnaire
Alpha
Mean
SD
n
0.76 0.67 0.81 0.79 0.84 0.74 0.84 0.89 0.89 0.85
30.74 28.22 37.29 33.04 34.07 40.27 39.26 33.61 40.08 39.20 357.06
7.99 7.41 8.69 7.87 7.84 6.73 7.23 9.14 7.68 7.64 58.25
426 422 425 427 427 418 432 431 432 431 361
nal consistency reliability for the Emotional Awareness and Sexuality, Physical Fitness and Nutrition, Environmental Wellness, and Self-Care subscales is therefore either marginally acceptable or too low, considering the large number of items in each subscale. Test–Retest Reliability The data used for this analysis were the pre- and posttest scores of participants who had not been exposed to the Lifetime Wellness curriculum (n = 110). Results of a one-way repeated measures ANOVA indicated that there was no significant mean increase over 12 weeks in any of the subscales except Self-Care, which increased significantly (p < .05) from the pretest to the posttest. This mean difference was probably clinically not meaningful. An intraclass correlation coefficient was calculated using a one-way ANOVA model for each subscale, to include systematic error (mean changes) in the error component. Mean pre- and posttest scores and intraclass correlation coefficients are presented in Table 2. The Spearman–Brown Prophecy formula was used to adjust the intraclass correlation coefficients to estimate reliability for a single trial (Baumgartner, 1968). When adjusted for a single trial, reliability for 4 of the 10 subscales fell below .70. These four subscales were Self-Care, Emotional Awareness and Sexuality, Intellectual Wellness, and Occupational Wellness. Reliability coefficients for the six subscales above the .70 level ranged from .70 to .81. All coefficients are low considering that each subscale contains 10 items. Longer tests usually have higher reliability because they provide a more adequate sampling of the measured behavior (Gronlund, 1967). Mean scores therefore appear to be stable over time, showing no significant mean change from test to retest with the exception of one subscale (Self-Care). Test–retest
163
*p < .05.
Physical Fitness and Nutrition Self-Care Safety and Lifestyle Environmental Wellness Social Awareness Emotional Awareness and Sexuality Emotional Management Intellectual Wellness Occupational Wellness Spirituality and Values Total Questionnaire
Subscale 32.77 29.32 38.96 35.08 35.55 40.99 40.68 34.13 41.39 40.87 366.97
M
Pretest
6.84 7.17 8.08 7.47 8.11 6.21 7.08 8.58 7.10 7.13 52.34
SD 32.78 30.82 39.33 35.60 35.36 41.51 40.47 35.40 41.25 41.32 371.65
M
Posttest
7.55 7.79 7.92 7.84 7.70 6.24 6.73 10.98 7.67 7.94 51.11
SD 0.01 1.51* 0.38 0.52 –0.18 0.52 0.21 1.27 –0.13 0.45 4.68
Mean Difference (Posttest– Pretest)
0.82 0.76 0.90 0.90 0.90 0.81 0.87 0.77 0.81 0.88 0.93
ICC
0.70 0.62 0.81 0.81 0.82 0.68 0.77 0.63 0.68 0.79 0.87
ICCADJ
TABLE 2 Means, Standard Deviations, Difference Scores, Intraclass Correlations (ICC), and Intraclass Correlations Adjusted (ICCADJ) for Participants Not in Wellness Classes (n = 110)
164
STEWART, ROWE, LALANCE
reliability adjusted for a single administration was either marginally acceptable or too low. These marginally acceptable and unacceptably low reliability coefficients indicate that, although mean scores may be stable, individual scores are not. This means that individual scores changed in different directions (i.e., some individual scores increased and some decreased). This Participant × Time interaction could be problematic if the TWI(HS) were used in repeated measures ANOVA designs in which changes over time are to be detected. The Participant × Time interaction indicated here by the reliability coefficients would be included in the error variance component (MSRESIDUAL or MSINTERACTION) of the omnibus F test of significance (Keppel, 1991), therefore decreasing the statistical power of such tests. Exploratory Factor Analysis of the TWI(HS) The 10-factor model, as proposed by the test developers of the TWI(HS), was tested with an exploratory factor analysis. Even though the National Wellness Institute considers wellness to be composed of six theoretical dimensions or domains of wellness, the TWI(HS) is composed of 10 wellness subscales for testing wellness knowledge, attitudes, and behaviors. The purpose of this statistical analysis was twofold: (a) to determine if there are 10 dimensions, or factors, of wellness, and (b) to determine if the 10 items of each subscale measure the same construct. If a simple factor structure were to exist, the rotated factor loading matrix would meet the following three criteria: (a) the 10 items of each proposed subscale would all load on the same factor, (b) none of the 10 items on a subscale would load on any other factor, and (c) no items from any other subscale would load on that factor. The principal-axis factoring extraction method in the SPSS factor analysis program was used to force a 10-factor structure on the data, and the varimax rotation method was used to rotate the factors. Factor loadings above .3 from this analysis are presented in Table 3. Simple structure was not found, thus evidence was not obtained to support the 10-factor model for the TWI(HS). Only three of the 10 subscales (Intellectual Wellness, Occupational Wellness, and Spirituality and Values) showed a fairly clear factor-loading pattern fulfilling Criterion 1, but each failed to meet Criteria 2 and 3. Because there was not a simple pattern for a 10-factor structure for the TWI(HS), exploratory factor analysis was also used to force a six-factor structure on the data, corresponding to the six recognized dimensions of wellness. This analysis showed a similar lack of simple factor structure. External Validity Evidence If wellness knowledge, attitudes, and behaviors improved as a result of participation in the Lifetime Wellness classes, one would expect to see this detected by an increase in mean TWI(HS) scores from the pretest to the posttest. The scores for this
TABLE 3 Varimax Rotated Factor Loading Matrix Item 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
Subscale Title PFN PFN PFN PFN PFN PFN PFN PFN PFN PFN SC SC SC SC SC SC SC SC SC SC SL SL SL SL SL SL SL SL SL SL EW EW EW EW EW EW EW EW EW EW SA SA SA
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
.72 .62 .77 .51 .37 .38 .31 .41
.62 .34 .31 .34 .52 .54 .52 .41 .42 .46 .67 .70 .62 .33 .33 .54 .70 .43 .67 .33 .50 .71 .74 .68 .57 .61 .52
.33 (continued)
165
TABLE 3 (Continued) Item 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
Subscale Title
F1
SA SA SA SA SA SA SA EAS EAS EAS EAS EAS EAS EAS EAS EAS EAS EM EM EM EM EM EM EM EM EM EM IW IW IW IW IW IW IW IW IW IW OW OW OW OW OW OW OW
.42 .57 .64
F2
F3
F4
F5
F6
F7
F8
F9
F10
.37
.40 .31 .62 .54 .43 .37
.34 .36 .34 .45 .37
.53 .55 .39 .37 .71 .65 .49 .50 .47
.33 .32 .61 .36
.45 .57 .52 .55 .54 .56 .53 .66 .47 .51 .43 .35
.30
.32
.40 .36 .58 .72 .63 .61 .63 .53
.36
(continued)
166
WELLNESS INVENTORY
167
TABLE 3 (Continued) Item
Subscale Title
88 89 90 91 92 93 94 95 96 97 98 99 100
OW OW OW SV SV SV SV SV SV SV SV SV SV
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
.52 .61 .44 .36 .31 .51 .32 .46 .46 .50 .43 .37 .31
.56 .35 .31 .36 .34 .30
Note. PFN = Physical Fitness and Nutrition; SC = Self-Care; SL = Safety and Lifestyle; EW = Environmental Wellness; SA = Social Awareness; EAS = Emotional Awareness and Sexuality; EM = Emotional Management; IW = Intellectual Wellness; OW = Occupational Wellness; SV = Spirituality and Values.
analysis were those of participants enrolled in Lifetime Wellness classes who completed pre- and posttest questionnaires (n = 127). Participants were pre- and posttested over a 12-week interval. Mean scores for each subscale, as well as for the total questionnaire, were analyzed using repeated measures t tests to determine if mean scores changed from the pretest to the posttest. These results are presented in Table 4. The ns are varied due to nonresponse on some items. The posttest mean was significantly (p < .05) higher than the pretest mean for only three of the 10 subscales (Physical Fitness and Nutrition, Environmental Wellness, and Social Awareness). No significant mean change (p > .05) was found in the other seven subscales. Additionally, there was no mean change for the total questionnaire score, indicating that the TWI(HS) does not measure changes in wellness knowledge, attitudes, and behaviors.
DISCUSSION The discussion of the data gathered in this research will focus on how they relate to four completed studies that are similar in nature. The four previous research studies are Palombi (1987, 1992), McClanahan (1990), Murray (1996), and Papenfus and Beier (1984). A similarity of this study to those of Palombi, McClanahan, and Murray is the use of National Wellness Institute inventories as measurement instruments. The three instruments were the Lifestyle Assessment Questionnaire (LAQ;
168
STEWART, ROWE, LALANCE
TABLE 4 Pretest and Posttest Means, Standard Deviations, and t Values for Participants Enrolled in Lifetime Wellness Classes Pretest Subscales Physical Fitness and Nutrition Self-Care Safety and Lifestyle Environmental Wellness Social Awareness Emotional Awareness and Sexuality Emotional Management Intellectual Wellness Occupational Wellness Spirituality and Values Total Questionnaire
Posttest
n
M
SD
M
SD
t Value
108 109 113 107 108 103 114 108 110 110 77
31.0 28.7 36.5 31.9 34.0 40.1 39.9 33.9 39.8 39.2 360.5
8.2 7.4 8.6 8.0 7.5 7.0 6.4 9.3 7.5 7.7 59.0
32.5 29.7 36.9 33.5 35.7 40.3 39.1 34.3 40.1 38.5 366.2
7.5 8.4 8.6 8.9 8.2 8.3 7.3 9.1 7.7 8.1 60.9
–2.62* –1.60 –0.82 –2.31* –2.61* –0.21 1.38 –0.02 –0.52 1.19 –1.64
*p < .05
National Wellness Institute, 1983a, used by Palombi), the Testwell: A Self-Scoring Wellness Assessment Questionnaire (TASS; National Wellness Institute, 1983b, used by McClanahan), and the Testwell: Wellness Inventory—College Edition (TWI[CE]; National Wellness Institute, 1993, used by Murray). Each of these three studies included investigation of some aspect of reliability and/or validity of the research instruments. The study of Papenfus and Beier is similar to this study in that they investigated whether wellness inventory scores changed as a result of participation in a 10th-grade wellness class. Only Palombi (1992) reported subscale reliability coefficients for an associated wellness inventory. She reported acceptable reliability for 8 of the 10 subscales on the LAQ (α = .74 and above), but internal consistency for the Physical Fitness and Self-Care subscales was low (α = .64 and .68, respectively). This finding is also comparable to reliability coefficients for the subscales of the TWI(HS) obtained in this study. Only one of the 10 subscales (Self-Care) had a reliability coefficient below .70 in this study. McClanahan (1990) reported that total questionnaire scores were stable over time for the population tested. However, an inappropriate statistic (the Pearson r interclass correlation coefficient) was used to determine stability (test–retest) reliability. To determine test–retest reliability, the pretest scores are correlated with the posttest scores to determine the degree of consistency between the two sets of scores. The pretest and posttest scores use the same measure of the same variable, thus the intraclass correlation is the appropriate statistical technique. The Pearson r is an interclass correlation and should be used when correlating two different vari-
WELLNESS INVENTORY
169
ables (Thomas & Nelson, 1996). There were no significant mean differences in this study for the TWI(HS) total questionnaire or for 9 of the 10 subscales, indicating stability of group mean scores. However, when the intraclass coefficients were adjusted to estimate reliability for a single administration, 4 of the 10 subscales fell below .70, indicating instability of individuals’ scores. These subscales were Self-Care, Emotional Awareness and Sexuality, Intellectual Wellness, and Occupational Wellness. Typical use of the TWI(HS) will usually require only one administration. This result indicated changes in scores in different directions and indicates that the stability reliability of the TWI(HS) for some subscales is questionable, particularly if the instrument is used to track individual (rather than group) changes over time. McClanahan (1990) did not report on test–retest reliability for any of the 10 subscales or intraclass correlation coefficients adjusted for a single administration, and so it must be assumed that this adjustment was not made in that study. McClanahan (1990) used the total scores of the participants from the 10 subscales of the TASS as input data for an exploratory factor analysis and found the subscales to be explained by two higher order factors (labeled Physical and Nonphysical). Seven of the 10 subscales (Drugs and Driving, Social, Emotional Awareness, Emotional Control, Intelligence, Occupational, and Spiritual) were identified as being explained by the Nonphysical higher order factor. Three subscales (Physical Fitness, Nutrition, and Self-Care) were identified as being explained by the Physical higher order factor. McClanahan (1990) assumed a 10-factor first order structure. Thus in her analysis, the assumption was made that the items on each subscale measured what each was purported to originally measure by the test developers (i.e., the Physical Fitness subscale items measured knowledge, attitudes, and behavior pertaining to physical fitness, etc.). This analysis was a very different analysis and addressed a different question than the question posed in the present study. McClanahan’s analysis addressed the question of a second-order structure. In the present study, a 10-factor exploratory analysis was forced on the data in this investigation corresponding to the TWI(HS) developers’ identification and division of the inventory into 10 subscales. This analysis would determine if there was a first-order structure of 10 factors. A 10-factor structure was not supported empirically. The National Wellness Institute recognizes six dimensions of wellness, but a six-factor structure was also not supported by the data in this study. Papenfus and Beier (1984) found that wellness instruction did elicit positive lifestyle changes in 10th-grade students. McClanahan (1990) also found the adoption of positive lifestyle changes in college students following wellness instruction. Significant pretest–posttest mean increases were found for the treatment groups. A greater total questionnaire mean increase was found for the participants in an activity-based course than for those in a cognitive-based course. No significant total questionnaire mean increases were found for a control group. Results for individual subscales were not reported. Murray (1996) also reported significant
170
STEWART, ROWE, LALANCE
mean increases in TWI(CE) scores of college students enrolled in a Lifetime Wellness class for the total questionnaire and for 7 of 10 subscales. No significant differences were found for the total questionnaire or for any subscale for scores of a control group. Unlike the results of McClanahan (1990) and Murray (1996), there was no difference found in this study between the pretest and posttest scores for the total questionnaire or for 7 of the 10 subscales of the TWI(HS). The three subscales that did detect mean increases were Physical Fitness and Nutrition, Environmental Wellness, and Social Awareness. Two possible explanations may be given for this result: (a) that wellness knowledge, attitudes, and behaviors did not change and, therefore, the scores did not change; or (b) that wellness knowledge, attitudes, and behaviors did change, but the TWI(HS) failed to measure or detect these changes. Because McClanahan (1990), Murray (1996), and Papenfus and Beier (1984) all reported evidence supporting the contention that instruction does have an influence on wellness knowledge, attitudes, and behaviors, the second explanation appears to be a more reasonable explanation for the results of this study. Although the curricula in these studies were designed for a different age group, the basic subject matter and class-based format were the same. However, in this study the researchers were unable to control the quality and content of the teaching and this should be borne in mind. During the course of this study, several practical problems were noticed when administering the TWI(HS). One problem was the lack of participant response to certain items on the questionnaire; repeatedly, responses to certain items were omitted. This nonresponse may have been due to a lack of understanding of the intended meaning of the item. For example, Item 12 asks about monthly examinations of the breasts or testes, Item 19 asks about maintaining a recommended blood pressure range, and Item 20 asks about maintaining a recommended blood cholesterol level. Although these are certainly important wellness considerations, they are mature adult-oriented health concerns and have limited relevance to many high school aged students. Several items pertain to operating a motor vehicle. Item 23 states, “I stay within 5 miles per hour of the speed limit,” Item 32 states, “I carpool or take as many riders as I safely can when I am driving a car (if you do not drive, answer “5”),” Item 33 states, “I drive a fuel efficient vehicle (if you do not drive, answer “5”),” and Item 35 states, “To reduce the amount of pollution, I drive a well maintained vehicle (if you do not drive, answer “5”).” These statements regarding operating a vehicle seem to assume several things. One assumption is that all potential users of the TWI(HS) are legal drivers and have access to a vehicle. This is clearly evident in Item 23, which offers no alternative response for those who are not yet legal drivers or do not have a vehicle to drive. This item also does not allow a response for those situations that would require driving well below the posted speed limit. Items 33 and 35 appear to assume that all users of the TWI(HS) have ownership or
WELLNESS INVENTORY
171
access to the vehicle of their choice. These items do offer an alternative response for those who do not drive, but a response of 5 to these items if one does not drive does not seem to be a sensible response. Throughout the TWI(HS), an item score of 5 indicates the achievement of a high-level of wellness. A response of 5 would therefore be interpreted that a person who has not reached legal driving age, or does not have access to a vehicle to drive, has achieved a high level of environmental wellness. This choice is therefore not logical. An assumption of Item 32 is that all parents or guardians are willing to accept the financial and legal responsibility for their high school aged driver and additional passengers. Parental or guardian approval of carpooling, as well as automobile insurance coverage restrictions, does not seem to have been considered. A response of 5 would then not be appropriate and any other response (1, 2, 3, or 4) would indicate a lower level of wellness behavior when the situation is not a choice of the participant. Other items posed quite different practical problems. Item 53 states, “I have positive interactions with men in my life,” and Item 54 states, “I have positive interactions with women in my life.” The meaning of these two statements was confusing to some participants and elicited several requests for interpretation from the administrator of the TWI(HS) in this study. Some participants did not understand the meaning of the word interactions, and indicated to the researcher that they thought the item referred to a sexual relationship. Item 53 was the most frequently unanswered item on both the pretest and posttest. Item 59 states the following: “I do not engage in sexual intercourse (answer “5,” if true. Complete following if false.) If I choose to engage in sexual intercourse I take steps to prevent unwanted pregnancy.” Item 60 states the following: “I do not engage in sexual intercourse (answer “5,” if true. Complete following if false.) If I choose to engage in sexual intercourse, I use condoms to reduce the risk of disease.” The length of these items and the additional instructions placed in the middle of the item appeared to be a cause of confusion. A review of the answer sheets revealed that some students answered 5 (almost always) to one of these items and gave a different response to the other item. Obviously one would expect that the two items would both be answered with a 5 (almost always) or both with a response other than 5. Based on the empirical findings of this study, and the participant observations just described, it is strongly recommended that the TWI(HS) in its present form should be used in high schools only with great caution and consideration for the findings of this research. The results for internal consistency and stability reliability indicate that the reliability of the TWI(HS) is at least questionable. The results of the exploratory factor analysis, and the measurement of changes in wellness attitudes, indicate a lack of construct validity evidence for the TWI(HS). In summary, these results indicate that at this point in time the TWI(HS) possesses neither adequate reliability nor validity to be used widely as a measure of wellness in 9thand 10th-grade children.
172
STEWART, ROWE, LALANCE
Improvement of the scale is needed. This should be achieved by returning to the early stages of test development. The lack of information in the test manual regarding the nature of the construct being measured indicates insufficient attention was paid to content validity. In particular, there was no clear rationale for the number and nature of the subdimensions of wellness, and no clear statement of whether the questionnaire was designed to measure knowledge, attitudes, or behaviors. The items on the test reflect a mixture of these three characteristics within the subscales. This study confirms empirically that the lack of a theoretically defensible process for developing the questionnaire has resulted in a test that lacks reliability and validity. Issues of content validity should be addressed initially, via consultation with content experts and the theoretical literature, and this should drive the development of the item bank. Following this, psychometrically sound procedures should be used to obtain data that are supportive of the reliability and validity of the developed questionnaire.
REFERENCES Baumgartner, T. A. (1968). The application of the Spearman–Brown prophecy formula when applied to physical performance tests. Research Quarterly for Exercise and Sport, 39, 847–856. Baumgartner, T. A., & Jackson, A. S. (1999). Measurement for evaluation in physical education and exercise science (6th ed.). Boston: McGraw-Hill. Gronlund, N. E. (1967). Measurement and evaluation in teaching. New York: Macmillan. Hatfield, T., & Hatfield, S. R. (1992). As if your life depended on it: Promoting cognitive development to promote wellness. Journal of Counseling and Development, 71, 164–167. Keppel, G. (1991). Design and analysis: A researcher’s handbook. Englewood Cliffs, NJ: Prentice Hall. Lifetime Wellness Curriculum Framework. (1994). Nashville: Tennessee State Department of Education. McClanahan, B. S. (1990). The influence of an undergraduate wellness course on lifestyle behaviors: A comparison of an activity-based course and a cognitive-based course (Doctoral dissertation, Memphis State University, 1990). Dissertation Abstracts International, 52, 433. Murray, S. R. (1996). The efficacy of an introductory health/wellness course in positively changing wellness behaviors (Doctoral dissertation, Middle Tennessee State University, 1996). Dissertation Abstracts International, 57, 1509. National Wellness Institute. (1983a). Life Assessment Questionnaire (2nd ed.). Stevens Point, WI: Author. National Wellness Institute. (1983b). Testwell: A Self-scoring Wellness Assessment Questionnaire. Stevens Point, WI: Author. National Wellness Institute. (1993). Testwell: Wellness Inventory—College Edition: User manual. Stevens Point, WI: Author. National Wellness Institute. (1994). Testwell: Wellness Inventory—High School Edition: User manual. Stevens Point, WI: Author. Nunnally, J. C. (1982). Reliability of measurement. In H. E. Mitzel (Ed.), Encyclopedia of educational research (5th ed.; pp. 1589–1601). New York: Macmillan. Omizo, M. M., Omizo, S. A., & D’Andrea, M. J. (1992). Promoting wellness among elementary school children. Journal of Counseling and Development, 71, 194–198.
WELLNESS INVENTORY
173
Palombi, B. J. (1987). Reliability and validity of wellness instruments: Users and non-users of counseling center services and their level of wellness (Doctoral dissertation, Michigan State University, 1987). Dissertation Abstracts International, 49, 2919–B. Palombi, B. J. (1992). Psychometric properties of wellness instruments. Journal of Counseling and Development, 71, 221–225. Papenfus, R., & Beier, B. J. (1984). Developing, implementing, and evaluating a wellness education program. Journal of School Health, 55(9), 360–362. Robbins, G. (1994). Understanding and working with an emphasis on wellness. Thresholds in Education, 20(1), 25–29. Robbins, G., Powers, D., & Rushton, J. (1992). A required fitness/wellness course that works. Journal of Physical Education, Recreation, and Dance, 63(2), 17–21. Rosenstein, A. H. (1989). Health promotion and the cost of illness. College and University Personnel Association Journal, 40(4), 7–14. Thomas, J. R., & Nelson, J. K. (1996). Introduction to research in health, physical education, recreation, and dance (3rd ed.). Champaign, IL: Human Kinetics. U.S. Department of Health & Human Services, Public Health Service. (1992). Healthy people 2000: National health promotion and disease prevention objectives. Boston: Jones & Bartlett.