Scandinavian Journal of Psychology, 2009, 50, 183–189
Blackwell Publishing Ltd
DOI: 10.1111/j.1467-9450.2008.00701.x
Health and Disability
The Women’s Health Questionnaire (WHQ): A psychometric evaluation of the 36-item Norwegian version EINAR KRISTIAN BORUD,1 MONICA MARTINUSSEN,2 ANNE ELISE EGGEN3 and SAMELINE GRIMSGAARD4 1
The National Research Center in Alternative and Complementary Medicine, University of Tromsø, Norway Centre for Child and Adolescent Mental Health, University of Tromsø, Norway 3 Institute of Community Medicine, University of Tromsø, Norway 4 Clinical Research Center, University Hospital of North Norway, Tromsø, Norway 2
Borud, E. K., Martinussen, M., Eggen, A. E. & Grimsgaard, S. (2009). The Women’s Health Questionnaire (WHQ): A psychometric evaluation of the 36-item Norwegian version. Scandinavian Journal of Psychology, 50, 183–189. The Women’s Health Questionnaire (WHQ) was designed specifically to study possible changes that occur during menopause. The purpose of this study was to perform a psychometric evaluation of the Norwegian version of the WHQ by examining the factor structure and construct validity of the instrument. Data used for the evaluation were collected at baseline of the ACUFLASH study, a randomized, controlled clinical trial that evaluated the effect of acupuncture treatment on menopausal symptoms. Altogether, 267 women with a very high frequency of hot flushes were included in the study. Some deficiencies in the WHQ questionnaire were observed when applied to this sample, including an unclear factor structure, low alpha values for some dimensions, and a strong floor effect in the vasomotor symptoms dimension. The total scale score appears reliable, but care should be taken when interpreting some of the subscales. Key words: Menopause, quality of life, acupuncture. Einar Kristian Borud, NAFKAM, University of Tromsø, N-9037 Tromsø, Norway. E-mail:
[email protected]
INTRODUCTION The term “quality of life” (QOL) refers to perceived physical and mental health over time (“Health-Related Quality of Life”). Quality of life instruments are widely used in research, and the term “health-related quality of life” (HR-QOL) is frequently used in the medical field. It is claimed that this approach takes qualitative aspects into account, such as the effect of subjective symptoms on day to day functioning and well-being (Wiklund, Karlberg, Lindgren, Sandin & Mattsson, 1993). Hence, HR-QOL instruments can be used to better understand the effect of shortand long-term disorders and symptoms in single patients and in different populations (“Health-Related Quality of Life”). HR-QOL instruments are often divided into two subgroups: generic and specific. Generic HR-QOL instruments are designed to be applicable across a wide range of populations and interventions, while specific HR-QOL measures are designed to be relevant for particular interventions or in certain subpopulations (Coons, Rao, Keininger & Hays, 2000). The Women’s Health Questionnaire (WHQ) is a specific HRQOL measure. It is a self-administered questionnaire that measures the physical and mental health of women aged 40 to 65 years. It was developed in England, and designed specifically to study changes that may occur during menopause (Hunter, 1992, 2000). The Greene climacteric scale (Greene, 1998), the Utian Quality of Life Scale (Utian, Janata, Kingsberg, Schluchter & Hamilton, 2002), and the Menopause-specific Quality of Life Questionnaire (Hilditch, Lewis, Peter et al., 1996) are similar instruments.
The WHQ has demonstrated good internal consistency and test–retest reliability in several studies (Genazzani, Nicolucci, Campagnoli et al., 2002; Hunter, 2003; Silva Filho, Baracat, Conterno, Haidar & Ferraz, 2005). It has been translated into 27 languages, and validated in many countries, including Sweden, Italy and Brazil (Portuguese version) (Genazzani et al., 2002; Silva Filho et al., 2005; Wiklund et al., 1993). The WHQ was translated into Norwegian by the Mapi Research Institute, but a psychometric validation had not been performed until the current study. A high prevalence of menopausal vasomotor symptoms was a criterion for inclusion in the present study. The participants in this study reported more vasomotor symptoms than participants in prior studies of the WHQ. Hence, it was necessary to evaluate the psychometric properties of the instrument among women presenting a high degree of vasomotor complaints. The evaluation was performed by examining the factor structure of the Norwegian version, and by exploring the construct validity of the instrument by comparing the WHQ to instruments measuring related constructs, such as a measure of psychosomatic complaints (PSC) and a measure of positive health status (EQ-5D). We expected the WHQ to be negatively correlated with psychosomatic complaints and positively related to health status.
METHODS Participants A total of 267 women were included in the study. Mean age at inclusion was 53.8 (SD = 4.4) years, and mean age at menopause was 48.9
© 2008 The Authors. Journal compilation © 2008 The Scandinavian Psychological Associations. Published by Blackwell Publishing Ltd., 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA. ISSN 0036-5564.
184 E. K. Borud et al.
Scand J Psychol 50 (2009)
Table 1. Baseline characteristics of the postmenopausal women (N = 266) N (%) Years of education 17 Paid work Number of children 0 1 2 3 4 Living alone Self-reported health Bad Not quite good Good Excellent Missing Insomnia Never/a few times a year 1–3 times/month Once a week More than once a week Missing Insomnia that affects working ability Hypertension Hypothyreosis
124 25 49 67 214
(47) (9) (19) (25) (81)
34 48 110 58 16 48
(13) (18) (41) (22) (6) (18)
5 68 152 38 4
(2) (26) (57) (14) (1)
65 41 26 132 3 126 46 30
(24) (16) (10) (49) (1) (47) (17) (11)
(SD = 3.7) years (see Table 1 for further sample characteristics). Data used for evaluation were collected at baseline of the ACUFLASH study, a multi-center (Tromsø, Bergen and Oslo), randomized, controlled trial that evaluated whether acupuncture-care combined with self-care was more effective than self-care alone for relief of climacteric complaints (Borud, Alraek, White, Fonnebo & Grimsgaard, 2007). Study participants were recruited by newspaper advertisements and media coverage of the study by newspapers and television. The baseline questionnaire was answered by 266 of the 267 women who were included in the study. For practical reasons, the psychosomatic complaints checklist (PSC) was included in the baseline questionnaire three months after the study start; therefore, altogether it was administered to 181 participants. Postmenopausal women (at least one year since the last menstrual bleeding) were eligible if they documented at least an average of seven hot flushes per day over seven consecutive days during the two-week qualifying period. Exclusion criteria included surgical menopause, history of cancer within the past five years, use of anticoagulant medication, heart valve disease, poorly controlled hypertension, poorly controlled hypothyroidism, hyperthyroidism, poorly controlled diabetes mellitus, organ transplant, mental disease, overt drug or alcohol dependency, or inability to complete study forms. Use of systemic hormone therapy, selective serotonin reuptake inhibitors and serotonin-norepinephrine reuptake inhibitors required an eight week wash-out period, and use of local prescription hormone therapy required a four-week wash-out period.
Measurements The Women’s Health Questionnaire. The range of subscales included in the WHQ enable a detailed assessment of several dimensions of
mental and physical health, including depression, anxiety, sleep problems, and somatic symptoms, along with subscales for menstrual problems and sexual difficulties (Hunter, 2003). The following domains are covered by the questionnaire: anxiety/fears (items 2, 4, 6, 9), attractiveness (items 21, 32), somatic symptoms (items 14–16, 18, 23, 30, 35), memory/ concentration (items 20, 33, 36), vasomotor symptoms (items 19, 27), depressed mood (items 3, 5, 7, 8, 10, 12, 25), sleep problems (items 1, 11, 29), sexual behavior (items 24, 31, 34) and menstrual symptoms (items 17, 22, 26, 28). The WHQ is scored on a four-point Likert scale (1 = yes, definitely, 2 = yes, sometimes, 3 = no, not much, 4 = no, not at all). Items 7, 10 and 25 of the depressed mood category and items 21 and 32 of the attractiveness domain are reversed before scoring. The items are usually dichotomized before scoring, and within each domain an average score between 0 and 1 is calculated, where 0 is an indicator of “good health status” and 1 is an indicator of “poor health status”. A clinically significant change within each domain of the WHQ is a difference of approximately 0.10 to 0.20. Norms are available for different age groups, nationalities and menopausal status (Hunter, 2003). EQ-5D. EQ-5D is a standardized generic quality of life instrument that is used as a measure of health outcome (“EQ-5D, an instrument to describe and value health”). The first part of the EQ-5D descriptive system consists of five dimensions: mobility, self-care, usual activity, pain/discomfort, and anxiety/depression. Each dimension has three levels, which are designated as no problem, some problem, or extreme problem. Subjects are asked to check the level most indicative of their current level of function or experience on each dimension. Five dimensions, each with three levels, yield 243 possible distinct health states that comprise the classification system (11111 to 33333). The classification system has been assigned several different standardized scores derived from population-based samples of respondents asked to assign values to subsets of the 243 states using the anchoring labels noted above. The values are in the interval −1 to 1, where the health state 33333 (worst possible health state) is given the value −1.000, and the health state 11111 (no problems at all) is given the value 1.000. The second part of the EQ-5D is a 20 cm visual analogue scale (VAS), which has end-points labelled “best imaginable health state” and “worst imaginable health state” anchored at 100 and 0, respectively. Respondents are asked to indicate how they rate their own health state by drawing a line from an anchor box to that point on the VAS which best represents their own health on that day. The first part of the EQ-5D produces a health index based on a descriptive system, and the second part is a self-rated assessment of health status based on the VAS (Coons et al., 2000). Cronbach’s alpha for the first part of the EQ-5D was 0.63 based on our data. Psychosomatic Complaints (PSC). The PSC is a checklist of 19 physical symptoms (Quinn & Shepard, 1974). Subjects indicate how often they experience 19 physical conditions (e.g., poor appetite, headaches, pain in the heart, sleep disturbances, backaches, restlessness during the past year) on a scale ranging from 1 = “never” to 4 = “often”. The scoring is calculated as the mean score of the 19 items. Cronbach’s alpha for PSC, based on the current sample, was 0.79. Construct validity was explored by correlating the WHQ total score and the scores of the separate dimensions of the WHQ with the scores of EQ-5D, EQ-5D VAS, and the PSC. The PSC was expected to show a medium to large negative correlation with WHQ total score. The PSC was also expected to be related to the somatic symptoms dimension, and also to the anxiety/fears and depressed mood dimensions, as previous studies have demonstrated a high correlation between PSC and burnout symptoms (Martinussen & Richardsen, 2006; Martinussen, Richardsen & Burke, 2007). A medium correlation was expected between the EQ-5D health index and the WHQ total score. Medium to large correlations were expected between the EQ-5D health index and the somatic symptoms, anxiety fears and depressed mood dimensions, as two of the dimensions in the EQ-5D health index were pain/discomfort and anxiety/depression. The EQ-5D VAS “health status” was expected
© 2008 The Authors. Journal compilation © 2008 The Scandinavian Psychological Associations.
The Women’s Health Questionnaire
Scand J Psychol 50 (2009) to correlate more strongly with the WHQ somatic symptoms dimension compared to the other dimensions. Internal consistency was evaluated by calculating Cronbach’s alpha coefficient (Bland & Altman, 1997). The presence of floor or ceiling effects was evaluated by calculating the proportion of participants with the lowest or highest possible score. Floor or ceiling effects are considered to be present if more than 15% of the respondents achieved the lowest or highest possible score, respectively (Terwee, Bot, de Boer et al., 2007).
Procedure Women who wanted to participate phoned the study coordinator, received information about the study and were briefly screened by telephone for eligibility. Potential participants received a diary by mail and recorded the frequency and severity of hot flushes and the duration of sleep at night for a period of 14 days. Women who returned the diary and fulfilled the inclusion criteria received an informed consent form and the baseline questionnaires, which included the WHQ, the EQ-5D and the PSC, by mail. The questionnaires were completed at home, and returned during the enrollment visit with the local study coordinator. The coordinators double-checked the eligibility criteria and obtained written informed consent. The participants were stratified by center and thereafter block randomized (random block size of four, six or eight) to receive additional acupuncture or not receive additional acupuncture. Block randomization (organizing study participants into blocks and randomizing within each block) was used to ensure close balance of the numbers in each group at any time during the trial. The treatment group received 10 sessions of acupuncture-care and self-care, and the control group engaged in self-care only. The study was approved by the Norwegian Data Inspectorate, the Norwegian Biobank Registry and the Regional Committee for Medical Research Ethics.
Sample size The sample size in the ACUFLASH study was calculated using data from previous trials of hormone therapy, herbs and acupuncture. We aimed to detect a 50% reduction in hot flush rate in the acupuncture group and a 20% difference between groups. Assuming a baseline hot flush rate of 7.0 ± 3.5 (M ± SD) for change in flush rate, and employing a two-sample t-test, 100 women were needed in each group to obtain 80% power (two-tailed test, and α-value of 0.05). A total of 267 women were recruited to compensate for study dropout and withdrawal. A “rule of thumb” for determining a priori sample size for exploratory factor analysis is a subject to item ratio of 10:1 or more (Costello & Osborne, 2005). In this study, the ratio was 9:1. Adequate sample size is partly determined by the nature of the data; the stronger the data, the smaller the sample need be for accurate analysis (MacCallum, Widaman, Zhang & Hong, 1999). Strong data results from several items loading strongly on each factor, and high communalities without crossloading. These conditions may be rare in practice (Widaman, 1993).
Statistical analysis SPSS software, version 14.0 (SPSS Inc, Chicago, IL, USA), was used for all statistical analysis, except for significance-testing (Hotelling’s T 2 test) of differences between correlation coefficients, for which the program Simple Interactive Statistical Analysis (SISA, Quantitative Skills, Consultancy for Research and Statistics, The Netherlands) was used. Scores in the different dimensions of the WHQ were calculated, after reversing the appropriate items, as mean scores based on the four-point Likert scale. The factor structure of the WHQ was evaluated through exploratory factor analysis using Principal Component Analysis (PCA) with Varimax rotation. The number of factors was determined by examining the Scree plot.
185
The original developer of the WHQ, and the developers of a revised version of the instrument, used PCA with Varimax rotation to explore the factor structure (Girod, de la, Keininger & Hunter, 2006; Hunter, 2000). Varimax is an orthogonal rotation that minimizes the complexity of the components by increasing large loadings and decreasing small loadings within each component, and it is by far the most commonly used method (Costello & Osborne, 2005). Examination of the Scree plot is considered the best choice to enable investigators to determine the number of factors to retain (Costello & Osborne, 2005). Tabachnick and Fidell (2001) suggested 0.32 as a good cut-off point for minimum item factor loading (Tabachnick & Fidell, 2001). This equates to approximately 10% overlapping variance with the other items in the factor. Cross-loading items are items loading 0.32 or higher on two or more factors (Costello & Osborne, 2005). Exploratory factor analysis, rather than confirmatory factor analysis, was chosen in this study because analyses of the WHQ on a pooled international database had previously resulted in an unclear factor structure (Girod et al., 2006). Another reason was that the participants in this study reported far more disturbance due to vasomotor episodes compared to the participants used to evaluate the original instrument.
RESULTS Factor analysis The menstrual symptoms dimension of the WHQ was excluded from the study because all participants were postmenopausal. Items related to the sexual behavior dimension were also excluded due to a preponderance of missing values. The initial PCA was performed on the remaining 29 items. Based on the Scree plot, five factors were identified. Descriptive statistics and principal component analysis results are presented in Table 2. Four items from the original anxiety/fears dimension (items 2, 4, 6, 9) had the highest loadings on the first factor. However, items 11 and 29 of the sleep dimension and items 3, 5, 8 and 12 of the depressed mood dimension also loaded on this factor. Hence, factor one may represent an anxiety/depression dimension. The second factor included items that described feelings of attractiveness, well-being and liveliness; therefore, this factor may represent a “well-being” dimension. In addition to the items of the original attractiveness dimension, items of the original depressed mood, somatic symptoms and anxiety/fears dimensions cross-loaded or loaded on this factor. The high loading items on factor 3 described a range of somatic symptoms; hence this factor represents a somatic symptoms dimension. The factor includes all the items of the original somatic symptoms dimension, one item from the original anxiety/fears dimension and two items from the original sleep problems dimension. The highest loading items on factor 4 are the three items of the original memory/concentration dimension. In addition, items from the somatic symptoms and depressed mood dimension cross-loaded on this factor. Factor 5 included items that reflected symptoms related to menopausal complaints. Here we found the two items of the original vasomotor symptoms dimension and item 1 of the original sleep problems dimension. This dimension may thus represent a “menopausal vasomotor symptoms/sleep problems” dimension.
© 2008 The Authors. Journal compilation © 2008 The Scandinavian Psychological Associations.
186 E. K. Borud et al.
Scand J Psychol 50 (2009)
Table 2. Descriptive statistics and principal component analysis results for the WHQ items (N = 266) Item noa
2 4 6 9 21 (R) 32 (R) 14 15 16 18 23 30 35 20 33 36 1 11 29 3 5 7 (R) 8 10 (R) 12 25 (R) 19 27 13
M (SD)
I get very frightened or panic feelings for apparently no reason at all I feel anxious when I go out of the house on my own I get palpitations or a sensation of “butterflies” in my stomach or chest I feel tense or “wound up” I feel rather lively and excitable I feel physically attractive I have headache I feel more tired than usual I have dizzy spells I suffer from backache or pain in my limbs I feel sick or nauseous I often notice pins and needles in my hands and feet I need to pass urine more frequently than usual I am more clumsy than usual I have difficulty in concentrating My memory is poor I wake early and then sleep badly for the rest of the night I am restless and can not keep still I have difficulty in getting off to sleep I feel miserable and sad I have lost interest in things I still enjoy the things I used to I feel life is not worth living I have good appetite I am more irritable than usual I have feelings of well-being I have hot flushes I suffer from night sweats I worry about growing old
3.54 3.82 2.96 2.66 2.95 2.74 2.62 2.00 3.12 2.04 3.32 2.75 2.32 2.90 2.48 2.35 1.96 2.84 2.33 2.95 3.17 3.71 3.61 3.80 2.63 3.18 1.03 1.23 2.92
(0.74) (0.55) (0.96) (0.96) (0.85) (0.85) (0.96) (0.92) (0.93) (1.00) (0.87) (1.13) (1.10) (0.96) (0.92) (0.86) (0.94) (0.98) (1.07) (0.91) (0.93) (0.60) (0.82) (0.48) (0.96) (0.82) (0.18) (0.52) (0.91)
Variance explained (%)
Factor loadingsb 1 0.70 0.67 0.54 0.56
2
3
4
0.36 0.38 0.77 0.69 0.48
0.33
0.62 0.47 0.56 0.67 0.49 0.63 0.39
0.37
0.39 0.70 0.71 0.77
0.37 0.52 0.39 0.48 0.41
5
0.55
0.46 0.46 0.54 0.58
0.40
0.38 0.42
0.36 0.71 0.34 0.83
0.36
0.34
12
12
10
9
5
Notes: Total variance explained (48%). Factor loadings