MEASUREMENT IN PHYSICAL EDUCATION AND EXERCISE SCIENCE, 9(2), 79–111 Copyright © 2005, Lawrence Erlbaum Associates, Inc.
Service Quality Assessment Scale (SQAS): An Instrument for Evaluating Service Quality of Health–Fitness Clubs Eddie T. C. Lam Department of Health, Physical Education, Recreation, and Dance Cleveland State University
James J. Zhang Department of Exercise and Sport Sciences University of Florida
Barbara E. Jensen Professor Emeritus, Physical Education Springfield College
This study was designed to develop the Service Quality Assessment Scale to evaluate the service quality of health–fitness clubs. Through a review of literature, field observations, interviews, modified application of the Delphi technique, and a pilot study, a preliminary scale with 46 items was formulated. The preliminary scale was administered to members of one health–fitness club. From exploratory factor analysis (EFA) of the pilot test data, 6 factors emerged. Next the revised scale (reduced to a 40-item scale) was administered to 10 health–fitness clubs (N = 1,202). The data set was split into halves: one for EFA and the other for confirmatory factor analysis (CFA). Six factors emerged in the EFA: Staff, Program, Locker Room, Physical Facility, Workout Facility, and Child Care. The fit indexes from the CFA indicated that the model was permissible (e.g., Root Mean Square Error of Approximation = .07, Standardized Root Mean Square Residual = .05, Comparative Fit Index = .87). All the factors had acceptable alpha and composite reliability coefficients. The model was then tested for invariance across gender; 9 items were eliminated due to a lack of invariance for factor loadings or tau coefficients. The 31-item scale with 6 factors Requests for reprints should be sent to Eddie T. C. Lam, Department of Health, Physical Education, Recreation, and Dance, Cleveland State University, 2121 Euclid Avenue, PE 218, Cleveland, OH 44115-2214. E-mail:
[email protected]
80
LAM, ZHANG, JENSEN
displayed sound psychometric properties and invariance for factor loadings and tau coefficients, and can be utilized to evaluate service-quality issues in various health–fitness club settings.
Key words: exploratory factor analysis, confirmatory factor analysis
Americans are becoming more health conscious than ever. According to the Profiles of Success (International Health, Racquet and Sportsclub Association [IHRSA], 2004), in the year 2001 more than 50 million people in the United States participated in sport or fitness activities. Those classified as frequent health club attendees increased in number more than 200% between 1987 and 2003 to approximately 15.7 million, whereas the average attendance per member rose from 72 to 90 days per year (IHRSA). The health consciousness of Americans is also reflected by the increase in health–fitness club membership. In the last decade, memberships have doubled from 17.3 million in 1987 to more than 39.4 million in 2004; meanwhile, the number of commercial health–fitness clubs has exceeded 22,000. Based on the demand of the members, more than 95% of these health–fitness clubs provide free-weight and cardiovascular equipment, whereas 78% of them have an aerobic and group exercise area (IHRSA). Health–fitness clubs not only compete among themselves for the 36 million potential members, but they also face challenges from other organizations in this $8 billion industry. The number of health–fitness clubs in apartments, hotels, and resorts is growing to satisfy those people who expect to be able to work out at their on-site facilities or while traveling. Learning that wellness programs can reduce absenteeism due to illness as well as lower health insurance costs, corporations now invest more of their revenues than ever in corporate fitness centers. On the other hand, backed by medical expertise and the reputations of their parent hospitals, hospital-owned clubs have become popular within their communities (McDonald & Howland, 1998). Tax-exempt organizations such as the Young Men’s Christian Association and Young Women’s Christian Association, which benefit from the lower cost structures, often compete directly with commercial health–fitness clubs by offering similar services. Under such internal competition among themselves and external threats from nonprofit organizations, top management of health–fitness clubs need to develop sound operating and management processes to enhance service quality. Service quality has been recognized as one of the major elements that affect member retention and the long-term profitability of an organization (McDonald & Howland, 1998; Zeithaml, Berry, & Parasuraman, 1996). Marketing resources are better spent on keeping existing customers than attracting new ones (Fornell, 1992; Fornell & Wernerfelt, 1987; Sonnenberg, 1989). To satisfy customers, service quality has to meet or exceed the expectations of members (Stum & Thiry, 1991).
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
81
Satisfied customers are most often repeat customers. High customer satisfaction results in a better reputation of the organization, lower costs of attracting new customers, fewer resources devoted to handling and managing complaints, and more customer referrals (Anderson & Sullivan, 1993; Fornell & Wenerfelt, 1987; Garvin, 1988; Hallowell, 1996; Parasuraman, Zeithaml, & Berry, 1988). Service quality also has a strong effect on purchase intentions and customer satisfaction. For example, satisfied customers are more likely to become involved more frequently, to take part in other services offered by the organization, to pay for the benefits that they receive, and to be more tolerant of price increases (McAlexander, Kaldenberg, & Koenig, 1994; Reichheld & Sasser, 1990). Considering membership dues are a major source of revenue for health–fitness clubs, member retention is critical to the financial status of the health–fitness clubs (Cannie & Caplin, 1991; Horovitz, 1990; Jones & Sasser, 1995; Reichheld & Sasser, 1990; Sawyer & Smith, 1999; Zeithaml et al., 1996). Because the loyalty of participants to their recreation or health–fitness service may be related to their perception and evaluation (Pritchard, Howard, & Havitz, 1992), an ongoing process to evaluate service quality is necessary to meet the needs and expectations of the customers. Service-Quality Measuring Instruments To frequently assess service quality of health–fitness clubs, an effective instrument to obtain valid data is essential. Only a very limited number of scales have been developed that are pertinent to the health–fitness setting. The SERVQUAL (Parasuraman et al., 1988) has been a popular scale used to measure service quality in various service industries. Parasuraman et al. (1988) claimed that the SERVQUAL is a generic scale that can be applied to a wide spectrum of settings. The SERVQUAL was developed based on four samples, and the scale consists of general measurement questions in five dimensions supported by exploratory factor analysis (EFA): (a) Tangibles, (b) Reliability, (c) Responsiveness, (d) Assurance, and (e) Empathy. The Tangibles factor refers to the physical properties of the club whereas the other factors refer to the intangible service aspects. Later on, a number of other scales were generated based on the SERVQUAL model. Using the same five dimensions of the SERVQUAL, MacKay and Crompton (1990) developed a 25-item scale to measure recreation service quality. Their scale includes many of the same items as on the SERVQUAL. Using the items of the SERVQUAL and the scale developed by MacKay and Crompton (1990) as an item pool, Wright, Duray, and Goodale (1992) created a 30-item scale to assess recreation-center service quality. Wright et al. found that facility cleanliness had the highest ranking (most essential) based on the responses of recreation-center users. Similarly, Howat, Absher, Crilley, and Milne (1996) formulated the 15-item Center for Environmental and Recreation Management-Customer Service Quality
82
LAM, ZHANG, JENSEN
(CERM-CSQ) Scale. The CERM-CSQ has four dimensions supported by EFA: Core Services, Staff Quality, General Facility, and Secondary Services. All of the items of the CERM-CSQ can be classified under the five dimensions of the SERVQUAL. Later, Howat, Murray, and Crilley (1999) found that perceived service-quality attributes could be classified under three dimensions: Personnel, Core, and Peripheral. All of these researchers adopted or developed items related to the five dimensions of the SERVQUAL scale. Emphasis was directed toward evaluating the structural validity (Messick, 1995) of the model of service quality with EFA methodology. Other researchers have attempted to develop assessment scales specific to the health and fitness industry. Obtaining data from 436 fitness club members in a Canadian metropolitan area, Chelladurai, Scott, and Haywood-Farmer (1987) developed the Scale of Attributes of Fitness Services (SAFS) to measure service quality of fitness clubs. The SAFS includes 30 items in 5 dimensions: Professional, Consumer, Peripheral, Facilitating Goods, and Goods and Services. The first 4 dimensions of the SAFS relate to the primary services offered by fitness clubs whereas the last dimension, Goods and Services, is not directly related to fitness per se. The items of the SAFS were developed based on the input of “three professors of sport management, one university fitness instructor, and six staff members of a commercial fitness club” (p. 163, Chelladurai et al., 1987), and item retention was determined by item-total correlations. EFA was not employed to examine the factor structure of the SAFS. In another study, Kim and Kim (1995) developed the 33-item Quality Excellence of Sports Centers (QUESC) to assess service quality of sport centers after “a review of literature of service quality” and the interview of “a focus group” (p. 211). The QUESC was based on a Korean sample and has 11 dimensions supported by EFA: Ambience, Employee Attitude, Reliability, Information, Programming, Personal Consideration, Privileges, Price, Ease of Mind, Stimulation, and Convenience. Kim and Kim reported that the 11 factors were based on EFA; however, 3 factors—Price, Privilege, and Stimulation—of the QUESC has only 1 item. The stability of a single-item factor would be questionable. Papadimitriou and Karteroliotis (2000), however, did not support the factor structure of the QUESC with EFA procedures when using respondents in the sport and fitness centers in Greece. Instead, they suggested a 4-factor model that included Instructor Quality, Facility Attraction and Operation, Program Availability and Delivery, and Other Services. These 4 dimensions accounted for 57.1% explained variance. Although merits may be associated with the above scales, their content and substantive and structural validity are questionable (Messick, 1995). First, the SERVQUAL is a generic scale designed to measure service quality in various businesses and industries; however, it is too general and thus does not provide specific information for health–fitness club management to improve their operational practice. The uniqueness of the health–fitness industry is the programs it offers. This essen-
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
83
tial dimension is missing from the generic service-quality models (Brady & Cronin, 2001; MacKay & Crompton, 1988, 1990; McDougall & Levesque, 1994; Parasuraman et al., 1988; Rust & Oliver, 1994); thus, researchers need to develop an alternative instrument that could be used to evaluate specific service-quality aspects important in the health–fitness club setting. A second observation noted was that the scales developed all appear to need additional content and substantive validity information (Messick, 1995). For example, the 4 dimensions of the CERM-CSQ (Howat et al., 1996) were established by an EFA using 15 items; however, the content relevance of some of the items under the factors appears questionable. The General Facility dimension has such items as “safe parking” and “facility cleanliness,” but similar items like “facility comfort” and “quality equipment” are under the Core Services dimension. The Staff Quality dimension has such items as “staff presentation” and “officials,” but an item like “organization” is also included in the Core Services dimension. We found similar content-relevance issues in the scale developed by Howat et al. (1999). The item “The center should have facilities which are well maintained” is under the Personnel dimension. The item “The center’s equipment should be of a high quality” loaded .42 on one factor and –.44 on another and yet was retained. Often the researchers did not effectively use focus groups to propose dimensions or items related to the hypothesized models. The assessment of item content-relevance of the scales developed specifically for the health–fitness setting did not include newer procedures suggested by Dunn, Bouffard, and Rogers (1999). A third issue is the generalizability validity (Messick, 1995) regarding the QUESC (Kim & Kim, 1995). The structure and operation of health–fitness clubs in other countries may be different from the current practices in the United States. For example, the number of older health-club members (over the age of 55) in the United States has increased more than 350% since 1987, reaching 6.9 million in 2002 (IHRSA, 2003). On the contrary, Korean health–fitness centers may concentrate their resources and services more on young adults than on elderly people because of the latter’s comparatively low acceptance of the importance of physical activities (Cho, 2002). A systematic investigation is needed to generate a set of items that are both content-representative and content-relevant for the health–fitness setting. The purpose of this study was twofold. First, the study was designed to develop items for the Service Quality Assessment Scale (SQAS) to assess the service quality of health– fitness clubs. The main focus of the developmental process was to identify the service-quality dimensions so that various aspects of service quality could be evaluated. Second, the items for the SQAS were evaluated by exploratory factor analysis, confirmatory factor analysis, and gender invariance testing, as well as internal consistency reliability measures following current scale-development procedures utilized in the measurement field.
84
LAM, ZHANG, JENSEN
Conceptual Service-Quality Model In previous studies, service-quality models were either too general (i.e., a generic model that can be applied to various service industries) or too specific (i.e., a model that is designed solely for the industry under investigation). Considering that both approaches have merits and limitations (Brown, Churchill, & Peter, 1993; Murray & Howat, 2002; Parasuraman et al., 1988), the service-quality model proposed in this study was a synthesis of the models that were formulated for the different service industries as a whole (Brady & Cronin, 2001; McDougall & Levesque, 1994; Parasuraman et al.; Rust & Oliver, 1994) as well as models that were specifically designed for the sport and recreation entities (Chelladurai et al., 1987; Howat et al., 1996, 1999; Kim & Kim, 1995; Papadimitriou & Karteroliotis, 2000). For the purpose of this study, a multidimensional service-quality model was formulated to identify the constructs that were used to determine perceived service quality of health–fitness clubs. In this hypothesized model, overall perceived service quality was determined by club members as a result of their service encounters in six dimensions: Staff, Program, Child Care, Locker Room, Physical Facility, and Workout Facility. These six dimensions, however, could be grouped under three major constructs (i.e., Personnel, Program, and Facility), which were derived from the tri-component model of service quality (Rust & Oliver, 1994), the three-factor model of service quality (Brady & Cronin, 2001), the SERVQUAL (Parasuraman et al., 1988), the three-factor service-quality attributes of Howat et al. (1999), and the four-factor service-quality expectations scale of Papadimitriou and Karteroliotis (2000). The constructs of these models and scales are summarized in Table 1. The six dimensions within these three constructs were hypothesized to be determinants of service quality that could be used to capture consumer perceptions of service quality in various health–fitness club settings. The definitions of these dimensions are important for model clarification.
Personnel. Most services involve the interaction between the service provider and the customers (Zeithaml, Parasuraman, & Berry, 1985). This is particularly true in the health–fitness industry, where the interaction happens once the customer walks in the door. The staff of a health–fitness club represents the organization and promotes the service directly to the customer (Shostack, 1977). The appearance, attitude, knowledge, and courtesy of the personnel have a direct influence on the customer’s perception of service quality (Bitner, Booms, & Tetreault, 1990; Brady & Cronin, 2001; Czepial, Solomon, & Surprenant, 1985). Mudie and Cottam (1999) indicated that during a service encounter, the customer will interact with the animate objects (the service employees), and, therefore, such attributes as knowledge and courtesy of the service personnel are very important. In the SERVQUAL scale (Parasuraman et al., 1988), four of the five dimensions are re-
85
Staff
Program Child Care
—
—
Program availability and delivery
—
Core services (e.g., activity ranges) Secondary services (e.g., child minding) Peripheral (e.g., child minding) Program offered
Primary core: Professional (e.g., quality of programs)
—
Program
Service environment Service product Locker room Physical facility Workout facility
General facility (e.g., parking) Core services (e.g., facility comfort, quality equipment) Core (e.g., parking) Ambience Convenience Enabling (e.g., location) Tangible (e.g., appealing facilities) Facility attraction and operation Other services Tangibles
Physical environment quality Outcome quality (tangible) Primary facilitating goods Secondary facilitating goods
Facility
Note. Dashes indicate that the program element was not found in these models that were not designed for the sport and fitness industry. SAFS = Scale of Attributes of Fitness Services. CERM-CSQ = Center for Environmental and Recreation Management-Customer Service Quality Scale. QUESC = Quality Excellence of Sports Center.
6
Service Quality Assessment Scale
5
3
Reliability Responsiveness Assurance Empathy Service Delivery
4
Papadimitriou and Karteroliotis, 2000 Parasuraman, Zeithaml, and Berry, 1988 (SERVQUAL)
Rust and Oliver, 1994
Instructor quality
4
Personnel (e.g., friendly, knowledgeable) Employee attitude Employee reliability Process (e.g., courteous) Outcome (e.g., do it right)
McDougall and Levesque, 1994
4
Howat, Absher, Crilley, and Milne, 1996 (CERM-CSQ)
Primary core: Professional (e.g., knowledge and skill) Primary core: Consumer Staff quality (e.g., knowledge, responsiveness)
3 11
6
Chelladurai, Scott, and Haywood-Farmer, 1987 (SAFS)
Interaction quality (e.g., attitude, expertise)
Personnel
Howat, Murray, and Crilley, 1999 Kim and Kim, 1995 (QUESC)
3
Factor
Brady and Cronin, 2001
Model
Construct
TABLE 1 Constructs of General and Specific Service Quality Models
86
LAM, ZHANG, JENSEN
lated to how the service was handled by personnel (i.e., Reliability, Responsiveness, Assurance, and Empathy). Brady and Cronin (2001) asserted that customers aggregate their evaluations of attitude, behavior, and expertise to form their perceptions of an organization’s performance on the Interaction Quality dimension. In the models developed by Rust and Oliver (1994) and McDougall and Levesque (1994), this dimension is referred to as Service Delivery and Service Process. The emphasis is on how the service is delivered (i.e., the process), not what is delivered by the service (i.e., the end product). This is similar to what Grönroos (1982) called the Functional Quality. In the QUESC scale (Kim & Kim, 1995), almost one third of the items are grouped under this dimension. For example, the Employee Attitude and Employee Reliability dimensions of the QUESC includes such items as willing to help, responsive to complaints, courteous, responsible, adequate knowledge and skills, and consistent services. This dimension was titled Staff in the six-factor model proposed for this study.
Program. The program factor was not found in those generic service-quality models that were not designed specifically for the sport or health–fitness industry (Brady & Cronin, 2001; McDougall & Levesque, 1994; Parasuraman et al., 1988; Rust & Oliver, 1994). These models cannot be adopted by researchers to assess service quality in recreation or health–fitness centers, where the program element is a major factor that affects customer satisfaction. The Program Offered dimension proposed by Kim and Kim (1995) is used to assess whether a variety of programs and activities are being offered (family programs, children’s programs, community activities, variety of sports, etc.). Though named Program Availability and Delivery and including such items as program innovation and promotion, the Program dimension proposed by Papadimitriou and Karteroliotis (2000) is similar to Kim and Kim’s Program Offered dimension. Howat et al. (1999) considered all the attributes related to the physical facility itself as Core and classified all the services and programs provided by the sports and leisure centers as Peripheral. This Peripheral dimension (i.e., services and programs) included such attributes as child minding, a variety of activities, on-time programs, up-to-date information on activities, and so on. The term Program was used for this dimension in this study using the six-factor model. Child Care was a separate dimension in our model. Facility. All the service-quality models under investigation recognized the importance of physical facility in evaluating customers’ perceptions of service quality (Brady & Cronin, 2001; Chelladurai et al., 1987; Howat et al., 1996, 1999; Kim & Kim, 1995; McDougall & Levesque, 1994; Papadimitriou & Karteroliotis, 2000; Parasuraman et al., 1988; Rust & Oliver, 1994). The Physical Facility dimension of the conceptual model in this study represents the physical environment of the facility, which refers to the “built environment” or physical surroundings as
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
87
opposed to the natural or social environment (Bitner, 1992). This dimension was represented by the Ambience factor of the QUESC, which includes both the facilities and environmental elements. Kim and Kim (1995) recognized that adequate space, brightness, modern facilities, locker room with a warm atmosphere, and so on, were the major components of the Ambience factor in Korean sport centers. Chelladurai et al. (1987) indicated that the “physical items” in a fitness club are in the form of facilitating goods and supporting facilities, which include fitness equipment (cleanliness, availability, and variety), locker room, and the fitness center itself (cleanliness, size, and hours of operation). Howat et al. (1999) identified three factors in their service-quality scale for sports and leisure centers: Personnel, Core, and Peripheral. The Core factor of their scale contains such physical facility attributes as safe and secure parking area, clean facilities, well-organized center, and so on. On the other hand, Papadimitriou and Karteroliotis (2000) emphasized the Ambience aspect of the physical facility. In their Facility Attraction and Operation dimension, they included such items as brightness, cleanliness, comfortable temperature, safety, and modern environment. We used the terms Locker Room, Physical Facility, and Workout Facility as three separate dimensions in our six-factor model. Consistent with the previous service-quality literature, the proposed six-dimension service-quality model in this current study was derived from the aforementioned three constructs: Personnel (Staff), Program (Program, Child Care), and Facility (Locker Room, Physical Facility, and Workout Facility). The separation of Child Care from Program (e.g., aerobic or fitness programs) was necessary because an increasing number of health–fitness facilities provide child care service, which is an area of major concern among customers (Mintel International Group Limited, 2001). For similar reasons, instead of using the general term facility to include all physical attributes, we used Locker Room, Physical Facility, and Workout Facility, which would allow club managers to pinpoint specific areas of improvement. The six-factor model as well as other related hypothesized models were examined to determine the model that best fits the health–fitness setting. The investigation included two distinct phases, and so the presentation was divided into Study 1, preliminary scale development, and Study 2, testing of the SQAS. The steps in scale development utilized are outlined in Figure 1.
STUDY 1 Study 1 included two distinct phases: (a) initial scale development and (b) pilot testing of the initial draft of the SQAS. These are the first two steps of the scale development process (see Figure 1).
FIGURE 1 process.
88
A flow chart outlines the basic steps and elements in the scale-development
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
89
Method
Participants Two groups of participants were included in Study 1: (a) the participants for the initial scale construction and (b) the participants for the pilot study of the initial scale. The participants included in the initial scale-construction phase were varied and represented various personnel in the health–fitness industry. First, field observations were made in 10 fitness clubs. In each club, a focus group (n = 7) was formed, including 2 health–fitness club administrators and 5 members with different club-membership types and different sociodemographic backgrounds, to gain additional information. Health–fitness club members (n = 15) from various clubs and various activities were interviewed individually. Finally, 15 top management people (e.g., owners, executive directors, general managers, program directors, etc.) who had been in the health–fitness industry for at least 5 years participated in a Delphi method to determine the content-relevance of the newly developed items. The members of one health–fitness club in a large metropolitan area in the South were chosen for the pilot study of the initial scale. Through a mail-out request, all 1,500 members were invited to voluntarily participate in the survey. A total of 234 members (129 women and 105 men) agreed to participate, and a copy of the preliminary SQAS was mailed to them. Most of these members were between the ages of 26 and 50 years old (60%) and were married (79%).
Formulation of the Preliminary Scale The process of formulating the preliminary scale included a literature review, field observations of health–fitness clubs, interviews with focus groups and members of health–fitness clubs, and a modified application of the Delphi technique. The literature review was thorough and included the findings from general service-quality studies and studies specifically related to sport and recreation centers (e.g., Brady & Cronin, 2001; Chelladurai et al., 1987; Howat et al., 1996, 1999; Kim & Kim, 1995; Lam, 1994; MacKay & Crompton, 1988, 1990; Papadimitriou & Karteroliotis, 2000; Parasuraman et al., 1988; Rust & Oliver, 1994; Wright et al., 1992). Field observations were conducted by the researchers in 10 health–fitness clubs. During the observations, the researchers were accompanied by the health– fitness club general manager, who also explained the usage and functions of each facility. In addition, particular attention was directed to the overall ambience, facilities, equipment, programs, and staff-member interactions. On-site discussions were conducted by the researchers in each health–fitness club with focus groups of administrators and club members. The discussions were normally part of their general meeting agenda and were centered around issues of member retention and satisfaction, and the content of the preliminary items on the service-quality ques-
90
LAM, ZHANG, JENSEN
tionnaire. Additionally, 15 members from several health–fitness clubs who participated in different programs (e.g., aerobics, weight training) were interviewed individually. They were asked what they liked and disliked about the club, the equipment they usually used, the time and frequency they visited the club, and so on. All the interviews were conducted in an open format. As a result of the aforementioned procedures, 56 service-quality attributes were identified and each attribute was phrased into a test item using a Likert-type scale ranging from 1 (poor) to 7 (excellent) to represent the quality of service. Through a modified application of the Delphi technique, content validity of the proposed list of 56 items under nine dimensions was evaluated in two rounds by a panel of experts from top management who had been in the health–fitness industry for at least 5 years. The Delphi technique is “basically a method of using expert opinion to help make decisions about practices, needs, and goals” (Thomas & Nelson, 2001, p. 272). The panel of experts was asked to determine whether the items were relevant, representative, and clear enough in determining service quality in health–fitness clubs. They were requested to determine the appropriate dimensional definition for each item to ascertain if the items could be classified appropriately under their respective proposed dimensions. They could also delete items that were deemed unimportant or irrelevant and add dimensions or items not on the list. Based on the decision of a majority of the expert panel members (80%), 46 items were retained under six dimensions. The retained items were consistent with those related to the Personnel, Program, and Facility dimensions of other service-quality models. Each of the 46 items was proposed to relate to one of the six factors of Staff, Program, Child Care, Locker Room, Physical Facility, or Workout Facility in the six-factor model for the initial draft of the SQAS.
Pilot Study Procedures. A pilot study was carried out to further examine the 46 items in the preliminary scale and to assess the testing procedures. The participants in one health–fitness club were asked to return the questionnaire using the envelope provided or to drop it in the box provided at the front desk of their health–fitness club. Data analysis. The Data Reduction (factor) procedures from Version 10.0 of the SPSS for Windows were utilized for the statistical analyses. When conducting the EFA, we identified the number of factors by alpha extraction to maximize the generalizability of the factors (Kaiser & Caffrey, 1965) and promax rotation, which combines the merits of both orthogonal and oblique rotations (Hendrickson & White, 1964). Guttman’s (1954) eigenvalue-one rule and Cattell’s (1966) scree test were used as references when determining the number of factors.
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
91
Results The results of the pilot study factor analysis with this preliminary set of data from one health–fitness club was an important step in scale development and refinement. According to Comrey (1988) as well as Guadagnoli and Velicer (1988), an EFA requires 200 to 300 respondents. Thus, the sample size (N = 234) for the pilot study was appropriate for a factor analysis. Using the alpha extraction method, we identified a total of eight factors with eigenvalues larger than 1.00 (the results were also consistent with the scree plot). Based on the standard of a factor loading equal to or greater than .40 without double loading, 4 items were eliminated from the 46-item scale after examining the pattern matrix of the promax rotation. A thorough examination of the pattern matrix and the scale items further revealed that 2 other items (reasonable fee and value for the money) were redundant, and they were hence discarded. As a result of this refinement, 6 items were deleted from the scale. Next, the same procedure in factor analysis was utilized to examine the 40-item scale. Without confining the number of factors, the alpha extraction identified eight factors. Attempts to reduce the number of factors demonstrated that the six-factor solution was most interpretable. The six factors were identified as Staff, Program, Locker Room, Physical Facilities, Workout Facilities, and Child Care. All 40 items distributed across the six factors with a coefficient of .40 or higher without double loading except for 1 item, adequacy of signs and directions, which had a coefficient close to .40 (i.e., .38).
STUDY 2 Study 2 included a testing of the revised SQAS with a new sample. This sample was divided randomly so that another EFA was conducted with one half and a confirmatory factor analysis (CFA) was conducted with the second half. Furthermore, tests of whether the SQAS was invariant across gender groups and tests of the internal consistency reliability of the dimensions of the scale were conducted. Method
Participants The participants of Study 2 were 1,202 members (471 men and 731 women) from 10 health–fitness clubs in a major southern metropolitan area. Slightly more than 90% of the members were White and visited their clubs at least twice a week. Most members were married (82%) and traveled to their clubs in 20 min or less (88.9%). A majority of the members (78.1%) had either a bachelor’s or master’s
92
LAM, ZHANG, JENSEN
degree. Almost half of the members were between the ages of 36 and 50 years old (48.3%), had been a club member for 5 years or more (43.4%), and had a household income of more than $100,000 (47.1%). Attempts were also made to draw samples from various health–fitness clubs of similar size in different geographic regions of the metropolitan area. The purpose of doing so was to include diverse settings so that the results could be generalized to a larger population. To obtain a more valid evaluation of the perceived service quality of the health–fitness clubs, participants who had not been members for at least 3 months were excluded from the study.
Procedures A survey packet including an informed-consent cover letter explaining the purpose of the study and the 40-item SQAS was mailed out to all 8,000 members in the 10 participating health–fitness clubs. For their convenience, the respondents were asked to return the questionnaire in one of the following ways within 2 weeks: (a) by mailing it back using the business-reply envelope provided, (b) by faxing their responses to the researchers, (c) by dropping the questionnaire in the box provided at the front desk of their respective health–fitness club, or (d) by responding to the electronic version of the questionnaire on the Web site.
Data Analysis After collecting the data, we randomly split the entire data set into halves, one for EFA and the other for CFA. The Data Reduction (factor) and Scale (reliability analysis) procedures from Version 10.0 of the SPSS for Windows were utilized in the data analysis.
EFA. Again, the number of factors was identified by alpha extraction (Kaiser & Caffrey, 1965) and promax rotation (Hendrickson & White, 1964). The criteria for retaining the factors were based on the eigenvalue equal to or greater than 1.0 (Guttman, 1954) while comparing it to the scree test (Cattell, 1966). CFA. In this stage, the second half of the data was analyzed using a CFA. The purpose of the CFA was to confirm the factor structure of the six-factor 40-item SQAS scale. The Windows LISREL 8.53 (du Toit, du Toit, Jöreskog, & Sörbom, 2002) computer program was used to analyze the data. The PRELIS 2.53 (du Toit et al.) program was utilized to test for the degree of skewness and kurtosis as well as multivariate normality. Because the item results were only slightly skewed, the maximum likelihood (ML) estimation method was used in conducting the CFA. Olsson, Foss, Troye, and Howell (2000) suggested that a sample size of 2,000 or more is necessary for the weighted least squares (WLS) estimation method; with
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
93
data similar to the SQAS data and less than 2,000 respondents, Olsson et al. recommended using the ML estimation method. As mentioned earlier, initial scales developed were derived from three factors; therefore, the proposed six-factor SQAS model was compared with a three-factor model: (a) the Staff factor was by itself, (b) the Program factor included both Fitness Program and Child Care, and (c) the Facility factor was comprised of Locker Room, Physical Facility, and Workout Facility combined. These six-factor and three-factor models were logical comparisons stemming from the literature on scales measuring service quality. Tests of correlated and uncorrelated factors for both the six-factor and three-factor models were made. Compared to these nested models, a one-factor model was also tested assuming this would be the model with the worst fit. All of these models are nested models so comparisons were made with the chi-square difference test for nested models (Bentler & Bonett, 1980). The following fit indexes were used to examine the fit of the models: the Root Mean Square Error of Approximation (RMSEA; Steiger & Lind, 1980), the Standardized Root Mean Square Residual (SRMR; Bentler, 1995), the Comparative Fit Index (CFI; Bentler, 1990), and the Incremental Fit Index (IFI; Bentler & Bonett, 1980). As pointed out by Steiger (1989) and Byrne (1998), values of the RMSEA less than .05 indicate a very good fit, and values up to .08 indicate reasonable errors of approximation in the population. MacCallum, Browne, and Sugawara (1996) further commented on these cutpoints by declaring that values of the RMSEA between .08 and .10 indicate mediocre fit, and those greater than .10 indicate poor fit. On the other hand, the SRMR ranges from 0 to 1.00 and “in a well-fitting model this value will be small—say, .05 or less” (Byrne, 1998, p. 115). Values of the CFI and IFI also range from 0 to 1.00, with values larger than .90 indicating an acceptable fit, and values greater than .95 indicating a good fit (Bentler, 1990, 1992; Hu & Bentler, 1999; Marsh, Balla, & McDonald, 1988, Steiger, 1990). West, Finch, and Curran (1995) further commented that the CFI has “only small downward bias (3% to 4%), even under severely nonnormal conditions” (p. 74). On the other hand, the Expected Cross-Validation Index (ECVI; Browne & Cudeck, 1989) and the Akaike Information Criterion (AIC; Akaike, 1974) were used to measure the fit across models. The ECVI is used to assess, in a single sample, the likelihood that the model cross-validated across samples with similar size from the same population (Browne & Cudeck). When comparing different models, the ECVI is computed for each model and the model having the smallest ECVI value denotes the largest potential for replication. Because it can take on any value, the ECVI has no predetermined range of values (Byrne, 1998). Similar to the ECVI, the AIC is used to compare two or more models with smaller values representing better model fit (Hu & Bentler, 1995).
Variance extracted and composite reliability. The variance extracted (VE) by each dimension was computed for each factor following the procedures outlined
94
LAM, ZHANG, JENSEN
by Fornell and Larcker (1981). According to Fornell and Larcker, VE is the “amount of variance captured by the construct in relation to the amount of variance due to measurement error” (p. 45). For each of the six SQAS dimensions, alpha reliability coefficients (Cronbach, 1951) and composite reliability (CR) based on the weighted omega formula suggested by Bacon, Sauer, and Young (1995) were computed.
Gender invariance. To determine whether the model would be invariant across gender, the data set for the CFA analysis was analyzed following procedures outlined mathematically by Meredith (1993) and clarified by Widaman and Reise (1997) as well as Hofer, Horn, and Eber (1997). The data set was divided into male and female samples, and the analysis was conducted on the variance/covariance matrices for each gender using the ML estimation method. After the baseline model was tested, the order for analysis was first to test the invariance of the lambda X (Lx) values (to determine if item factor loadings were different for men and women), then the tau (Tx) coefficients (to determine if item means were different for men and women), then the phi matrix (to determine if the variances and covariances for the six factors were different for men and women), and then the errors in measurement. Items that were not invariant across gender were trimmed from the scale to establish at least “strong” factorial invariance (Meredith, 1993), which meant that at least the item factor loadings and Tx values were invariant.
Results The results are presented in the following four sections: (a) EFA, (b) CFA, (c) VE and CR, and (d) Gender Invariance Testing.
EFA A total of 1,227 members voluntarily agreed to participate in the study and returned the survey forms. Out of the returned packets, 25 were incomplete and they were excluded from further analyses. Therefore, the final sample included 1,202 individuals. The entire data set was randomly split into halves: Sample A (n = 601) and Sample B (n = 601). An EFA was conducted on Sample A using alpha extraction and promax rotation. Six factors emerged with a total explained variance of 61.34%. All factors had low to moderate correlation with each other (r = .29 to .70). All 40 items loaded on their respective factors without double loadings: Staff (9 items), Program (7 items), Locker Room (5 items), Physical Facility (7 items), Workout Facility (6 items), and Child Care (6 items). Though 1 item, temperature control, loaded on both factors (which might be capitalized by chance), the overall structure of the scale could be considered as appropriate for further investigation (see Table 2 for the factor loadings and the wording of each item).
TABLE 2 Pattern Matrix of the Six-Factor Service Quality Assessment Scale I I. Staff 1. Possession of required knowledge/skills 2. Neatness and dress 3. Willingness to help 4. Patience 5. Communication with members 6. Responsiveness to complaints 7. Courtesy 8. Provision of individualized attention by instructors 9. Provision of consistency of service II. Program 1. Variety of programs 2. Availability of programs at appropriate level 3. Convenience of program time/schedule 4. Quality/content of programs 5. Appropriateness of class size 6. Background music (if any) 7. Adequacy of space III. Locker Room 1. Availability of lockers 2. Overall maintenance 3. Shower cleanliness 4. Accessibility 5. Safety IV. Physical Facility 1. Convenience of location 2. Hours of operation 3. Availability of parking 4. Accessibility to building 5. Parking lot safety 6. Temperature control 7. Lighting control V. Workout Facility 1. Pleasantness of environment 2. Modern-looking equipment 3. Adequacy of signs and directions 4. Variety of equipment 5. Availability of workout facility/equipment 6. Overall maintenance VI. Child Care 1. Quality of staff 2. Cleanliness of equipment 3. Hours of operation 4. Adequacy of space 5. Safety of environment 6. Diversity of experience provided Note.
II
III
IV
V
VI
0.712 0.553 0.975 0.906 0.942 0.626 0.775 0.584 0.766 0.725 0.884 0.797 0.706 0.708 0.387 0.570 0.762 0.867 0.818 0.869 0.811 0.573 0.433 0.808 0.856 0.747 0.359 0.555
0.390
0.480 0.677 0.368 0.902 0.770 0.714 0.793 0.872 0.659 0.787 0.876 0.804
For clarity purpose, only factor coefficients of .35 or higher are shown. N = 601.
95
96
LAM, ZHANG, JENSEN
CFA As suggested by the EFA, a six-factor SQAS model was proposed in the CFA process. One of the basic assumptions of CFA is multivariate normality. In this regard, the data in Sample B were examined through the PRELIS 2.53 (du Toit et al., 2002) computer program. The basic assumption of multivariate normality was not met (i.e., χ2 = 39,036, p < .00). The distributions of most items in this current sample were negatively skewed and leptokurtic. A common way to transform the nonnormal distribution is to use either logarithm or power functions (Box & Cox, 1964; Daniel & Wood, 1980; Emerson & Stoto, 1983; Hoyle, 1995; Rummel, 1970); however, a few attempts using logarithms (e.g., log Xj/1–Xj) suggested by Rummel (1970) or power larger than 1 (e.g., power 2.5) suggested by Hoyle (1995) were unsuccessful. This could be due to the small data ratio (Emerson & Stoto), that is, the ratio of the largest data value (i.e., 7) over the smallest data value (i.e., 1) was small. Real psychological data are almost never normally distributed (Chou & Bentler, 1995). Extensive research on the robustness of the ML method indicated that this method is almost always acceptable even when data are nonnormally distributed (Harlow, 1985; Hoyle & Panter, 1995; Muthén & Kaplan, 1985; Tanaka & Bentler, 1985; West et al., 1995). Because the data of this study were not seriously skewed (see Table 3), the ML estimation method was used in conducting the CFA. Using the Windows LISREL 8.53 (du Toit et al., 2002) computer program, five nested models were tested based on the ML estimation method (see Table 4). The chi-square statistics for the six-factor, correlated SQAS model was significant, χ2725 = 2,693, p < .001; however, the RMSEA was .07 (90% CI = .06–.07), the SRMR was .05, and both the CFI and IFI were .87. Because both the RMSEA and SRMR values were in the uppermost ranges, this indicated that though this six-factor model did not provide an excellent fit to the data, it was admissible. On the other hand, the CFI and IFI values (.87) of this study were slightly below the .90 standard (Hu & Bentler, 1999). When compared with the six-factor, uncorrelated model (the correlations of the six factors were set to zero in the phi matrix), the six-factor, correlated SQAS model had smaller RMSEA, SRMR, ECVI, and AIC, but larger CFI and IFI, values. In comparison to the three-factor correlated and uncorrelated models, the six-factor, correlated SQAS model was a better fit than were any of the three-factor models. The worst model fit was the one-factor model with all items loading on one factor. All this demonstrated that the six-factor, correlated SQAS model was superior to the other models. In addition, the chi-square change along with the change in degrees of freedom also confirmed that the six-factor, correlated SQAS model was significantly better, p < .001, than each of the four alternative models. The interfactor correlations, standardized factor structure coefficients, and errors of measurement of the SQAS model estimated by the CFA are presented in Figure 2. The parameter estimates between the indicators and latent variables
FIGURE 2 Factor structure coefficients, interfactor correlations, and errors of measurement of the six-factor Service Quality Assessment Scale model.
97
TABLE 3 Descriptive Statistics, Skewness, and Kurtosis of the 40-Item Service Quality Assessment Scale Variable
M
SD
Skewness
z
p
Kurtosis
z
p
Staff 1 Staff 2 Staff 3 Staff 4 Staff 5 Staff 6 Staff 7 Staff 8 Staff 9
5.42 5.69 5.58 5.62 5.12 4.86 5.74 5.27 5.32
1.10 1.04 1.24 1.11 1.36 1.40 1.19 1.22 1.25
–0.67 –0.59 –0.78 –0.93 –0.78 –0.87 –1.10 –0.90 –1.09
–3.41 –3.23 –3.59 –3.81 –3.59 –3.72 –4.01 –3.77 –4.00
.00 .00 .00 .00 .00 .00 .00 .00 .00
0.92 –0.19 0.13 1.07 0.43 0.81 1.19 1.31 1.63
3.44 –0.93 0.74 3.81 1.95 3.14 4.08 4.35 4.96
.00 .18 .23 .00 .03 .00 .00 .00 .00
Program 1 Program 2 Program 3 Program 4 Program 5 Program 6 Program 7
5.59 5.31 5.11 5.49 5.43 5.05 4.98
1.16 1.23 1.39 1.06 1.06 1.22 1.42
–0.85 –0.76 –0.77 –0.84 –0.98 –1.10 –0.79
–3.69 –3.56 –3.58 –3.67 –3.86 –4.01 –3.60
.00 .00 .00 .00 .00 .00 .00
0.89 0.74 0.51 1.27 1.86 2.14 0.45
3.36 2.95 2.25 4.25 5.36 5.79 2.02
.00 .00 .01 .00 .00 .00 .02
Locker Room1 Locker Room 2 Locker Room 3 Locker Room 4 Locker Room 5
5.66 5.24 5.13 5.56 5.53
1.23 1.39 1.42 1.25 1.22
–1.21 –0.88 –0.94 –1.09 –1.12
–4.13 –3.73 –3.81 –4.00 –4.03
.00 .00 .00 .00 .00
1.82 0.53 0.76 1.28 1.62
5.30 2.29 2.99 4.27 4.94
.00 .01 .00 .00 .00
Physical Facilities 1 Physical Facilities 2 Physical Facilities 3 Physical Facilities 4 Physical Facilities 5 Physical Facilities 6 Physical Facilities 7
6.02 5.76 5.10 5.74 5.33 5.30 5.60
1.15 1.39 1.58 1.17 1.29 1.29 1.15
–1.21 –1.29 –0.76 –1.10 –0.69 –0.70 –0.84
–4.13 –4.21 –3.55 –4.00 –3.43 –3.45 –3.67
.00 .00 .00 .00 .00 .00 .00
1.38 1.39 .00 1.47 0.15 0.25 0.74
4.48 4.52 0.13 4.66 0.86 1.30 2.94
.00 .00 .45 .00 .20 .10 .00
Workout Facilities 1 Workout Facilities 2 Workout Facilities 3 Workout Facilities 4 Workout Facilities 5 Workout Facilities 6
5.57 5.58 5.53 5.50 5.18 5.23
1.21 1.34 1.15 1.12 1.23 1.23
–1.12 –1.25 –0.83 –1.22 –1.02 –1.12
–4.03 –4.16 –3.66 –4.14 –3.92 –4.03
.00 .00 .00 .00 .00 .00
1.58 1.55 0.89 2.60 1.72 1.82
4.87 4.82 3.37 6.42 5.12 5.29
.00 .00 .00 .00 .00 .00
Child Care 1 Child Care 2 Child Care 3 Child Care 4 Child Care 5 Child Care 6
5.62 5.47 5.44 5.10 5.53 5.30
0.71 0.68 0.71 0.78 0.67 0.69
–2.04 –2.08 –1.51 –1.47 –2.02 –1.26
–4.77 –4.79 –4.40 –4.37 –4.76 –4.18
.00 .00 .00 .00 .00 .00
9.90 12.07 7.97 9.20 12.22 8.04
10.78 11.38 10.10 10.55 11.42 10.13
.00 .00 .00 .00 .00 .00
98
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
99
TABLE 4 A Comparison Between the Six-Factor Service Quality Assessment Scale Correlated and Uncorrelated Models, the Three-Factor Correlated and Uncorrelated Models, and a One-Factor Model Model
RMSEA
SRMR
CFI/IFI
ECVI
AIC
χ2
df
.07
.05
.87
5.26
3092
2693
725
.10
.31
.77
9.32
5476
4341
.11
.08
.73
11.15
6548
.12
.27
.68
12.32
.14
.09
.60
18.01
6-factor SQAS correlated 6-factor SQAS uncorrelated 3-factor SQAS correlated 3-factor SQAS uncorrelated 1-factor SQAS
∆χ2
∆df
740
1648*
15
4977
737
2284*
12
7233
5773
740
3080*
15
10573
7064
740
4371*
15
Note. RMSEA = Root Mean Square Error of Approximation; SRMR = Standardized Root Mean Square Residual; CFI/IFI = Comparative Fit Index/Incremental Fit Index; ECVI = Expected Cross-Validation Index; AIC = Akaike Information Criterion; SQAS = Service Quality Assessment Scale. *p < .001.
ranged from .51 (between background music and Program) to .87 (between safety and Locker Room). The interfactor correlations ranged from .26 (between Physical Facility and Child Care) to .80 (between Physical Facility and Workout Facility). The errors of measurement extended from .23 (safety) to .74 (background music).
VE and CR The VE of .61 (Staff ), .55 (Program), .68 (Locker Room), .50 (Physical Facility), .59 (Workout Facility), and .50 (Child Care) were all considered acceptable based on the .50 standard (Fornell & Larcker, 1981). The alpha reliability coefficients (Cronbach, 1951) for the six factors were: .94 (Staff ), .88 (Program), .93 (Locker Room), .84 (Physical Facilities), .91 (Workout Facility), and .92 (Child Care). The range of the alpha coefficients suggested that the items under each factor represented a unidimensional subconstruct. The CR of the SQAS was calculated based on the weighted omega formula suggested by Bacon et al. (1995): .93 (Staff), .89 (Program), .91 (Locker Room), .87 (Physical Facility), .90 (Workout Facility), and .86 (Child Care). These CRs were all above the .70 standard (Fornell & Larcker).
Gender Invariance Testing The six-factor, correlated SQAS model was first tested separately for the male and female data. Table 5 includes the results from these separate analyses as well
100
LAM, ZHANG, JENSEN
TABLE 5 Gender Invariance Testing for the Six-Factor Correlated Model Model Men Women Baseline: 40 items Model 1 with Lx constrained Staff Program Locker Room Physical Facility Workout Facility Child Care C1 C2 C3 C4 C5 C6 Baseline: 37 items Lx constrained Staff 4 Program 6 Workout Facility 4 Lx and Tx constrained Lx, Tx, phi constrained Lx, Tx, phi constrained and TD constrained
χ2
df
1625.134 2090.411 3715.545 3785.938
725 725 1450 1484
3725.825 3726.164 3717.912 3721.978 3721.299 3750.624 3715.550 3715.550 3733.539 3726.414 3720.519 3716.131 3102.370 3142.040 3106.585 3111.357 3107.474 3198.870 3249.580 3554.284
1458 1456 1454 1456 1455 1455 1451 1451 1451 1451 1451 1451 1228 1259 1229 1229 1229 1290 1311 1348
∆χ2
∆df
RMSEA
SRMR
CFI
70.393a
34
.075 .075 .075 .075
.058 .059 .059 .063
.861 .867 .864 .862
10.280 10.619 2.367 6.433 5.754 35.079b .005 .005 17.992c 10.869c 4.974c .586
8 6 4 6 5 5 1 1 1 1 1 1
76.360d 4.215c 8.987c 5.104c 56.830d 50.710e 304.704f
31 1 1 1 31 21 37
.074 .073
.058 .063
.877 .877
.073 .073 .075
.063 .075 .079
.875 .873 .856
Note. RMSEA = Root Mean Square Error of Approximation; SRMR = Standarized Root Mean Square Residual; CFI = Comparative Fit Index; Lx = Lambda X; Tx = Tau coefficients; TD = Theta Delta. aχ 2 = 43.777, p < .05. bχ 2 = 11.070, p < .05. cχ2 = 3.841, p < .05. dχ 2 = 40.256, p < .05. eχ 2 34 5 31 21 = 32.670, p < .05. fχ 237 = 55.758, p < .05.
as the gender invariance testing for the six-factor, correlated model. The model fit statistics for the separate analyses and for the baseline model where no parameter estimates were constrained equal across groups represented adequate fit; this was essential to continue with the invariance tests (Bollen, 1989; Meredith, 1993). When holding the Lx values constrained across gender groups, we found a significant chi-square change, χ2 = 70.393, p < .05; Table χ2 = 43.777, df = 34. Tests were then made of the invariance of the Lx values separately for each of the six dimensions or factors. Because the Child Care dimension was the only factor with significant differences, each of the items in Child Care was then tested to determine which items had significantly different Lx values for men and women. The Lx values for Items C3 (hours of operation), C4 (adequacy of space), and C5 (safety of environment) were not invariant across the gender models. When each of these
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
101
items was tested separately, a significant difference in the chi-square value was found when the Lx value for each item was constrained to be equal across the two groups. For each of these 3 items, the variability of the scores for women was higher than that for men. The Lx values were all in the .90 range for women and in the .80 range for men. No real content-related differences could be hypothesized for these 3 items to explain the gender differences; whether the differences were more sample specific would need to be tested in further research. In any case, these 3 items were eliminated, and a test was made of the baseline model for the 37-item scale with only 3 items under Child Care (see Table 5). The 37-item baseline model represented adequate fit; however, the first test of constraining the Lx values was significantly different from the baseline model, χ2 = 76.360, p < .05; Table χ2 = 40.256, df = 31. The items that appeared to have quite different Lx values from the male and female samples were the 4th Staff item, the 6th Program item, and the 4th Workout Facilities item. Each of these 3 items was deleted from the scale. Tests were then made for constraining in a hierarchical fashion, next the Tx values, then the phi matrix, and lastly the theta delta (TD) values. Each of these tests produced a significant chi-square difference value; thus, the 37-item scale did not have even “weak” factorial invariance (Meredith, 1993). Mean differences in gender groups were found for the 1st and 5th items under Locker Room as well as between the means for the 4th Program item; thus, these 3 items were eliminated. The final scale tested was then a 31-item scale with 8 items under Staff (eliminate patience), 5 items under Program (eliminate quality/content of program and background music), 3 items under Locker Room (eliminate both availability of lockers as well as safety), 7 items under Physical Facilities, 5 items under Workout Facilities (eliminate variety of equipment), and 3 items under Child Care (eliminate hours of operation, adequacy of space, and safety of environment). The 31-item scale was tested for gender invariance (see Tables 6 and 7). The 31-item model represented adequate fit for the baseline model; furthermore, the scale represented “strong” factorial invariance (Meredith) with both the tests for Lx and Tx coefficients showing no significant differences. The model was also tested holding the Lx and Tx coefficients and phi matrix constrained; adding the phi matrix made a significant difference in chi-square values. Meredith suggested that tests of the phi matrix and TDs were not necessary unless one needed “strict” factorial invariance. In most samples these values are allowed to be sample specific and are allowed to differ across groups of participants. To summarize, to have “strong” gender invariance with the SQAS (Lx and Tx coefficients invariant), the 31-item trimmed model should be used.
DISCUSSION The study was designed to develop an instrument to measure service quality of health–fitness clubs. Consistent with the literature, the SQAS supports the notion
102
LAM, ZHANG, JENSEN
TABLE 6 Gender Invariance Testing for the 31-Item Service Quality Assessment Scale Model Model Baseline: 31 items Lx constrained Lx and Tx constrained Lx, Tx, and phi constrained
χ2
df
∆χ2
∆df
RMSEA
SRMR
CFI
2098.965 2124.740 2150.484 2195.441
838 863 888 909
25.775a 25.744a 96.476b
25 25 71
.072 .071 .069 .069
.059 .061 .061 .074
.893 .893 .893 .817
Note. RMSEA = Root Mean Square Error of Approximation; SRMR = Standarized Root Mean Square Residual; CFI = Comparative Fit Index; Lx = Lambda X; Tx = Tau coefficients; TD = Theta Delta. aχ 2 = 37.652, p < .05. bχ 2 = 37.652, p < .05. cχ 2 = 90.531, p < .05. 25 25 71 TABLE 7 Gender Differences in Each Factor Mean Score of the 31-Item Service Quality Assessment Scale Model
Staff Program Locker Room Physical Facility Workout Facility Child Care
M Difference
SE
t-ratio
p
.095 –.064 .118 –.002 .078 –.029
.075 .085 .115 .056 .094 .050
1.265 –.751 1.024 –.040 .831 –.576
>.050 >.050 >.050 >.050 >.050 >.050
Note. Gender coding: Male = 1 and female = 2; positive mean differences indicate men have higher mean compared to that of women. All mean differences were nonsignificant (p > .05).
that service quality is a multidimensional construct that requires multiple dimensions to evaluate the perceptions of patrons. The SQAS consists of 40 items under six dimensions: Staff (9 items), Program (7 items), Locker Room (5 items), Physical Facility (7 items), Workout Facility (6 items), and Child Care (6 items). If one needed a scale with all items invariant for factor loadings and for item mean differences across gender groups, a 31-item scale is proposed: Staff (8 items), Program (5 items), Locker Room (3 items), Physical Facilities (7 items), Workout Facilities (5 items), and Child Care (3 items). The results of this study substantiate the SQAS model that service quality can be determined by club members as a result of their service encounters. The first scales developed to measure service quality were generic and adaptable across a broad spectrum of services; using scales like the SERVQUAL requires modification and adaptation when applied to various organizational contexts (Parasuraman et al., 1988). Murray and Howat (2002) supported this notion of industry-specific dimensions of service quality. One reason for the examination
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
103
of industry-specific dimensions is due to the variability among industries in terms of the service environment. In the health and fitness industry, services are more closely associated with intangibles. On the other hand, the SQAS is developed to examine the perception of service quality by patrons. The SERVQUAL, however, defines perceived service quality as the difference between consumer expectations and perceptions. Perceived service quality is operationalized by subtracting consumers’ expectations scores from their perceptions-of-performance scores (Parasuraman et al., 1988). Although this approach of assessing consumer satisfaction has been widely adopted (Brown et al., 1993), more recently a number of researchers have found that performance-only measures are superior in terms of predictive validity and measurement reliability (Crompton & Love, 1995; Cronin & Taylor, 1992). Whipple and Thatch (1988) stated, in explaining the superiority of performance measures, “there is evidence that pre-purchase choice criteria and post-purchase evaluation criteria are not the same” (p. 17). This supports the notion that customers may not be able to accurately indicate the importance of attributes on purchase evaluation and repurchase intentions; therefore, the exclusion of the importance construct may be more practical. Though labeled differently (with names such as Personnel, Instructor Quality, Professional, Employee Attitude, Employee Reliability), the Staff dimension of the SQAS is included in other studies (e.g., Chelladurai et al., 1987; Howat et al., 1999; Kim & Kim, 1995; Papadimitriou & Karteroliotis, 2000). Previous researchers (e.g., Bitner et al., 1990; Brady & Cronin, 2001; Czepial et al., 1985) have indicated that knowledge, attitude, appearance, and courtesy are major attributes that affect customers’ perceived service quality. All these attributes are included in the Staff dimension of the SQAS. The SQAS has a number of merits over the SERVQUAL. First, the 9-item Staff dimension can replace all four dimensions (Reliability, Responsiveness, Assurance, and Empathy) of the SERVQUAL. Second, the SQAS includes both the physical (e.g., Tangibles) and human (e.g., Assurance) factors of the SERVQUAL; yet, the most important element of the SQAS (i.e., Program) is not found in the SERVQUAL. During the course of literature review, we found inconsistencies in scale-development methodology used in the field of sport and leisure. First, EFA was not adopted in scale development in a number of publications (e.g., Chelladurai et al., 1987; Wright et al., 1992). Proper score interpretation of each of the dimensions on a scale is possible only with correctly applied factor analytic methodology (Messick, 1989). In some of the earlier publications using EFA, misconceptions regarding its manipulation may be found. For example, Kim and Kim (1995) reported “coefficient alpha” (Table 1, p. 213) as the result of their factor analysis, in spite of the fact that coefficient alpha (a measure of internal consistency of items) has nothing to do with factor analysis. In other studies, inappropriate techniques or wrong judgments were made by the researchers. For example, in spite of the low correla-
104
LAM, ZHANG, JENSEN
tion among the three factors (i.e., from .38 to .56), Howat et al. (1999) reported in their study that “the factor-correlation matrix indicated a high level of correlation between the factors, refuting the use of orthogonal rotation” (p. 50). The decision-making process in the determination of extraction and rotation methods, the number of factors, and so forth, is rather complicated in EFA; however, a very common practice by researchers (e.g., Howat et al., 1996) is to follow the default procedures on a statistical package (i.e., the utilization of the principal component extraction and the varimax rotation methods). When developing a new scale, researchers attempt to seek factors that have maximum correlation with corresponding factors in the universe of variables (i.e., to increase the generalizability of the scale). In this regard, the alpha extraction method is more appropriate. On the other hand, “varimax is inappropriate if the theoretical expectation suggests a general factor may occur” (Gorsuch, 1983, p. 185). Obviously, principal component extraction and varimax rotation or Cattell’s (1966) “eigenvalue larger than 1” does not provide a single solution to all testing conditions. Judicious decisions in the selection of appropriate analytic techniques will avoid Type IV errors (i.e., wrong solutions to the wrong problems) in the service-quality scale-development process (see Gorsuch, as well as Rummel, 1970, for an excellent discussion of, and comparisons among, various extraction and rotation approaches). Attempts were made to develop the SQAS by means of rigorous measurement procedures (see Figure 1). The initial process in constructing the scale was based on extensive literature review and the actual observations and examination of various health–fitness clubs. The generation of constructs and selection of items was compatible to the current health–fitness club setting. The convincing content validity of the SQAS was established by a panel of experts, focus groups, and health–fitness club members. A pilot study was carried out before the actual study to examine the face validity of the items. In addition, the current scale was developed by a large database (i.e., involving 10 health–fitness clubs and more than 1,200 club members), which enhances the generalizability of the SQAS. On the other hand, the robustness of the SQAS has been confirmed by at least two exploratory factor analyses on two different occasions. All these enhance the psychometric properties of the SQAS in terms of validity and reliability. Finally, the SQAS is the only health–fitness service-quality scale so far to be tested by CFA; the correlated, six-factor structure of the model using the 31-item version of the SQAS displays “strong” (Meredith, 1993) factorial invariance across gender. Researchers in general have supported the notion that CFA should be accomplished on a data set independent of the EFA. This can be done by dividing the initial data pool into halves or by collecting an entirely new data set subsequent to the initial EFA (Froman, 2001). This study adopts the former method by splitting the data set into halves due to certain constraints (e.g., resources and time). Nevertheless, the more common approach of applying CFA to an entire new data set is encouraged.
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
105
Based on the goodness-of-fit indexes of the CFA, the SQAS six-factor, correlated model provides reasonable fit to the data, but it is not perfect; and there is room for improvement. Previous researchers have indicated that the fit of a model is affected by, among other things, its complexity and specification (Bollen & Long, 1993; Gerbing & Anderson, 1993; Kaplan, 2000). As pointed out by Gerbing and Anderson (1993), most researchers using structural equation modeling have involved two to six latent variables, with about two to six indicators for each latent variable. Fan, Thompson, and Wang (1999) classified their four-latent-variable model (with three to four indicators per latent variable) as “moderate complexity” (p. 63). Based on this standard, the SQAS can be labeled as high complexity, which may hamper its model fit. On the other hand, using too few indicators per latent variable is inappropriate. In their Monte Carlo study, Anderson and Gerbing (1984) found a greater chance of nonconvergence and improper solutions with two indicators per factor, especially with small sample sizes (e.g., N < 150). MacCallum (1995) pointed out that models with low numbers of parameters relative to the number of measured variable variances/covariances were highly disconfirmable, and that “for such models, bad fit to observed data is entirely possible” (p. 30). Gerbing and Anderson (1985) also found that the structural parameters were unbiased when the models have three or more indicators per factor. Viewing this, the SQAS maintains at least three indicators per latent variable during the entire scale-development process (Loehlin, 1998). Although the current results provide preliminary psychometric support of the SQAS, it is important to recognize the scope of this study and to consider future research directions. First, the invariance of the SQAS items across gender has been examined and only 31 of the 40 items had similar factor loadings and item mean scores for male and female respondents. Possibly future researchers could add additional items that test to be invariant across gender. Preliminary evidence was provided on the similarity of mean response of men and women to the six dimensions of the SQAS. Previous researchers have indicated that differences always exist between the two genders, for example, in their perceived service quality of locker room (Lam, 1994), and their participation and consumption of organized sports (Fischer & Gainer, 1994). Therefore, a major issue of service-quality studies in the health–fitness industry is the investigation of differences in gender preference, such as the type of equipment or programs preferred, the ambience of the facility, and other related issues (American Sports Data, Inc., 1997). Future studies are required to compare the perception of service quality between male and female customers or to confirm the invariance across gender. Second, the causal relationship between benefit and member satisfaction was not examined in this study. The main purpose of carrying out surveys in health–fitness clubs is to provide owners the necessary information so that appropriate marketing strategies can be formulated for the target groups. Previous researchers have indicated that benefit is a major factor for retaining members (Kazuo et al., 1998) and members are more willing to
106
LAM, ZHANG, JENSEN
pay if the benefits of participation are emphasized (McCarville & Garrow, 1993). Future researchers should examine this area when assessing service quality of the clubs. When applying the SQAS in the health–fitness setting, the top management can simply examine the mean service-quality score of each factor of the SQAS to determine the area of improvement. The mean score of each factor is determined by averaging the scores of all the items within the factor. For example, the Staff factor has 8 items on the 31-item SQAS; thus, its service-quality score is obtained by averaging all the scores of those 8 items within the Staff factor. To understand more about the practices of the members and to make use of the resources effectively, club managers may also incorporate the following questions in their surveys: What time of the day do members prefer to exercise? What type of cardiovascular equipment do they use most frequently? This study utilizes a cross-sectional self-report measure; future research using more diverse methodologies, such as in-depth interviews or longitudinal studies, are necessary to better understand the motives and psychological mechanisms of customer satisfaction. In conclusion, the SQAS has sound psychometric properties and can be used to assess service quality of health–fitness clubs. According to Jöreskog and Sörbom (1993), a fit model does not necessarily mean a correct or best model because there may be many equivalent models as judged by the fit indexes. This is particularly true for the SQAS model because it is only in its infant stage. Future researchers should reexamine the SQAS with other samples to further study the factor structure and invariance across gender. In addition, further examination of the psychometric properties, such as the convergent and divergent validity of the SQAS, is required. The SQAS could be compared to another scale, if the scale were developed with the same degree of emphasis on scale construction and specificity for the health–fitness setting, for convergent validity; furthermore, the scores might be compared in a qualitative approach to judgments of members from an interview process to determine the quality of service quality from both quantitative and qualitative approaches. The discriminant validity of the SQAS might be estimated by determining if the scores on the dimensions of the scale do not correlate with scores from other scales designed to measure totally different satisfaction elements.
REFERENCES Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723. American Sports Data, Inc. (1997). Health club trend report. Hartsdale, NY: Author. Anderson, E. W., & Sullivan, M. (1993). The antecedents and consequences of customer satisfaction for firms. Marketing Science, 12, 125–143.
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
107
Anderson, J. C., & Gerbing, D. W. (1984). The effect of sampling error on convergence, improper solutions, and goodness-of-fit indices for maximum likelihood confirmatory factor analysis. Psychometrika, 49, 155–173. Bacon, D. R., Sauer, P. L., & Young, M. (1995). Composite reliability in structural equations modeling. Educational and Psychological Measurement, 55, 394–406. Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238–246. Bentler, P. M. (1992). On the fit of models to covariances and methodology to the Bulletin. Psychological Bulletin, 112, 400–404. Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA: Multivariate Software. Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structure. Psychological Bulletin, 88, 588–606. Bitner, M. J. (1992). Servicescapes: The impact of physical surroundings on customers and employees. Journal of Marketing, 56, 57–71. Bitner, M. J., Booms, B. H., & Tetreault, M. S. (1990). The service encounter: Diagnosing favorable and unfavorable incidents. Journal of Marketing, 54, 71–84. Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley. Bollen, K. A., & Long, J. S. (1993). Introduction. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 1–9). Newbury Park, CA: Sage. Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society, 26(Series B), 211–243. Brady, M. K., & Cronin, J., Jr. (2001). Some new thought on conceptualizing perceived service quality: A hierarchical approach. Journal of Marketing, 65(3), 33–49. Brown, T. J., Churchill, G. A., & Peter, J. P. (1993). Improving the measurement of service quality. Journal of Retailing, 69, 127–139. Browne, M. W., & Cudeck, R. (1989). Single sample cross-validation indices for covariance structures. Multivariate Behavioral Research, 24, 445–455. Byrne, B. M. (1998). Structural equation modeling with LISREL, PRELIS, and SIMPLIS: Basic concepts, applications, and programming. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Cannie, J., & Caplin, D. (1991). Keeping customers for life. New York: Amacom. Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276. Chelladurai, P., Scott, F. L., & Haywood-Farmer, J. (1987). Dimensions of fitness services: Development of a model. Journal of Sport Management, 1, 159–172. Cho, M. H. (2002). Health and physical activity among elderly people in South Korea. Journal of the International Council for Health, Physical Education, Recreation, Sport, and Dance, 38(4), 51–55. Chou, C. P., & Bentler, P. M. (1995). Estimates and tests in structural equation modeling. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 37–55). Thousand Oaks, CA: Sage. Comrey, A. L. (1988). Factor-analytic methods of scale development in personality and clinical psychology. Journal of Consulting and Clinical Psychology, 56, 754–761. Crompton, J. L., & Love, L. L. (1995). The predictive validity of alternative approaches to evaluation quality of a festival. Journal of Travel Research, 34, 11–24. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. Cronin, J. J., & Taylor, S. A. (1992). Measuring service quality: A re-examination and extension. Journal of Marketing, 56, 55–68. Czepial, J. A., Solomon, M. R., & Surprenant, C. F. (1985). The service encounter. Lexington, MA: Lexington.
108
LAM, ZHANG, JENSEN
Daniel, C., & Wood, F. S. (1980). Fitting equations to data: Computer analysis of multifactor data. New York: Wiley. Dunn, J. G. H., Bouffard, M., & Rogers, W. T. (1999). Assessing item content-relevance in sport psychology scale-construction research: Issues and recommendation. Measurement in Physical Education & Exercise Science, 3, 15–36. du Toit, S., du Toit, M., Jöreskog, K. G., & Sörbom, D. (2002). Interactive LISREL: User’s guide. Chicago: Scientific Software International. Emerson, J. D., & Stoto, M. A. (1983). Transforming data. In D. C. Hoaglin, F. Mosteller, & J. W. Tukey (Eds.), Understanding robust and exploratory data analysis (pp. 97–128). New York: Wiley. Fan, X., Thompson, B., & Wang, L. (1999). Effects of sample size, estimation methods, and model specification on structural equation modeling fit indexes. Structural Equation Modeling, 6, 56–83. Fischer, E., & Gainer, B. (1994). Masculinity and the consumption of organized sports. In J. A. Costa (Ed.), Gender issues and consumer behavior (pp. 84–103). Thousand Oaks, CA: Sage. Fornell, C. (1992). A national customer satisfaction barometer: The Swedish experience. Journal of Marketing, 56(1), 6–21. Fornell, C., & Larcker, D. (1981). Evaluating structural equation models with unobservable variables and measurement errors. Journal of Marketing Research, 18, 39–50. Fornell, C., & Wernerfelt, B. (1987). Defensive marketing strategy by customer complaint management: A theoretical analysis. Journal of Marketing Research, 24, 337–346. Froman, R. D. (2001). Elements to consider in planning the use of factor analysis. Southern Online Journal of Nursing Research, 5(2), 1–22. Retrieved January 20, 2004, from http://www.snrs.org/ members/SOJNR_articles/iss05vol02.pdf Garvin, D. A. (1988). Managing quality: The strategic and competitive edge. New York: Free Press. Gerbing, D. W., & Anderson, J. C. (1985). The effects of sampling error and model characteristics on parameter estimation for maximum likelihood confirmatory factor analysis. Multivariate Behavioral Research, 20, 255–272. Gerbing, D. W., & Anderson, J. C. (1993). Monte Carlo evaluations of goodness-of-fit indices for structural equation models. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 40–65). Newbury Park, CA: Sage. Gorsuch, R. L. (1983). Factor analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Grönroos, C. (1982). Strategic management and marketing in the service sector. Helsingfors, Sweden: Swedish School of Economics and Business Administration. Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability of component patterns. Psychological Bulletin, 103, 265–275. Guttman, L. (1954). Some necessary conditions for common factor analysis. Psychometrika, 18, 179–186. Hallowell, R. (1996). The relationships of customer satisfaction, customer loyalty, and profitability: An empirical study. International Journal of Service Industry Management, 7(4), 27–42. Harlow, L. L. (1985). Behavior of some elliptical theory estimators with nonnormal data in a covariance structures framework: A Monte Carlo study. Unpublished doctoral dissertation, University of California, Los Angeles. Hendrickson, A. E., & White, P. O. (1964). Promax: A quick method of rotation to oblique simple structure. British Journal of Statistical Psychology, 17, 65–70. Hofer, S. M., Horn, J. L., & Eber, H. W. (1997). A robust five-factor structure of the 16-PF: Evidence from independent rotation and confirmatory factorial invariance procedures. Personality and Individual Differences, 23, 247–269. Horovitz, J. (1990). How to win customers: Using customer service for a competitive edge. London: Pitman Publishing.
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
109
Howat, G., Absher, J., Crilley, G., & Milne, I. (1996). Measuring customer service quality in sports and leisure centers. Managing Leisure, 1, 77–89. Howat, G., Murray, D., & Crilley, G. (1999). The relationships between service problems and perceptions of service quality, satisfaction, and behavioral intentions of Australian public sports and leisure center customers. Journal of Park and Recreation Administration, 17(2), 42–64. Hoyle, R. H. (1995). The structural equation modeling approach: Basic concepts and fundamental issues. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 1–15). Thousand Oaks, CA: Sage. Hoyle, R. H., & Panter, A. T. (1995). Writing about structural equation modeling. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 158–176). Thousand Oaks, CA: Sage. Hu, L., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 76–99). Thousand Oaks, CA: Sage. Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. International Health, Racquet and Sportsclub Association. (2004). Profiles of success. Boston: Author. Jones, T., & Sasser, W. (1995, November–December). Why satisfied customers defect. Harvard Business Review, 88–89. Jöreskog, K. G., & Sörbom, D. (1993). LISREL 8: User’s reference guide. Chicago: Scientific Software International. Kaiser, H. F., & Caffrey, J. (1965). Alpha factor analysis. Psychometrika, 30, 1–14. Kaplan, D. (2000). Structural equation modeling. Thousand Oaks, CA: Sage. Kazuo, Y., Tsutomu, Y., Kenichiro, N., Makoto, N., Itsuki, N., & Kazuhiko, K. (1998). The product structure and its influencing power on consumer satisfaction in fitness club. Bulletin of Institute of Health and Sport Sciences, 21, 87–98. Kim, D., & Kim, S. Y. (1995). QUESC: An instrument for assessing the service quality of sport centers in Korea. Journal of Sport Management, 9, 208–220. Lam, E. T. C. (1994). Member survey final report: A guide to serve 2,200 members (Tech. Rep.). Springfield, MA: Metropolitan Springfield YMCA. Loehlin, J. C. (1998). Latent variable models: An introduction to factor, path, and structural analysis (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. MacCallum, R. C. (1995). Model specification: Procedures, strategies, and related issues. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 16–36). Thousand Oaks, CA: Sage. MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1, 130–149. MacKay, K. J., & Crompton, J. L. (1988). A conceptual model of consumer evaluation of recreation service quality. Leisure Studies, 7, 41–49. MacKay, K. J., & Crompton, J. L. (1990). Measuring the quality of recreation services. Journal of Park and Recreation Administration, 8(3), 47–56. Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodness-of-fit indexes in confirmatory factor analysis: The effect of sample size. Psychological Bulletin, 103, 391–411. McAlexander, J. H., Kaldenberg, D. O., & Koenig, H. F. (1994). Service quality measurement. Journal of Health Care Marketing, 14(3), 34–40. McCarville, R. E., & Garrow, G. W. (1993). Name selection and response to a hypothetical recreation program. Journal of Park and Recreation Administration, 11(3), 15–27. McDonald, M. A., & Howland, W. (1998). Health and fitness industry. In L. P. Masteralexis, C. A. Barr, & M. A. Hums (Eds.), Principles and practice of sport management (pp. 431–451). Gaithersburg, MD: Aspen.
110
LAM, ZHANG, JENSEN
McDougall, G. H. G., & Levesque, T. J. (1994). A revised view of service quality dimensions: An empirical investigation. Journal of Professional Services Marketing, 11, 189–209. Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525–543. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education/Macmillan. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749. Mintel International Group Limited. (2001). Health and fitness club market. Chicago: Author. Mudie, P., & Cottam, A. (1999). The management and marketing of services (2nd ed.). Oxford, England: Butterworth-Heinemann. Murray, D., & Howat, G. (2002). The relationships among service quality, value, satisfaction, and future intentions of customers at an Australian Sports and Leisure Center. Sport Management Review, 5, 25–43. Muthén, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171–189. Olsson, U. H., Foss, T., Troye, S. V., & Howell, R. D. (2000). The performance of ML, GLS, and WLS estimation in structural equation modeling under conditions of misspecification and nonnormality. Structural Equation Modeling, 7, 557–595. Papadimitriou, D. A., & Karteroliotis, K. (2000). The service quality expectations in private sport and fitness centers: A reexamination of the factor structure. Sport Marketing Quarterly, 9, 157–164. Parasuraman, A., Zeithaml, V. A., & Berry, L. L. (1988). SERVQUAL: A multiple-item scale for measuring consumer perceptions of service quality. Journal of Retailing, 64, 12–36. Pritchard, M., Howard, D., & Havitz, M. E. (1992). Loyalty measurement: A critical examination and theoretical extension. Leisure Sciences, 14, 155–164. Reichheld, F. F., & Sasser, W. E. (1990, September). Zero defections: Quality comes to services. Harvard Business Review, 68, 105–111. Rummel, R. J. (1970). Applied factor analysis. Evanston, IL: Northwestern University. Rust, R. T., & Oliver, R. L. (1994). Service quality: Insights and managerial implications form the frontier. In R. T. Rust & R. L. Oliver (Eds.), Service quality: New directions in theory and practice (pp. 1–19). Thousand Oaks: CA: Sage. Sawyer, T. H., & Smith, O. (1999). The management of clubs, recreation and sport: Concepts and applications. Champaign, IL: Sagamore. Shostack, L. G. (1977). Breaking free from product marketing. Journal of Marketing, 41, 73–80. Sonnenberg, F. K. (1989). Service quality: Forethought, not afterthought. Journal of Business Strategy, 10(5), 54–57. Steiger, J. H. (1989). EzPATH: A supplementary module for SYSTAT and SYSGRAPH [Computer software]. Evanston, IL: SYSTAT, Inc. Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation approach. Multivariate Behavioral Research, 25, 173–180. Steiger, J. H., & Lind, J. C. (1980, May). Statistically-based tests for the number of common factors. Paper presented at the annual meeting of the Psychometric Society, Iowa City, IA. Stum, D. L., & Thiry, A. (1991). Building customer loyalty. Training and Development Journal, 45(4), 34–36. Tanaka, J. S., & Bentler, P. M. (1985). Quasi-likelihood estimation in asymptotically efficient covariance structure models. Proceedings of the American Statistical Association, pp. 658–662. Thomas, J. R., & Nelson, J. K. (2001). Research methods in physical activity (4th ed.). Champaign, IL: Human Kinetics.
SERVICE QUALITY OF HEALTH–FITNESS CLUBS
111
West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 37–55). Thousand Oaks, CA: Sage. Whipple, T. W., & Thatch, S. V. (1988). Group tour management: Does good service produce satisfied customers? Journal of Travel Research, 27(2), 16–21. Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. In K. J. Bryant, M. Windle, & S. G. West (Eds.), The science of prevention: Methodological advances from alcohol and substance abuse research (pp. 281–324). Washington, DC: American Psychological Association. Wright, B. A., Duray, N., & Goodale, T. L. (1992). Assessing perceptions of recreation center service quality: An application of recent advancements in service quality research. Journal of Park and Recreation Administration, 10(3), 33–47. Zeithaml, V. A., Berry, L. L., & Parasuraman, A. (1996). The behavioral consequences of service quality. Journal of Marketing, 60(4), 31–46. Zeithaml, V. A., Parasuraman, A., & Berry, L. L. (1985). Problems and strategies in service marketing. Journal of Marketing, 49(2), 33–46.