Benchmarking of service quality with data ... - Semantic Scholar

7 downloads 21211 Views 455KB Size Report
A case study of auto repair services is provided for the purpose of illustration. The current ..... practices. It is required to determine the degree to which each of.
Expert Systems with Applications 41 (2014) 3761–3768

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Benchmarking of service quality with data envelopment analysis Hakyeon Lee a, Chulhyun Kim b,⇑ a Department of Industrial and Systems Engineering, Seoul National University of Science and Technology, 172 Gongreung 2-dong, Nowon-gu, Seoul 139-746, Republic of Korea b Department of Technology and Systems Management, Induk University, Choansan-ro 14, Nowon-gu, Seoul 139-050, Republic of Korea

a r t i c l e

i n f o

Keywords: Benchmarking Service quality Data envelopment analysis (DEA) SERVQUAL SERVPERF

a b s t r a c t This paper proposes a data envelopment analysis (DEA) approach to measurement and benchmarking of service quality. Dealing with measurement of overall service quality of multiple units with SERVPERF as multiple-criteria decision-making (MCDM), the proposed approach utilizes DEA, in particular, the pure output model without inputs. The five dimensions of SERVPERF are considered as outputs of the DEA model. A case study of auto repair services is provided for the purpose of illustration. The current practice of benchmarking of service quality with SERVQUAL/SERVPERF is limited in that there is little guidance to whom to benchmark and to what degree service quality should be improved. This study contributes to the field of service quality benchmarking by overcoming the above limitations, taking advantage of DEA’s capability to handle MCDM problems and provide benchmarking guidelines. Ó 2013 Elsevier Ltd. All rights reserved.

1. Introduction Service quality has consistently been at the core of research into service industries, since it is recognized as a critical determinant of business performance and a strategic tool for firms wishing to gain long-term viability (Gale, 1994). The prerequisite for achieving a high level of service quality is to be able to measure it. During the past two decades, determining the best way to measure service quality has been a matter of concern for both practitioners and researchers. There is an extensive body of knowledge on measuring service quality, which has also been a continued focus of research in terms of definition, typology, models, and operationalization (Seth, Deshmukh, & Vrat, 2005). Unquestionably, the most popular measure of service quality is SERVQUAL developed by Parasuraman, Zeithaml, and Berry (1988). SERVQUAL is a multi-item instrument for measuring service quality based on the gap model, in which service quality is a function of the difference between perception and expectation (Parasuraman, Zeithaml, & Berry, 1985). SERVQUAL has enjoyed a number of applications in a variety of settings, but many researchers have also tackled its operationalization (Carrillat, Jaramillo, & Mulki, 2007). In an effort to address deficiencies in SERVQUAL, Cronin and Taylor (1992) developed the SERVPERF instrument, which uses customers’ perceived performance as a direct measure of service quality. Regardless of whether SERVQUAL or SERVPERF is used, what is of particular interest in this study is analysis of survey data in measuring service quality. The original instrument of both SERVQUAL ⇑ Corresponding author. Tel.: +82 2 970 6469, fax: +82 2 974 2849. E-mail address: [email protected] (C. Kim). 0957-4174/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.eswa.2013.12.008

and SERVPREF (henceforth, SERVQUAL/SERVPERF) comprises of five dimensions with 22 items (44 items in SERVQUAL, for measuring both perception and expectation). Analysis of SERVQUAL/SERVPERF can take several forms, including item-by-item analysis, dimension-by-dimension analysis, or computation of a single measure of overall service quality (Buttle, 1996). The single measure can also be obtained in various ways, such as a simple sum or average, a weighted sum, or a weighted average, with weights assigned to each dimension or item. One of the primary reasons for producing a single measure of overall service quality across dimensions is to enable benchmarking through comparison. One of the practical values of SERVQUAL/SERVPERF lies in its ability to establish best practices by comparing overall quality scores of service units and then to improve the performance of units that are falling behind (Camp, 1989; Kettinger & Lee, 1997). However, benchmarking based on a simple aggregated measure has the limitation that there is little guidance to whom to benchmark and to what degree service quality should be improved. To address this limitation, this paper proposes a data envelopment analysis (DEA) approach to computation of a single measure of overall service quality and benchmarking in measuring service quality with the five dimensions of SERVQUAL/SERVPERF. DEA is a linear programming model for measuring the relative efficiency of decision-making units (DMUs) with multiple inputs and outputs (Cooper, Seiford, & Tone, 2000). DEA has various advantages: it can handle multiple inputs and outputs; it does not require prescribed functional forms of production as well as prescribed weights to be attached to each input and output. The greatest merit of DEA is that it provides benchmarking guidelines for inefficient DMUs (Paradi & Zhu, 2013). For each inefficient DMU, DEA identifies a set of

3762

H. Lee, C. Kim / Expert Systems with Applications 41 (2014) 3761–3768

efficiency units called the reference set, which constitutes its benchmark, with information on how much should be improved to be efficient. Thus, producing a single measure of overall service quality with DEA automatically draws up guidelines about how to conduct service quality benchmarking in terms of each dimension. DEA has widely and successfully been employed to measure performance across various service industries, mainly in banking, health care, transportation, and education (Liu, Lu, Lu, & Lin, 2013). Most early studies have paid attention to the operational efficiency or profitability of service units, rather than the quality aspect (Soteriou & Stavrinides, 1997). Recent studies have started to look at quality measures as outputs, such as grades in education (Olesen & Petersen, 1995), the ratio of actual deaths to predicted deaths in healthcare (Morey, Fine, Loree, Retzlaff-Roberts, & Tsubakitani, 1992), and the number of satisfied customers in the airline industry (Adler & Berechman, 2001). Some researchers have attempted to measure both quality efficiency and operating efficiency using DEA (Soteriou & Stavrinides, 1997; Kamakura, Mittal, de Rosa, & Mazzon, 2002; Sherman & Zhu, 2006; Shimshak, Lenard, & Klimberg, 2009). However, the quality measures utilized in those studies are merely proxies, not direct indicators of service quality. This study adopts direct measures of service quality as the output variables of DEA, the five dimensions of SERVQUAL/SERVPERF. Another problem of the previous studies considering service quality variables as outputs of DEA is its implicit assumption that inputs are transformed into quality output measures. However, it does not make sense that quality-type outputs, as opposed to quantitative physical outputs, increase with additional inputs (SalinasJimenez & Smith 1996; Shimshak et al., 2009). To deal with the lack of connection between amount of inputs and quality-type outputs, this study adopts the pure output DEA model proposed by Lovell and Pastor (1997). The pure output model only includes several outputs without input variables. The remainder of this paper is organized as follows. Section 2 reviews the theoretical background of service quality measurement, with a focus on SERVQUAL/SERVPERF. DEA models are explained in Section 3. Section 4 shows how to apply DEA to measurement and benchmarking of service quality with a case study of the auto repair services. The paper ends with conclusions and directions for future research in Section 5.

2. Measuring service quality: SERVQUAL and SERVPERF In the pioneering work by Parasuraman et al. (1985), ten dimensions of service quality were proposed with the gap model in which service quality is a function of the difference between perceptions and expectations of a service. Parasuraman et al. (1988) developed a scale composed of 22 items for measuring service quality, called SERVQUAL, in which the original ten dimensions of service quality are collapsed into five: tangibles, reliability, responsiveness, assurance, and empathy, as presented in Table 1. The SERVQUAL instrument includes 22 items for measuring expectations (E) and 22 corresponding items for measuring perceptions (P). For each item, based on the gap model, a quality

score (Q) is obtained as the difference between the perception (P) and expectation (E) ratings; that is, Q = P  E. The development of SERVQUAL has spawned a considerable amount of related research on its practical applications as well as theoretical discussions. A number of applications of SERVQUAL have been reported in a variety of settings (Ladhari, 2009), but it has also been criticized on theoretical and operational grounds (Jain & Gupta, 2004). The biggest issue that has been raised by many researchers is its operationalization—namely, the use of the gap score (Calvo-Porral, Lévy-Mangin, & Novo-Corti, 2013). Contrary to the original work by Parasuraman et al. (1988), the convergent validity of SERVQUAL has often not been confirmed in subsequent studies. Many studies have found that service quality measured with SERVQUAL is not significantly related to that measured directly through a single-item scale (Babakus & Boller, 1992; Carman, 1990). Van Dyke, Kappelman, and Prybutok (1997) insisted that separately measuring expected and perceived level of service quality and subtracting one score from the other is too simplistic to capture the complex cognitive process of perceiving service quality because one’s perception of service quality already entails the expectation of a service. The conceptualization of ‘‘expectation’’ has also been under attack because it is subject to multiple interpretations (Teas, 1993; Teas, 1994). Although Parasuraman, Zeithaml, and Berry (1994) provided thorough rebuttals to the critics on the use of gap scores, many researchers posit that a simple performance-based measure is a preferable means of measuring service quality (Babakus & Boller, 1992; Bolton & Drew, 1991; Cronin & Taylor, 1992; Cronin & Taylor, 1994). Raising fundamental criticisms against SERVQUAL, Cronin and Taylor (1992) proposed SERVPERF, which directly assesses customers’ perceived performance. The SERVPERF instrument discards the expectation component and only includes 22 items for measuring performance (P). SERVPERF assumes that higher perceived performance implies higher service quality; that is, Q = P. Obviously, the SERVPERF scale is more efficient than the SERVQUAL scale because it reduces by half the number of items to be measured in SERVQUAL. Cronin and Taylor (1992) also empirically showed the theoretical superiority of the SERVPERF scale over the SERVQUAL scale. Since the advent of SERVPERF, much vigorous debate has been taking place on whether SERVQUAL or SERVPERF should be used for measuring service quality. Numerous attempts have been made to compare the two scales on such various criteria as reliability, content validity, predictive validity, convergent validity, and diagnostic power (Babakus & Boller, 1992; Brady, Cronin, & Brand, 2002; Brown, Churchill, & Peter, 1993; Carrillat et al., 2007; Cui, Lewis, & Park, 2003; Hudson, Hudson, & Miller, 2004; Jain & Gupta, 2004; Kettinger & Lee, 1997; Mukherje & Nath, 2005; Quester & Romaniuk, 1997; Zhou, 2004). However, the question is still controversial, and general agreement does not exist about which is the better. Most researchers have upheld the idea that SERVEPRF is a better alternative than SERVQUAL in terms of validity and explanatory power (Babakus & Boller, 1992; Brady et al., 2002; Brown et al., 1993; Kettinger & Lee, 1997; Zhou, 2004) while Quester and Romaniuk (1997) support the superiority of SERVQUAL in

Table 1 Dimensions of SERVQUAL (Parasuraman et al., 1988). Dimension

Definition

Number of items

Tangibles Reliability Responsiveness Assurance Empathy

Physical facilities, equipment, and appearance of personnel Ability to perform the promised service dependably and accurately Willingness to help customers and provide prompt service Knowledge and courtesy of employees and their ability to inspire trust and confidence Caring, individualized attention the firm provides its customers

4 5 4 4 5

H. Lee, C. Kim / Expert Systems with Applications 41 (2014) 3761–3768

terms of convergent validity. The recent study by Carrillat et al. (2007) revealed that both scales are equally valid measures of service quality. When it comes to diagnostic power, however, researchers have reached the nearly universal consensus that SERVQUAL is superior to SERVPERF (Jain & Gupta, 2004; Kettinger & Lee, 1997; Pitt, Watson, & Kavan, 1997). Since each has its own advantage, Jain and Gupta (2004) suggested that one should employ SERVPERF for assessing overall service quality and making comparisons across units, firms, and industries, thanks to its higher validity and explanatory power, whereas they preferred SERVQUAL as a tool for periodic diagnosis to identify areas of quality shortfalls within specific service processes. This study does not intend to join that debate, but rather aims to show that the proposed DEA approach can be successfully used to produce an aggregated single measure of overall service quality and benchmarking. Consequently, in this research, whether SERVQUAL or SERVPERF should be employed is actually trivial. Together with the advantage of efficiency in measurement, this study prefers to adopt SERVPERF, since benchmarking is relevant to comparison of the overall service quality of multiple units. 3. DEA DEA is a non-parametric approach that does not require any assumptions about the functional form of a production function and a priori information on the importance of inputs and outputs. The relative efficiency of a DMU is measured by estimating the ratio of weighted outputs to weighted inputs and comparing it with those of other DMUs. DEA allows each DMU to choose the input and output weights that maximize its efficiency. DMUs that achieve 100% efficiency are considered efficient, while DMUs with efficiency scores below 100% are inefficient. For every inefficient DMU, DEA identifies a reference set composed of corresponding efficient DMUs that can be used as benchmarks for improvement. DEA also allows for calculation of the required amount of improvement in the inefficient DMU’s inputs and outputs to make it efficient. The first DEA model, proposed by Charnes, Cooper, and Rhodes (1978), is the CCR model, which assumes that production exhibits constant returns to scale. Banker, Charnes, and Cooper (1984) extended the CCR model into the BCC model for cases of variable returns to scale. DEA models are also distinguished by objective: maximize outputs (output-oriented) or minimize inputs (inputoriented). The output-oriented BCC model employed in this study is formulated as

max g s:t: Xk  xo gyo  Yk  0

ð1Þ

ek ¼ 1 k0 where X is the matrix of input vectors, Y is the matrix of output vectors, (xo, yo) is the DMU being measured, g is the reverse of the efficiency score, and k is the vector of intensity variables. The only difference between the CCR and BCC models is the presence of the convexity condition, ek ¼ 1. While DEA was originally developed for measuring the efficiency of multiple units performing a transformation process of several inputs and several outputs, it is now playing a broader role as a tool for multiple-criteria decision-making (MCDM) problems (Bouyssou, 1999). Despite the fact that the traditional goals of DEA and MCDM differ in that MCDM aims to prioritize a set of alternatives with conflicting criteria, many researchers have found similarities between DEA and MCDM (Ramanathan, 2006).

3763

Scholars have recognized that the MCDM and DEA formulations coincide if inputs and outputs are viewed as criteria, with minimization of inputs and maximization of outputs (Belton & Vickers, 1993; Doyle & Green, 1993; Stewart, 1996). Such criteria can be divided into two types: costs or negative evaluation items (the smaller the value, the better) as inputs and benefits or positive items (the greater the value, the better) as outputs (Stewart, 1996). Efficiency scores of DMUs are considered as priority weights or performance scores in MCDM. When this is the case, it is not assumed that inputs are necessarily and directly transformed into outputs (Cook, Tone, & Zhu, 2014). In some MCDM problems, there is no negative (or positive) evaluation item. In other words, all criteria are preferred to be high (or low); thus, only outputs (or inputs) will exist when using DEA. To accommodate this kind of situation, Lovell and Pastor (1999) suggest the pure output (or input) model without inputs (or outputs). They proved that an output-oriented CCR model with a single constant input and an input-oriented CCR model with a single constant output coincide with the corresponding BCC models, but a CCR model without inputs (or outputs) is meaningless. The pure output model has successfully been employed in various problems, such as target setting for bank services (Lovell & Pastor, 1997), facility layout design (Yang & Kuo, 2003), identification of new business areas (Seol, Lee, & Kim, 2011), and service-process benchmarking (Seol, Choi, Park, & Park, 2007). Since all of the five dimensions of SERVPERF are positive items, this study also adopts the pure output model to aggregate their scores into a single measure of service quality.

4. DEA–SERVPERF approach to benchmarking of service quality 4.1. Motivation and framework As mentioned earlier, one of the practical uses of SERVQUAL/ SERVPERF is benchmarking of service quality across multiple units. Benchmarking can be defined as ‘‘a continuous, systematic process for evaluating the products, services, and work processes of organizations that are recognized as representing best practices for the purpose of organizational improvement’’ (Spendolini, 1992). In general, benchmarking is composed of three steps: (1) identifying the best performers; (2) setting benchmarking goals; and (3) implementation. (Camp, 1989; Camp, 1998; Spendolini, 1992; Donthu et al., 2005). However, simply adopting SERVQUAL/SERVPERF cannot practically support any of the three steps. When it comes to the first step, SERVQUAL/SERVPERF provide little guidance the selection of benchmarks. The unit with the highest score is likely to be considered a best practice, but it does not make sense to urge all other units to follow the best practice, because the managerial context and environment each unit faces might be different. It is obvious that there is no one leader for all units. Depending on the managerial context and environment, each unit have a different set of leaders that could play as role models (Donthu et al., 2005). A more rational approach is to assign different relevant benchmarks to different units, considering their similarities in terms of managerial and operational situations (Lee & Kim, 2012). DEA can solve this problem by assigning a different set of efficient units as role models who have similar input and output structure to each inefficient DMU. Once benchmarking targets are selected, the question on setting goals still remains: how much should service quality be improved? Benchmarking goals should be measurable, attainable, and actionable (Spendolini, 1992). One may consider the current achievement levels of the best performer in overall service quality as goals, it may not be actionable because a unit with lower overall quality may still outperform the best performer in terms of several

3764

H. Lee, C. Kim / Expert Systems with Applications 41 (2014) 3761–3768

of the five dimensions. A virtual benchmark whose values are set as the highest ones for each dimension among all units can be constructed, but narrowing the differences between a unit and the virtual benchmark in all dimensions may not be attainable in real practices. It is required to determine the degree to which each of the five dimensions matters in achieving benchmarks by mirroring the current level and characteristics of service quality in each unit. This can also be achieved from DEA because it not only identifies a reference set as benchmarks for improvement and but also determines the amount of improvement required. A reference set for an inefficient DMU may include one or more efficient DMUs. Combining the values of reference DMUs produces the hypothetical composite unit, which is considered to be a benchmark for the inefficient DMU to be efficient. The performance levels of the composite unit are the lambda-weighted averages of the performance levels of the reference DMUs where lambdas are the dual variables indicating the relative importance of each reference DMU (Post & Spronk, 1999). Measuring the overall quality of service units with SERVQUAL/ SERVPERF can be viewed as an MCDM problem in which the five criteria are employed to measure the performance of the units in terms of service quality. Several studies have adopted this approach to measuring service quality, using famous MCDM techniques such as grey relational analysis (GRA) (Kuo & Liang, 2011), TOPSIS (Büyüközkan & Çifçi, 2012), the analytic hierarchy process (AHP) or fuzzy AHP (Büyüközkan, Çifçi, & Güleryüz, 2011; Chow & Luk, 2005; Lupo, 2013), the analytic network process (ANP) (Hsieh & Lin, 2008), and both the AHP and the ANP (Altuntas, Dereli, & Yilmaz, 2012). Further, as mentioned in Section 2, DEA can be used as a tool for MCDM by considering the input/output variables of DMUs as negative/positive criteria for the evaluation of alternatives. Thus, DEA is capable of aggregating the scores of the five dimensions of SERVPERF into a single measure of overall service quality. Fig. 1 shows the correspondence between SERVPERF and DEA. The DEA results not only contain information on the overall efficiency scores for quality of service units, but also provide benchmarking guidelines for each inefficient DMU. Since the five dimensions of SERVPERF are positive items from the perspective of MCDM, this study adopts the pure output model explained in Section 2. The pure output-oriented BCC model is obtained by removing the first constraint corresponding to inputs from (1). 4.2. Measuring DEA-SQ for benchmarking The proposed approach is illustrated with a case study of auto repair services. In the auto-repair service industry, the quality gap between service providers are particularly large, and service provider switching occurs frequently; that is, the effect of service quality on customer retention is very decisive (Bansal, Irving, & Taylor, 2004). For this reason, many studies have dealt with service quality in auto-repair services (Bansal et al., 2004; Bearden, 1983; Schneider, 2012). In addition, major automotive manufacturers

SERVPERF

usually retain huge service networks composed of multiple homogeneous service units, which is exactly the context in which DEA can be effectively utilized. The proposed approach was applied to repair service units of the Korea division of company G. It had been originally established and owned by a Korean company, but was fully acquired by an American corporation, one of the leading global automotive manufacturers. Their service network in Korea is composed of 479 units across the country. A questionnaire composed of the 22 items of SERVPERF and a single item for directly measuring overall service quality was used for the customer survey. Each item was rated on a scale from 0 to 100. The survey was conducted by e-mail on customers who visited any of the service units during January 2010 and accepted to join the survey. A total of 2705 respondents completed the questionnaire, giving a response rate of 29.7%. Excluding the units for which the number of respondents is less than five yielded a final set of data including 254 units with 1949 respondents—on average, 7.67 responses for each unit. DEA was then conducted using the pure output model, considering the five dimensions as outputs. For operationalization, the single constant input value 10 was assigned to every DMU. DEA efficiency scores of as measures of service quality (DEA-SQ) were then obtained for the 254 DMUs. First of all, the construct validity of DEA-SQ is examined along with two other measures: overall service quality directly obtained from a single item (Single Item-SQ), and SERVPERF score computed as the sum of the scores of the five dimensions (SERVPERF-SQ). Table 2 presents the Pearson correlation coefficients among these three measures. It is shown that both SERVPERF-SQ and DEA-SQ are highly correlated with Single Item-SQ, which implies that both measures have high construct validity. As expected, DEA-SQ is found to have lower validity than SERVPERF-SQ. This finding may be attributed to the truncated distribution of DEA efficiency scores below 1, since basic DEA models do not discriminate performance of efficient DMUs. However, this study does not intend to insist that DEA-SQ is superior to SERVPERF. The validity test shows that DEA-SQ is also a valid measure of overall service quality, so it can be used for benchmarking of service quality. DEA results reveal that eight out of a total of 254 DMUs are efficient. Table 3 presents the eight efficient DMUS with their actual scores across five dimensions. The second column represents the frequency with which each efficient DMU is included in reference sets for the other 246 inefficient units. As mentioned before, DEA provides benchmarking guidelines for inefficient DMUs In the present results, for DMU 87, whose efficiency score is 0.9062, for example, a single DMU—DMU 37— is assigned as a benchmark. Comparing the differences between DMUs 37 and 87 determines how much improvement is needed for each dimension of DMU 87, as shown in Table 4(a). For DMU 129, whose efficiency score is 0.9434, a reference set composed of two DMUs, 8 and 37, is identified. Combining the two reference DMUs also yields the target values of DMU 129, as shown in Table 4(b).

MCDM

DEA

Service units

Alternatives

DMUs

Five dimensions

Positive criteria

Outputs

Overall service quality

Priority weights

Efficiency score

Fig. 1. SERVPERF-DEA correspondence

3765

H. Lee, C. Kim / Expert Systems with Applications 41 (2014) 3761–3768 Table 2 Correlations among measures of overall service quality. Single item-SQ Single item-SQ SERVPERF-SQ DEA-SQ ⁄

1

Table 6 Comparison across types.

SERVPERF-SQ 0.844 1

*

DEA-SQ

Type

*

No. of units

Average efficiency score

A 22 0.9509 B 50 0.9269 C 182 0.9483 Relative comparison using Mann–Whitney-U test: C, A > B

0.718 0.808* 1

p < 0.01.

Mean rank 118.84 94.29 137.67

v2 = 14.549, df = 2, p = 0.001.

4.3. Benchmarking within groups In the above analysis, every DMU is evaluated together under the underlying assumption that all service units are homogeneous. Actually, units included in the repair-service network of company C can be divided into three types by their different characteristics, namely, type of management and level of repair carried out. Some of the units are directly managed by the company, while others are managed by individuals and indirectly overseen by the company through contracts. Level of repair is dichotomized as heavy or light. Table 5 characterizes the three types of service unit and gives response statistics. Type C has the largest number of units, while Type A has the lowest. The average number of responses per unit is the highest in Type A. To examine whether efficiency differences exist among the three types of service unit, a Kruskal–Wallis test is conducted, since the theoretical distribution of efficiency scores in DEA is unknown (Lee, Park, & Choi, 2009). The Kruskal–Wallis test is a nonparametric method that compares the medians of two or more samples to determine if the samples have come from different populations. The results of the test are presented in Table 6.

Statistically significant differences are found between efficiency scores among the three types of unit. A post hoc Mann–Whitney U test is also conducted for paired comparison. While there is no significant difference between Type C and Type A, their performance is found to be statistically superior to that of Type B. The differences between the efficiency scores of these three groups imply that the quality context of the service units may not be homogeneous due to their different managerial contexts and environments. Therefore, it may not be realistic to provide benchmarking guidelines based on different types of DMU. DMUs 87 and 126, employed as examples above, belong to Types B and A, respectively, but their benchmarks, DMUs 37 and 8, are both Type C units. For benchmarking to be effective, benchmarks within the same group need to be assigned to inefficient DMUs. In withingroup benchmarking, DEA is independently carried out for each type. Table 7 summarizes within-group DEA results for the present data, including average efficiency scores and number of efficient DMUs. It should be noted that the DEA results cannot be used for performance comparisons among groups as the three groups are evaluated separately.

Table 3 List of efficient DMUs. DMU

Frequency

Response

Tangibles

Assurance

Reliability

Empathy

37 188 164 20 8 149 207 201

132 16 83 30 74 73 15 11

100.00 81.25 100.00 91.67 100.00 100.00 95.00 100.00

97.92 73.44 77.34 75.00 89.58 90.63 86.25 87.50

100.00 65.63 89.06 87.50 66.67 93.75 85.00 82.50

83.33 95.83 93.75 94.44 94.44 91.67 93.33 93.33

100.00 62.50 93.75 91.67 66.67 93.75 92.50 85.00

Table 4 Example of benchmarking. Tangibles

Assurance

Reliability

Empathy

(a) Benchmarking of DMU 87 (efficiency score = 0.9062) Target (DMU 37) 100.00 Actual 90.63 Difference 9.37

Response

97.92 88.28 9.64

100.00 78.13 21.87

83.33 72.92 10.42

100.00 82.81 17.19

(b) Benchmarking of DMU 129 (efficiency score = 0.9434) Target (DMU 8&37) 100.00 Actual 94.34 Difference 5.66

91.21 80.42 10.79

94.25 80.42 13.83

91.00 85.85 5.15

94.25 79.25 15

Table 5 Types of repair units. Group

A B C

Characteristics

Statistics

Type of management

Level of repair

Number of units

Number of responses

Average responses per unit

Direct Indirect Indirect

Heavy Heavy Light

22 50 182

594 477 878

27.0 9.5 4.8

3766

H. Lee, C. Kim / Expert Systems with Applications 41 (2014) 3761–3768

Table 7 Within-group DEA results. Programs

No. of projects

Average efficiency score

No. of efficient DMUs (%)

A B C

22 50 182

0.9814 0.9537 0.9538

8 (36.66) 10 (20.00) 75 (41.21)

Within-group DEA yields new efficiency scores and reference sets for inefficient DMUs. It is revealed that DMU 87 (Type B) is still inefficient, although its efficiency score has increased from 0.9062 to 0.9576. A new reference set composed of DMUs 134, 183, and 250, all of which belong to the same group (Type B), is assigned as a benchmark. Table 8 presents the new target values obtained from combination of these three DMUs. The amount of improvement has decreased for the response and reliability dimensions compared with the previous target values, while that of the other three dimensions has increased. In contrast, DMU 129, which is not efficient in the previous analysis, is evaluated as efficient in the within-group analysis.

4.4. Implications and discussions The above case study revealed that DEA-SQ is a valid measure of overall service quality, and it is also shown that the proposed DEA approach can be effectively used for benchmarking of service quality. It is no doubt that SERVPERF is a useful instrument for measuring service quality, but it is not informative itself at all in terms of benchmarking of service quality. In order for benchmarking to be successfully made, SERVPERF should be employed in conjunction with DEA because DEA can automatically aid the two steps of benchmarking: identifying the best performers and setting benchmarking goals. However, it should be noted that DEA is basically a tool for diagnosis so that it cannot support the third step of benchmarking, implementation. DEA does not prescribe any reengineering strategies to make inefficient units efficient. Such improvement strategies need to be studied and implemented by understanding the operations of the benchmarks (Talluri, 2000). The proposed approach can be applied to various service practices in various industries, and one thing that is important in formulating DEA is the selection of the DEA models. We only employed the pure output BCC model, but there are various DEA models available. Selecting the most appropriate model may enhance the applicability and diagnostic power of the proposed approach. The following four issues need to be discussed for selection of the appropriate DEA models. Firstly, the underlying assumption of employing the pure output model is that quality-type outputs do not increase with additional physical inputs, but quality-relevant inputs such as the amount of efforts to train employees may have influences on quality-type outputs. When these kinds of inputs are considered to be important in measuring performance, the basic outputoriented BCC model should be utilized. If input variables are included, one should consider the reduction effect that the output-oriented model has on inputs. When the reduction effect

exists, the latent variable model proposed by Bretholt and Pan (2013) can be an effective solution. Secondly, it was shown that the construct validity of DEA-SQ is lower than that of SERVPERF, which may be attributed to the characteristic of the DEA efficiency score, whose upper limit is 1. Although DEA models do not allow for ranking efficient DMUs, some techniques have been developed for full-ranking such as cross-efficiency (Sexton, Silkman, & Hogan, 1986) and superefficiency (Andersen & Petersen, 1993). Utilizing these models can enhance the validity of DEA-SQ by enhancing discriminant power. Thirdly, the relative importance of the five dimensions of SERVPERF is not mirrored in DEA-SQ. In DEA, the weights of variables are chosen in a manner that assigns a best set of weights to each DMU, where ‘‘best’’ means that the resulting efficiency score is maximized under the given data. However, if commonly accepted views on the importance of variables are taken into account, the weight flexibility in DEA leads to unrealistic efficiency scores (Allen, Athanassopoulos, Dyson, & Thanassoulis, 1997). When this is the case, weights restrictions need to be placed (Dyson & Thanassoulis, 1988). Adopting weights restriction methods, such as the assurance regions (AR) model (Thompson, Langemeir, Lee, Lee, & Thrall, 1990), is expected to produce more realistic benchmarking guidelines because the importance of the five dimensions of SERVPERF may differ across industries. For example, the reliability dimension of transportation services is much more important than the other dimensions, so is the assurance dimension for professional services. Incorporating such preference by restricting weights of the five dimensions can more effectively capture the unique characteristics of various service industries. Fourthly, three types of repair units were evaluated together in the case example. When DMUs can be categorized into different types, however, it would be more appropriate to adopt the categorical DEA model. In the categorical DEA model, a DMU is only evaluated in comparison to units in the same category (Banker & Morey, 1986). The categorical DEA model can provide more achievable benchmarking goals by improving the peer group construction process. 5. Conclusions This paper proposed a data envelopment analysis (DEA) approach to measurement and benchmarking of service quality based on the five dimensions of SERVPERF. Dealing with measurement of overall service quality of multiple units with SERVPERRF as an MCDM problem, the proposed approach utilized DEA as an MCDM technique, in particular, the pure output model without inputs. A case study of auto repair services was also provided as an illustration. The current practice of benchmarking of service quality with SERVQUAL/SERVPERF is limited by the lack of clear guidelines for determining whom to benchmark and to what degree service quality should be improved. This study contributes to the field of service quality benchmarking by overcoming the above limitation, taking advantage of DEA’s capability to handle MCDM problems and provide benchmarking guidelines. Nevertheless, this study is subject to some limitations which might serve as fruitful avenues for future research. First, although

Table 8 Example of within-group benchmarking: DMU 87 (efficiency score = 0.9576).

Target (DMU 134&183&250) Actual Difference

Response

Tangibles

Assurance

Reliability

Empathy

100 90.63 9.37

92.19 88.28 3.91

87.5 78.13 9.37

83.33 72.92 10.41

90.62 82.81 7.81

H. Lee, C. Kim / Expert Systems with Applications 41 (2014) 3761–3768

SERVQUAL/SERVPERF ratings usually take an ordinal form, this study adopts a ratio scale for measurement of each item because the standard DEA models are capable of treating cardinal data. However, it is not easy for customers to answer questions on a 100-point scale, and response reliability is likely to be low compared to the Likert scale frequently employed in SERVQUAL/SERVPERF. In many DEA applications, ordinal inputs and outputs have been quantified to accommodate the DEA structure as a convenience, although this quantification might be superficial (Cook, Kress, & Seiford, 1996). There have been a few approaches to accommodating ordinal data, such as the imprecise model (Cooper, Seiford, & Tone, 2000) and the project model (Cook & Zhu, 2006). Employing these models with a five- or seven-point Likert scale is expected to more accurately capture service quality. Second, although this paper adopts the SERVPERF scale to measure various aspects of service quality, SERVQUAL can still be used instead of SERVPERF. However, a problem may occur if SERVQUAL is used, since DEA cannot accommodate the negative values that may appear when using SERVQUAL. To accommodate the SERVQUAL scale, a scaling adjustment is required so that DEA will not yield infeasible solutions. Acknowledgement This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (NRF-201132A-B00050). A preliminary version of this research was presented at the 2012 International Conference on Asia Pacific Business Innovation and Technology Management and appeared in Lee and Kim (2012). References Adler, N., & Berechman, J. (2001). Measuring airport quality from the airlines’ viewpoint: An application of data envelopment analysis. Transport Policy, 8(3), 171–181. Allen, R., Athanassopoulos, A., Dyson, R. G., & Thanassoulis, E. (1997). Weights restrictions and value judgments in data envelopment analysis: Evolution, development and future directions. Annals of Operations Research, 73(1), 13–34. Altuntas, S., Dereli, T., & Yilmaz, M. K. (2012). Multi-criteria decision making methods based weighted SERVQUAL scales to measure perceived service quality in hospitals: A case study from Turkey. Total Quality Management & Business Excellence, 23(11/12), 1379–1395. Andersen, P., & Petersen, N. C. (1993). A procedure for ranking efficient units in data envelopment analysis. Management Science, 39(10), 1261–1264. Babakus, E., & Boller, G. W. (1992). An empirical assessment of the Servqual scale. Journal of Business Research, 24(3), 253–268. Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Sciences, 30(9), 1078–1092. Banker, R. D., & Morey, R. C. (1986). The use of categorical variables in data envelopment analysis. Management Science, 32(12), 1613–1627. Bansal, H. S., Irving, P. G., & Taylor, S. F. (2004). A three-component model of customer to service providers. Journal of the Academy of Marketing Science, 32(3), 234–250. Bearden, W. O. (1983). Profiling consumers who register complaints against auto repair services. Journal of Consumer Affairs, 17(2), 315–335. Belton, V., & Vickers, S. P. (1993). Demystifying DEA: A visual interactive approach based on multi criteria analysis. Journal of the Operational Research Society, 44(9), 883–896. Bolton, R. N., & Drew, J. H. (1991). A multistage model of customers’ assessments of service quality and value. Journal of Consumer Research, 17(4), 375–384. Bouyssou, D. (1999). Using DEA as a tool for MCDM: Some remarks. Journal of the Operational Research Society, 50(9), 974–978. Brady, M. K., Cronin, J. J., Jr., & Brand, R. R. (2002). Performance-only measurement of service quality: A replication and extension. Journal of Business Research, 55(1), 17–31. Bretholt, A., & Pan, J. N. (2013). Evolving the latent variable model as an environment DEA technology. Omega, 41(2), 315–325. Brown, T. J., Churchill, G. A., Jr., & Peter, P. J. (1993). Improving the measurement of service quality. Journal of Retailing, 68(1), 127–139. Buttle, F. (1996). SERVQUAL: Review, critique, research agenda. European Journal of Marketing, 30(1), 8–32. Büyüközkan, G., & Çifçi, G. (2012). A combined fuzzy AHP and fuzzy TOPSIS based strategic analysis of electronic service quality in healthcare industry. Expert Systems with Applications, 39(3), 2341–2354.

3767

Büyüközkan, G., Çifçi, G., & Güleryüz, S. (2011). Strategic analysis of healthcare service quality using fuzzy AHP methodology. Expert Systems with Applications, 38(8), 9407–9424. Calvo-Porral, C., Lévy-Mangin, J. P., & Novo-Corti, I. (2013). Perceived quality in higher education: An empirical study. Marketing Intelligence & Planning, 31(6), 601–619. Camp, R. C. (1989). Benchmarking: The search for industry best practices that lead to superior performance. Milwaukee: Quality Press. Camp, R. C. (1998). Global cases in benchmarking. Milwaukee: ASQ Quality Press. Carman, J. M. (1990). Consumer perceptions of service quality: An assessment of the SERVQUAL dimensions. Journal of Retailing, 66(1), 33–35. Carrillat, F. A., Jaramillo, F., & Mulki, J. P. (2007). The validity of the SERVQUAL and SERVPERF scales: A meta-analytic view of 17 years of research across five continents. International Journal of Service Industry Management, 18(5), 472–490. Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring efficiency of decision making units. European Journal of Operational Research, 2(6), 429–444. Chow, C. C., & Luk, P. (2005). A strategic service quality approach using analytic hierarchy process. Managing Service Quality, 15(3), 278–289. Cook, W. D., Kress, M., & Seiford, L. M. (1996). Data envelopment analysis in the presence of both quantitative and qualitative factors. Journal of the Operational Research Society, 47(7), 945–953. Cook, W. D., Tone, K., & Zhu, J. (2014). Data envelopment analysis: Prior to choosing a model. Omega, 44, 1–4. Cook, W. D., & Zhu, J. (2006). Rank order data in DEA: A general framework. European Journal of Operational Research, 174(2), 1021–1038. Cooper, W. W., Seiford, L. M., & Tone, K. (2000). Data envelopment analysis: Theory, methodology, and applications, references and DEA-solver software. Boston: Kluwer Academic Publishers. Cronin, J. J., Jr., & Taylor, A. S. (1992). Measuring service quality: A reexamination and an extension. Journal of Marketing, 56(3), 55–67. Cronin, J. J., Jr., & Taylor, A. S. (1994). SERVPERF versus SERVQUAL: Reconciling performance based and perception based – minus – expectation measurements of service quality. Journal of Marketing, 58(1), 125–131. Cui, C. C., Lewis, B. R., & Park, W. (2003). Service quality measurement in the banking sector Korea. International Journal of Bank Marketing, 21(4), 191–201. Donthu, N., Hershberger, E. K., & Osmonbekov, T. (2005). Benchmarking marketing productivity using data envelopment analysis. Journal of Business Research, 58(11), 1474–1482. Doyle, J., & Green, R. (1993). Data envelopment analysis and multiple criteria decision making. Omega, 21(6), 713–715. Dyson, R. G., & Thanassoulis, E. (1988). Reducing weight flexibility in date envelopment system. The Journal of Operational Research Society, 39(6), 563–576. Gale, B. T. (1994). Managing customer value: Creating quality and service that customers can see. New York: The Free Press. Hsieh, L. F., & Lin, Y. Y. (2008). A service quality measurement architecture for hot spring hotels in Taiwan. Tourism Management, 29(3), 429–438. Hudson, S., Hudson, P., & Miller, G. A. (2004). The measurement of service quality in the tour operating sector: A methodological comparison. Journal of Travel Research, 42(3), 305–312. Jain, S. K., & Gupta, G. (2004). Measuring service quality: SERVQUAL vs SERVPERF scales. Vikalpa, 29(2), 25–37. Kamakura, W. A., Mittal, V., de Rosa, F., & Mazzon, J. A. (2002). Assessing the service–profit chain. Marketing Science, 21(3), 294–317. Kettinger, W. J., & Lee, C. C. (1997). Pragmatic perspectives on the measurement of information systems service quality. MIS Quarterly, 21(2), 223–240. Kuo, M. S., & Liang, G. S. (2011). Combining VIKOR with GRA techniques to evaluate service quality of airports under fuzzy environment. Expert Systems with Applications, 38(3), 1304–1312. Ladhari, R. (2009). A review of twenty years of SERVQUAL research. International Journal of Quality and Service Sciences, 1(2), 172–198. Lee, H., & Kim, C. (2012). A DEA-SERVQUAL approach to measurement and benchmarking of service quality. Procedia - Social and Behavioral Sciences, 40, 756–762. Lee, H., Park, Y., & Choi, H. (2009). Comparative evaluation of performance of national R&D programs with heterogeneous objectives. European Journal of Operational Research, 196(3), 847–855. Liu, J. S., Lu, L. Y. Y., Lu, W.-M., & Lin, B. J. Y. (2013). A survey of DEA applications. Omega, 41(5), 893–902. Lovell, C. A. K., & Pastor, J. T. (1997). Target setting: An application to a bank branch network. European Journal of Operational Research, 98(2), 290–299. Lovell, C. A. K., & Pastor, J. T. (1999). Radial DEA models without inputs or without outputs. European Journal of Operational Research, 118(1), 46–51. Lupo, T. (2013). A fuzzy ServQual based method for reliable measurements of education quality in Italian higher education area. Expert Systems with Applications, 40(17), 7096–7110. Morey, R. C., Fine, D. J., Loree, S. W., Retzlaff-Roberts, D. L., & Tsubakitani, S. (1992). The trade-off between hospital cost and quality of care: An exploratory empirical analysis. Medical Care, 30(8), 677–698. Mukherje, A., & Nath, P. (2005). An empirical assessment of comparative approach to service quality measurement. Journal of Services Marketing, 19(3), 174–184. Olesen, O. B., & Petersen, N. C. (1995). Incorporating quality into data envelopment analysis: A stochastic dominance approach. International Journal of Production Economics, 39(1–2), 117–135. Paradi, J. C., & Zhu, H. (2013). A survey on bank branch efficiency and performance research with data envelopment analysis. Omega, 41(1), 61–79.

3768

H. Lee, C. Kim / Expert Systems with Applications 41 (2014) 3761–3768

Parasuraman, A., Zeithaml, V. A., & Berry, L. L. (1985). A conceptual model of service quality and its implications for future research. Journal of Marketing, 49(3), 41–50. Parasuraman, A., Zeithaml, V. A., & Berry, L. L. (1988). SERVQUAL: A multiple item scale for measuring consumer perceptions of service quality. Journal of Retailing, 65(1), 12–40. Parasuraman, A., Zeithaml, V. A., & Berry, L. L. (1994). Reassessment of expectations as a comparison in measuring service quality: Implications for further research. Journal of Marketing, 58(1), 111–124. Pitt, L. F., Watson, R. T., & Kavan, C. B. (1997). Measuring information systems service quality: Concerns for a complete canvas. MIS Quarterly, 21(2), 209–222. Post, T., & Spronk, J. (1999). Performance benchmarking using interactive data envelopment analysis. European Journal of Operational Research, 115(3), 472–487. Quester, P. G., & Romaniuk, S. (1997). Service quality in the Australian advertising industry: A methodological study. Journal of Services Marketing, 11(3), 180–192. Ramanathan, R. (2006). Data envelopment analysis for weight derivation and aggregation in the analytic hierarchy process. Computers and Operations Research, 33(5), 1289–1307. Salinas-Jimenez, J., & Smith, P. (1996). Data envelopment analysis applied to quality in primary health care. Annals of Operations Research, 67(1), 141–161. Schneider, H. S. (2012). Agency problems and reputation in expert services: Evidence from auto repair. The Journal of Industrial Economics, 60(3), 406–433. Seol, H., Choi, J., Park, G., & Park, Y. (2007). A framework for benchmarking service process using data envelopment analysis and decision tree. Expert Systems with Applications, 32(2), 432–440. Seol, H., Lee, S., & Kim, C. (2011). Identifying new business areas using patent information: A DEA and text mining approach. Expert Systems with Applications, 38(4), 2933–2941. Seth, N., Deshmukh, S. G., & Vrat, P. (2005). Service quality models: A review. International Journal of Quality & Reliability Management, 22(9), 913–949. Sexton, T. R., Silkman, R. H., & Hogan, A. (1986). Data envelopment analysis: Critique and extensions. In R. H. Silkman (Ed.), Measuring efficiency: An assessment of data envelopment analysis. San Francisco: Jossey Bass.

Sherman, H. D., & Zhu, J. (2006). Benchmarking with quality-adjusted dea (q-dea) to seek lower-cost high-quality service: Evidence from a U.S. bank application. Annals of Operations Research, 145(1), 301–319. Shimshak, D. G., Lenard, M. L., & Klimberg, R. K. (2009). Incorporating quality into data envelopment analysis of nursing home performance: A case study. Omega, 37(3), 672–685. Soteriou, A. C., & Stavrinides, Y. (1997). An internal customer service quality data envelopment analysis model for bank branches. International Journal of Operations and Production Management, 17(8), 780–789. Spendolini, M. J. (1992). The benchmarking book. New York: American Management Association. Stewart, T. J. (1996). Relationships between data envelopment analysis and multicriteria decision analysis. The Journal of the Operational Research Society, 47(5), 654–665. Talluri, S. (2000). Data envelopment analysis: Models and extensions. Decision Line, 31(3), 8–11. Teas, K. R. (1993). Expectations, performance evaluation, and consumers’ perception of quality. Journal of Marketing, 57(4), 18–34. Teas, K. R. (1994). Expectations as a comparison standard in measuring service quality: An assessment of reassessment. Journal of Marketing, 58(1), 132–139. Thompson, R. G., Langemeir, L. N., Lee, C., Lee, E., & Thrall, R. M. (1990). The role of multiplier bounds in efficiency analysis with application to Kanssa farming. Journal of Econometrics, 1(2), 93–108. Van Dyke, T. P., Kappelman, L. A., & Prybutok, V. R. (1997). Measuring information systems service quality: Concerns on the use of the SERVQUAL questionnaire. MIS Quarterly, 21(2), 195–208. Yang, T., & Kuo, C. (2003). A hierarchical AHP/DEA methodology for the facilities layout design problem. European Journal of Operational Research, 147(1), 128–136. Zhou, L. (2004). A dimension-specific analysis of performance-only measurement of service quality and satisfaction in China’s retail banking. Journal of Services Marketing, 18(6/7), 534–546.

Suggest Documents