Database Segmentation using Share of Customer - CiteSeerX

4 downloads 34044 Views 190KB Size Report
Apr 1, 1998 - marketing companies can track is share of customer: .... ions to their best customers as a way of show- ..... Journal of Direct Marketing, 8(3):7{.
Database Segmentation using Share of Customer Edward C. Malthouse

Paul Wang

April 1, 1998

 The authors thank Tom Collinger and Frank Mulhern for many helpful suggestions. A revised version of this paper will appear in Journal of database marketing.

1

Abstract

Database marketers have the ability to develop segmentations and implement programs with unprecedented sophistication. This is due to having two advantages over traditional marketers. First, database marketers have rich and detailed information about each of their customers. Companies that maintain a database of their customers often record every transaction and contact with each customer. This vast information can give them a deep understanding of their customers and allow the company to de ne precise segments in their database. The second advantage is that companies that have a database can target their customers with great accuracy and ef ciency. The same information that can be used for developing segmentations can also be used in implementing marketing programs. One key piece of information that many database marketing companies can track is share of customer : of the total number of purchases that each customer makes within a given category, what percent is bought from our company? This is valuable information for a company to track for several reasons. First, many companies are shifting their objectives from high market share to high share of customer. These companies have realized that simply increasing market share may or may not positively a ect their bottom line. In increasing market share, companies often acquire customers who will not be loyal customers and will switch to a competitor with little inducement. This frequently happens before the acquisition costs have been recovered. Even though the company has, at least temporarily, increased its market share, its bottom line does not necessarily improve (Reichheld, 1996). The solution to this problem is to attract customers who will remain loyal and then o er them products and services that satisfy their needs. When a company has earned the trust of its customers, the customers will want to do business with the company and make a high percent of their purchases in a given category from that company. From the customer's point of view, it is more convenient to work with a company that they trust than it is to start a new relationship with another company. Another reason to track share of customer is what we call silent attrition. A company may be losing

We discuss why it is important for database marketers to track information on share of customer and introduce a new reason for tracking it, which we call silent attrition. We show that share of customer can be a powerful segmentation variable by developing a conceptual framework for using it as a basis of segmentation. We discuss strategic objectives for possible segments, how to gather information on share of customer, and analytical methods for using it to develop segmentations. An empirical analysis illustrates our new approach.

1 Share of Customer Segmentation is one of the tenets of modern marketing, which is based on the fact that customers usually have varying needs and wants. When this is the case, a company should subdivide the market into homogeneous groups, or segments. It can then adopt speci c strategies and develop marketing programs for each segment. A company that targets a speci c market segment can o er a product or service that meets the needs of the segment more closely than a mass marketing. The price, product, promotions, and distribution channels can all be tailored accordingly. This can develop a strong sense of loyalty to the product among members of the segment, making it more dif cult for competitors to entice them to switch. Segmentations have been de ned based on many di erent variables including geographic, demographic, psychographic, and behavioral attributes (Kotler, 1997, Ch. 9). For example, one basis of behavioral segmentation is usage rate. Customers can be divided into groups of current nonusers, light users, and heavy users. Once these groups of customers have been identi ed, the marketer can design strategies and tactics for each. The company might wish to adopt a retention strategy with the heavy users. It wants to know why the customers are heavy users and then do things that will keep this group happy. It might also adopt an aggressive strategy to increase usage from light users, and a strategy to generate trial among the nonusers. 2

customers, but in a silent way. Silent attrition occurs when customers increase their purchases in a given category, without increasing the amount purchased from a particular company. In this case, the company does not lose the customer entirely; instead it slowly loses relevance. An example may help clarify this concept. Based on data examined in this paper, consider the category of casual- ne-dining restaurants and suppose we operate one restaurant in this category. Suppose there is an individual who dines at our restaurant twice a month, and dines at casual- nedining restaurants a total of four times per month, on average. Thus we have fty percent share of customer visits in this category. Now suppose this customer takes a di erent job requiring more entertaining, and she increases her number of visits to casual ne dining restaurants to 15 times a month, but continues to visit our restaurant only twice a month. Our share of customer has dropped, even though she still dines with us twice a month. This type of attrition is not detected unless a share-of-customer approach is used. Many companies measure customer count | how much business do we get from each customer? How many customers do I have? But these gures would not detect the silent attrition. Our product has become less relevant to this customer and we are in danger of losing her entirely. It is important to detect silent attrition early and take marketing action to regain relevance. Companies in many industries are recognizing the importance of share of customer, but they use different names. Various restaurant chains such as McDonalds refer to it as share of stomach. The car industry calls it share of garage. The fashion industry calls it share of closet. Financial service companies call it share of wallet. Package goods companies such as Proctor and Gamble call it share of requirement. The media industry calls it share of eyeballs and also share of advertising dollars. The non-pro t industry calls it share of heart. Companies that serve hobbiests call it share of mind. Consulting rms call it share of client. Thus, share of customer is a very important concept that is attracting interest in many industries. Our goal is to develop ways to segment markets that are based on the share-of-customer approach. In our

view, what is needed is a systematic approach for developing segmentation based on share of customer. In this paper we develop a conceptual framework for these segmentations; we discuss the database requirements to develop these segmentations and implement programs successfully; we present analytical methods; and we illustrate our approach with a business case.

2 Conceptual Framework In order to develop share-of-customer segmentations, database marketers need two key pieces of information: (1) each customer's consumption of the company's product or service; and (2) each customer's consumption of products or services from other companies in the category. Most companies that maintain a marketing database will track the rst variable, but few currently track the second. If either of the two pieces of information are missing, the information can be added by using a sample survey. The company could draw a sample of customers, ask them about their purchasing habits in the category, develop a predictive model that estimates the two variables from other variables in the database, and then apply the model to the remainder of the database. For example, the company could build a model predicting category purchases from other information in the database such as credit card usage, travel habits, number of earners in the family, and purchase history on the sample data, and then score the remainder of the database with estimates of category purchases. In some cases marketers already have information on both variables. One example is supermarket chains that have preferred-shopper programs, which record all the purchases at stores in the chains made by members of the program. These chains could use these databases to segment, for example, buyers of their private-label products. Airlines have begun to ask questions on these two variables on surveys of frequent- yer customers. Some nancial service companies make this information on these two variables available in certain certain categories. For example, hotels can learn how much each of their customers use their American Express cards at other hotels. 3

will typically account for a large fraction of our revenues. The primary goal for this group should be retention. We must keep this group satis ed and returning. We should use market research to understand what we have done to attract these customers and what we need to do to retain them. We should not use marketing programs that o er price breaks to this group, unless we know they are particularly price sensitive. Instead, we should usually spend money on improving service and leverage our database to ensure our o er remains uniquely relevant to this group. We should communicate that we appreciate their business. For example, airlines give their best customers preferred seating, allow them board before other passengers, and offer them other service perks such as showers after transatlantic ights. Some high-end department stores o er private showings of new fashions to their best customers as a way of showing their appreciation to this group and adding value without cutting price. Other department stores allow their best customers to visit their stores the evening before a big sale so that these customers will have the rst pick of the reduced merchandise. It is usually dicult to increase the amount of revenue we get from this group because our share of customer is already high. Thus we would rst have to increase their consumption in the category before we could increase their productivity. One exception is cross-selling them to product lines in other categories o ered by our company. 2. Heavy other, heavy us (HH). The members of this segment are heavy users in the category and like the product or service o ered by our company, but they buy from many di erent companies in the category. There are many possible explanations for this behavior and often there will be subsegments within this group that are switchers for di erent reasons. For example, they could be variety seekers by nature, or they could be switchers because of price, promotions, or distribution. Returning to the casual ne-dining example, perhaps we have a restau-

High Light other Heavy us (LH)

Usage of our products or services Light other Light us (LL) Low

Low

Heavy other Heavy us (HH) Heavy other Light us (HL)

High Usage of other products or services in category

Figure 1: Conceptual grid of four share-of-customer segments. Using these two variables, we propose the conceptual grid displayed in Figure 2. Customers should be ranked based on their value of the two variables, and classi ed into one of four segments. The rst cell, \Heavy other, Heavy us (HH)," is comprised of customers who are heavy users of our product or service and heavy users of other companies' products or services in the category. The other three segments are, \Light other, Heavy us (LH)," \Heavy other, Light us (HL)," and \Light other, Light us (LL)." The grid can easily be expanded to have more segments | for example, a nine-segment solution with all combinations of nonusers, light users, and heavy users | but for now we shall focus on this four-segment solution. Having determined that there are four distinct groups of customers, the marketing manager has the opportunity to tailor marketing strategies and programs to speci c segments. We discuss each segment separately: 1. Light other, heavy us (LH). The members of this segment are usually heavy users in the category and they like the product or service o ered by our company. They are very loyal to us. It is likely that whenever they make a purchase from our category, they think of us rst. This group 4

rant near an individual's work location, but none near the individual's home; therefore, it is inconvenient to visit our restaurant for non-work occasions. The HH group usually accounts for a substantial percentage of a company's revenues and therefore is an important group to retain. As with the LH group, our primary goal should be retention and we should use similar market research and programs with this group. A secondary objective should be to increase our share of customer with subsegments of this group by getting them to make more purchases from us instead of from our competitors. The share of customer may increase for one of the subsegments as the trust and relevance between a customer and us increases. We would like to migrate this subsegment into the LH segment. There may be other subsegments that are inherently variety seekers and will always buy from competitors as well as from us. 3. Heavy other, light us (HL). Members of this group are heavy purchasers in the category, but not from us. Because this group is already a heavy purchaser in this category, this group is usually our best opportunity for growth. Therefore we should consider adopting an o ensive strategy to increase share of customer, at least for subsegments of this group. We should try to migrate these customers into the HH segment. Before we do this we need to understand the competitor's strengths and weaknesses, and the customer's needs and wants. Why hasn't this group bought from us in the past? Are there barriers that prevent them from buying from us more often? We do not necessarily want to adopt this strategy for the entire group because there are most likely subsegments that are not at all attractive. One subsegment is the \cherry pickers," who make their purchase decisions based on price and are not likely to become loyal customers. Spending marketing budget on this group will often be unpro table because they are deal-prone, price-sensitive, and likely to attrite as soon as

a competitor makes a better o er. Another subsegment that we may not want to target is our competitor's best customers. In many cases our product will be positioned di erently from a competitor's and may not be attractive to this group unless we change our positioning. We must also be prepared for a competitive response if we begin to target this subsegment. 4. Light other, light us (LL). This segment does not currently make many purchases from our category. Before we can increase our business from this group, we must rst increase their purchases from the category. In most cases this is very dicult, but there can be subsegments in this group that are exceptions. For example new customers will often rst show up in the LL segment. These are people who, for one reason or another, have not made purchases in a particular category in the past, but might become heavy users in the future. The usual reason for this is a change in lifestyle or lifestage. For example, the set of products and services consumed by a person after he retires is usually di erent than the set he consumed before retirement. First-time parents have much di erent needs after the child is born than before. It is important to identify members of this subsegment early, o er them a product or service that meets their needs, gain their trust, and migrate them into the LH segment. There may be other attractive subsegments in the LL group. Companies should use market research to understand why various members of this segment do not currently make many purchases in the relevant category.

3 Empirical Analysis We illustrate our new approach to segmentation with an example from the restaurant industry. We examine a national restaurant chain in the casual- nedining category, which is an emerging segment of the restaurant market o ering high-quality, avorful food in a casual atmosphere. The average lunch check size per person for restaurants in this category is usually 5

$10{15. The chain we examine has been building a database of its customers with customer satisfaction surveys and tear-o cards used to enroll in their birthday club. The main use of the database is for direct marketing programs such as sending announcements of new products. The surveys are also used to monitor customer satisfaction. After dining at the restaurant, customers can complete a short survey with questions about their dining experience and general dining habits. The survey also asks for their name and address, and gives the respondent the option of not receiving mailings from the restaurant. The survey includes two questions that measure share of stomach: (1) \How many times have you visited [this restaurant chain]?" and (2) \How many times have you dined at any other casual dining restaurant in the past 30 days?" We will use these two questions to develop a share-of-stomach segmentation of the database. The company gathered a total of 16,386 surveys during 1996, which we use to develop the segmentation. Because these customers were selected as a convenience sample, they may not be representative of the population of customers. If there are systematic di erences in the dining habits between customers who enter the database with the satisfaction surveys and those who enter with the tear-o cards, then this group is not even representative of the corporate database. Despite of these limitations, the results presented in this section make intuitive sense and seem useful. This is particularly encouraging because the databases collected by nontraditional direct marketing companies often have these limitations. These companies include retailers, companies that sell their products through retail channels, restaurants, and many service providers. Because corporate databases are often not representative of the overall customer base, database marketers should be aware of the possible biases and factor them into the nal decision-making process. The rst question, number of times visited the chain, was measured on a ve-point scale and the second question, number of times other, was measured as a count. Some of the analytical methods we will apply to the data require that these variables have commensurate units. Therefore we computed quin-

tiles for the second question to put it on a ve-point scale also. The relationship between the raw counts and quintiles are given in Table 1. Table 1: The relationship between the counts and quintiles for the question \How many times have you dined at any other casual dining restaurant in the past 30 days?" Quintile Range Percent 1 0{3 times 18.6% 2 4{5 times 21.8% 3 6{9 times 13.5% 4 10{14 times 23.3% 5 15+ times 22.9%

A plot of the joint distribution of the two variables is shown in Figure 3. Visual inspection of the plot reveals four local modes, indicating four concentrations of customers: 1. Light other, heavy us | eats at the chain very frequently, but not at other restaurants in this category very often. 2. High other, heavy us | eats at both the chain and other restaurants in this category very often. 3. Heavy other, moderate us | eats at the chain moderately often, but other casual dining restaurants very often. 4. Light other, light us | does not eat at casual dining restaurants very often. An alternative method of nding concentrations of customers is to cluster analyze the two variables. We suggest using k-means clustering, which selects clusters so that the sum of within-cluster variation is minimized. Equivalently, it maximizes the betweencluster variation, since the total variation in the data, which is xed, can be decomposed into the sum of within- and between-cluster variations (Everitt, 1993, ch. 5). One property of a good segmentation is that the members of each segment be similar (i.e., small within-segment variation). Another is that the segments be di erent from each other (i.e., large 6

Prob(x,y) .02 .06

The CCC column gives the cubic clustering criteria (Sarle, 1983) for the cluster. Spikes also indicate that the cluster solution ts the data and a value greater than 2{3 indicates a good clustering (Sarle, 1983, p. 49). The R2 column gives the fraction of variation explained by the model. The R2 values increase steadily until the four-cluster solution, and then increase more slowly. This also indicates that the four-cluster solution ts the data. Table 2: Summary statistics for the 2{8 cluster solutions.

5 Nu

4 mb

er

3 tim e

4 s s e U

2 so

the

1 r

1

# Clusters 2 3 4 5 6 7 8

5

3 2 tim r e b Num

F

11193.59 13901.11 18772.70 17709.91 18920.78 18640.51 19803.71

CCC 16.11 ?12 97 27.09 27.34 18.83 25.16 32.89 :

R

2

.4221 .6447 .7861 .8222 .8606 .8795 .9005

Figure 2: The joint distribution of the number of times that someone has dined at the chain and at other casual dining restaurants during the past 30 The cluster means for the four-cluster solution are given in Table 3. The four clusters correspond to days. the modes we found from our visual inspection of Figure 3. between-segment variation). Therefore, the k-means objective function is a reasonable one to use for deTable 3: Means from the four-cluster solution. veloping segmentations. Another virtue of k-means is that the computational e ort required to compute Count Percent Other Us Description a solution is proportional to the number of observa- Segment LH 3881 25.3% 1.68412 3.99197 Light other, Heavy tions (O(n), where n is the sample size). This feature HH 3908 25.5% 4.31504 4.74672 Heavy other, Heavy is particularly important to database marketers, who HM 4795 31.3% 4.17917 2.32942 Heavy other, Moderat usually have a large number of observations. LL 2741 17.9% 1.66160 1.43330 Light other, Light u We ran a k-means cluster analysis on the two variables, computing the 2{8 cluster solutions. The summary statistics for the cluster solutions are given in The objective function for the k-means method is Table 2. The F statistics give the ratio of betweenn and within-cluster variances. If a cluster solutions kx ? x k2 ; min

i=1 i \ ts" the data, the between-cluster variance should be large, the within-cluster variance should be small, and we should see a spike in the F statistic. The xi is a 2  1 vector containing the two share-ofF statistics show a spike at the 4 cluster solution, customer measurements for customer i = 1; : : : ; n, which con rms the intuition developed with our vi- xh is the cluster centroid for cluster h = 1; : : : ; k, sual inspection of the density estimate in Figure 3. and i is the cluster number to which individual i

X

7

i

is assigned ( i 2 f1; : : : ; kg). By using an objective function that assigns equal weight to each variable, the k-means method implicitly assumes the variables have commensurate units. The importance of either share-of-customer variable and the nal cluster solution can be changed by expressing the variable in di erent units. For example, suppose one of the variables is measured in centimeters. This variable's contribution to the objective function would be increased by simply re-expressing this variable in millimeters, and thus this variable would have greater in uence on the nal clusters. The usual solution to this problem is to scale the variables appropriately before the analysis whenever the original variables have di erent units (Venables and Ripley, 1994, p. 312). The most common approach to scaling is to standardize all variables to have mean zero and variance one. This approach, however, is very sensitive to outliers and does not work well with count data, as we have here. An approach that is often used with count data is to take logarithms, but this transformation is not robust to outliers either. Therefore we suggest using a form of ranking such as quintiles or deciles when the data contain outliers. The k-means cluster solutions are invariant under monotonic transformations of the variables with this scaling, and are thus robust to outliers. It is important to use market research to learn more about each segment. This can help us re ne our strategies and marketing communications. To illustrate this, we pro led the segments o other questions from the survey. One of the questions was \What one area would you like to see improved at [the chain]?" The possible responses to this question were \decor," \taste of food," \service," \new items," \value," and \other." We computed a cross tabulation and correspondence analysis map to understand the relationship between these two variables. The map is shown in Figure 3. Both approaches showed that the two segments containing people who dine at this chain frequently believe that it should continue to o er new items. Those who are heavy users of other casual restaurants, but moderate users of this one are more inclined to suggest improving the taste of the food, the decor, or the value. The light users were most likely to check the value box. This is valuable infor-

mation for developing programs or marketing communications for the groups.

service

LL

HH

other

value

LH new items decor HM

taste of food

Figure 3: Correspondence analysis map showing the relationship between segment membership and the one attribute the chain should improve. As a second illustration of how market research can be used, we also compare the segments on their satisfaction with the restaurant chain. The survey included 13 questions concerning customer satisfaction on various attributes. We analyzed the 13 questions with factor analysis and found three factors. The rst factor, food, included questions concerning the taste and freshness of the food, the portion size, and the value for the money. The second factor, service, included questions concerning the speed, attentiveness, attitude, knowledge, and helpfulness of the server. The third factor, atmosphere, included questions regarding the attractiveness of the decor, cleanliness, and comfort level. The values of coecient alpha for the three factors were .8739, .9217, and .9177, respectively, indicating that the three scales give re8

liable measurements of the underlying dimensions. The means of the three dimensions of satisfaction are Table 5: Three measures of share for the segments. given in Table 4. Segments containing heavy users of Causal Share Number Percent the chain tend to be more satis ed with the food than Segment Dining of Visits of Visits the other segments. The di erences between the segShare Stomach to Chain to Chain ments on atmosphere and service are much smaller. LH 38.4% 4.6% 7417 28.7% Under the assumption that the database comprises a HH 19.5% 6.3% 12124 47.0% 13.2% 3.5% 4748 18.4% random sample of customers, we performed Tukey's HM 30.6% 2.7% 1528 5.9% studentized range test for the pair-wise comparisons LL 25817 100.0% of each variable, Using a type I error rate of .05, the Total two segments that are heavy users of this chain were not di erent from one another, but were di erent from the other two segments. ing their casual dining share. They should segment the HM segment further and develop programs to Table 4: Pro les of the four clusters on satisfaction migrate selected subsegment into the HH group. The LL segment is unusual. They are very loyal to the food, atmosphere, and service. chain | 30.6% of the times that this group dines caSegment Food Atmosphere Service sually are at the chain. But this group does not eat LH 4.35203 4.14639 4.55684 out very often and only accounts for 5.9% of the toHH 4.32284 4.10823 4.57054 tal visits to the chain. This is a group that does not HM 4.17394 4.05009 4.49519 go to casual dining restaurants very often, but when LL 4.18314 4.12426 4.50848 they do go they are likely to choose the chain. Their loyalty indicates they like the chain's concept, but not nd it to be a good value. They should inTable 5 gives the values of three di erent measures do vestigate why this group does not visit casual dining of share for the four segments. The rst measure of restaurants very often. If they can succeed in getting share is the \casual dining share," which is the frac- this group to come more often, it is likely that this tion of casual dining experiences during the last 30 group would remain loyal to the restaurant. days that were at the chain: (Number times us) / (Number times us + Number times other). The second measure is \share of stomach." If we assume that a person has 60 opportunities to have a casual dining experience in 30 days, share of stomach is (Number The share-of-customer concept has been getting times us)/60, or the fraction of meals over the past 30 much attention from the marketing community at days at the chain. The third measure gives the frac- large. Many companies are refocusing the objectives tion of total visits to the chain accounted for by each of their marketing e orts on achieving a high share segment. The fourth column gives the total number of customer instead of market share alone. They of times that each segment visited the chain. The have found that focusing on share of customer can fth column gives the percent of visits to the chain have a positive impact on their bottom lines. We for each segment. For example, 47% of the visits to argue that share of customer can also be a powerful basis for segmentation, which is of particular interthe chain were from members of the HH segment. The strategic objectives for each of the segments est to database marketers because many companies should be as we outlined in the previous section. The have, or can get access to, information on this subrestaurant should try to retain members of LH and ject. This paper proposes a conceptual framework HH. They should also try to migrate subsegments for segmentations by share of customer, describing a from the HH segment into the LH group by increas- priori what segments we expect exist, and discussing

4 Discussion

9

strategic objectives for each segment. We give an example from the restaurant industry in support of our approach. By using a corporate database with our segmentation approach, companies can manage their relationships with their customers more e ectively. They can adopt appropriate migration or retention strategies and develop more relevant creative strategies. We conclude that this will lead to more business and better pro tability. In the process of collecting customer-share information, marketers should be conscientious about the privacy concerns. We exhort marketers to exercise sound judgement and to respect their customer's rights to privacy. The goal in collecting share-ofcustomer information is not to invade customer's privacy, but rather to build a better relationship with the them so that their needs can be served better and, at the same time, the company can improve its profitability. Both sides must bene t from a company having this information to justify its practice. Ideally the company should have its customers' consent to do this.

Reichheld, F. F. (1996). The Loyalty E ect. Harvard Business School Press, Boston. Sarle, W. S. (1983). SAS technical report A-108, cubic clustering criterion. Technical report, SAS Institute, Cary, NC. Venables, W. N. and Ripley, B. D. (1994). Modern Applied Statistics with S-Plus. Springer, New York. Wayland, R. E. and Cole, P. M. (1997). Customer Connections. Harvard Business School Press, Boston.

References DeSarbo, W. S. and Ramaswamy, V. (1994). CRISP: Customer response based interactive segmentation procedures for response modeling in direct marketing. Journal of Direct Marketing, 8(3):7{ 20. Everitt, B. S. (1993). Cluster Analysis. Arnold, London, third edition. Jones, T. O. and W. Earl Sasser, J. (1995). Why satis ed customers defect. Harvard Business Review, 56:83{95. Kotler, P. (1997). Marketing Management: Analysis, Planning, Implementation, and Control. Prentice Hall, Englewood Cli s, New Jersey, ninth edition. Levin, N. and Zahavi, J. (1996). Segmentation analysis with managerial judgment. Journal of Direct Marketing, 10(3):28{47. Milne, G. R. and Gordon, M. E. (1994). A segmentation study of consumers' attitudes toward direct mail. Journal of Direct Marketing, 8(2):45{52. 10