DAVIDE GIACALONE
DEPARTMENT OF FOOD SCIENCE PHD THESIS 2013 · ISBN 978-87-7611-641-5
university of copenhagen
DAVIDE GIACALONE Consumers’ perception of novel beers Sensory, affective, and cognitive-contextual aspects
Consumers’ perception of novel beers · Sensory, affective, and cognitive-contextual aspects
Cover photo courtesy of David Arky Photography
Consumers’ perception of novel beers
Sensory, affective, and cognitive-contextual aspects PhD thesis · 2013 Davide Giacalone
Consumers’ perception of novel beers Sensory, affective, and cognitive-contextual aspects
Davide Giacalone
PhD Thesis · 2013
Title: Consumers’ perception of novel beers: Sensory, affective, and cognitive-contextual aspects
Main supervisor: Ass. Prof. Michael Bom Frøst Sensory and Consumer Science Section, Dept. of Food Science, Faculty of Science, University of Copenhagen, Denmark
Co-supervisor: Prof. Wender Laurentius Petrus Bredie Sensory and Consumer Science Section, Dept. of Food Science, Faculty of Science, University of Copenhagen, Denmark
Opponents: Ass. Prof. Derek Victor Byrne (chairman) Sensory and Consumer Science Section, Dept. of Food Science, Faculty of Science, University of Copenhagen, Denmark Dr. John Prescott TasteMatters Research and Consulting, Australia Prof. Liisa Lähteenmäki Centre for Research on Customer Relations in the Food Sector (MAPP), Dept. of Business Administration, School of Business and Social Sciences, Århus University, Denmark
Date of PhD defense: October 4th, 2013
Ph.D. Thesis 2013 - © Davide Giacalone ISBN 978-87-7611-641-5 Printed by SL Grafik, Frederiksberg, Denmark (www.slgrafik.dk)
Table of Contents Preface .................................................................................................................................................. 5 Abbreviations ....................................................................................................................................... 7 Abstract ................................................................................................................................................ 9 Resumé (Danish Abstract) ................................................................................................................. 11 Riassunto (Italian Abstract) ............................................................................................................... 13 Introduction and objectives ................................................................................................................ 15 Overview of studies ........................................................................................................................ 17 List of publications............................................................................................................................. 19 1. Setting the scene............................................................................................................................. 21 1.1. Characteristics and trends in the Danish beer market ............................................................. 21 1.2. Increasing competition and the need for consumer-oriented innovation ................................ 26 1.3. References ............................................................................................................................... 28 2. Collecting sensory responses to beers: A consumer-centric approach .......................................... 31 2.1. Sensory analysis and product development............................................................................. 31 2.2. Rapid descriptive methods and consumer involvements in sensory tests ............................... 32 2.3. The use of consumers for profiling of beers ........................................................................... 34 2.4. Check-all-that-apply (CATA) ................................................................................................. 35 2.4.1. Ballot design and descriptor selection in CATA profiles for beer ................................... 38 2.5. Napping ................................................................................................................................... 43 2.6. Assessment of sensory data from consumer panels ................................................................ 51 2.6.1. Discrimination.................................................................................................................. 51 2.6.2. Reliability and validity ..................................................................................................... 55 2.7. Method differences and practical recommendations ............................................................... 57 2.8. Future directions ...................................................................................................................... 61 2.9. References ............................................................................................................................... 63 3. Hedonic responses to beers among Danish consumers .................................................................. 69 3.1. From sensory perception to hedonic response ........................................................................ 69 3.2. The case for preference mapping ............................................................................................ 71 3.3. Partial Least Squares Regression approaches to preference mapping .................................... 76 3.3.1. Three-blocks extensions ................................................................................................... 78 3.4. All-in-One Test (AI1) .............................................................................................................. 79 3.4.1. Generalizability ................................................................................................................ 81
3
3.4.2. Potential biases in hedonic judgments ............................................................................. 84 3.5. Determinants of acceptance of novel beers ............................................................................. 86 3.5.1. Most Advanced Yet Acceptable ...................................................................................... 86 3.5.2. The collative motivational model: Predictive of consumers’ preferences for beers? ...... 87 3.5.3. Novelty value in beers: What consumers say .................................................................. 91 3.6. Practical implications .............................................................................................................. 94 3.7. Future directions ...................................................................................................................... 95 3.8. References ............................................................................................................................... 97 4. Contextual acceptance of novel and familiar beers: An investigation using the situational appropriateness framework .............................................................................................................. 103 4.1. Contextual variables and food-related consumer behavior: An overview ............................ 104 4.2. Cognitive-contextual influences on beer consumption ......................................................... 106 4.3. The situational appropriateness construct ............................................................................. 107 4.4. Product-context associations: novel and familiar beers ........................................................ 108 4.5. Practical implications ............................................................................................................ 113 4.6. Future directions .................................................................................................................... 116 4.7. References ............................................................................................................................. 117 Conclusions ...................................................................................................................................... 121 Appendix 1 (Papers)......................................................................................................................... 127 Paper I .............................................................................................................................................. 129 Paper II ............................................................................................................................................. 143 Paper III............................................................................................................................................ 163 Paper IV ........................................................................................................................................... 173 Paper V............................................................................................................................................. 183 Paper VI ........................................................................................................................................... 215 Paper VII .......................................................................................................................................... 249 Appendix 2 (Activities).................................................................................................................... 261 Summary of activities during the PhD ............................................................................................. 263 Courses overview ......................................................................................................................... 263 Supplemental short courses and workshops: ................................................................................ 263 Conference contributions (international only) ............................................................................. 264 Teaching activities ........................................................................................................................ 266 Honors and awards ....................................................................................................................... 266 Acknowledgements .......................................................................................................................... 267
4
Preface This thesis is submitted in partial fulfillment of the requirements for the Ph.D. degree in Sensory Science at the Department of Food Science (FOOD), University of Copenhagen. It focuses on understanding consumers’ perception of novel beers from a sensory, affective, and cognitivecontextual perspective. The work was funded by the Danish Agency for Science, Technology and Innovation, through the consortium Danish Microbrew – Product Innovation and Quality, and by the Danish Ministry of Economic and Business Affairs, through the projects Local Foods in Denmark. Additional support was provided by the Faculty of Science, University of Copenhagen. The scientific project participants are affiliated at different institutions, including academics (FOOD-Sensory Science, and FOOD-Microbiology), technical institutes (Agrotech – Institut for Jordbrugs- og FødevareInnovation, and DHI – Afdeling for By og Industri), Danish breweries (Nørrebro Bryghus, Indslev Bryggeri A/S, Thisted Bryghus A/S, Skands Bryggeri, Baldersbrønde Bryggeri, Herslev Bryghus and Grauballe/Iisgaard), and other companies operating within the brewing industry, both in Denmark and abroad (Weurmann Specialty Malts, Holvrieka Danmark A/S, Sinus Automation A/S, Nordic Seed A/S, Bactoforce A/S, Ferm Laboratorium, and Givaudan). Most of the studies included in this thesis have been conducted at the Sensory Science group, Faculty of Science, or at sites of participating partners. The last four studies have been carried out at the New Zealand Institute for Plant and Food Research in Auckland, New Zealand, during a research stay abroad between September 2012 and February 2013. All study details are given in their respective papers. The thesis is structured as follows: Abstract: in English, Danish, and Italian. Introduction and aims: this section describes the overall research frame, the scientific aims addressed. It is followed by an overview of the studies that were carried out during the PhD and a list of the resulting publications. Chapter 1: this is an introductory chapter that discusses relevant trends in the Danish beer market, the challenges faced by microbreweries, and the need for consumer-oriented innovation. Chapters 2 to 4: these chapters are based on the experimental work conducted during the PhD project. Each of them is thematically linked to one aspect of consumers’ experience with beers –
5
Chapter 2 deals with sensory perception, Chapter 3 deals with affective aspects, Chapter 4 discusses relevant cognitions – and builds on one or more papers in the appendix (relevant papers are indicated at the beginning of each chapter). When relevant, unpublished data, or follow-up analyses of data presented in the papers, are discussed as well. The goal of these chapters is to integrate the results of individual papers the significance in relation to the thesis aims, as well as in a larger research perspective. Conclusions: this section rounds off by providing a synthetic summary of the findings, with a twofold focus on scholarly contributions and practical implications for microbreweries. Appendix 1: individual research papers written during the Ph.D. (in the form in which they are published or submitted, so layout may vary). Appendix 2: overview of relevant activities carried out during the Ph.D.
6
Abbreviations
A-PLSR: ANOVA Partial Least Squares Regression AL: Attribute Liking CA: Correspondence Analysis CATA: Check-All-That-Apply CLT: Central Location Testing DA: Descriptive Analysis D-PLSR: Discriminant Partial Least Squares Regression FOOD: Department of Food Science, University of Copenhagen HUT: Home Use Test IBU: International Bitterness Units JAR: Just-About-Right scales JK-PLSR: Jack-knifed Partial Least Square Regression KU: Københavns Universitet (University of Copenhagen) MFA: Multiple Factor Analysis OL: Overall liking PCA: Principal Component Analysis PKI: Product Knowledge and Involvement PLSR: Partial Least Squares Regression QDA: Quantitative Descriptive Analysis RTD: Ready to drink beverages SME: Small and medium sized enterprise UFP: Ultra Flash Profiling VARSEEK: Variety Seeking Scale
7
8
Abstract As a result of the impressive resurgence of micro and craft breweries, the product diversity in the Danish beer market has remarkably increased. Craft breweries are traditionally characterized by innovativeness, unique sensory experiences, and a focus on novel beer styles not previously known to many consumers. After a decade of growth, the Danish craft brewing segment is rapidly reaching maturity, and a higher degree of consumer orientation seems to be needed for continuing success. The aim of this PhD project was to investigate some of the key aspects of consumers’ perception of novel beers, and ways in which these can be considered to inform product development decisions. Sensory insights into how consumers perceive a new beer are paramount. As craft breweries rarely have access to traditional sensory analysis (in the form of a trained panel), the first part of the project has examined the suitability of consumer-oriented descriptive methodologies. Attention has been given in particular to projective mapping and check-all-that-apply questionnaires. The work showed both approaches to be feasible for rapid sensory characterization of beers, and potentially applicable by craft breweries, among other things, to gain a clearer understanding of the sensory outcome of experimental beers, and to enable comparisons with competitive products. The second part of the project studied how different sensory characteristics of beers translate into patterns of acceptance among consumers. The results highlighted that consumers’ preferences for beers are highly heterogeneous, and segments based on preferences for specific sensory characteristics and beer styles were identified. It was further hypothesized that consumers’ preferences may be jointly determined by the presence of novel sensory elements, and the way these fits with consumer’s previous experiences with beer. Empirical evidence gathered during the work generally supported this hypothesis, indicating that consumers prefer beers with novel flavors that are not perceived as too novel or discontinuous with their sensory expectations. The last part of the project has investigated relevant cognitive aspects of consumers’ experience with novel beers. Particular attention was given to the issue of appropriateness in specific usagecontexts. In a series of studies, consumers were found to strongly differentiate between different beers based on this aspect, suggesting that consumer’s choices of beers in real-life settings may ultimately depend on the match between product-related characteristics and the requirements of specific situations.
9
In summary, this PhD project provides interdisciplinary insights applicable to product development in the craft beer industry. More generally, this work makes a number of original contributions to our understanding of determinants of consumers’ perception of novel food and beverages, as well as methodological advances in the use of consumers as subjects in sensory and consumer research.
10
Resumé (Danish Abstract)
Som følge af den imponerende vækst af mikrobryggerier, er mangfoldigheden i det danske ølmarked markant øget. Mikrobryggerier er typisk præget af innovation, unikke sanseoplevelser, og et fokus på innovative øltyper som ofte er ukendte for mange forbruger. Efter et årti med kraftig vækst, er det danske mikrobrygsegment ved at nå et mæthedspunkt, og forbrugerorienteret innovation bliver i højere grad nødvendig til at skabe fortsat succes. Formålet med dette Ph.Dprojekt har været at udforske nøgleaspekter ved forbrugernes opfattelse af innovative øl, og at foreslå strategier til hvordan disse nøgleaspekter kan anvendes til at støtte produktudvikling. Det er først og fremmest afgørende at forstå, hvordan forbrugere opfatter innovative øl fra et sensorisk synspunkt. Da mikrobryggerierne sjældent har muligheden for at bruge traditionelle sensoriske analyser (som et trænet smagspanel), undersøgte den første del af projektet anvendeligheden af alternative sensoriske metoder. Det blev især lagt fokus på projektiv mapping og check-all-that-apply spørgeskemaer. Begge metoder viste sig at være velegnet til hurtig sensorisk profilering af øl, og kan bl.a. anvendes til at dokumentere de sensoriske egenskaber af nye og eksperimentelle øl, samt hvordan disse opleves sammenlignet med konkurrerende produkter. Den anden del af projektet undersøgte hvordan forskellige sensoriske egenskaber påvirker forbrugeraccept af øl. Resultaterne understreger at forbrugernes præferencer i forhold til øl er meget varieret, og segmenter baseret på præferencer for bestemte sensoriske egenskaber kunne identificeres. Det blev yderligere antaget, at forbrugernes præferencer afhænger af den fælles effekt af sensorisk nyhedsværdi og at der er en sammenhæng med forbrugernes erfaring med øl. De empiriske resultater indsamlet i løbet af projektet støttede hypotesen at forbrugere foretrækker øl med innovative smagsprofiler så længe de ikke opfattes som for nye eller for afvigende fra deres sensoriske forventninger. Den sidste del af projektet udforskede relevante kognitive aspekter af forbrugernes oplevelse med innovative øl, især i forhold til egnetheden af øl til særlige sammenhænge. Det blev demonstreret at være et vigtigt aspekt som forbrugere anvender til at differentiere mellem forskellige øl. Dette tyder på, at forbrugernes valg af øl i reelle forbrugssituationer i sidste ende afhænger af overensstemmelsen mellem produktets egenskaber og kravene som forskellige situationer stiller.
11
Som opsummering giver dette Ph.D projekt tværfaglige viden som kan anvendes i produktudvikling indenfor mikrobryggeribranchen. I bredere forstand bidrager dette arbejde til vores forståelse af faktorer der påvirker forbrugernes opfattelse af innovative føde- og drikkevarer, samt metodologiske fremskridt i anvendelse af forbrugere i sensoriske undersøgelser.
12
Riassunto (Italian Abstract)
A seguito dell’imponente crescita del fenomeno dei microbirrifici aritgianali, la varietá merceologica nel mercato della birra danese é notevolemente aumentata. La produzione dei microbirrifici si é caratterizzata per una forte attenzione all’innovazione ed alla produzione stili di birra sconosciuti alla maggior parte dei consumatori. Dopo circa una decade, il trend di crescita della birra artigianale sta rallentando, ed una maggiore attenzione alle preferenze del consumatore si rende sempre piú necessaria per assicurare un successo continuo. L’obiettivo di questo progetto di dottorato é stato di studiare alcuni aspetti-chiave – sensoriali, affettivi, e cognitivi – della percezione delle birre artigianali da parte dei consumatori, e di esporre il modo in cui le informazioni ottenute possono essere utilizzate per la formulazione e lo sviluppo di nuovi prodotti. Le proprietá sensoriali della birra sono di fondamentale importanza nel determinare l’esperienza e le preferenze del consumatore. Poiché i microbirrifici non hanno, di norma, la possibilitá di accedere all’analisi sensoriale convenzionale, la prima parte del lavoro ha esplorato l’applicabilitá di metodi rapidi e meno costosi, che possono essere eventualmente utilizzati anche in assenza di un panel addestrato per una rapida caratterizzazione della birra. Particolare attenzione é stata rivolta a due metodi in particolare: il projective mapping e i questionari check-all-that-apply. Il lavoro svolto ha dimostrato una sostanziale validitá di entrambi gli approcci, evidenziandone il potenziale impiego da parte dei microbirrifici, ad esempio per documentare i profili sensoriali delle birre sperimentali negli impianti pilota, oltre che per comparare un determinato prodotto con quelli dei concorrenti. La seconda parte della tesi considera l’influenza delle proprietá sensoriali sull’accettabilitá delle birre artigiani. I risultati ottenuti indicano che, in generale, i consumatori preferiscono birre con flavor innovativi ma che non sono percepite come troppo discontinue rispetto alle loro aspettative e cioè comunque riconducibili a precedenti esperienze percettive legate al consumo di birra. Inoltre i dati raccolti hanno evidenziato una evidente eterogeneitá delle preferenze da parte dei consumatori danesi, rendendo possibile l’identificazione di diversi segmenti caratterizzati da altrettanti orientamenti nell’esprimere la propria preferenza per le proprietà sensoriali dei prodotti. L’ultima parte del lavoro si incentra sugli aspetti cognitivi relativi all’esperienza del consumo delle birre artigianali, particolarmente rispetto all’utilizzo percepito della birra in specifici contesti/situazioni di consumo. Da una serie di studi effettuati é emerso chiaramente che i
13
consumatori associano birre diverse a diversi tipi di utilizzo, suggerendo che le scelte dei consumatori in ultima istanza dipendano tanto dalle caratteristiche del prodotto in sé, quanto dal fatto che queste soddisfino i requisiti di determinate situazioni di consumo. Complessivamente, questa tesi offre un’approccio interdiciplinare a supporto delle attivitá di innovazione e sviluppo da parte dei microbirrifici. In un contesto piú ampio, questo lavoro contiene una serie di contributi originali nel campo della sensory & consumer science: dalla percezione e delle preferenze dei consumatori verso i prodotti alimentari innovativi, ad aspetti metodologici relativi al coinvolgimento dei consumatori come soggetti negli studi sensoriali.
14
Introduction and objectives As the world’s most widely consumed alcoholic drink, beer is a familiar product category in which consumers often hold clear expectations with regards to its sensory (e.g. taste, smell, visual and textural) characteristics. After a long period of dominance by large mass-market brewers and relatively standardized light lagers, the beer market is now undergoing important changes. In recent years Denmark, as most of the Western world, has witnessed an impressive resurgence of small craft breweries, characterized by a creative verve in brewing innovative beers and reinventing old beer styles. Product innovation in the craft brewing industry, however, has mostly relied on individual brewmasters’ vision about taste and identity, and often resulted from trial and error than from a strategic approach. While extreme and unique beers are certainly fascinating to brewmasters and beer connoisseurs, there is some danger to leave the majority of consumers far behind. Product innovation is widely considered as a major source of competitive advantage in the brewing industry, and understanding how consumers react to new and unexpected flavor in beers will be instrumental in increasing the success of craft beers. The present thesis has precisely this overall objective: to assist consumer-oriented innovation among craft breweries by providing insights into key areas of consumers’ experience with novel beers. To this end, three relevant research issues are addressed. First, the issue of how to collect sensory response to novel beers is explored. Because microbreweries in general (and the industrial partners in particular) are SMEs without access to traditional sensory panels, this thesis focuses on alternative methods suitable with untrained subjects, and on their applicability for sensory characterization of beers. Methods such as check-allthat-apply questions and Napping are applied to profile the sensory characteristics of a range of beers with selected ingredients and flavorings of interest. The goal of this part is to provide craft breweries with a toolbox to evaluate sensory characteristics of their products to be used for e.g. prototype evaluations, profiling of competing products, and evaluations of product quality from a consumer perspective. The second aim is investigating is consumers’ affective response to novel beer. From an applied perspective, the focus is on linking sensory characteristics to consumers’ hedonic responses to identify important ‘drivers of liking’ for beers. From a more basic perspective, this thesis seeks to
15
deepen current understanding of what drives consumers’ appreciation of innovative products in a familiar product category such as beer. Principles from design theory and experimental psychology suggest that a thoughtful balance between innovation (novelty) and adherence to consumers’ expectations (typicality) could be the key to increasing hedonic response in beers. This thesis provides an empirical test of this assumption. Further, the role of a number of consumer characteristics, such as previous knowledge and experience with beer and personal variety–seeking tendency, on consumers’ affective responses to novel beers is investigated to uncover differences between consumer segments. Finally, a full understanding of how consumers evaluate new beers must also take contextual factors into account. Among the many ways in which contextual influences can be explored, this thesis focuses on the appropriateness (item-by-use) technique to explore relationships between beers and usage contexts, with the goal of identifying possible situational aspects that may increase willingness to try, and possibly also increase acceptance, of novel beers. A schematic overview of the research aims, sub-aims, and related papers, is given below:
Aim 1. Sensory characterization of specialty beers (Papers I, II, & III) Aim 1.1. Development and testing of rapid methods for sensory characterization of novel beers applicable with consumer panels Aim 1.2. Characterization of sensory properties of beers with selected flavors/ingredients of interest
Aim 2. Affective responses to novel beers (Papers I, IV, & V) Aim 2.1. Mapping of sensory preferences among beer consumers Aim 2.2. Determinants of preferences for beer – Effect of novelty and typicality Aim 2.3. Segmentation based on relevant consumer characteristics
Aim 3. Contextual acceptance of novel beers (Paper VI) Aim 3.1. Exploration of situational appropriateness of beers varying in degree of novelty Aim 3.2. Identification of situational factors that may influence willingness to try novel beers
16
Overview of studies Ten studies have been carried out during the PhD project (08/2010-07/2013). A chronological overview is given below. The studies have resulted in six research papers. A visual overview of the relationships between studies, papers, and research aims is provided on the following page. #
Time and place
N
Study aim
1
09/2010 Århus, Denmark
160
Study consumer perceptual and affective response to a range of specialty beers, mapping of consumer preferences from sensory descriptors and consumers’ characteristics of interest.
2
3
4
5
6
7
12/2010 Indslev and Copenhagen, Denmark
02/2011 Copenhagen, Denmark 12/2011 Copenhagen, Denmark 02/2012 Copenhagen, Denmark 03/2012 Copenhagen, Denmark 09/2012 Auckland, New Zealand
8
12/2012 Auckland, New Zealand
9
12/2012 Auckland, New Zealand
10
03/2013 Auckland, New Zealand
17
135
129
122
Study sensory perception of specialty beers via the Napping methodology, and compare results from panel with different degrees of product expertise.
Results Paper I
Paper II
Comparison of three rapid descriptive methods for sensory characterization of specialty beers: CATA, CATA w/intensity and Napping. Investigate effect of collative properties on flavor preferences for beers, and elucidate relationships with relevant consumers’ traits.
Paper III & Paper V
Investigation of bias of hedonic scores when co-eliciting a product sensory description using CATA questions.
Paper IV
Investigate effect of collative properties on flavor preferences for beers, and elucidate relationships with relevant consumers’ traits.
Paper V
Paper V
153
Investigate effect of novelty on flavor preferences for beers.
76
Study perceived situational appropriateness of beers with varying degrees of familiarity.
Paper VI
Study perceived situational appropriateness of beers with varying degrees of familiarity. Validate earlier results with a new stimuli set.
Paper VI
Study perceived situational appropriateness of beers with varying degrees of familiarity. Validate earlier results with new stimuli set, accounting for extrinsic factors.
Paper VI
Study perceived situational appropriateness of beers with varying degrees of familiarity. Validate studies 7/8/9 with a new stimuli set, and a different elicitation method.
Paper VI
97
93
145
17
Connections between studies, papers and research themes
18
List of publications This thesis is based on the following research papers, which are referred to by the roman numerals I, II, III, IV, V, VI, and VII.
(Papers in peer-reviewed journals) I.
Giacalone, D., Bredie, W. L. P., & Frøst, M. B. (2013). “All-in-one test” (AI1): A rapid and easily applicable approach to consumer product testing. Food Quality and Preference, 27, 108-119.
II.
Giacalone, D., Machado, L., & Frøst, M. B. (2013). Consumer-based product profiling: Application of partial Napping® for sensory characterization of specialty beers by novices and experts. Journal of Food Products Marketing, 19, 201-218.
III.
Reinbach, H. C., Giacalone, D., Machado, L., Bredie, W.L.P., & Frøst, M.B. (2013). Comparison of three sensory profiling methods: CATA, CATA with intensity and Napping to study consumer perception. Food Quality and Preference, doi:10.1016/j.foodqual.2013.02.004.
IV.
Jaeger, S. R., Giacalone, D., Roigard, C. M.; Pineau, B., Vidal, L., Gimenez, A., Frøst, M. B., & Ares, G. (2013). Investigation of bias of hedonic scores when co-eliciting product attribute information using CATA questions. Food Quality and Preference, 30, 242-249.
V.
Giacalone, D., M. Duerlund, J. Bøegh-Petersen, Bredie, W. L. P., & Frøst, M. B. The effect of stimulus collative properties on consumers’ flavor preferences. Appetite (Under Review).
VI.
Giacalone, D., Frøst, M. B., Bredie, W. L. P., & Jaeger, S. R. Situational appropriateness and consumers’ use patterns for beers: The moderating role of product familiarity. In preparation for submission to Food Quality and Preference.
(Papers in trade journals) VII.
Giacalone, D., Reinbach, H. C., & Frøst, M. B. (2011). A snapshot mapping of the Danish beer market. Scandinavian Brewers’ Review, 68, 12-20.
19
20
1. Setting the scene The Danish beer market has profoundly changed in the last decade. A striking resurgence of craft breweries has marked a breakaway from product uniformity to an unprecedented availability of beers with unique flavor profiles. The craft brewing segment has grown exponentially in last years, but is now reaching a maturity stage where supply begins to exceed the demand. Future growth can only come from “crossing the chasm” from the core base of beer connoisseurs and enthusiasts, to the majority of consumer who might be interested in exploring the world of craft breweries. A higher degree of consumer orientation may help craft breweries achieve that goal. This chapter is introductory and does not build on any of the papers. More details on the Danish beer industry are given in Paper VII.
1.1. Characteristics and trends in the Danish beer market Beer in Denmark has a very long history, with the first evidence of beer consumption tracing back to 1370 BC. At the beginning of the 19th century, there were over 400 breweries in the country and about 140 in Copenhagen alone (Mejlholm & Martens, 2006). These breweries focused for the most part on brewing various types of top-fermenting ales, such as the traditional Danish hvidtøl (Eng. White beer), a dark (in spite of the name), low ABV, extremely sweet ale (Nielsen, 2008). The situation rapidly changed in the course of the industrial revolution, particularly after the Danish physiologist Emil Christian Hansen, then working for the Carlsberg brewery, developed a method to obtain a pure strain of yeast, saccharomyces carlsbergensis, which enabled breweries to produce and distribute much larger and consistent output than ever before. This yeast is used to produce bottom-fermented lager beers, which quickly became the dominant style since Carlsberg started selling it in 1847. The following century and a half saw a progressive concentration in the beer industry, culminated with Carlsberg rising to a quasi-monopolistic position after their merger with Tuborg in the 1970s. As a result, the number of breweries drop sharply (from 400 to less than 20), and so did the number of available beer styles, with the market characterized by a high degree of product uniformity and variation being limited to various lagers. Things have changed, rapidly and deeply, in the last decade. Although industry sales are still dominated by two dominant brewing groups – Carlsberg and Royal Unibrew (a conglomerate of regional breweries formed in 2005) – a striking resurgence of small, independent breweries has occurred. According to the Danish Brewer’s Association, the number of breweries in the country
21
has grown from 19 in to 120 in the last decade (Figure 1.1), while the market share of craft and micro brewed beers (Figure 1.2) has grown from 0.5% to 4.5% in the same period (Bryggeriforeningen, 2012).
Fig. 1.1. Number of Danish Breweries in the period 2000-2011. The column on the left reports that Danish breweries launched 616 new beers in the year 2011 (Bryggeriforeningen, 2012).
Fig. 1.2. Sales volume trend for craft beers (light green) and foreign beers (dark green), as % of total market size. The pie chart on the right shows the different distribution channels. Craft beers reach the end consumers largely through grocery stores (77%) and to a lesser degree via establishments such as restaurants and pubs (23%).
22
These craft breweries or microbreweries (these terms are used interchangeably in this thesis, although they are not complete synonyms – see Box 1.1.) have strongly differentiated themselves from large breweries by having a strong product focus where flavor intensity, experimentation and local identity are key characteristics. As a result the product diversity in the Danish beer market has increased remarkably, and indeed the success of craft beer was also fuelled by consumers’ wanting to break away from the previous uniformity (Schnell & Reese, 2003). Box 1.1. Glossary: micro or craft? The terms “microbrewery” and “craft brewery” are often used interchangeably (in Danish, both are encompassed by the word mikrobryggeri). Technically, however, the concepts can be distinguished. According to the definition of the American Brewers’ Association, a craft brewery is a relatively small, local or regional brewery, producing less than 6 millions barrels per year (1 US barrel = 117.348 liters), and with at least 50% of their volume in either all malt beers, or in beers that use adjuncts to enhance rather than lighten flavor. A microbrewery is more strictly defined as a craft brewery whose production does not exceed 15.000 barrels per year, and that has at least 75% or more of its beer sold off. The former aspect distinguishes it from a regional brewery, i.e. a craft brewery producing between 15.000 and 6.000.000 barrels a year. The latter aspect distinguishes it from a brewpub, where beer is brewed primarily for sale in the restaurant and bar, often dispensed directly from the brewery’s storage tanks. (Source: American Brewers’ Association).
Importantly, the growth of the craft brewing segment has started and consistently continued in a period where the overall beer market has contracted (Figure 1.3.) and overall beer consumption has shrunk (Figure 1.4.) as a consequence, among other things, of the competition of wine, spirits and ready-to-drink (RTD) carbonated beverages. Generally, craft breweries have capitalized on larger trends, such as the issue of sustainable consumption, the buy-local movement, the growing awareness and interest by consumers that expect choice and local flavors in food and drinks (Flack, 1997; Schnell & Reese, 2003). In Scandinavia, and particularly in Denmark, these issues have been amplified by the success of the restaurant Noma and the general interest into the so-called “New Nordic Cuisine”. A growing interest in beer has also been observed in selected consumer groups, such as the local consumer organization Dansk Ølentusiaster (DØ, Eng. Danish Beer Enthusiasts) that was formed in 1998 and has quickly established as one of the largest beer advocacy groups in Europe1.
1
With ≈ 10.000 members, DØ is the second largest association of this kind, outnumbered only by the British “Campaign for real ale” (CAMRA) association.
23
Fig. 1.3. Sales volume trend in the Danish beer market in the period 2000-2011 (Bryggeriforeningen, 2012).
Fig. 1.4. Per capita consumption of alcoholic drinks among Danes in the period 2000-2011 (Bryggeriforeningen, 2012).
24
Craft breweries have strongly differentiated themselves from mainstream breweries on a number of dimensions, such as a clear focus on brewing ales rather than lagers, on all-malt brewing2 and high(er) gravity alcoholic beers, on substantially more hoppy beers, and by the use of specialty malts and additional ingredients to create novel and unique flavor experiences (see Figure 1.5. for a snapshot of relevant brewing trends in the Danish beer market).
Fig. 1.5. This is a bi-plot of a Principal Component Analysis (PCA). The PCA scores (blue) are Danish breweries, the loadings (red) are a number of characteristics of their product portfolio (self-reported by each brewery in the year 2010). What this plot shows is that craft breweries are strongly correlated with an internal prevalence to brew ales (top/bottom ratio), more alcoholic (ABV), darker (Malt color) and bitter beers (IBU), and by experimentation with adjuncts and special ingredients. Details on all variables and further interpretations are given in Paper VII.
These recent developments have favored the emergence of a connoisseurs beer culture, where – according to Paul Gatza, director of the American Brewers’ Association – “beer drinkers are starting to discover hops the same way the wine drinkers know grapes”3. More generally, this has 2
As opposed to the practice of large breweries of using large quantities of so-called “adjuncts”, i.e. lower cost unmalted grains (usually corn, rice, or sorghum) as a supplement to malted barley as the main starch source. Adjuncts are used to lighten the beer body, and to lower production costs compared to an all-malt grain base. Note that microbreweries may also use adjuncts, but in that case it is usually aimed at achieving specific additional features (such as using wheat to get improve foam retention) or desired sensory properties. 3 Paul Gatza, director of the American Brewers’ Association. USA Today – Money Section, May 25th, 2012. www.money.usatoday.com
25
caused a shift from beer being seen as a generic product to craft beer differentiated by flavor (Carroll & Swaminathan, 2000; Watne, 2012). Large commercial brewers have also jumped in the trend as well, establishing or acquiring regional labels and market them as craft beers4. In Denmark, the most notable example is Carlsberg-owned microbrewery Jacobsen, which is housed in the original Carslberg Brewery in Valby, Copenhagen, and focuses on upscale specialty beers.
1.2. Increasing competition and the need for consumer-oriented innovation Although craft brewing is still quite on the uptick, the strong growth rate that the craft breweries have experienced throughout the last decade is rapidly tailing away. Growth prospects forecasts for the craft brewing segment are currently below 10% per annum (Euromonitor, 2010), a sign that the segment has reached a plateau where supply may be close to exceeding demand. Accordingly, industry commentators believe that many microbreweries will struggle to remain profitable5, and that the overall number of breweries in the market is not going to increase any further (Euromonitor, 2010). Part of the problem comes from the fact that craft breweries have gained market share from mainstream lager drinkers, but their success has been primarily confined to a minority of highly involved consumers. Drawing a parallel with the popular Roger’s innovation adoption model (Roger, 1983), one may say that craft breweries have successfully targeted the highly involved beer connoisseurs/enthusiasts minority, and are now facing the “chasm” (Moore, 1991) that separates early adopters from the early majority of consumers (Figure 1.6). Further growth may be impeded or even impossible if not by attracting that majority of consumers that may be interested, but has yet to explore the craft beer world. Reasons for why a large part of beer drinkers do not ordinarily drink craft beer are manifold, and include structural issue such as a relatively limited distribution, and likely reference price effects (Monroe, 1973; Mazumdar, Raj, & Sinha, 2005) when compared to mass-produced beers. However, part of the story may be that consumers’ preferences change very slowly, and that consumers have a general aversion to food or beverages that are too novel or discontinuous compared to what they are expecting (Costa, & Jongen, 2006; Van Trijp, & Van Kleef, 2008), as may be the case with craft beers.
4
In the US MillersCoors’ Blue Moon brand is possibly the most successful example. As a result of the increasing competition, since 2000 onwards beer prices have grown consistently less than the average consumer price index (Bryggeriforeningen, 2012).
5
26
Fig. 1.6. Theoretical adoption life-cycle for innovative products. Adapted from Rogers (1983).
In a familiar product category like beer, consumers often hold clear expectations about its flavor. Previous research on sensory qualities of pale lagers, the category accounting for approximately 85% of all beer drunk in Denmark (Bryggeriforeningen, 2012), found that sensory descriptors commonly associated with this category are usually grainy, light-bodied, refreshing, hoppy, and sweet (Daems & Delvaux, 1997; Donadini & Fumi, 2010; Gains & Thomson, 1990; Mejlholm & Martens, 2006). Overall flavor intensity is generally low, and a recent comparison of the most popular European lagers reported the perhaps unflattering finding that the main sensory characteristic of Danish lagers was watery (ASAP, 2003). It is clear “at first sip” that craft beers are very different, since they often provide much more intense and unique flavor experiences. For many, this could be too big a deviation from their expectations. This is a problem because although consumers value novelty or newness in food and beverages (particularly in a category like craft beers), past research suggests that they tend to prefer products that deliver only moderate level of novelty (Köster & Mojet, 2007; Van Trijp & Van Kleef, 2008). One of the world’s most renowned experts in beer matters, UC Davis emeritus Michael Lewis, has described the situation as follows “The brewers in the craft segment are among the most adventurous and creative brewers on the planet, and many are attracted as a dog to bacon by the prospect of new approaches; the background of many as home brewers persuades them to follow their own noses rather than gauging at the needs of their customers. (…). While craft-brewed beers are fascinating avenues of brewing arts and science to explore, there is real danger of leaving the consumer far behind.” (Lewis, 2010)
Product development in the craft brewing industry is indeed mostly based on personal feelings, views and choices of the brewmaster or product developer. As such it is often a matter of trial and
27
error. As competition intensifies, breweries need to orient their product development and innovation more towards consumers' experience and preferences. Sensory and consumer science offers many tools for collecting consumer insights that can then be used in the product development process. It is also the most natural starting point, since there is little doubt that sensory properties are a major determinant of consumers’ acceptance of (craft) beers. Understanding consumers’ sensory responses to beers can be thus used by breweries to better plan their product development activity, for example by understanding how certain beers compare with similar ones, what sensory dimensions are associated with consumers’ liking, and whether there are unfilled niches that may be promising opportunities. Innovation has been a main driver in pushing the craft brewing movement, and will no doubt remain a major competitive parameter in this industry. Consumer-oriented innovation, in the sense of an approach towards product development where an integrated analysis and understanding of consumers’ perceptions and preferences play a key role (Grunert et al., 2008; Jaeger & MacFie, 2010), might be what will help craft breweries retain, and hopefully also increase, their market share in the coming years. This PhD thesis will be, hopefully, a first step in this direction.
1.3. References ASAP GmbH, Association for Sensory Analysis and Product Development (2003), Mapping the taste of beer. http://www.esn-network.com/635.html (accessed June 2013). Bogue, J., & Ritson, C. (2004). Understanding consumers’ perceptions of product quality for lighter dairy products through the integration of marketing and Sensory information. Acta Agricolturae Scandinavica Section C: Food Economics, 1, 67-77. Bryggeriforeningen (Danish Breweries’ Association), (2012). Tal fra Bryggeriforening 2012 (Eng. Figures from the Breweries’ Association 2012).Copenhagen, Denmark. Carroll, G. R., Swaminathan, A. (2000). Why the microbrewery movement? Organizational dynamics of resource partititioning in the U.S. brewing industry. American Journal of Sociology, 106, 715-762. Costa, A. I. A., & Jongen, W. M. F. (2006). New insights into consumer-led food products development. Trends in Food Science & Technology, 17, 457-465. Daems, V., & Delvaux, F. (1997). Multivariate analysis of descriptive sensory data on 40 commercial beers. Food Quality and Preferences, 8, 373-380. Donadini, G., & Fumi, M. D. (2010). Sensory mapping of sales in the Italian market. Journal of Sensory Studies, 25, 19-49.
28
Earle, M., Earle, R., & Anderson, A. (2001). Food Product Development. Boca Raton, FL: Woodhead Publishing. Euromonitor (2010). Beer – Denmark (Country sector briefing). Euromonitor International Global Market Information Database. Flack, W. (1997). American microbreweries and neo-localism: “Ale-ing” for a sense of place. Journal of Cultural Geography,16, 37-53. Gains, N., & Thomson, D. M. H. (1990). Sensory profiling of canned lager beers using consumers in their own homes. Food Quality and Preference, 2, 39-47. Grunert, K. G., Jensen, B. B., Sonne, A., Brunsø, K., Byrne, D. V., Clausen, C., Friis, A., Holm, L., Hyldig, G., Kristensen, N. H., Lettl, C., & Scholderer, J. (2008). User-oriented innovation in the food sector: Relevant streams of research and an agenda for future work. Trends in Food Science & Technology, 19, 590-602. Jaeger, S. R., & MacFie, H. (2010). Consumer-driven Innovation in Food and Personal Care Products. Cambridge, UK: Woodhead Publishing. Köster, E. P., & Mojet, J. (2007). Theories of food choice development. In L. J. Frewer & J. C. M. Van Trijp (Eds.), Understanding Consumers of Food Products (pp. 93-124), Cambridge, UK: CRC. Lewis, M. J. (2010). Drinkability: Countering a dash to the extreme. Scandinavian Brewers’ Review, 67, 8-11. Mazumdar, T., Raj, S. P., & Sinha, I. (2005). Reference price research: Review and propositions. Journal of Marketing, 69, 84-102. Mejlholm, O., & Martens, M. (2006). Beer identity in Denmark. Food Quality and Preference, 17, 108-115. Moore, G. A. (1991). Crossing the Chasm. New York, NY: Harper Business Essentials. Monroe, K. B. (1973). Buyers’ subjective perceptions of price. Journal of Marketing Research. 10, 70-80. Nielsen, R. (2008). The beer of the Danish golden age. Scandinavian Brewers’ Review, 65, 14-21. Rogers, E. M. (1983). Diffusion of Innovations. New York, NY: Free Press. Schnell, S. M., & Reese, J. F. (2003). Microbreweries as tools of local identity. Journal of Cultural Geography, 21, 4569. Van Trijp, H. C. M., & Van Kleef, E. (2008). Newness, value and new product performance. Trends in Food Science & Technology, 19, 562-573. Watne, T. (2012). Agents of change: An investigation into how craft brewers educate their consumers. Proceedings of the Australia and New Zealand Marketing Academy (ANZMAC) Conference, 3-5 December 2012, Adelaide, Australia.
29
30
2. Collecting sensory responses to beers: A consumer-centric approach Sensory information into how consumers perceive a new beer play a key role in product development. A significant part of this PhD work has investigated the suitability of different methodologies for collecting consumer responses to beer. Particular attention was given to rapid and inexpensive methodologies that can be especially beneficial for craft breweries that, like most SMEs in the food & beverage industry, have limited or no access to conventional sensory analysis (in the form of a trained sensory panel). This chapter reviews the work carried out in this area and is organized as follows: first, a brief overview of current methodological trends in sensory science, particularly with regards to consumers’ involvement in sensory tests, is given. Then, the methods used in this PhD work – CATA and Napping – are introduced, and their applicability for sensory profiling of craft beers is reviewed based on experimental results obtained. The chapters concludes by outlining pro and cons of the methods, recommendations for industrial applicability in the craft brewing industry, and possible perspectives for future research. Relevant papers: I, II, and III.
2.1. Sensory analysis and product development Sensory analysis is the scientific method used to evoke measure, quantify, and interpret human responses to products (Lawless & Heymann, 2010). As the scope of sensory analysis is rather broad, it is customary to divide it into three main areas: discriminative, descriptive and affective. Discriminative sensory analysis is concerned with understanding whether a perceptible difference exists between two products. Discriminative methods are routinely employed among breweries1 for quality control purposes, that is to assess whether a specific products deviates from a specific standard, e.g. to assess whether an ingredient change has a perceptible effect, or to determine the shelf-life of a product. Common examples of discriminative tests include the triangle test (Bengtsson & Helm, 1946; Helm & Trolle, 1946), the duo-trio test (Peryam & Schwartz, 1950), the tetrad test (Lockhart, 1951; revisited by O’Mahony, Masuoka, & Ishii, 1994), and the n-alternative forced choice (Ennis, 1990), among others. While this area of test is more related to product maintenance or optimization, the next two types of sensory analysis are more directly related to product development.
1
In fact, some of them originated from within the beer industry. For example, the triangle test was developed by Erik Helm at Carlsberg Breweries, and the tetrad test was first adopted by the Kirin beer company (O’Mahony, Masuoka, & Ishii, 1994).
31
Descriptive sensory analysis is concerned with the description and quantification of sensory characteristics of food and beverages. Such knowledge can then be can be related to raw materials, process parameters, recipe formulation or other design factors, which makes descriptive analysis an important tool to the food industry for development and quality control. The third and last area of sensory analysis, affective (or hedonic) testing, is concerned with quantifying the degree of liking or disliking of a product from the consumer population of interest (Lawless & Heymann, 2010). Relating affective and descriptive information can be used to identify sensory characteristics that positively affect consumer’s liking (the so-called “drivers of liking”), thus providing guidance for product development purposes. This aspect will be dealt with extensively in the next chapter. The present chapter focuses on descriptive methodologies, and investigates the suitability of a range of methods for craft breweries. Descriptive analysis is a generic term that includes a variety of closely related methods – some of which trademarked – such as the Flavour Profile (Cairncross & Sjostrom, 1950), the Texture Profile (Brandt, Skinner, & Coleman, 1963), Quantitative Descriptive Analysis™ (QDA™, Stone, Sidel, Oliver, Woolsey, 1974), and the Spectrum™ Method (Munoz & Civille, 1992). The most often encountered solution among field practitioners is, however, a combination of the latter two methods (QDA & Spectrum) which some authors refer to generic descriptive analysis (DA, Dijksterhuis, & Byrne, 2005; Murray, Delahunty, & Baxter, 2001; Lawless & Heymann, 2010; Valentin, Chollet, Lelièvre, & Abdi, 2012). DA is usually carried out by a small (8-12) group of panelists who undergo a training phase where they are exposed to a range of variations of the target product, then agree on a common set of sensory attributes (ideally associated chemical or physical references), and finally learn to quantify attribute intensity on a (usually) 10-15 cm unstructured scale in a consistent way. Classical applications of DA in product development are: -
Prototype testing;
-
Benchmarking against existing products and in-market competitors;
-
Determine effect of ingredients substitution and/or process changes;
-
Provide insights into consumers’ evaluations
2.2. Rapid descriptive methods and consumer involvements in sensory tests DA is known to produce detailed, robust and repeatable results, as documented by numerous scientific publications (for a review on the topic, see Murray et al., 2001). However, it has also
32
certain drawbacks. It is a very slow method, particularly because of the extended training phase. Second, it is a very costly method. Maintaining a sensory panel is (usually) not affordable for SMEs in the food industry, and can be a significant spending also for large companies. Lastly, it is possible that the trained assessors experience the product differently from the final consumers, and/or that they may take into account sensory characteristics that may be irrelevant for the consumers (Ares et al., 2010a), providing high quality results but with low external validity. In order to address these drawbacks, a number of alternative descriptive methodologies have been proposed over the years, most of which require little or no training and are easily implementable with trained panelists or consumers alike. The idea that consumers (or generally untrained subjects) can be used for descriptive tasks – traditionally a highly controversial topic among sensory scholars (Moskowitz, Munoz, & Gacula, 2003) – is increasingly accepted due to three factors: 1. strong evidence that consumers can provide valid and meaningful sensory (descriptive) information (e.g. Bruzzone, Ares, & Gimenez, 2012; Worch, Lê, & Punter, 2010); 2. methodological developments that facilitate the collection and analyses of such responses; 3. a general consensus that a deeper consumers’ involvement at an early stage in the product development life-cycle is beneficial to the success food products development (Grunert et al., 2008; Stewart-Knox & Mitchell, 2003; van Kleef, van Trijp, & Luning, 2005; van Trijp & van Kleef, 2008). A useful way to classify rapid descriptive methodologies, according to recent reviews on the topic (Valentin et al., 2012; Varela & Ares, 2012), is the distinction between verbal-based methods and similarity-based methods. Verbal methods are based on monadic evaluations of a number of products on individual sensory descriptors. Examples of this class of methodology are Free Choice Profiling (Williams & Langron, 1984), its Flash Profile variant (Dairou & Siefferman, 2002), and check-all-that-apply (CATA) questionnaires (Adams, Williams, Lancaster, & Foley, 2007). Similarly to DA, these methods produce a descriptive sensory profiling of the products, while bypassing the time-consuming steps of attribute and scaling alignment that is a key aspect of DA (Valentin et al., 2012). In the second class of methods, similarity-based, assessors are presented with all products simultaneously, and give a global evaluation expressed as perceived inter-product differences. These methods are sometimes defined “holistic”, because they require the assessor to consider the product as a whole, unlike the “reductionist” verbal-based methods which require assessors to
33
decompose the stimulus into multiple attributes. In reality, some similarity-based methods may include a verbalization task, but this occurs only after the global evaluation, and the output is usually not used to build the perceptual space beyond simple correlational measures. The simplest (and probably best known) of these methods is the free sorting task (Lawless, 1989; Lawless, Sheng, & Knoops, 1995), which has been applied for sensory evaluation of beers, and is reportedly applicable with trained assessors and consumers alike (Chollet, Lelièvre, Abdi, & Valentin, 2011). Another important similarity-based method is projective mapping (Risvik et al., 1994), the first method to introduce the idea of expressing product differences as Euclidean differences by means of projection onto a two dimensional space. Various adaptations and modifications of the original projective mapping technique have been proposed, the best known of which is Napping® (Pagès, 2003; 2005).
2.3. The use of consumers for profiling of beers In comparing beers, brewers typically focus on chemical and physical properties such as the percentage of the original wort, the amount of alcohol, the ph-level, the degree of bitterness, the color, and so on. While these parameters can be measure with routine instruments, comparing and describing beers from a sensory standpoint can be imprecise and idiosyncratic, unless a panel of some sort is brought in to clarify and “objectify” such descriptions. Sensory scientific approaches to evaluation of beer flavors abound in the literature. A significant share of this work even originated from within the brewing industry, which has been traditionally very involved with sensory evaluation (e.g. Brown & Clapperton, 1978). Descriptive techniques have been used for a variety of uses: relate sensory characteristics of beer to its chemical composition (e.g. Meilgaard, 1982), ageing (Guyot-Declerck, François, Ritter, Govaerts, & Collin, 2005) or storage conditions (Pangborn, Lewis & Tanno, 1977), to define different brewing styles from a sensory standpoint (Daems & Delvaux, 1997), and to model consumer preferences (Guinard, Uotani, & Schlich, 2001). The majority of previous studies have used classical DA with trained panel for producing a descriptive profiling task. However, a few alternative approaches using consumers have also been attempted successfully. Already Clapperton and Piggott (1979) indicated, in a study where they compared panels with different degrees of expertise, that trained subjects and consumers are able to produce similar profiles in a DA-like task (though training was found to improve reproducibility and discrimination). Later, Gains and Thomson (1994) successfully
34
conducted a home use test (HUT) in which they used a consumer population for sensory profiling of a selected sample of lager beers. They reported that consumers can validly profile beers, provided that sensory differences are not extremely subtle, especially if they have at least some degree of familiarity with the product (Gains & Thomson, 1994). A similar conclusion was reached by Chollet & Valentin (2001) who use a combination of sorting and a fast descriptive task. As mentioned in the previous paragraph, a number of available descriptive techniques, suitable for consumer profiling, have been proposed. In the context of this research, it was considered of interest to explore the suitability of new approaches for the sensory profiling of beers. It was chosen to focus primarily on two approaches: Check-all-that-apply (CATA) questions (Adams, Williams, Lancaster, & Foley, 2007), and Napping® (Pagès, 2003, 2005). This choice was motivated by a number of factors: -
Both methods are fast, low-cost, and in other product categories, have been used successfully with both trained subjects and consumer alike. These characteristics make them suitable for application in SMEs such as craft breweries;
-
Both methods had not previously applied to evaluation of beers;
-
It was considered of interest to compare two fundamentally different approaches – CATA is a verbal based method, whereas Napping is a similarity based – to see the different type of insights they can generate;
-
Synergies with existing research projects during the PhD period (Machado, 2011; Dehlholm, 2012).
2.4. Check-all-that-apply (CATA) Check-all-that-apply (CATA) questionnaires are an increasingly common technique to collect analytical consumer evaluations of food products. The CATA or “pick-any” approach originated from the work of the mathematical psychologist Clyde Coombs (1964), and was first used in marketing survey research (Driesener & Romaniuk, 2006; Smyth, Dillman, Christian, & Stern, 2006). Since it was introduced to the sensory and consumer community (Adams, Williams, Lancaster, & Foley, 2007), the popularity of this method has increased remarkably. In its use for descriptive profiling, the CATA method consists in presenting consumers with a predefined descriptor list and in having them “tick” all the descriptors they find appropriate to
35
describe a food product, or alternatively to indicate explicitly whether each descriptor applies or not2 (Fig. 2.1).
Fig. 2.1. An example of check-all-that-apply questionnaire in the standard (left) and forced choice variant (right).
Several published studies have appeared in the literature showing how to use CATA questionnaire to obtain sensory maps based on consumer perceptions across a number of product categories (Ares et al., 2010a; Ares, Varela, Rado, & Gimenez, 2011; Bruzzone, Ares, & Gimenez, 2012; Parente, Ares, & Manzoni, 2010; Sinopoli & Lawless, 2012), and to perform external preference mapping (Ares et al., 2010b; Dooley, Lee, & Meullenet, 2010; Parente, Manzoni, & Ares, 2011; PiquerasFiszman, Ares, Alcaide-Marzal, & Diego-Mas, 2011; Puyares, Ares, & Carrau, 2010; Plaehn, 2012). Proponents of the method commonly list three main advantages of the CATA technique. The first two are its rapidity and ease of use, as it is very intuitive. For example, Driesener and Romaniuk (2006), using brand names as test stimuli, demonstrated that CATA questionnaires are significantly quicker to administer compared to both ranking and rating tasks. The same finding was reported by Ares and colleagues, who added that CATA requires fewer explanations and that consumers find it easier than other methods (Ares et al., 2010a). The last advantage of CATA that is consistently mentioned is that, when used together with collection of hedonic ratings, it might have
2
This approach is sometimes referred to as “applicability scoring” (Ennis, & Ennis, 2013), to distinguish it from standard CATA.
36
a smaller effect on the latter than e.g. Just-About-Right and intensity ratings, a claim that appear to be supported by the results presented in Paper IV (I will go back to this aspect in the next chapter). The data produced by this method are treated as dichotomous responses (checked term = 1; unchecked term = 0) for each of the term present in the CATA ballot, and arranged in either an unfolded assessor by attribute matrix, or a in a cross tabulation matrix containing total frequency of mention for each term (Fig. 2.2).
Assessor 1 1 1 2 2 2 … i
Product A B C A B C … k
Attr. 1
Attr. 2
….
Attr. J
0/1
Product A B C … K
Attr. 1
Attr. 2
….
Attr. j
Counts
Fig. 2.2. Possible set-ups for CATA data: unfolded (left) and cross-tabulation (right).
Different statistical techniques can be applied to CATA data, depending on whether one is working with the unfolded or the cross tabulation matrix. Cross tabulation matrices are mostly analyzed by exploratory multivariate techniques such as correspondence analysis (CA), a generalization of Principal Component Analysis (PCA) tailored for the analysis of frequency data (for an overview see (Abdi & Williams, 2010). CA is a common analytical tool used for CATA data, used mainly for graphical visualization of sample differences. Partial Least Square Regression (PLSR) can also be applied to the data with advantages in terms of interpretation and evaluation of the results, as discussed at § 2.6.1. In this thesis work, CATA questions were applied for sensory characterization of beer in two main studies. In the first study (Paper I), CATA was used with a large consumer panel (N= 160) to derive a sensory profile of six beers covering a product space representative the sensory diversity in the Danish craft beer market. The main sensory variation was given by a separation between two beers brewed with different ratios of roasted and caramel malt and beers brewed with pale malt, but also separated between these two groups by providing a quite accurate sensory characterization that largely matched the commercial description provided by the producers and our prior knowledge of
37
the samples. Furthermore, statistical analysis revealed that consumers discriminated between all sensory samples for 17 out of 27 sensory descriptors (Paper I). The same conclusion was obtained in the second study (Paper III), in which it was also tested whether adding an intensity dimensions (i.e. for each descriptor they checked, consumers were also asked its intensity on a category scale) would improve the accuracy of descriptive profiling and lead to a better product differentiation. At an overall level, the two CATA variants provided largely congruent configurations: raw materials and specifically grains seemed to drive the main source of sensory variation, separating nutty/roasted beers with beers characterized by flowery-fruity estery flavors, but also clear differences within these two sub-classes. It was also found that the adding an intensity dimension to the task did not affect the conclusions, and actually reduced the number of descriptors on which samples significantly differed, which could be explained by consumers’ lack of consistency in the use of scales (Paper III).
2.4.1. Ballot design and descriptor selection in CATA profiles for beer The design of a CATA ballot has very important implications for the results one will obtain. Concerning the number of descriptors, very lengthy lists should be avoided because that may encourage a behavior called satisficing (Krosnick, 1991), consisting in choosing a few options that seem reasonable instead of carefully evaluating all descriptors. One solution to this is to use the forced choice variant (Ennis & Ennis, 2013), but nevertheless it is advised not to exceed 20 to 30 descriptors, especially if naïve consumers are used as panelists3. Another issue to be considered designing CATA ballot design is the so-called primacy bias: descriptors that appear on top of the list will be checked more often than those that appear on the bottom of the list (Ares & Jaeger, 2013). Again, the forced-choice variant should reduce this problem (though at the cost of a more effortful task), but it is nevertheless advised to randomize the order in which the terms appear on the ballot. Semantic grouping of the attributes by sensory modalities or by encompassing sensory descriptors is also known to help facilitate the task and reduce both the satisficing and the primacy bias. An example of this (from Paper III) is shown in Figure 2.3.
3
The longest ballot of this kind I am aware of contained 101 terms (Campo et al., 2008), but was conducted with trained panelists who had been trained in the use of the ballot and the task was actually restricted to selecting up to six terms that most applied to the sample.
38
Fig. 2.3. Semantic grouping in CATA ballots (excerpt from the ballot used for Paper III).
Another important issue to consider in the design of a CATA task is the type of descriptors to be included. In this work, selection of CATA descriptors was approached by a combination of pilot work and reliance on existing sensory vocabularies for beers. In particular, two widespread tools were used: The Beer Flavour Wheel (Meilgaard, Dalgliesh, & Clapperton, 1979) and the Danish Beer Language (Det Danske Øl Akademi, 2006). The first one, which is the industry’s standard sensory tool, stems from the seminal work of Morten Meilgaard in uncovering major flavor constituents of beer. The beer flavor wheel is a tool for arranging common sensory characteristics of beer into a 3-tiers categorical system, whereby inner terms are more general and outer terms are more specific (Figure 2.4.). The focus is restricted to the chemical senses. The Beer Flavor wheel was created to link the chemist perspective to the sensory scientist perspective, where all outer terms in the wheel can be related to identified aroma compounds with a specified purification method (Meilgaard, Dalgliesh, & Clapperton, 1979; Meilgaard, 1982), or at least to a physical reference standard. The Beer flavor wheel is rather technical and most suitable for a trained panel which can make use of references. The Danish Beer Language is a sensory vocabulary developed by the Danish Beer Academy, an organization active in raising awareness of beer in Denmark, together with industrial and academic partners (Figure 2.5 shows an English translation of it). This vocabulary focuses on everyday language, does not contain negative words (unlike the Beer Flavor Wheel which devotes significant attention to flavor defects), and considers all sensory modalities.
39
Fig. 2.4. The original Beer Flavor Wheel. Chemical or physical references for the outer terms are given in the original paper by Meilgaard, Dalgliesh, & Clapperton (1979).
40
Appearance
Color
Yellowish-white Straw-colored Golden Reddish golden Copper Ruby red Nut-brown Chestnut Coffee-like Coal-black
Livelihood
Still Sparkling - small bubbles - medium sized bubbles - large bubbles
Taste and mouthfeel
Turbidity
Taste
Foam
Aftertaste - short bitterness - long bitterness - short sweetness - long sweetness
Clear Hazy -sediments - floating
Mouthfeel
Sour Sweet Salt Bitter
Full bodied Bubbly Refreshing Warming Astringent Drying Metallic Coating
Color White Creamy Café au Lait Nut-brown Texture Airy Dense Creamy Persistency Short Long
Aroma
Hops and flowers Hops Rose Geranium Elderflower Heather
Corn and malt
Corn and malt Green grass Straw Bread (Newly baked) Rye bread Toasted bread Caramel Fresh butter Malt candy Burned coal Coffee Chocolate Liquorice
Spices
Vanilla Cinnemon Cloves Allspices Cardamom Ginger Coriander Star Anise Bog Myrtle Juniper berry Pepper mint Honey Nutmeg Timian Pepper Cocoa
Fruits/berries Cherry Strwaberry Raspberry Blackcurrant Apple Pear Banana Melon Pineapple Peach Apricot Plums Prunes Rasin Orange Lemon Grape Lime
Alcohol
Woody/Earthy
Nuts
Newly ironed Macthes Cooked corn Celery Onion Leather Rubber
Portwine Sherry Liquor Wine-like Grappa Vinegar
Hazelnut Almond Walnut Coconut
Pine needle Timber Resin Cellar Earthy
Sulfurous
Fig. 2.5. Descriptors included in the “Danish Beer Language” (Dansk Ølakademi)
41
Both are valid point of departures for selecting appropriate CATA descriptors, which should be done taking into account the specific study objective and the type of panelists available. The results of the studies in this thesis suggest that consumers will use simpler terms (i.e. understandable without the use of references), as previously observed for beer evaluation (Chollet & Valentin, 2007; Clapperton & Piggott, 1979) as well for other food products (for a discussion, see van Trijp & Schifferstein, 1995). Consumers seemed to prefer integrated terms (viz., which combines several product attributes into one term). However, it is expected that consumers vary largely in their degree of expertise, and more experienced consumers may be able to use more precise descriptors. For instance, a subsequent analysis of the use of descriptors from the CATA dataset in Paper III, where consumers were divided into three groups based on their (self-reported) level of beer knowledge, showed that high- and low-knowledgeable consumers vary a lot in the use of terms (Fig. 2.6.). In particular, less knowledgeable consumers used more overall flavor categories (nutty, woody, hoppy, etc.), whereas high knowledgeable consumers used were able to pick more specific flavor notes. Consideration of the type of consumers available (e.g. a convenience sample vs. a group of beer enthusiasts) should thus be taken into account for selecting appropriate descriptors.
Fig. 2.6. Plots from a PLSR model, in which the background product knowledge of the consumers (X) was use to predict their use of CATA terms (Y). The circles represent 95% confidence ellipses of the regression coefficients estimated by jack-knife resampling. The X loadings plot (right) shows that, although not completely separated, the highly knowledgeable consumers differed from less experienced ones with regards to the terms they used. The Y loadings plot (left) shows which CATA terms are especially used differently (Rinnan, Giacalone, & Frøst, In preparation).
42
2.5. Napping The Napping® method was introduced by Pagès (2003, 2005), and can be considered a modified version (see Box 2.1.) of the original projective mapping (Risvik et al., 1994) technique. Both are based on the idea that inter-perceived product differences can be expressed as a Euclidean configuration in a unique session. The method consists in presenting the samples simultaneously to each assessor, together with a large rectangular sheet of blank paper of a size similar to a standard A2 sheet (60 by 40 cm), which resembles a paper tablecloth (the word ‘‘napping’’ derives from ‘‘nappe’’ – the French word for ‘‘tablecloth’’). Assessors are then instructed to evaluate the perceived the similarities (or dissimilarities) between the sample, by positioning them on the sheet in such a way that two samples should be placed very close if they seem identical to them and distant from one another if they seem different. It is stressed that they have to do so according to their own criteria, and that are not right or wrong solutions. At the end of the task, assessors usually write the sample code in the place it occupies on the sheet, or use post-its notes to that effect. A visual representation of a Napping set-up is given in Figure 2.7.
Box 2.1. Projective mapping versus Napping Projective mapping and Napping are often improperly used as synonyms in the literature. It is more correct to state Napping is a specific case of projective mapping. While Napping has a specified protocol (Pagès, 2005) with regards to materials, task instructions and data analysis, projective mapping is a more generic approach to sensory evaluations. The following is a schematic overview of differences between projective mapping and Napping:
Frame geometry and size Frame look Data analyses
Projective Mapping
Napping
Rectangular (A4 or A3) or square (60 x 60)
Rectangular (60 x 40)
Drawn axes, gridline, or blank
Blank
GPA, PCA, MDS-INDSCAL, STATIS
MFA on unscaled data
Although the issue of terminology may appear trivial, it is useful to bring this up because method users need to be aware that protocol modifications may alter the way assessors face the task and produce slightly different results. For example, as Dehlholm, Lê, & Bredie (submitted) show, the shape and size of the frame affect the projection strategies the assessors adopt (particularly the use of the first and second dimension). Further, data analyses other than MFA on unscaled data may yield results that do not reflect individual assessor’s use of the space (Morand & Pagès, 2006).
43
Fig. 2.7. Set-up of the Napping test (top), and detail of a participant (bottom). Pictures are from study 3.
These data are digitalized in data matrix with products as rows, and X-co-ordinate and Y-coordinate on the sheet as columns. It is customary to place the origin of the coordinate system can be in the bottom left corner (though it can be placed anywhere). Finally, because Napping in itself is a kind of sorting task, it has become customary to instruct the assessors, once they have reached a final configuration, to add a list of sensory descriptors that they find appropriate to describe the samples, a procedure known as Ultra-flask Profiling (UFP, Perrin et al., 2008; Perrin & Pagès, 2009). The process and the resulting data structure are shown in Figure 2.8.
44
Fig. 2.8. Example of a completed Napping sheet for an individual assessor (above), and resulting data structure at panel level (below).
Napping data are usually analyzed by Multiple Factor Analysis (MFA), a multivariate data analytical technique that seeks the common structure between several blocks of variables (i.e. the individual assessors in a Napping task) describing the same observations (the samples). Figure 2.9 provides an exemplified overview of MFA applied to Napping data. As it appears from the figure, MFA can be thought of as a PCA in two steps. The main difference with PCA is that MFA takes into account into account individual differences, rather than averaging the data (Nestrud & Lawless, 2008).
45
Using the notation in Fig. 2.8, MFA starts by computing an initial PCA on each individual matrix Xj (containing sample coordinates for individual assessors), and subsequently transformed into a new matrix ������ such as:
�
������ �
1
����
� �� �
where �� represent the first eigenvalue of the initial PCA of matrix Xj. The quantity��� , called first singular value in MFA jargon, is basically a matrix equivalent of the standard deviation. This procedure corresponds to a normalization (i.e. the first eigenvalues of the transformed ������
matrices are all equal to 1) that prevents the blocks with the largest variance from exerting an overwhelming influence: in Napping, this means accounting for individual panelists differences in the use of the projective space. After this step, the data blocks are concatenated in a global data table on which new PCA is run, i.e. by singular value decomposition of the matrix ����� � � �������� |������� | � |������ ]. The
descriptive data from UFP are usually treated as supplementary variables to the MFA on the Napping coordinates. “Supplementary variables” means that UFP data are not used to construct the MFA model, but correlation coefficients of the UFP sensory descriptors are calculated and can be
presented in the product space to aid the interpretation. It is important to remark that this solution provides a sensory configuration that is not necessarily driven by the sensory variables with the strongest structure, but by those that are relatively more important for the assessor4. Accordingly, some authors observed that this method can be thought of as producing both quantitative and qualitative sensory information (Chollet et al., 2011).
4
According to the inventor of the method, this is the main advantage of Napping over DA (Pagès, 2005). The latter produces (ultimately) a data matrix crossing products and descriptors. Such data, typically containing mean values over assessors, is then mean-centered column-wise and analyzed by Principal Component Analysis (PCA), either by giving identical weight to the same variables after a normalization procedure so that each descriptor gets the same variance (usually dividing the data by the sample standard deviation), or by keeping the weight of each descriptors proportional to its variance (unscaled PCA). Whichever solution one chooses, Pagès (2005) observes, the weights given to the descriptors do not necessary correspond to the actual importance for the subject.
46
. Fig. 2.9. Schematic representation of MFA steps applied to a Napping dataset.
47
The individual assessors’ maps (i.e. the initial PCAs) can be used as a source of additional information, for example for visually inspecting whether an assessor perceives the products differently from the consensus configuration. A more quantitative assessment of the latter can be also obtained by computing Rv coefficients (Robert & Escoufier, 1976) between individual assessors configurations and the global one, as shown in Figure 2.10 below (meaning and use of the Rv coefficient are discussed at Box 2.2).
Fig. 2.10. Comparison of two assessors’ individual sheets (green) and the consensus configuration (black). The assessor on the left is quite far from the consensus space (accordingly, the value of the Rv coefficient between his/her configuration and the consensus was 0.01), whereas the one on the right is very close (Rv = 0.89). The data are from two panelists that participated in Study 3.
In this thesis, partial Napping was applied to a set of commercially relevant beers in two occasions (Paper II and Paper III). Paper II reports results from an exploratory study conducted with project partners in which the suitability of the method for sensory profiling of craft beers was explored. Unlike the original Napping method, in which the evaluation is supposedly completely holistic, assessors were instructed to focus on taste and smell characteristics of the beer. This variant is sometimes referred to as Partial Napping (Dehlholm et al., 2012) or Napping by modality (Pfeiffer & Gilbert, 2008), and was chosen in part because there were some strong differences between samples with regards to color (which we were less interested in), and in part because this variant was found in a comparative study to produce results more similar to those of classical DA (Dehlholm et al., 2012; Pfeiffer & Gilbert, 2008). As discussed in Paper II, the method produced an interpretable representation of the beers and the UFP provided indications of the underlying sensory differences. However, a few of
48
the beers could not be discriminated, due to a high inter-assessor variability in the way they built their maps. This is typical of a Napping task with untrained panelists, and is in agreement with previous claims that similarity based methods (like Napping) are useful for generating a coarse product profile with summarized sensory information, but are generally less discriminative than verbal-based method (Albert et al., 2011; Veinand et al., 2011), such as e.g. DA or CATA. In that experiment, circa half of the panelists were naïve consumers, while the rest of the panelists were brewmasters. Analyses of the differences between the two groups revealed that the brewmasters had a higher agreement with regards to sample differences (Paper II), which suggests that the degree of product expertise may increase assessors’ reliability in a Napping task.
Box. 2.2. The Rv coefficient The Rv coefficient (Robert & Escoufier, 1976) is an index that has gained popularity in the sensory science field for comparing the outcome of descriptive methods, and for evaluating the performance of individual assessors in similarity based methods. This measure has been used extensively in this thesis (most notably in papers II and III), and thus it is useful to briefly elucidate its main properties. The Rv coefficient is a simple way of measure the similarity of two groups of variables (in Napping: panelists coordinates vs. MFA loadings) measured on the same set of observation (the beer samples). Formally, the Rv coefficient is defined as follows: ܴሺܺǡ ܻሻ ൌ
ݎݐሺܺܺ ᇱ ܻܻ ᇱ ሻ
ඥݎݐሺܺܺ ᇱ ሻଶ ݎݐሺܻܻ ᇱ ሻଶ
where X and Y are two mean-centered matrices that share the same row dimensionality. The Rv coefficient has several important properties that make it convenient for comparison of multivariate configurations. First, it takes values between 0 and 1 (0 = each variable of X is uncorrelated to each variable of Y; 1 = X and Y are identical). Second, it is a multivariate generalization of Pearson’s correlation coefficient (i.e., it corresponds to the correlation coefficient after rearranging the two matrix products – XX’ and YY’ – into vectors), and thus can be straightforwardly interpreted similarly to the R2 goodness of fit in ordinary regression analysis between two variables (Kherif et al., 2003; Meyners, 2001). Third, it is independent of rotation and scaling. Lastly, it is applicable to matrices with unequal number of variables (it only requires the rows to be the same) which makes it a very versatile tool applicable to measure inter-assessor variability and also global method comparisons (cf. Paper II for an application of the former, Paper III for the latter).
49
In the second experiment (Paper III) Napping was used for sensory profiling of the same beers used for the CATA experiment mentioned beforehand, and the sensory profiles obtained by these different approaches were compared. The key indication was that Napping produced a very similar product separation (Rv coefficients > 0.90) they differed somewhat in the descriptive output of the beers (Paper III). Basically, while CATA aided the description by increasing awareness of certain sensory characteristics, Napping facilitated the development of a larger and more diverse vocabulary, leading to a series of considerations with regards to method selection (discussed at the end of this chapter). In a follow-up study, the same beer samples used for Paper III were evaluated with the Napping methodology by a trained sensory panel (the in-house panel at Carlsberg Breweries). In a similar way as in Paper II, the results obtained were then compared with those from the consumer group, where we further distinguished between “novices” and “enthusiasts” based on their self-reported knowledge and interest in beer. The results (unpublished except in the form of a conference presentation – Giacalone, Machado, & Frøst, 2012) showed no differences in mean Rv coefficients among the three groups (novices, enthusiasts and trained panelists). This does not concur with previous results (Paper II) and indicate that product knowledge does not necessarily yield higher consistency in product configuration by Napping. In contrast, however, the three groups differed highly in the use of the words they used during the UFP. The use of UFP descriptors was studied by ANOVA-Partial Least Squares Regression (Figure 2.11.), which showed significant differences particularly between Experts and Novices. Experts used to a larger extent specific sensory attributes (e.g. malty, ester, astringent), whereas Novices tended to use more abstract and integrated terms (e.g. summer, heavy, youth). Summarizing, the results obtain in this thesis suggest that consumers are capable to produce a sensory characterization of beers by Napping, and that the level of expertise only leads to a better (i.e. more specific) use of sensory descriptors. This supports and extends the findings by Clapperton and Piggott (1979), and by Chollet and Valentin (2001), who applied different approaches to beer evaluation (DA and a combination of sorting + UFP respectively) and reached this very same conclusion. More in general, these results are largely in agreement with previous claims that the issue of expertise is more related to the quality of the sensory terminology than to perceptual abilities (e.g. Chollet, Valentin, & Abdi, 2005; Lawless, 1984; Solomon, 1990; Guerrero, Gou, & Arnau, 1997; Urdapilleta, Parr, Dacremont & Green, 2011).
50
Fig. 2.11. Correlation loadings plot from APLSR analysis showing the significant effect of product knowledge on the use of descriptors (Giacalone, Machado, & Frøst, 2012).
2.6. Assessment of sensory data from consumer panels Rapid descriptive methodologies produce data that are qualitatively different from DA. The lack of replicates, the non-continuous nature of the data generated, the generally higher variability, the foreseeable lack of consensus among panelists require a non-standard approach to evaluating the quality of these data. This section briefly outlines some key issues that should be taken into account when evaluating data from CATA and Napping.
2.6.1. Discrimination Discrimination has to do with answering the question: Are two or more products significantly perceived as different? For CATA data, until recently only exploratory analyses were applied to the data (usually by CA, as mentioned beforehand). In paper I and paper III, it was proposed to use Partial Least Squares – Discriminant Analysis (PLS-DA) for statistical assessment of different inter-sample differences. This procedure is a variation of ordinary PLSR when one of the matrixes is binary and uses the sub-
51
models created during the cross-validation to estimate the reliability of the regression coefficient, a procedure known as jack-knife resampling5. The data set-up for this type of analysis with CATA data is shown in Figure 2.12. Depending on which matrix one uses as the regressor and which as the regressand, this analysis allows a rapid verification of both product differences (identity matrix as regressor) as regressor and significant descriptors (CATA matrix as regressor). Such analyses are known as, respectively, ANOVA-PLSR (A-PLSR) and Discriminant-PLSR (D-PLSR) (Martens & Martens, 2001).
Fig. 2.12. How to arrange CATA data into a form suitable for JK-PLSR. This figure was presented by the author at the workshop on alternative descriptive methodologies at the 9th Pangborn symposium.
As shown in paper I and III, A-PLSR and D-PLSR have been used in for indeed, verification of differences between samples and identification of significant descriptors from CATA profile as well as the descriptive output of Napping (UFP). An alternative approach that has meanwhile been proposed by other authors (e.g. Ares & Jaeger, 2013) is the application of Cochran’s Q test, a non-parametrical test that can be carried out on a 5 Briefly, the procedure consists in computing the difference between the regression coefficient of the regressor variable in each sub-models (based on the samples left out during the cross validation procedure) and the regression coefficient for the overall model. The sum of squares of the differences in all the sub-models is used as an estimate of the variance for that variable. The significance of the estimate for each variable is computed using a t-test, and thus it is possible to present the resulting regression coefficients with uncertainty limits for different levels of significance (Martens, Høy, Westad, Folkenberg, & Martens, 2001; Martens & Martens, 2000; 2001).
52
descriptor-by-descriptor basis, in order to identify significant differences between products for each of the terms included in the CATA ballot. Tables 2.1 and 2.2 present a comparison of results obtained by this procedure with the one obtained by JK-PLSR. As it can be seen the two approaches produce nearly identical estimates: this supports the validity of the approach proposed in Paper I and III, with PLSR being advantageous because it enables statistical testing and exploratory insights in a single analysis (in standard software packages this comes with graphic displays and instant overview of products differences and descriptors correlations, which makes it suitable for lessexperienced users without a firm statistical background).
Descriptors
D-PLSR 1
Cochran’s Q Test
Floral *** *** Beans *** *** Intense berries n.s. n.s. Caramel *** *** Nuts *** *** Savory spices ** ** Dessert Spices ** * Reg. Spices n.s. n.s. Herbs n.s. n.s. Citrus fruit *** *** Berries n.s. n.s. Fruit n.s. n.s. Dried fruit *** *** Liquor *** *** Bitter n.s. n.s. Sparkling *** *** Refreshing *** *** Fruity *** *** Aromatic *** *** Spicy n.s. n.s. Still n.s. n.s. Smoked *** *** Foamy n.s. n.s. Sour *** ** Sweet *** *** Vinous n.s. n.s. Warming *** *** Significance levels: *** p < 0.001; ** p < 0.01; * p RV Novices and the null hypothesis as H0 : RV Experts ≈ RV Novices
MATERIALS AND METHODS The Beers Nine special Danish beers—seven commercially available and two experimental—were chosen for this study. The selection was made in order to ensure enough variety between the samples and to illustrate current tendencies of beer making in Denmark. Moreover, each of them contained a special ingredient or flavor of interest to add more complexity to the discriminative task. The full list of the beer samples is given in Table 1.
Subjects Subjects (N = 17) were recruited through the authors’ personal network. Roughly half of them (N = 8) were professional brewmasters or very knowledgeable beer consumers (named “experts” throughout this article), whereas the others (N = 9) were novice consumers with an interest in beer. None of them had ever undergone any sensory training in the description of beer flavors; however, it seemed sensible to assume a higher general experience with the product in the expert group compared to the novices. TABLE 1 List and Details of the Beer Samples Used for the Study Beer name
Brewery
Beer type∗
Special flavor Walnuts Elderflower Oat and rye Chamomile and heather (Calluna vulgaris) honey Juniper berries Beech twigs Cranberries
Nutty Fynsk For˙ar Havre Stout Classens Lise
Ørbæ Bryggeri Ørbæk Bryggeri Bryggeri Skovlyst Halsnæs Brygus
Brown Ale Pale Ale Stout Pale Ale
Enebær Stout Bøgeberg Oak Aged Cranberry Bastard Rosehip Beer
Grauballe Bryghus Bryggeri Skoviyst Hornbeer
Stout Amber Ale Fruit Beer
Experimental
Pale Lager
Pine Beer
Experimental
Pale Lager
Pilsner beer (“Grøn Tuborg”) with added rosehip powder1 Pilsner beer (“Grøn Tuborg”) with added pine needles flavorant2
1 “Hybenpulver Økologisk,” Coesam SA Laboratorios de Cosmetica. Concentration = 5% (5 g/95 g). 2 “Pin Thyrol,” Firmenich SA. Concentration = 0.00625% (6.25 ul/100 ml). ∗ Self-reported by the producer.
148
148
Consumer-Based Product Profiling
205
Experimental Procedure—Partial Napping The presented study was conducted in two identical experimental sessions. All the beer samples were presented simultaneously to the subjects, blind labeled with a three-digit code. They were served at temperature of 8◦ C in clear glasses. Approximately 5 cl of each beer was given to the subjects. For the Napping task, subjects were provided with a large sheet of blank paper: the tablecloth, or nappe. We used the indication of Pagès (2005), a 60 cm × 40 cm white sheet with no coordinates drawn.1 Subjects were instructed to smell and taste the samples one by one, then place them onto the tablecloth in a way that reflected the perceived similarities or differences: similar samples should be placed very near, and different samples should be placed distant from each other (Pagès, 2005). Furthermore, it was explained to the subjects that they were free to choose their own criteria to place the samples on the sheet, and that there were no right or wrong way of doing it. We did introduce one limit, however: i.e., subjects were instructed to focus solely on smell and taste characteristics and not to consider appearance and mouthfeel. This kind of Napping task, focused on a limited number of sensory dimensions, is known as “partial Napping” or “Napping by modality” (Pfeiffer & Gilbert, 2008) and was originally proposed by Pagès (2003). Compared to its holistic version, partial Napping allows the subjects to be more analytical (Pfeiffer & Gilbert, 2008). In the present study, the sample set had rather large visual differences that were irrelevant to the flavors, thus the restriction to focus on smell and taste. Furthermore, on comparative studies, partial Napping has been found to give the closest results to those of conventional profiling (Dehlholm, Brockhoff, Meinert, Aaslyng, & Bredie, 2012), supporting the reliability of this method (Pfeiffer & Gilbert, 2008). When the discriminative task was completed, subjects were instructed to write down on the tablecloth the number of each beer sample in the place they occupied. At this point, the Napping was combined with an Ultra-Flash Profiling (UFP; Perrin et al., 2008) task: i.e., the subjects were asked to write down (directly on the sheet) any word they found appropriate to describe each sample. A completed nappe consists of marks indicating the position of each sample plus, next to each of them, the descriptors used for that particular sample. The combination of Napping and UFP has been used by some authors, and these two techniques have proved to be good complements for each other (Albert, Varela, Salvador, Hough, & Fiszman, 2011; Perrin et al., 2008; Pfeiffer & Gilbert, 2008). Through ad hoc multivariate statistical analysis, Napping and UFP together can provide a quick profile showing relationships 1 Unlike the original Projective Mapping technique, where an A4 sheet with two crossed axes was used (Risvik et al., 1994).
149
149
206
D. Giacalone et al.
between products and descriptors, similar to Principal Component Analysis (PCA) results from conventional profiling (Pfeiffer & Gilbert, 2008).
DATA TREATMENT Data were digitalized using a coordinate system with the origin in the left bottom corner. The outcome was a table with 9 rows (the samples) and 34 columns (the X and Y coordinates for each subject). The descriptors—i.e., the words elicited during the ultra-flash profiling part—were entered separately and treated as a contingency table crossing products and descriptors. Because of the large number of terms generated during the descriptive task (no restrictions on neither type or number on words that could be used), synonyms or near synonyms were semantically clustered prior to the analysis. This was done qualitatively, with the help of two “institutional” tools: the “beer flavor wheel” (Meilgaard, Dalgliesh, & Clapperton, 1979) and the “Danish Beer Language” (Det Danske Ølakademi, 2006). Furthermore, words mentioned only once were not included in the analysis.
Statistical Analysis All statistical analyses were performed using the computing environment R (R Development Core Team, 2005) and the R packages SensoMineR (Lê & Husson, 2008) and FactoMineR (Lê, Josse, & Husson, 2008).
Procrustes Multiple Factor Analysis The data were analyzed by Procrustes Multiple Factor Analysis (PMFA) (Morand & Pagès, 2006). As the name suggests, this method combines Multiple Factor Analysis (Pagès & Husson, 2001) and Generalized Procrustes Analysis (Gower, 1975). It consists in creating an MFA consensus configuration from each subject’s sheet (Morand & Pagès, 2006), thus being particularly suitable for a Napping task, in which product sensory differences are expressed as Euclidean distances. The analysis is performed in three steps. First, an initial PCA is performed on the individual configurations where each product sample has X and Y coordinates. These data are then normalized by dividing all the elements by the first eigenvalue obtained on this PCA. Second, the normalized individual data sets are merged into a global matrix, and a new PCA is performed on the new matrix. The scores plot of this new PCA represents a sort of consensus map of how the panel perceived the products on a global level. Finally, each individual configuration undergoes a procrustean rotation and is superimposed to the global configuration, in order to allow
150
150
Consumer-Based Product Profiling
207
comparisons between the individual configurations and the global ones. This can be done by computing the RV coefficients, which is a statistical measure of fit between two configurations (see next section). Thus, PMFA allows for comparison between individual and consensus configuration, which can be used to gather various insights. The descriptors (i.e., the words elicited during the ultra-flash profiling task) are used as a set of supplementary variables, meaning that they do not influence the factors (i.e., the axes) construction, but their correlation coefficient with each factor is calculated and represented visually as in a loadings plot of a PCA model (Perrin et al., 2008). The use of MFA for handling Napping data was one of the most important additions introduced by Pagès (2005) to the original projective mapping technique. The main advantage of employing MFA for this kind of data is that this method (as opposed to previously employed methods, e.g., PCA) treats each individual as a group of two un-standardized variables (the X and Y coordinates) (Pagès, 2005). This is important because individual differences in the use of the space (vertical vs. horizontal dimension) are respected, whereas in a standardized PCA this initial configuration would be deformed (Morand & Pagès, 2006), and thus it is possible to see what is actually important for that person (Lawless & Heymann, 2010). Furthermore, MFA balances the various groups (i.e. the subjects) as explained earlier,2 ensuring that no group plays a predominant role in the first two dimensions of the consensus configuration. After the PMFA model was calculated, Agglomerative Hierarchical Clustering (AHC) was performed on the data to obtain a tree of similarity and differences between the samples based on their scores on the MFA dimensions (Husson, Lê, & Pagès, 2011). This clustering technique uses Ward’s agglomerative algorithm to progressively regroup elements in a Euclidean space (in this case, the consensus profile), looking at clusters inertia as a measure of variability. By adding more dimensions, and thus more inertia, the number of clusters decreases and within cluster inertia increases, until all the individuals are in the same clusters (Husson et al., 2011). In our case, samples are defined by quantitative variables (their scores on the progressive MFA dimensions), thus, clustering was performed with an equation similar to that of an n-way ANOVA (with clusters as objects and dimensions as treatment). The optimal number of clusters was obtained by minimizing the following criterion: min
qmin ≤q≤qmax
(q) , (q + 1)
2
In the version used for this study—which is the one proposed by the method developers (Pagès & Husson, 2001; Pagès, 2005)—the scaling factor is the first eigenvalue of a separate PCA performed on each individual configuration.
151
151
208
D. Giacalone et al.
where (q) is the between-clusters inertia increase when moving from q – 1 to q clusters, qmin and qmax , respectively, the minimum and maximum number of clusters chosen by the users (Husson et al., 2011).
RV Coefficients The RV coefficient (Robert & Escoufier, 1976) is a measure of similarity between two set of points. Its value is comprised in the closed interval [0,1]: the closer it is to 1, the more similar the two configurations will be. Computing RV coefficients in a Napping test can provide very useful indications. For instance, if the test is done with trained assessors, it can provide a measure of a panelist’s performance compared to the group. In the present context, we used mean RV coefficient for beer experts and novices to test differences between these two groups. RV coefficients for subjects in both groups were computed, and t-test on the group means to uncover whether real differences existed, allowing us to test our hypothesis that experts would give a more consistent profiling than the novices. Figure 1 gives an example of the consensus configuration with an individual tablecloth superimposed.
RESULTS The PMFA conducted yielded five main representations of our dataset: (1) a consensus profile of the 9 beers; (2) a representation of the X and Y dimensions of each tablecloth and their relations with the MFA dimensions; (3) a representation of the descriptors used by the subjects to describe the beers; (4) individual profiles of the 9 beers by each panelists; (5) individual profiles superimposed to the consensus one (as exemplified in Figure 1). Each of them can provide different insights, as we show in the following sections.
Consensus Profile of the 9 Beers The consensus product map (Figure 2), is obtained only from the Napping data. This plot shows how the subjects perceived the beers relative to each other on an overall level: the closer the two beers, the more similar, and the further apart, the more different. The first two dimensions combined explain 55.3% of the total variance in the data set. Most of the beers are distributed along the first dimension, whereas the second dimensions mostly described the specificity of one particular sample (Oak Aged Cranberry Bastard), and, to a lesser extent, highlighted further differences by spreading some of the other beers apart. The scree plot suggested that two components were optimal. However, adding a third dimension (with a total of 67.1% total
152
152
209
Consumer-Based Product Profiling Y2
4
S302
2
S302
S522
S841
S916
Dim 2 0
S144 S119 S841
S522
S665
S284
S472
–2
S144
S916
S119
S284
S665
–4
S472
–6
–4
–2
0
2
4
Dim 1 RV between the mean representation and the representation of Y2: 0.3824
FIGURE 1 Consensus configuration (black) with an individual tablecloth superimposed (green, Subject Y2). Note the procrustean rotation (to maximize similarity). The computed RV coefficient on the bottom gives a measure of the similarity between the two configurations. For brevity, samples are labeled with 3-digit codes at this stage (color figure available online).
variance explained) further explained the specificity of Classens Lise, already recognized as different in the first two dimensions. The agglomerative Hierarchical Clustering showed that the nine samples were located in seven distinct clusters. Samples that are not significantly different from each other are grouped by ellipses in Figure 2—Cluster 1: Havre Stout and Enebær Stout; Cluster 2: Bøgebryg and Nutty—i.e., all the beers expect the ones clustered were perceived as different by our panel (p < 0.05). To elaborate on the consensus profile, the configuration is obtained only from the Napping data, i.e., from the X and Y coordinates of the individual configurations. This means that for each beer, there are 17 partial data points (one for each panelist), and that the representations in the consensus map are an average of the partial ones. This is exemplified visually in Figure 3, where, for clarity, we considered only three samples (chosen because their average point lies far from the origin in different directions). On a global level, these beers were perceived as different by the whole panel. However,
153
153
210
D. Giacalone et al. Consensus product map
Fynsk Forår
2
Dim 2 (19.5%)
4
Oak Aged Cranberry Bastard
Rosehip Beer
0
Bøgebryg Havre Stout
Enebær Stout
Nutty
Pine Beer
–2
Classens Lise
–4
–2
0 Dim 1 (35.8%)
2
4
6
FIGURE 2 Consensus product map (Dim1 vs. Dim2) (color figure available online).
Individual factor map As1 As2 As3 As4 As5 As6 As7 As8 As9 As10 As11 As12 As13 As14 As15 As16 As17
Oak Aged Cranberry Bastard
Pine Beer
Enebær Stout
FIGURE 3 Representation of three beers obtained by averaging the partial 17 points. The partial positions for each panelist are shown in different colors and can be related to the average position (color figure available online).
154
154
211
Consumer-Based Product Profiling
superimposing the individual points show remarkable individual differences: for example, assessor 1 did not clearly discriminate between Oak Aged Cranberry Bastard and Enebær Stout but opposed both of them to the Pine Beer. Vice versa, assessor 8 placed the Pine Beer and Oak Aged Cranberry Bastard close to each other and opposed them to Enebær Stout. These kinds of individual differences are to be expected in a Napping task, where individuals are free to choose their own criteria for characterizing the samples. Thus, while the consensus profile provides a “compromise” configuration (Pagès, 2005), inspecting individual configurations can be interesting to discover what sensory characteristic mattered more for the individual consumers.
Subjects’ and Dimensions’ Influence Inter-individual differences are also evident when looking at the subjects’ representation (Figure 4): this plot has the two first MFA dimensions as the abscissa and ordinate. Each subject is then plotted in according to his or her weight with that particular dimension of the MFA model. Thus, the relative
As5 As9
As14
As15
As8
As2 As12 As1 As10 As11 As6
As13 As3
As4 As16
As7
0.0
0.2
Dim 2 (19.5 %) 0.4 0.6
0.8
1.0
Subjects representation
0.0
0.2
0.4 0.6 Dim 1 (35.8 %)
0.8
1.0
FIGURE 4 Representation of the 17 subjects. The coordinate of each subject along the two axes indicate the importance of that dimension given by the subject in his or her configuration (color figure available online).
155
155
212
D. Giacalone et al.
importance given by each subject to each dimension (and to the underlying sensory characteristics) can be observed. Assessors 4 and 7 assess the samples as most different to each other along the first underlying dimension, whereas assessors 14 and 15 use dimension 2 the most. However, the subjects mostly differ in their relation to the first dimension, while not so much variation can be observed in dimension 2; we can conclude that the sensory differences responsible for differences within samples are the ones described by dimension 1 (as it was to be expected by looking at the consensus profile). In a Napping test, inter-product differences are given as Euclidean distances, meaning that each product has an X and a Y on each panelist’s sheet. These coordinates are different between assessors (as shown in Figure 3) and determine the formation of the MFA dimensions. Figure 5 shows how each individual panelist’s sample coordinates are correlated with the first two dimensions. Correlation is indicated by the angle formed between the dimension and the vector; contribution is indicated by the vector length. These coordinates are not normalized in the constructions of the axes; in Figure 6, they appear as normalized to show their correlation with the MFA
0.0
As1 As2 As3 As4 As5 As6 As7 As8 As9 As10 As11 As12 As13 As14 As15 As16 As17.
Y3
Y9 Y17 Y14 Y5 Y13
Y7 X5 X9 X11
X2
X10
Y16X13
X7
Y1
Y15
–0.5
Dim 2 (19.5%)
0.5
1.0
Correlation circle
X16
X3
Y6 Y11
X1
X6 X12 X17 Y10 X14
–1.0
X15 Y2 Y12 Y4
–1.0
–0.5
0.0 Dim 1 (35.8%)
0.5
1.0
FIGURE 5 Representation of the individual axes of coordinates and their correlation with the first two MFA dimensions (color figure available online).
156
156
Consumer-Based Product Profiling
213
FIGURE 6 Representation of sensory descriptors and their correlation with the MFA dimensions (indicated by the corresponding vector). Terms in large fonts were used more frequently (uppercase: >30 times; lowercase: >20 times). Descriptors used only once and/or of hedonic nature have been omitted (color figure available online).
dimensions. Visual inspection indicates that the X vectors have higher correlations with the first dimension. However, the high numbers of panelists inhibits an easy-to-read plot. We computed some simple descriptive statistics to ease the interpretation: on average, X coordinates had both larger range (X = 47.72; Y = 26.90) and standard deviation (X = 17.50; Y = 9.35) than the ordinates. The fact that the horizontal dimension is more used than the vertical one is a recurrent result in the Napping test, which is framed by the fact that the tablecloth is of rectangular shape with a longer horizontal side. However, it is conjectured that this greater use of the horizontal dimension corresponds to a more significant sensory dimension for the subject; thus the X vectors will correspond to the most important sensory characteristic for the subject (Pagès, 2005), regardless of which MFA dimension they are related to.
157
157
214
D. Giacalone et al.
Descriptors Representation and Interpretation The analysis thus far conducted has been concerned with how the subjects discriminate the products. The present section will discuss the underlying sensory differences. Once a consensus profile has been computed, it can be interpreted according to pre-existing knowledge of the sample set. However, since we combined Napping with a descriptive task (Ultra-Flash Profiling), sensory directions could be drawn directly from the subjects’ elicited descriptions. Figure 6 shows the representation of the sensory descriptors and their correlation with the MFA dimensions (though they are not used for constructing the axes). The plot shows that the dimension opposes bitter and fruity to “woody” notes (“pine,” “wood,” “forest,” etc.) and flowery notes. Sourness is very positively loaded on dimension two, as opposed to sweetness. It is now possible to interpret our consensus profile’s underlying properties. Inspecting Figure 2 and Figure 6 simultaneously revealed the sensory characteristics responsible for the product differences. Along the first MFA dimension, we find on the lower left quadrant Cluster 1 and Cluster 2, which, when combined, contain four beers: Bøgebryg, Havre Stout, Enebær Stout, and Nutty. These beers are correlated to sensory descriptors such as “nutty,” “licorice,” “malty,” “coffee,” “smoked,” and the beer type descriptors “stout” and “brown ale.” These two clusters basically group all the dark beers together (two stouts and two dark ales), which are opposed to the others. The sensory descriptors correlated with these clusters are consistent with the beer styles they belong to. Within-cluster differences are less clear; however, the two stouts (Cluster 1) were perceived as more similar than the other ales. On the top left quadrant, we find Oak Aged Cranberry Bastard, with the highest scores on dimension 2 and opposed to the rest of the beers in the first two dimensions. Oak Aged Cranberry Bastard is a fruit beer brewed with cranberries and produced by spontaneous fermentation, as are some Belgian beers. This latter characteristic was perceived by some subjects, and the descriptors “wild fermentation” and “sour wild yeast” were positively correlated with this beer. No panelist could recognize the taste of cranberry as such, but the sensory descriptors most positively associated with this beer are “sour” and “light bitterness,” plausible characteristics of the taste of cranberry. Classens Lise, a pale ale with added honey and chamomile, is located on the lower right corner, moderately correlated with the first dimension. Its position was close to the origin, indicating that none of the first two dimensions modeled this beer well; however, one of its ingredients is honey, and indeed, it correlates positively to the descriptor “sweet” and negatively to “sour.” A third dimension (not included in this analysis) distinguished Classens Lise from the remainder of the beers. In the same lower right quadrant is the Pine Beer. This was an experimental one:
158
158
Consumer-Based Product Profiling
215
a pilsner with added pine needle flavoring. This beer is positively loaded on dimension 1, where the woody characteristics are strong, highly correlated with “descriptors such as “pine,” “woody,” “forest,” “astringent,” and “dry,” showing that this pine-woody dimension was clearly perceived by most panelists. On the top-right quadrant we find Fynsk forår, a pale ale with added elderflower. It is positively loaded on both dimensions, correlated to descriptors such as “flowery” and “elderflower.” Finally, the other experimental beer, a pilsner with added rosehip powder, is also positively correlated to dimension one and somewhat correlated with flowery notes. The descriptor “pilsner” also is highly loaded on dimension 1 and correlated with Pine Beer and Rosehip Beer (the two pilsner beers in the sample set).
Individual vs. Average: Superimposed Representations and Inter-Group Differences This section addresses the second aim of the study: comparison of expert and novice consumers’ ability to perform a Napping task. In a Napping test, the agreement of one subject with the overall product map can be evaluated by superimposing his or her own configuration with the average one, as in the example shown in Figure 1. This characteristic is especially advantageous—whether working with consumers or trained panelists—since it allows the researcher to visually inspect if one subject perceived the products similarly or differently from the consensus configuration. As explained earlier, a measure of fit between the two configurations can be obtained by computing the RV coefficient. The higher the latter, the closer the individual configurations will be to the consensus configuration. One of the tested hypotheses was that people with higher knowledge of beer would profile the samples more consistently, i.e., have larger RV values. To test that, the panelists were divided into two groups: experts and novices. The mean RV coefficient for the brewmasters group was higher than the other group’s mean. The t-test revealed significant differences between the means of the two groups (p = 0.0125). The obtained results (summarized in Table 2) support our hypothesis.
TABLE 2 Mean and Standard Deviations Computed for the Two Groups of Assessors RV coefficients
Mean
Experts (n = 8) 0.618 Novices (n = 9) 0.405 Unpaired t-test showed significant difference between the two groups (p = 0.01252)
159
159
S.D. 0.167 0.169
216
D. Giacalone et al.
DISCUSSION AND CONCLUSIONS The first and foremost aim of this study was to test the applicability of Napping—performed by untrained panelists—to discriminate among a sample of Danish special beers. With regard to that, the experiment was successful. The samples were discriminated and the method provided interpretable results with regard to the sensory dimensions responsible for differences between beers. Most of our subjects positioned the samples according to either the beer style or the specific flavor added. Overall, the first dimension could be seen as separating lagers and pale ales from the four dark ales.3 The influence of the special ingredient was important especially for the two experimental beers, and for the beer most correlated with the second MFA dimension (Oak Aged Cranberry Bastard). Furthermore, Napping allowed us to look at the differences between the subjects and the level of agreement with the general consensus via the RV -coefficient. We found that, on average, experts had configurations more consistent with the consensus profile. This suggests that the precision is higher with product experts. This conclusion must be considered with caution; our results, however significant, refer to a small population. Moreover, the choice of the type of panelists to use could be related more to the specific aim of the test. Larger and more systematic studies are needed to better understand the effect of product knowledge on a subject’s performance in a Napping test. From a business-oriented perspective, it is important to stress that (partial) Napping is a very fast technique. Each experimental session took no more than 30 minutes (plus 30 minutes of preparation), with no prior training required. Moreover, the task was very well received by the subjects, who generally enjoyed the experience as a sort of tasting game. In our view, the combination of Napping and Ultra-Flash profile applied in this study is especially interesting because it relies directly on the subjects’ perceptions, letting them decide autonomously what are the most important sensory dimensions. In this sense, it is possible to say that this technique provides both quantitative and qualitative information, as it has been observed about other rapid methods (Chollet et al., 2011). This characteristic is particularly important in studies that aim at understanding what really matters for consumers, or, in other words, where external validity is a key issue (Garber, Hyatt, & Starr, 2003). Thus, Partial Napping could also be employed for selecting relevant descriptors to be used subsequently in, for example, a conventional profile, and/or to select subset of products for subsequent 3 With regard to the beers in clusters 1 and 2, it should be acknowledged that the subjects might have been influenced by the color (though they were instructed to concentrate only on smell and taste), since clear glasses were used during the Napping task. The color of a product is known to affect the perception of other sensory characteristics (Lawless & Heymann, 2010). The sensory descriptors elicited, however, match our previous knowledge of the samples and the commercial descriptions by the producer.
160
160
Consumer-Based Product Profiling
217
larger-scale consumer tests. These characteristics make Napping a versatile method, useful in a variety of settings both in combination with other sensory profile techniques or as a stand-alone. Furthermore, the speed and low cost of this method make it a valuable opportunity especially for SMEs in the food industry, which seldom have the resources or the access to conventional sensory panels.
REFERENCES Abdi, H., Valentin, D., Chollet, S., & Chrea, C. (2007). Analyzing assessors and products in sorting tasks: DISTATIS, theory and applications. Food Quality and Preference 18(4), 627–640. Albert, A., Varela, P., Salvador, A., Hough, G., & Fiszman, S. (2011). Overcoming the issues in the sensory description of hot served food with a complex texture: Application of QDA® , flash profiling and projective mapping using panels with different degrees of training. Food Quality and Preference 22(5), 463–473. Ares, G., Deliza, R., Barreiro, C., Gimenez, A., & Gámbaro, A. (2010). Comparison of two sensory profiling techniques based on consumer perception. Food Quality and Preference 21(4), 417–426. Chollet, S., Lelièvre, M., Abdi, H., & Valentin, D. (2011). Sort and beer: Everything you wanted to know about the sorting task but did not dare to ask. Food Quality and Preference, 22(6), 507–520. Dairou, J., & Sieffermann, J. (2002). A comparison of 14 jams characterized by conventional profile and a quick original method, the flash profile. Journal of Food Science 67(2), 826–834. Dehlholm, C., Brockhoff, P. B., Meinert, L., Aaslyng, M. D., & Bredie, W. L. P. (2012). Rapid sensory methods – Comparison of free multiple sorting, partial napping, napping, flash profiling and conventional profiling. Food Quality and Preference, 26(2), 267–277. Det Danske Ølakademi. (2006). Det danske ølsprog. (Tran.: The Danish beer language). Copenhagen, Denmark: Danish Brewers Union. Garber, L. L., Hyatt, E. M., & Starr, R. G. (2003). Measuring consumer response to food products. Food Quality and Preference 14(1), 3–15. Gower, J. C. (1975). Generalized procrustes analysis. Psychometrika 40(1), 33–51. Husson, F., Lê, S., & Pagès, J. (2011). Exploratory multivariate analysis by example using R. Boca Raton, FL: CRC Press. Jack, F. R., & Piggott, J. R. (1992). Free choice profiling in consumer research. Food Quality and Preference 3(3), 129–134. Lawless, H. T., & Heymann, H. (2010). Sensory evaluation of food. principles and practises (2nd ed.). New York, NY: Springer. Lê, S., & Husson, F. (2008). SensoMineR: A package for sensory data analysis. Journal of Sensory Studies 23(1), 14–25. Lê, S., Josse, J., & Husson, F. (2008). FactoMineR: An R package for multivariate analysis. Journal of Statistical Software 25(1), 1–18. Meilgaard, M. C., Dalgliesh, C. E., & Clapperton, J. F. (1979). Beer flavor terminology. Journal of the Institute of Brewing, 85(1), 38–42.
161
161
218
D. Giacalone et al.
Morand, E., & Pagès, J. (2006). Procrustes multiple factor analysis to analyse the overall perception of food products. Food Quality and Preference 17(1–2), 36–42. Pagès, J., & Husson, F. (2001). Inter-laboratory comparison of sensory profiles: Methodology and results. Food Quality and Preference 12(5–7), 297–309. Pagès, J. (2003). Recueil direct de distances sensorielles: Application à l’évaluation de dix vins blancs du val-de-loire. (Direct collection of sensory distances: Application to the evaluation of ten white wines from Loire Valley.) Sciences des Aliments 23, 679–688. Pagès, J. (2005). Collection and analysis of perceived product inter-distances using multiple factor analysis: Application to the study of 10 white wines from the Loire valley. Food Quality and Preference 16(7), 642–649. Perrin, L., Symoneaux, R., Maître, I., Asselin, C., Jourjon, F., & Pagès, J. (2008). Comparison of three sensory methods for use with the Napping® procedure: Case of ten wines from Loire valley. Food Quality and Preference 19(1), 1–11. Pfeiffer, J. C., & Gilbert, C. C. (2008). Napping by modality: A happy medium between analytic and holistic approaches. Proceedings of the 9th Sensometrics. Meeting, July 20–23, 2008, St. Catharines, Ontario, Canada. Risvik, E., McEwan, J. A., Colwill, J. S., Rogers, R., & Lyon, D. H. (1994). Projective mapping: A tool for sensory analysis and consumer research. Food Quality and Preference 5(4), 263–269. Robert, P., & Escoufier, Y. (1976). A unifying tool for linear multivariate statistical methods: The RV-coefficient. Journal of the Royal Statistical Society. Series C (Applied Statistics) 25(3), 257–265. Schutz, H. G. (1999). Consumer data-sense and nonsense. Food Quality and Preference 10, 245–251. Williams, A. A., & Langron, S. P. (1984). The use of free-choice profiling for the evaluation of commercial ports. Journal of the Science of Food and Agriculture 35(5), 558–568.
162
162
Paper III
Reinbach, H. C., Giacalone, D., Machado Ribeiro, L., Bredie, W. L. P., Frøst, M. B. (2013). Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping®. Food Quality and Preference, In press at doi:10.1016/j.foodqual.2013.02.004.
163
164
Food Quality and Preference xxx (2013) xxx–xxx
Contents lists available at SciVerse ScienceDirect
Food Quality and Preference journal homepage: www.elsevier.com/locate/foodqual
Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping Helene C. Reinbach 1, Davide Giacalone 1, Leticia Machado Ribeiro, Wender L.P. Bredie, Michael Bom Frøst ⇑ Department of Food Science, Faculty of Science, University of Copenhagen, Rolighedsvej 30, 1958 Frederiksberg C, Denmark
a r t i c l e
i n f o
Article history: Received 26 October 2012 Received in revised form 22 February 2013 Accepted 25 February 2013 Available online xxxx Keywords: Fast sensory methods Napping Check-all-that-apply Descriptive profiling Multivariate statistics Consumer perception
a b s t r a c t The present study compares three profiling methods based on consumer perceptions in their ability to discriminate and describe eight beers. Consumers (n = 135) evaluated eight different beers using Check-All-That-Apply (CATA) methodology in two variations, with (n = 63) and without (n = 73) rating the intensity of the checked descriptors. With CATA, consumers rated 38 descriptors grouped in seven overall categories (berries, floral, hoppy, nutty, roasted, spicy/herbal and woody). Additionally 40 of the consumers evaluated the same samples by partial Napping followed by Ultra Flash Profiling (UFP). ANOVA- and Discriminant Partial Least Square Regression (A-PLSR, D-PLSR) were used to evaluate the discriminative ability of the methods and descriptors. A-PLSR results showed that all samples were perceived as different in all three methods, whereas D-PLSR showed that all three methods had similar numbers of discriminating descriptors. For the two CATA variants, 29 and 24 descriptors for without and with rating intensity were significant, for Napping/UFP the number was 26. Multiple Factor Analysis was used to derive an overall product map and to compare it to product configurations from individual methods. Both qualitative and quantitative analysis (comparison of RV coefficients of the MFA configurations) revealed a very high agreement of the three methods in terms of perceived product differences. RV coefficients were used to compare sample configurations obtained in the three descriptive methods. For all comparisons the RV coefficients varied between 0.90 and 0.97, indicating a very high similarity between all three methods. These results show that the precision and reproducibility of sensory information obtained by consumers by CATA is comparable to that of Napping. The choice of methodology for consumer descriptive methods should then be based on whether it is desired to have consumers articulate their own perception of descriptors, or if it sufficient to present them to an existing vocabulary. Napping is slower and more laborious, and better for explorative studies with smaller number of consumers whereas, CATA is faster, less labor-intensive and thus more suitable for larger groups of consumers. 2013 Elsevier Ltd. All rights reserved.
1. Introduction Descriptive sensory profiling is important for the food industry as it can guide product development and reformulation of products as well as identify key sensory drivers essential for consumer acceptance and marketing of products. Conventional descriptive profiling is performed with a trained panel to obtain an objective description of the food products investigated (Lawless & Heymann, 2010). The need for less time-consuming and economical descriptive methods in the food industry has supported the development and use of more dynamic and fast descriptive sensory profiling methods assessed by panelists, food experts and consumers (Ares, Deliza, Barreiro, Gimenez, & Gambaro, 2010; Dehlholm, Brockhoff, Meinert, Aaslyng, & Bredie, 2012; Giacalone, Machado Ribeiro, & ⇑ Corresponding author. Tel.: +45 35 33 32 07; fax: +45 35 33 3509. 1
E-mail address:
[email protected] (M.B. Frøst). These authors contributed equally to this work.
Frøst, forthcoming; Nestrud & Lawless, 2010). The fast methods include projective mapping (Risvik, McEwan, & Rødbotten, 1997) and Napping (Pages, 2003, 2005), Flash Profiling (Dairou & Sieffermann, 2002) based on Free-Choice Profiling (Williams & Langron, 1984) and different sorting techniques such as free (Lawless, Sheng, & Knoops, 1995) single (Rosenberg & Kim, 1975) and multiple sorting (Dehlholm et al., 2012). Napping is a method in which food samples are projected on a two-dimensional space based on similarities, and is often combined with Ultra Flash Profiling (Perrin & Pages, 2009) to add a semantic description to the product differences. Napping can performed as a ‘‘global’’ Napping, including all sensory aspects, or as ‘‘partial’’ Napping focusing on specific sensory modalities (e.g. appearance, taste or mouthfeel) (Dehlholm et al., 2012; Pagès, 2005). Other consumer-friendly methods, such as just-about right scales (JAR), attribute liking, emotional questionnaires and checkall-that-apply (CATA) are increasingly used to capture consumer perception of food products. In particular the CATA method, in
0950-3293/$ - see front matter 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.foodqual.2013.02.004
165
Please cite this article in press as: Reinbach, H. C., et al. Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.02.004
165
2
H.C. Reinbach et al. / Food Quality and Preference xxx (2013) xxx–xxx
which a product is described by selecting appropriate words from a given list, is a simple and valid approach to gather information about sensory and non-sensory perception, and is believed to have smaller effect on liking and consumer perception of the product than similar methods (e.g. JAR) (Adams, Williams, Lancaster, & Foley, 2007; Ares et al., 2010; Giacalone, Bredie, & Frøst, 2013; Lado, Vicente, Manzoni, & Ares, 2010). Consumer-elicited CATA profiles have shown good agreement with traditional panel-developed sensory profiles (Dooley, Lee, & Meullenet, 2010; Ares et al., 2010), suggesting that CATA could be a valuable alternative to understand perception of product sensory attributes. The various methodologies to capture consumer perceptions are generally easier to perform and less time-consuming than traditional descriptive analysis with a trained sensory panel. Some methods are reductionist and based on a predefined list of descriptor (e.g. CATA), while other methods are more holistic and explorative (e.g. Napping). One of the suggested drawbacks of CATA is that this method produces relatively impoverished dichotomized data (1/0), which allegedly would mask relative differences between specific attributes. Including intensity scaling of attributes in the CATA method may therefore improve the accuracy of descriptive profiling and lead to a better product differentiation. This hypothesis could be tested by comparing CATA with CATA combined with intensity scaling. Data on consumer ratings of intensity generally show large variability and thus it is not clear if the scaling element would actually improve the CATA descriptions made by the consumer. Additionally, it would be of interest to compare how reductionist methods, with and without scale elements (CATA and CATA with intensity ratings), would fare compared to a more holistic and explorative one, such as Napping. The aim of the present study was to compare the effectiveness of three profiling methods, CATA, Napping and a novel method combining CATA with intensity scaling in studying consumer perception of a sample of eight beers. Three comparative criteria were considered in this study: (1) Discriminative ability: i.e. the method’s ability to successfully discriminate between the samples; (2) Descriptive ability: the degree to which the three profiling methods would agree on the sensory characterization; (3) Configurational congruence: the degree to which the sample spaces obtained by the different methods would be closely related to one another. 2. Materials and methods 2.1. Consumers One hundred and thirty-five consumers between 18 and 65 years were recruited in and around of University of Copenhagen (UCPH), through advertisement on websites, social networks, beer magazines and flyers. Approximately half of the consumers (n = 73, 46 males and 27 females) described the flavor of the beers using a CATA questionnaire. The other half (n = 62, 46 males, 16 females) completed a modified version of the CATA questionnaire where we introduced the possibility of scaling the intensity of checked attributes. Additionally, some of the consumers (n = 40, 23 males, 17 females) returned after approximately 10 days for a second session to perform a partial Napping focusing on the smell and taste attributes of the eight beers. After the testing, consumers received a token incentive for their participation (a bottle of craft beer, value 6 €). 2.2. Samples Eight beers were chosen for the study (Table 1), five that represented the flavor diversity of the Danish beer marked (e.g. fruity,
floral, woody, nutty or spicy), two beers were developed for the study to represent novel ingredients (sea buckthorn and pine) and finally a standard pilsner was included to represent the most consumed beer type in Denmark. 40 ml of beer was served at approximately 10 C in 24 cl beer glass covered with watch glasses and coded with three-digit random numbers. Serving orders were randomized to balance first order and carry-over effects (MacFie, Bratchell, Greenhoff, & Vallis, 1989). 2.3. CATA variants Sensory perception of the eight beers was evaluated by respectively CATA and CATA combined with a 15-point intensity scale. On the CATA ballot seven overall flavor categories were presented (Table 2). For each flavor category consumers were asked to check yes, if the flavor was present, and no if the flavor was not present. This formulation2 differs from the classical ‘‘check-all-that-apply’’, and was adopted in order to enhance the likelihood that consumers actually read through the whole list, reducing the behavior known as satisficing (Krosnick, 1991; Rasinski, Mingay, & Bradburn, 1994). Briefly, satisficing is a theory in behavioral decision making maintaining that when most people examine alternative sequentially, they tend to choose the first alternative that seem reasonable, as opposed to the optimal situation in which they would evaluate all alternatives comprehensively before taking a decision (Simon, 1955). Further, some overall flavor descriptors were supplemented with sub-descriptors to enable consumers to specify the exact flavor they perceived (Table 2). The list of flavor attributes was developed with inspiration from the ‘‘Danish beer language’’ (Det Danske Ølakademi [Eng. The Danish Beer Academy], 2006), and the ballot was pre-tested informally to assess that the appropriateness of the attribute list. On the CATA ballot with intensity scaling, the seven overall flavor attributes were presented with the yes/no checkboxes, the flavor subdescriptors and one horizontally oriented 15-point intensity scale per flavor category anchored with ‘very weak’ and ‘very strong’ in the ends to enable consumers to rate the intensity of the appropriate beer flavors. The choice of including only flavor terms, which differs from earlier CATA applications where often more holistic terms (e.g. emotions, usage attributes, conceptual attributes, etc.) are included, was motivated by our aim to restrict the focus on the descriptive profiling applicability of CATA. 2.4. Partial Napping Napping was performed as a partial Napping focusing on the smell and taste of the eight beers. Each consumer was provided with a 60 40 cm blank paper (the Napping sheet), a pen, postits, a tray with eight beer samples and a spittoon. The sample order on the individual trays was randomized to counter-act first order carry-over effect, even though the Napping methodology allows and requires subjects to go back and forth between samples. Consumers were instructed to evaluate the beer samples according to similarities or dissimilarities in smell and taste attributes by placing similar samples close to each other and more dissimilar samples further apart on the Napping sheet. After they had reached a final configuration, consumers noted down appropriate descriptors for the smells and tastes of the beers on the post-its, which were moved around the Napping sheet, when needed. This procedure is known as Ultra-Flash profiling and is commonly used to add a descriptive dimension to a Napping task (Perrin et al., 2008). When 2 A very similar formulation has been recently tested by Ennis and Ennis (2011), who coined their approach ‘‘applicability scores’’. Although unaware of this contribution at the time of designing this experiment, it is interesting to notice that we came to very similar conclusions regarding the need to account for unchecked items in CATA questionnaires.
166
Please cite this article in press as: Reinbach, H. C., et al. Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.02.004
166
3
H.C. Reinbach et al. / Food Quality and Preference xxx (2013) xxx–xxx Table 1 Description of the eight beers. Beer name
Main sensory characteristics
Main flavor ingredient
Beer style
Brewery
Sea buckthorn beer Fynsk forår Bøgebryg Pine beer Valnød Hertug Stjernebryg Enebær Stout Thy Pilsner
Berries Floral Woody Woody Nutty Spicy Spicy Neutral
Sea buckthorn Elderflower Beech tree extract Pine (Pin-Thyrol) Walnut Anise Juniper berries Hops
Flavored pilsner Pale Ale Amber Ale Flavored pilsner Dark Ale Strong Ale Stout Pilsner
Indslev/Univ. of Copenhagen (UCPH) Ørbæk Skovlyst Indslev/UCPH Rise Herslev Grauballe Thy
Table 2 Overview of descriptors for the three methods and estimated jack-knife significance from D-PLSR. Common descriptors
CATA (n = 73)
CATA Int. (n = 62)
Napping (n = 40)
Other Napping descriptors (by # of occurrences)
BERRIES Blueberry Cranberry Sea buckthorn Rose hip Other berries Floral Elderflower Chamomile Lavender Rose Other floral HOPPY NUTTY Hazelnut Almond Walnut Other nutty ROASTED Roasted bread Caramel Coffee Chocolate Other roasted SPICY/HERBAL Juniper berries Bog myrtle Anise Rosemary Cloves Laurel Other spicy/herbal WOODY Piney Birch Beech Maple Other woody Significant descriptors Optimal PC Validated Y variance
166** 16* n.s. n.s. 58** n.s. 242*** 93*** 50** 61** 36** 27** 345*** 200*** 72*** 47*** 71*** n.s. 271*** 60* 133*** 64*** 36*** n.s. 295*** 52* n.s. 56** 47*** 44* n.s. n.s. 232** 70** 60* 33* 36* n.s. 29 4 16%
140** 9* 31* n.s. n.s. 21** 215*** 77*** 38** n.s. n.s. n.s. 389*** 149*** 47* 33* 48***
7n.s.
243 n.s. 111*** 64*** 47*** 17* 274*** 50* 70* 64*** 35*** n.s. n.s. n.s. 205*** 71*** n.s. n.s. n.s. n.s. 24 4 11%
10*
Sweet*** (96) Bitter** (74) Sour*** (48) Citrus*** (24) Fresh** (22) Fruityn.s. (20) Strong*** (20) Light* (19) Liquorice** (19) Yeasty** (20) Full-bodied* (18) Alcoholic** (15) Pilsner** (15) Summer* (14) Thin* (13) Burnt* (12) Grainyn.s. (12) Wateryn.s. (11) Heavyn.s. (9) Soapy* (8) Neutraln.s. (8) Springn.s. (7) Regular* (6) Autumnn.s. (6) Applen.s. (5) Low bitternessn.s. (5)
22***
45*** 18*
9n.s.
***
33*** 8* 8* 19*
8
12n.s. 19**
26 5 22%
p Values for b coeff. Significance levels, n.s. non-significant. p < 0.05. p < 0.01. *** p < 0.001. *
**
all the samples had been placed on the paper, the assessors substituted the post-its with an X and noted the sample codes and the beer characteristics next to the X. Each beer sample was tasted in the given order and swallowed at least once to get the full perception of the beers. Re-tastings and spitting out the beer were allowed. Water and crackers were used as palate cleansers. 2.5. Data analysis The analysis conducted was divided into two parts: an analysis of the descriptive outputs of the three methods (using unfolded data matrices and Partial Least Squares Regression), and an analy-
167
sis of the sample spaces (using crosstab matrices and Multiple Factor Analysis). The descriptive profiles of the beers obtained from CATA, CATA with intensity and Napping were analyzed for product differences by ANOVA-Partial least square regression (A-PLSR; Martens & Martens, 2001) using the Unscrambler software (version 10.1, CAMO, Oslo, Norway). The matrices thus consisted of the number of beers (8) times the number of consumers in each group. For the A-PLSR analysis, the X-matrix consisted of the eight experimental beers (X = 1/0 design variables) while the Y-matrix consisted of the beer flavor descriptors for CATA (Y = 1/0) and CATA with intensity (Y = intensity/0). The Y-matrix for the Napping
Please cite this article in press as: Reinbach, H. C., et al. Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.02.004
167
4
H.C. Reinbach et al. / Food Quality and Preference xxx (2013) xxx–xxx
included the taste and flavor descriptors of the beers (Y = 1/0) elicited during the Ultra-Flash profiling task. Relations between the flavor/taste descriptors (X = 1/0 design variables for CATA and Napping, X = intensity/0 for CATA with intensity) and the experimental beers (Y = 1/0) were studied by Discriminant-PLSR (D-PLSR; Martens & Martens, 2001) with all variables standardized. Cross validation was performed by splitting the datasets into consecutive segments with eight samples (corresponding to one consumer), thus leaving out one consumer at a time. Martens uncertainty test, a jack-knife based extension of cross validation, was performed to estimate the significance of the model parameters at the optimal number of components, taken at the minimum predicted root mean square error (Martens & Martens, 2000). At last, data were organized into three separate matrices to allow for comparison of the individual methods sample configurations obtained. First, the frequencies of the CATA descriptors were calculated for each of the eight beers and the counts organized in a table crossing beers and descriptors. For CATA with intensity ratings, a similar table was constructed using averaged attributes ratings. For each Napping sheet, the X and Y coordinates of each sample were determined, using the left bottom corner of the sheet as the origin of the coordinate system. Multiple Factor Analysis (MFA) was used to analyze the data, using the three initial data matrices as individual MFA groups. Multiple Factor Analysis is a multivariate statistical technique which aims at integrating different groups of variables describing the same observations. MFA can be regarded as an enriched PCA and is performed in three steps: first, an initial PCA is computed on each separate group (in our case, CATA, CATA with intensity, and Napping). Secondly, the square root of the first eigenvalue of each individual PCA is extracted and used as scaling factor for each respective data matrix. Finally, the data are re-merged into a global matrix, and a new PCA is performed on this new and ‘‘scaled’’ data. The scores plot represents a consensus map of how the samples were perceived on a global level. Additionally, MFA has the important characteristic that it allows a rapid comparison of the overall configuration with the individual group configurations (i.e. the initial PCAs), both qualitatively, i.e. by visual inspections of the partial points via projection matrix and quantitatively, i.e. by computing regressor vector (RV) coefficients (Robert & Escofier, 1976) to measure configurational congruence. In this study, RV coefficients were calculated for all possible combination of methodologies for two dimensions and were used to compare sample configurations for each of the three descriptive methods. Mathematically, it can be shown that the RV coefficient corresponds to the Pearson’s correlation coefficient after rearranging the original matrices into vectors (Meyners, 2001). Thus, in the context of the present research high RV coefficients values would indicate that the methods would yield similar information. The MFA was carried out using the FactoMineR package (Lê, Josse, & Husson, 2008) in the statistical software R 2.14.1.
3. Results 3.1. Discriminative ability (A-PLSR) Data from CATA (n = 73), CATA with intensity (n = 62) and Napping (n = 40) were analyzed by A-PLSR to compare the discriminative ability of the three models. The optimal component number derived from the cross-validation was two for the CATA, three for the CATA with intensity method, and two for the partial Napping. These components accounted respectively for 8%, 11%, and 4% of the total variance in flavors/tastes of the eight beers (Fig. 1). Jack-knife based estimation of the regression coefficients (of the
predictor variables, i.e. the samples since they correspond to columns of a design matrix) and visual inspection of the perturbed sub-models revealed that in all three methods all beers were perceived and described differently (p < 0.05). In agreement all three methods tended to group beers in two clusters (Fig. 1a–c) containing beers with floral notes (Fynsk Forår, Sea-buckthorn, Thy Pilsner, Pine beer) versus beers with roasted and caramel notes (Stjernebryg, Enebær Stout, Bøgebryg, Valnød Hertug). The configurations differed somewhat between the three methods particularly with regards to the sample correlated with the second PLSR component. 3.2. Sensory characterization (D-PLSR) Data from CATA (n = 73), CATA with intensity (n = 62) and Napping (n = 40) were analyzed by D-PLSR to compare the descriptive ability of the three models in providing a sensory characterization of the samples. The D-PLSR conducted with the two CATA dataset included 38 flavor descriptors of which respectively 29 and 24 descriptors for CATA and CATA with intensity were found to have a significant effect on the descriptive profile of the eight beers (Table 2). The most frequently used flavor descriptors for CATA and CATA with intensity were hoppy, spicy/herbal, roasted, floral, woody, nutty, berries, caramel and elderflower, words which covered all seven flavor categories. For Napping basic taste descriptors were the most frequently used (e.g. sweet, bitter, sour) followed by the flavor descriptors hoppy, caramel, fruity, citrus and spicy. The optimal number of component was four for both CATA and CATA with intensity, accounting for 11% and 16% of the validated Y variance respectively. The first PLSR component described the difference in roasted and nutty flavors versus the floral and berry notes whereas the second component explained differences in hoppy versus spicy/herbal and woody flavors as well as floral/berries (only for CATA). The third component explained variation due to woody and spicy versus other characteristics (e.g. hoppy, berries, floral, roasted) for CATA and hoppy versus floral for CATA with intensity (not shown in figure). For the Napping/UFP method, consumers used a total of approximately 250 words to describe the flavors and tastes of the eight beers. Words mentioned five times or less were kept out of the analysis and synonyms were semantically grouped, following the guidelines given by Perrin et al. (2009). A total of 37 words were retained, of which 26 were found to have significant effect on the description of the eight beers (Table 2). Thirteen Napping descriptors were in common with the CATA/CATA with intensity. Ten of these descriptors were significantly for describing the beers and covered all seven flavor categories whereas the remaining 24 descriptors were unique to the Napping method. The Napping data were optimally described by five PLSR components accounting for 22% of the validated variance (Table 2). The first PLSR component accounted for variation in sweet, liquorice, alcoholic, caramel, full-bodied and strong flavor/tastes versus more sour, fresh, floral, fruity and light notes. In accordance with CATA and CATA with intensity roasted and nutty notes tended to be opposed to floral along the first component of the D-PLSR for the Napping data. The second component described variation due to bitterness versus low bitterness, fruity, floral and sweet characteristics. The third component described variation due to hoppy and malty versus piney flavor notes (figure not shown). Clear common trends in consumer perception of the beers were observed across all three methods (Fig. 2). For all three descriptive methods the major variance in the beers were caused by differences in roasted, nutty and caramel notes as opposed to floral flavors along the first component. The second component described variation due to hoppy versus spicy/herbal (CATA and CATA with
168
Please cite this article in press as: Reinbach, H. C., et al. Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.02.004
168
H.C. Reinbach et al. / Food Quality and Preference xxx (2013) xxx–xxx
5
Fig. 1. Scores plots from A-PLSR models for CATA (a), CATA with intensity (b), and Napping (c).
Fig. 2. Correlation loadings plot of the D-PLSR models including descriptors for CATA (A), CATA with intensity (B), and Napping (C). Circled descriptors have a significant effect on the sample variation (p < 0.05). The inner and outer ellipses represent R2 = 50% and R2 = 100% respectively.
intensity) or floral and fruity flavors for Napping. Interestingly, the D-PLSR analysis on the Napping dataset showed that the basic tastes, sweet and sour accounted for much of the variation in the first component whereas bitter versus fruity flavors explained variation in the second component. These findings could indicate the additional information could have been gained with CATA/CATA with intensity if basic tastes and the fruity descriptor had been included in the listed descriptors. This also highlights the specific advantage of free descriptors elicitation (such as in Napping/UFP) for uncovering main dimensions relevant for consumers which may be overlooked in methods where subjects are presented with a predefined list of descriptors. The ability to identify the main sensory characteristics for each of the eight beers (given in Table 1) varied between the three sensory profiling methods. The sea-buckthorn flavored pilsner was described as having sea buckthorn, floral (especially chamomile) and berry flavors using CATA and CATA with intensity whereas Napping revealed that sour was the most descriptive word for the beer (Fig. 2). The pale ale with elderflower (Fynsk Forår) was correctly identified as having a distinct floral and elderflower flavors by all descriptive methods. In contrast, the prominent flavor from the ale with beech twigs (Bøgebryg) and dark ale with walnut (Valnød Hertug) were less clearly identified in their descriptive profile. Bøgebryg and Valnød Hertug tended to be described as having caramel, chocolate, walnut and roasted notes by all methods as well as having nutty and woody notes (Napping, CATA) as well as almond (CATA, CATA intensity) hoppy (CATA intensity), piney (CATA) and to have juniper berry (CATA) and Other berry flavors (CATA intensity). The pine flavored pilsner were perceived to have piney or piney-like flavors (e.g. woody, birch, rosemary) with all three descriptive methods. However, this beer was best differentiated from the other beers with the CATA with intensity method. The anise and liquorice flavors were correctly identified in the strong ale with anise (Stjernebryg) by all descriptive methods and was further characterized as having nutty flavors (all methods) as well as having notes of caramel, cloves, walnut and hazelnut
169
(CATA and CATA with intensity), spicy/herbal flavors (CATA) and as having a sweet taste (Napping). None of the descriptive methods identified the Juniper berries in the Stout with Juniper berries (Enebær Stout); however, this beer was perceived as having distinct coffee and chocolate flavors by all methods. Finally the Thy pilsner was perceived to be hoppy with a hint of floral when using the CATA and CATA intensity method whereas the it was described as a pilsner type beer with bitter and sour tastes when assessed by Napping. 3.3. Configurational congruence (MFA) The last step of our analysis aimed at evaluating the configurational similarity of the sample spaces obtained by the three methods. This was assessed by MFA performed on three cross tabulation matrices containing frequency of concurrencies for each descriptor (CATA), average ratings over samples (CATA with intensity), and the assessors’ coordinates (Napping). Fig. 3 below shows the first two dimensions of the consensus MFA sample map (70.5% of the explained variance). The partial configurations obtained by the individual methods are superimposed to the consensus points. Visual inspection of Fig. 3 shows that all methods provided very similar sample maps. The only partial exception was the Pine beer where the CATA and the CATA with intensity seem to disagree, but only with regards to the variation described by the second MFA component. Accordingly, the three methods correlated similarly with the first MFA component (CATA = 33.7%, CATA with intensity = 33.4%, Napping = 32.9%. Percentages refer to the contribution of individual groups of variables to the MFA component), and differed slightly with regards to the second component (17.5%, 48.9%, and 33.6% respectively). RV coefficients were used to compare sample configurations obtained in the three descriptive methods. For all method comparisons the RV coefficients varied between 0.90 and 0.97, indicating a very high similarity between all three methods. The highest similarity was found between the two CATA methods (RV = 0.97)
Please cite this article in press as: Reinbach, H. C., et al. Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.02.004
169
6
H.C. Reinbach et al. / Food Quality and Preference xxx (2013) xxx–xxx
3
CATA CATA_INT Napping
1
Stjernebryg
EnebærStout
ValnødHertug
0
Dim 2 (15.69 %)
2
PineBeer
Bøgebryg
-1
SeaBuckthorn FynskForår
ThyPilsner
-3
-2
-1
0
1
2
3
Dim 1 (54.79 %) Fig. 3. Consensus MFA sample space (first two components) with superimposed partial points from individual methods.
followed by Napping and CATA with intensity (RV = 0.93) whereas the similarity between Napping and the CATA method was slightly lower (RV = 0.90).
4. Discussion In the present study, three consumer-based descriptive methods were compared. Two of the methods were verbal-based – CATA, and a CATA variant with possibility to rate intensity of checked attributes – while one was based on direct sample comparisons (Napping). Despite the different nature, comparative MFA revealed that the overall product map based on data from the three descriptive methods was very similar to the individual beer sample configurations for each method. The high similarity between methods was confirmed by the RV coefficients ranging from 0.90 to 0.97. These results suggest that the precision and reproducibility of sensory information obtained by consumers with CATA is comparable to that of Napping. This is in concordance with findings by Ares et al. (2010) that projective mapping and CATA questions provide very similar sensory profiles of eight milk desserts, supporting high validity of both sensory methods. The high similarity found between the two CATA methods in the present study indicates that quantitative scaling might not improve the accuracy of the CATA profile, but indicates that a quantitative dimension can be added to CATA without complicating the task for the consumer. The latter point is suggested by the equal time to completion (average of 45 min for both CATA versions), the equal number of checked terms (5.5 on average for the straight CATA and 5.7 for the scaled version), and, indirectly, by the fact that on a separate analysis no significant difference were found on liking ratings between the two CATA groups. A previous comparison between CATA and intensity scaling confirmed a high agreement between the two methods (Parente, Ares, & Manzoni, 2010). Recently Ares, Varela, Rado, and Gimenez (2011a) and Ares, Varela, and Rado (2011b) compared intensity scaling with CATA and projective mapping of orange flavored powdered drinks in one study, and added sorting as a fourth method in a second study. Comparing the data by MFA, the authors also found the methods to provide similar information regarding the sensory profiles, suggesting that stable product configurations can be obtained across different methodologies and different groups of consumers.
The presented work expanded this approach by including APLSR and D-PLSR analyses to ascertain additional information. This approach turned out to be very useful in comparing the three descriptive methods. First the discriminative ability of the three methods was analyzed by A-PLSR and comparable overall trends were obtained for the three methods revealing that the eight beers tended to group into clusters of floral beers versus beers with roasted and caramel flavors. Importantly, the three methods were able to discriminate between the beers. CATA with its clearly defined word descriptors might be faster and make it easier for the consumer to discriminate between and describe the beers and therefore improve discriminative ability when compared to Napping. Conversely CATA have previous been criticized for using relatively impoverished dichotomous data, which would supposedly yield a smaller discrimination power. These claims were however not confirmed in the present study. Next the descriptive ability of the three methods was compared by using D-PLSR, which revealed that the methods tended to vary in both number of words and type of words used. Comparable overall trends for the three methods was confirmed with D-PLSR, revealing that the eight beers spanned the product space that ranged from beers with roasted, nutty and caramel notes (Stjernebryg, Enebær Stout, Valnød Hertug, Bøgebryg) to beers with floral, berry and piney flavors (Sea buckthorn beer, Fynsk Forår, Thy Pilsner, Pine beer) in one dimension and hoppy (Thy Pilsner) versus spicy/herbal, woody (Pine beer), fruity or floral in the second dimension. The fact that the sweet and sour as well as bitter taste explained much of the variation in the Napping data highlights the importance of the basic tastes in consumer perception of the eight beers and gives additional information that was not captured by the predefined CATA words. The ability to identify the main sensory characteristics for each of the eight beers was also similar for the three sensory profiling methods. Three ingredients, elderflower in the pale ale (Fynsk Forår), pine in one of the flavoured pilsners, and anise/liquorice in the strong ale (Stjernebryg) were correctly identified by all descriptive methods, as was the Thy pilsner, which was hoppy and described as a pilsner type beer with bitter taste. The dark ale with walnut (Valnød Hertug) tended to be described as having walnut flavors by all methods and was perceived similar to the amber ale with beech twigs (Bøgebryg). The beech flavor in the amber ale was identified with the CATA method whereas the ale only tended to be woody with the Napping method. The sea-buckthorn in the flavored sea buckthorn pilsner was only identified with CATA and CATA with intensity suggesting that novel ingredients might be easier to capture if consumers are influenced by predefined CATA words raising awareness of the novel ingredients. The juniper berries in the Stout (Enebær Stout) were however not identified by any of the descriptive methods despite the help from the CATA words suggesting that some flavors (e.g. novel, unfamiliar, unexpected) might be difficult to capture, and might not be identified unless proper training is conducted. Even though the ability to identify and describe key sensory attributes in the beers did not vary much between the three methods, small differences were observed. Napping/UFP naturally tended to facilitate the development of a larger and more diverse vocabulary comprised of both sensory attributes and holistic terms whereas CATA is limited to the listed sensory attributes. Conversely, the predefined CATA words aided the description and identification of certain attributes (e.g. woody and sea buckthorn). These findings suggest that the choice of method should be based on the type of descriptive profile required and the resources available. Napping might be beneficial to use when unique, intuitive or creative descriptors are needed (e.g. explorative studies) with a smaller number of consumers, as Napping is slower and more laborious than CATA. CATA might useful in raising awareness of certain
170
Please cite this article in press as: Reinbach, H. C., et al. Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.02.004
170
H.C. Reinbach et al. / Food Quality and Preference xxx (2013) xxx–xxx
attributes and thereby aid the consumer in performing the descriptive profile. CATA is fast and easy and thus suitable for larger groups of consumers. The CATA with intensity variation investigated here did not show significant advantages compared to the ‘‘unscaled’’ version, providing very similar information. However, it should be acknowledged that the samples used in this work were characterized by rather large sensory differences. Scope exists for future research to assess the issue of discriminative power of CATA with less heterogeneous sample sets. Thus we conclude, in agreement with other authors (Bruzzone, Ares, & Gimenez, 2012), that the number of times consumers checked the presence of an attribute provides already a good estimate of its intensity. Importantly, the CATA with intensity method yielded fewer significant descriptors than the CATA version, possibly due to consumers’ inconsistency in the use of scales. This latter conclusion should be considered tentative due to the fact that the sample size in the CATA with intensity condition was slightly lower than in the CATA one. However, a separate analysis treating the ratings in the CATA w/intensity data as normal 1/0 CATA revealed extremely similar results to the scaled version, and returned a significant effect of two variables (lavender and rose) which were found to be non-significant when intensity ratings were used in the analysis (cf. Table 2). This corroborates the idea that potential advantages of scaling might be outweighed by the higher probability of obtaining a noisier dataset. 5. Conclusion In summary, the combination of MFA and PLSR for data analysis of CATA, CATA with intensity and Napping data revealed very good agreement in terms of inter-perceived product differences. The high inter-method reliability shows that consumer data are highly repeatable, and supports the validity of the three methods employed in the study. Additionally, these findings confirm that rapid descriptive methods are suitable to capture differences among products and that they can be a useful tool for capturing and understanding consumer perceptions. The choice of methodology should be based on practical considerations, such as ease of use or whether it is desired to have consumers articulate their own perception of descriptors, or if it sufficient to present them to an existing vocabulary. Acknowledgments Support for this work was provided by the Danish Agency for Science, Technology and Innovation and by the Faculty of Science, University of Copenhagen. The organizing committee of the 5th Eurosense Conference is thanked for the bursary award granted to Davide Giacalone. References Adams, J., Williams, S., Lancaster, B., & Foley, M. (2007). Advantages and uses of check-all-that-apply response compared to traditional scaling of attributes for salty snacks. In 7th Pangborn sensory science symposium. Minneapolis, MN, USA: Hyatt Regency. Ares, G., Deliza, R., Barreiro, C., Gimenez, A., & Gambaro, A. (2010). Comparison of two sensory profiling techniques based on consumer perception. Food Quality and Preference, 21, 417–426. Ares, G., Varela, P., & Rado, G. (2011b). Identifying ideal products using three different consumer profiling methodologies. Comparison with external preference mapping. Food Quality and Preference, 22, 581–591.
171
7
Ares, G., Varela, P., Rado, G., & Gimenez, A. (2011a). Are consumer profiling techniques equivalent for some product categories? The case of orange-flavored powdered drinks. International Journal of Food Science and Technology, 46, 1600–1608. Bruzzone, F., Ares, G., & Gimenez, A. (2012). Consumers’ texture perception of milk desserts. II – Comparison with trained assessors’ data. Journal of Texture Studies, 43, 214–226. Dairou, V., & Sieffermann, J. M. (2002). A comparison of 14 jams characterized by conventional profile and a quick original method, the flash profile. Journal of Food Science, 67, 826–834. Dehlholm, C., Brockhoff, P. B., Meinert, L., Aaslyng, M. D., & Bredie, W. L. P. (2012). Rapid descriptive sensory methods – Comparison of Free Multiple Sorting, Partial Napping, Napping, Flash Profiling and conventional profiling. Food Quality and Preference, 26, 267–277. Dooley, L., Lee, Y., & Meullenet, J. (2010). The application of check-all-that-apply (CATA) consumer profiling to preference mapping of vanilla ice cream and its comparison to classical external preference mapping. Food Quality and Preference, 21, 394–401. Ennis, D. M., & Ennis, J. M. (2011). Interpreting applicability scores. IFP Press, 14(4), 3–4. Giacalone, D., Bredie, W. L. P., & Frøst, M. B. (2013). ‘‘All-in-one test’’ (AI1): A rapid and easily applicable approach to consumer product testing. Food Quality and Preference, 27, 108–111. Giacalone, D., Machado Ribeiro, L., & Frøst, M. B. (forthcoming). Consumer-based product profiling: Application of partial Napping for sensory characterization of specialty beers by novices and experts. Journal of International Food and Agribusiness Marketing (accepted manuscript). Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5, 213–236. Lado, J., Vicente, E., Manzoni, A., & Ares, G. (2010). Application of a check-all-thatapply question for the evaluation of strawberry cultivars from a breeding program. Journal of the Science of Food and Agriculture, 90, 2268–2275. Lawless, H. T., & Heymann, H. (2010). Sensory evaluation of food: Principles and practices (2nd ed.). New York, NY: Springer. Lawless, H. T., Sheng, N., & Knoops, S. S. C. P. (1995). Multidimensional-scaling of sorting data applied to cheese perception. Food Quality and Preference, 6, 91–98. Lê, S., Josse, J., & Husson, F. (2008). FactoMineR: an R package for multivariate analysis. Journal of Statistical Software, 25, 1–18. MacFie, H. J., Bratchell, N., Greenhoff, N. K., & Vallis, I. V. (1989). Designs to balance the effect of order of presentation and first-order carry-over effects in hall tests. Journal of Sensory Studies, 4, 129–148. Martens, H., & Martens, M. (2000). Modified jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR). Food Quality and Preference, 11, 5–16. Martens, H., & Martens, M. (2001). Multivariate analysis of quality: An introduction (1st ed.). Chichester, West Sussex, England: John Wiley & Sons Ltd. Meyners, M. (2001). Permutation tests: Are there differences in product liking? Food Quality and Preference, 12, 345–351. Nestrud, M. A., & Lawless, H. T. (2010). Perceptual mapping of apples and cheeses using projective mapping and sorting. Journal of Sensory Studies, 25, 390–405. Pagès, J. (2003). Direct collection of sensory distances: Application to the evaluation of ten white wines of the Loire Valley. Sciences des Aliments, 23, 679–688. Pagès, J. (2005). Collection and analysis of perceived product inter-distances using multiple factor analysis: Application to the study of 10 white wines from the Loire Valley. Food Quality and Preference, 16, 642–649. Parente, M. E., Ares, G., & Manzoni, A. V. (2010). Application of two consumer profiling techniques to cosmetic emulsions. Journal of Sensory Studies, 25, 685–705. Perrin, L., & Pages, J. (2009). Construction of a product space from the Ultra-Flash profiling method: Application to 10 red wines from the Loire Valley. Journal of Sensory Studies, 24, 372–395. Rasinski, K. A., Mingay, D., & Bradburn, N. M. (1994). Do respondents really ‘‘mark all that apply’’ on self-administered questions? Public Opinion Quarterly, 58, 400–408. Risvik, E., McEwan, J. A., & Rødbotten, M. (1997). Evaluation of sensory profiling and projective mapping data. Food Quality and Preference, 8, 63–71. Robert, P., & Escofier, Y. (1976). Unifying tool for linear multivariate statisticalmethods- RV-coefficient. Journal of the Royal Statistical Society Series C –Applied Statistics, 25, 257–265. Rosenberg, S., & Kim, M. P. (1975). Method of sorting as a data-gathering procedure in multivariate research. Multivariate Behavioral Research, 10, 489–502. Simon, H. A. (1955). A behavioral model of rational choice. Quarterly Journal of Economics, 69, 99–118. Williams, A. A., & Langron, S. P. (1984). The use of free-choice profiling for the evaluation of commercial ports. Journal of the Science of Food and Agriculture, 35, 558–568.
Please cite this article in press as: Reinbach, H. C., et al. Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.02.004
171
172
Paper IV
Jaeger, S. R., Giacalone, D., Roigard, C., Pineau, B, Vidal, L., Gimenez, A., Frøst, M. B., Ares, G. (2013). Investigation of bias of hedonic scores when co-eliciting product attribute information using CATA questions. Food Quality and Preference, Vol. 30, Issue 2, pp. 242-249.
173
174
Food Quality and Preference 30 (2013) 242–249
Contents lists available at SciVerse ScienceDirect
Food Quality and Preference journal homepage: www.elsevier.com/locate/foodqual
Investigation of bias of hedonic scores when co-eliciting product attribute information using CATA questions Sara R. Jaeger a,⇑, Davide Giacalone c, Christina M. Roigard a, Benedicte Pineau a, Leticia Vidal b, Ana Giménez b, Michael B. Frøst c, Gaston Ares b a b c
The New Zealand Institute for Plant and Food Research Ltd., Mt Albert Research Centre, Private Bag 92169, Victoria Street West, Auckland, New Zealand Departamento de Ciencia y Tecnología de Alimentos, Facultad de Química, Universidad de la República, Gral. Flores 2124, CP 11800 Montevideo, Uruguay Department of Food Science, Faculty of Science, University of Copenhagen, Rolighedsvej 30, DK-1858 Frederiksberg C, Denmark
a r t i c l e
i n f o
Article history: Received 12 April 2013 Received in revised form 7 June 2013 Accepted 7 June 2013 Available online 14 June 2013 Keywords: Hedonic scaling CATA JAR Attribute scaling Research methodology Consumer research
a b s t r a c t Sensory and consumer scientists disagree on the practice of concurrently obtaining sensory information in hedonic tests. This is in part due to different mindsets about what consumers are able to do and evidence that such co-elicitation may bias hedonic scores. Check-all-that-apply (CATA) questions have been claimed to have a smaller effect on hedonic scores than other attribute such as just-about-right or intensity scales. In this research, nine studies using consumers as participants examined effects on hedonic product scores when sensory attribute information was co-elicited using CATA questions. The use of CATA concurrently with hedonic was benchmarked against concurrent attribute liking scores, attribute intensity scores and just-about-right scaling. Across a range of product categories (beer, fresh fruit, tea, flavoured water, crackers, savoury dips), only weak and transient evidence of bias of hedonic scores when concurrently using CATA questions was established. This effect was independent on whether samples, on average were moderately liked or moderately disliked, and replicated when samples were assessed partially by the sense of smell only or via full product assessment (appearance, aroma, flavour, taste, aftertaste, mouthfeel). The present research suggests that co-elicitation of hedonic scores and product attribute information using CATA questions may bias the hedonic scores, but not that it certainly will do so. This needs to be recognised, leading to more widespread acceptance that co-elicitation has merit. Investigators should decide on whether or not to co-elicit product attribute information using CATA questions on a case-by-case basis, acknowledging that bias may occur. Further research is needed to understand when/when not bias is likely to occur. 2013 Elsevier Ltd. All rights reserved.
1. Introduction Quantitative consumer research is often aimed at determining consumers’ hedonic reaction to the sensory characteristics of products (Lawless & Heymann, 2010). Consumers are asked to sample a set of products and to indicate how much they like them using a hedonic scale (Lim, 2011). In some instances, hedonic information is supplemented with questions on specific sensory characteristics with the aim of understanding consumer preferences and identifying recommendations for product reformulation (Stone & Sidel, 2004). One of the major concerns of including questions about specific sensory characteristics is that they can be a source of bias on hedonic scores (Stone & Sidel, 2004). According to Prescott, Lee, and Kim (2011) asking consumers to complete analytical tasks can hinder the utilisation of hedonic information. Based on previous research ⇑ Corresponding author. Tel.: +64 9 925 7000.
E-mail address:
[email protected] (S.R. Jaeger).
(Prescott, 1999, 2004; Small & Prescott, 2005), these authors argue that directing consumer attention to multiple attributes may inhibit a cognitive representation of synthetic characteristics, such as overall liking. Besides, survey research has shown that previous questions have the potential to alter a person’s perception of a product by making certain aspects more salient and relevant (Strack, 1992). Studies on the influence of analytical tasks on hedonic ratings have shown contradictory results and scholars are divided on the topic (Moskowitz, Munoz, & Gacula, 2003). Asking consumers to evaluate sensory attributes using intensity scales, attribute liking questions or just-about-right (JAR) scales have been reported to significantly affect hedonic scores and to affect conclusions regarding consumers’ preference patterns (Earthy, MacFie, & Hedderley, 1997; Popper, Rosenstock, Schraidt, & Kroll, 2004; Prescott et al., 2011). However, some authors have reported no effect (Gacula, Mohan, Faller, Pollack, & Moskowitz, 2008; Mela, 1989; Vickers, Christensen, Fahrenholtz, & Gengler, 1993). In summary, published studies suggest that the influence of analytical tasks on hedonic
0950-3293/$ - see front matter 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.foodqual.2013.06.001
175
175
243
S.R. Jaeger et al. / Food Quality and Preference 30 (2013) 242–249
scores is methodology and sample dependent, and that it does not always occur. Check-all-that-apply (CATA) questions are an increasingly popular technique to collect analytical consumer evaluations of food products (Adams, Williams, Lancaster, & Foley, 2007; Ares, Barreiro, Deliza, Giménez, & Gámbaro, 2010; Ares & Jaeger, 2013; Dooley, Lee, & Meullenet, 2010; Plaehn, 2012). The CATA method consists of presenting consumers with a predefined list of terms from which consumers should tick all they find appropriate to describe a focal product. This methodology has been reported to be quick and easy to use for participants (Ares et al., 2010; Driesener & Romaniuk, 2006), providing similar results than trained assessors panels using Quantitative Descriptive Analysis (Ares et al., 2010; Bruzzone, Ares, & Giménez, 2012; Dooley et al., 2010). Furthermore, it has been proposed that that, when used together with hedonic scales, CATA questions have a smaller effect than other attribute-based question types such as just-about-right or intensity scales (Adams et al., 2007). Tentatively, this may be explained by the fact that CATA questions are not thought to be cognitively demanding and do not encourage respondents to engage in deep processing (Krosnick, 1999; Sudman & Bradburn, 1992). Nevertheless, no published study has yet provided empirical evidence to support the claim by Adams et al. (2007). In view of the growing practice in academia and industry of using consumers for concurrent attribute assessment (Varela & Ares, 2012), there is an increasing need for studies investigating possible biases of CATA questions on consumer hedonic ratings. The present paper explores the influence of CATA questions on hedonic ratings through nine consumer studies. First, it benchmarks the use of CATA concurrently with hedonic against concurrently obtained attribute liking scores, attribute intensity scores and just-about-right scaling (Studies 1–3). In the second part of the manuscript the degree of biasing effect of different conditions of concurrent hedonic and CATA elicitation is examined (Studies 4– 8). The studies were conducted in multiple countries using multiple product categories. Between-subjects designs were used in all studies (see Tables 1 and 2 for an overview), and participants took part in a single research session only. 2. Methodology 2.1. Study 1: liking only vs. liking + intensity ratings vs. liking + CATA (flavoured water) Study 1 compared concurrent use of CATA questions and questions about attribute intensity ratings with regards to producing a bias in liking ratings. 2.1.1. Participants Consumers (N = 190, 63% female) completed Study 1 as part of a larger study on sensory odour acuity and food preferences. They were aged 18–60 years old and self-identified as Caucasian. All participants lived in Auckland, New Zealand. Participants gave vol-
untary written informed consent and were compensated in cash. According to chi-square test, across the three experimental treatments (A, B and C), consumers were balanced with respect to age (v2 = 3.86, p = 0.42) and gender (v2 = 5.97, p = 0.051), in order to infer that possible differences could be ascribed to the experimental design. 2.1.2. Samples Consumer responses were collected in response to the aroma of a non-cyclic sulphur-containing carbon-based compound (99% purity). Two samples (2431 and 694 ppb) were prepared by diluting in water a stock solution of the compound. Consumers received 10 ml of each sample in wine glass covered by watch glassed and labelled with a 3-digit code. Samples were poured 1–2 h prior being presented to consumers. 2.1.3. Experimental design, CATA lexicon and data collection Participants attended a single research session, and a betweensubjects design was used. In Treatment A, consumers were asked to smell the flavoured solutions and rate overall liking for the two samples using a labelled 9-pt hedonic scale where the endpoint anchors were 1 = ‘dislike extremely’ and 9 = ‘like extremely’. In Treatment B, consumers first rated overall liking, then rated the intensity of the following four attributes: mussel, seafood, canned/ cooked vegetable, seaside/marine. Ratings were collected on a category scale with anchors 0 = ‘I cannot smell anything’; 1 = ‘Extremely weak’; and 9 = ‘Extremely strong’). In Treatment C consumers evaluated liking first, and then completed a CATA task with the following 15 terms: cabbage, cooked vegetables, canned asparagus, heavy, light, mussel, salty, seafood, seaweed, sickening, sulphuric, seaside, sweet vomit, smoked fish, steamed fish. These terms were developed in pilot work with Plant & Food Research staff. Sample presentation order was counter-balanced. Data collection took place in standard sensory booths under controlled temperature and airflow conditions. Artificial white lighting was used. The participants in this study also completed Studies 2 and 6–8. The allocation of participants to experimental treatments was such that a person always completed Treatment A or Treatment B or Treatment C (see Tables 1 and 2). Treatment A was always the ‘Hedonic only’ treatment and the participants in this group were at no time presented with a task where attribute information was concurrently elicited with hedonic scores. Conversely, the participants who always completed Treatments B or C were always providing hedonic and attribute specific information about the focal samples. We used this allocation of participants to experimental treatments to retain participants in ‘stable mindsets’. 2.2. Study 2: liking only vs. liking + attribute liking vs. liking + CATA (salmon dip) Study 2 compared the use of CATA questions to the use of attribute liking questions concurrent with hedonic scaling.
Table 1 Overview of studies in Part 1 and summary of results. Study ID
Samples summary
Experimental treatment
Results summary from mixed linear model
Study 1
Flavoured water 2 Samples Aroma only assessment Salmon dip 1 Sample Aroma only assessment Strawberries 6 Sample Full sample assessment
(A) Hedonic only (B) Hedonic + attribute intensity (C) Hedonic + attribute CATA (A) Hedonic only (B) Hedonic + attribute liking (C) Hedonic + attribute CATA (A) Hedonic + attribute JAR (B) Hedonic + attribute CATA
FExp.Tr. = 0.64 FSample = 0.95 FExp.Tr. Sample = 0.01 FExp.Tr. = 0.07 FSample is N/A FExp.Tr.Sample is N/A FExp.G. = 0.25 FSample = 14.04 FExp.Tr.Sample = 0.51
Study 2
Study 3
176
176
p = 0.20 p = 0.33 p = 0.99 p = 0.93 N/A N/A p = 0.61 p < 0.0001 p = 0.46
244
S.R. Jaeger et al. / Food Quality and Preference 30 (2013) 242–249
Table 2 Overview of studies in Part 2 and summary of results. Study ID
Samples summary
Experimental treatment
Results summary of linear mixed model
Study 4
Beer 4 Samples Full sample assessment Rice crackers (flavoured) 6 Samples Full sample assessment Green tea (unsweetened) 2 Samples Full sample assessment Seaweed rice cracker 1 Sample Full sample assessment Seafood dip 1 Sample Full sample assessment Crackers (plain) 3 Samples Full sample assessment
(A) Hedonic only (B) Hedonic + CATA (C) CATA + Hedonic (A) Hedonic only (B) Hedonic + CATA term order 1 (C) Hedonic + CATA term order 2 (A) Hedonic only (B) Hedonic + delayed CATA (C) Hedonic + CATA (A) Hedonic only (B) Hedonic + CATA (28 terms) (C) Hedonic + CATA (7 terms) (A) Hedonic only (B) Hedonic + CATA (4Q) (C) Hedonic + CATA (1Q) (A) Hedonic + CATA (1Q) (B) Hedonic + CATA (3QM) (C) Hedonic + CATA (3QS)
FExp.Tr. = 3.58 FSample = 3.28 FExpTr Sample = 0.87 FExp.Tr. = 3.91 FSample = 11.41 FExp.Tr.Sample = 1.44 FExp.Tr. = 0.60 FSample = 10.24 FExp.Tr.Sample = 1.30 FExp.Gr. = 0.28 FSample is N/A FExp.Tr.Sample is N/A FExp.Tr. = 0.84 FSample is N/A FExp.Tr.Sample is N/A FExp.Gr. = 0.84 FSample = 2.74 FExp.Tr.Sample = 0.51
Study 5
Study 6
Study 7
Study 8
Study 9
p = 0.02 p = 0.02 p = 0.51 p = 0.02 p < 0.0001 p = 0.16 p = 0.56 p = 0.0015 p = 0.27 p = 0.76 N/A N/A p = 0.44 N/A N/A p = 0.44 p = 0.07 p = 0.73
Notes: in study 9 experimental treatment B used 3 CATA questions of mixed modality (3QM), whereas experimental treatment C used 3 CATA questions of separate sensory modality (3QS). The terms were unchanged and identical to the 21 terms used in experimental treatment A.
2.2.1. Participants Participants were the same group of individuals who took part in Study 1 of this research. 2.2.2. Samples A teaspoon of a commercially available salmon dip (Turkish Kitchen (Auckland, New Zealand), ‘‘Flavour it dip’’, Smoked salmon & dill with cashew™), served in 30 ml custard cups and labelled with a 3-digits code, was used as test stimulus. 2.2.3. Experimental design, CATA lexicon and data collection Participants attended a single research session, and a betweensubjects design was used. Treatment A served as control group in which consumers were instructed to smell the sample and subsequently rate overall liking for the two samples using a 9-pt hedonic scale (see Study 1). In Treatment B, consumers first rated overall liking, then liking for four attributes (two related to appearance and two to aroma): colour intensity, visual amount of dill, salmon aroma, smoky aroma, using a 9-pt hedonic scale identical to that in Study 1. In Treatment C consumers evaluated liking first, and then completed the CATA task in the same ballot. The CATA terms were organised within two columns: Aroma (containing the terms cooked fish, dill/herb, salmon, smoky, sweet, and tangy) and Appearance (Artificial, oily, salmon colour: strong, salmon colour: weak, small lumps, smooth). These terms were developed through pilot work with Plant & Food Research staff. 2.3. Study 3: liking + jar vs. liking + CATA (strawberries) Study 3 compared the use of CATA questions to just-about-right scales (JAR) when implemented concurrently with hedonic scaling. 2.3.1. Participants Consumers were randomly recruited among shoppers in a supermarket in Montevideo (Uruguay) based on their stated strawberry consumption (at least once during the week before the study) and their interest in participating (cash incentives were not used). The 120 consumers who took part (58% female: 42% male; aged 18–75 years old), were randomly divided in two groups of 60 people: Treatment A and Treatment B. Consumers in both treatments completed the study in the same session. Consumers did not differ in key demographics across treatments (Gender: v2 = 1.98, p = 0.16, Age: v2 = 0.80, p = 0.37).
2.3.2. Samples Six strawberries cultivars were used, of which three were commercially available in Uruguay (Yurí, Yvahé and Guenoa) and three were promising new cultivars developed by Instituto Nacional de Investigación Agropecuaria (INIA, Uruguay) (L20.1, L.53.3 and K50.5). All cultivars were grown under the same environmental and management conditions. Fruits were hand harvested from plants growing in a commercial greenhouse in Salto (Uruguay). Only mature, fully coloured unblemished fruit that felt firm to the touch were picked and then immediately packed into plastic, vented, lidded containers and stored under air at 2 C for 1 day prior to the consumer study. For each cultivar, two strawberries were presented to the consumers in closed plastic containers labelled with 3-digit random numbers, at room temperature. Samples were presented monadically according to a balanced design (Latin square design) and mineral water was available for rinsing between samples. 2.3.3. Experimental design, CATA lexicon and data collection A between-subjects design was used. In Treatment A, consumers had to try the strawberries, rate their overall liking using a 9pt hedonic scale (same anchors as Study 1) and then evaluate six sensory attributes using 5-pt just-about-right scales (1 = ‘not enough’, 3 = ‘just about right’, 5 = ‘too much’): red colour, sweetness, sourness, strawberry odour, strawberry flavour, and hardness. In Treatment B consumers evaluated liking first, and then completed the CATA task with the following 18 terms: not much red colour, moderate red colour, intense red colour, not very sweet, sweet, too sweet, not very sour, sour, very sour, no strawberry odour, moderate strawberry odour, intense strawberry odour, no strawberry flavour, moderate strawberry flavour, intense strawberry flavor, soft, firm, and hard. These terms were based on published data (Péneau, Brockhoff, Escher, & Nuessli, 2007) and previous studies with trained assessors and consumers (Ares, Barrios, Lareo, & Lema, 2009; Lado, Vicente, Manzzioni, & Ares, 2010) and known from a previous study to be the main attributes responsible for differences in the sensory characteristics of strawberry cultivars in Uruguay (Lado et al., 2010). Although the terms ‘too much strawberry odour’ and ‘too much strawberry flavour’ are not common in JAR scales, they were included in the study because some consumers tended to use them in previous studies. Data collection took place in standard sensory booths under controlled temperature and airflow conditions.
177
177
S.R. Jaeger et al. / Food Quality and Preference 30 (2013) 242–249
2.4. Study 4: liking only vs. liking + CATA yes/no vs. CATA yes/ no + liking (beer) Study 4 tested whether the presence of CATA items on the ballot affects hedonic scores, and specifically what is the impact of elicitation of CATA responses prior to vs. subsequent to the hedonic question. 2.4.1. Participants Participants (N = 129, 44% female) were a group of university students (all aged 18–30) at the Faculty of Science, University of Copenhagen, Denmark. Participants did not receive an incentive for participation. 2.4.2. Samples Four commercially available beers, all brewed by the same commercial brewery, were used as stimuli. They were selected to provide a sufficient span with regards to style and sensory characteristics (Harboe Pilsner, Harboe Classic, Harboe Juleøl, and Harboe Bjørnebryg (Skælskør, Denmark) – indicated respectively as samples A, B, C, D). The beers were stored at room temperature prior to serving. Evaluations took place in a central hallway outside a canteen of the Faculty of Science. Each consumer was monadically served about 5 cl of each beer sample in a plastic cup. Samples were labelled with 3-digit numbers, and served in a randomised order using a Latin square design. 2.4.3. Experimental design, CATA lexicon and data collection A between-subjects design was used. In Treatment A consumers (n = 42) only gave hedonic ratings. In Treatment B (n = 46) consumers gave hedonic ratings for the same beers, and concurrently completed a CATA questionnaire (on a beer-by-beer basis). Consumers in Treatment C (n = 41) performed the same task as consumers in Treatment B, but with an inverted question order, such that on their ballot the CATA questions appeared before the hedonic rating. For all groups, hedonic responses were elicited with the question ‘‘How much do you like this beer?’’, and rated on a 9-pt hedonic scale with anchors 1 = ‘Not at all’, 5 = ‘Neutral’ and 9 = ‘Like very much’. The list of CATA descriptors listed 10 terms: transparent, sparkling, yellow, rye-bread, flowery, sweet, malt, acidic, alcohol, and caramel. These were developed in pilot work with students and staff at the Department of Food Science at the University of Copenhagen. The CATA question was administered with the wording ‘‘Do you find this descriptor in the sample?’’ and answered in a ‘‘yes/no’’ dichotomous response. This approach differs slightly from the classical formulation (check-all-that-apply), and is known to increase the likelihood that respondents go through the whole list (Rasinski, Mingay, & Bradburn, 1994), hereby reducing satisficing response strategies (Krosnick, 1991). 2.5. Study 5: liking only vs. liking + CATA term order 1 vs. liking + CATA term order 2 (flavoured crackers) Study 5 tested whether the order in which CATA items were listed on the ballot affects hedonic and compared two fixed CATA term orders to hedonic only elicitation. 2.5.1. Participants Consumers (N = 145, 57% female) completed Study 5 as part of a larger study on sensory odour acuity and food preferences. The recruitment criteria were identical to Study 1, and with the exception of two people, participants did not take part in Study 1. Across the three experimental treatments (A, B, and C), consumers were balanced with respect to age (v22 = 1.16, p = 0.56) and gender (v24 = 1.59, p = 0.81), in order to infer that possible differences could be described to the experimental design.
178
178
245
2.5.2. Samples Six commercially available flavoured rice crackers were used (Pams Original, Fantastic Barbecue, Fantastic Chicken, Peckish™ Thins Salt & Vinegar, Trident Seaweed, and Sakata Wholegrain Smokey Barbecue. Two crackers were placed in a 60 mL lidded plastic cup and labelled with a 3-digit code. 2.5.3. Experimental design, CATA lexicon, and data collection A between-subjects design was used. In Treatment A, consumers (n = 56) gave only rated overall liking (9-pt labelled scale where 1 = ‘dislike extremely’ and 9 = ‘like extremely’). In Treatment B, consumers (n = 43) first rated overall liking, and then completed a CATA task with 14 items: vinegar, hard, smoky, salty, visible flavouring, sweet, bland, spicy, savoury, crisp, seeds/grains, seaweed, soy sauce and uneven surface. Consumers in Treatment C (n = 46) completed the Treatment B task but used a CATA ballot where items were listed in a different order: uneven surface, hard, savoury, seeds/grains, crisp, spicy, bland, salty, visible flavouring, smoky, soy sauce, seaweed, sweet and vinegar. The two presentation orders of the terms were randomly selected. Sample presentation order was balanced according to a Latin square design. Data collection took place in standard sensory booths under controlled temperature and airflow conditions. Artificial white lighting was used. Water was available for rinsing between samples. 2.6. Study 6: hedonic vs. concurrent hedonic/analytic evaluation vs. delayed analytic (green tea) Study 6 tested whether the presence of CATA items on the ballot has an effect on liking ratings, and also whether the mere expectation of having to perform an analytical task is a sufficient condition to produce a bias of hedonic scores. 2.6.1. Participants Participants were the same group of individuals who took part in Study 1 of this research. 2.6.2. Samples Two types of commercially available green teas (Chanui Fine Leaf Tea™, and Dilmah All Natural Green Tea – Pure Green™) were used as stimuli. The samples looked identical and were served (20 ml, at room temperature) in 60 ml plastic cups with lids, labelled with a 3-digit code. 2.6.3. Experimental design, CATA lexicon and data collection A between-subjects design was used to compare three experimental treatments. In Treatment A, consumers tasted the green teas and were only asked to rate overall liking for the two samples using a 9 point hedonic scale. In Treatment B, hedonic and analytical evaluations were split in time. Consumers started by rating overall liking for the two samples in a monadic order. Then, they received the same samples one more time and were asked to evaluate them on a separate ballot with nine CATA terms. Prior to hedonic assessment participants were told that they would receive each sample twice and that they the second time would be asked to evaluate them for a set of attributes. These were then read aloud and a CATA only ballot shown. In Treatment C consumers evaluated overall liking first, and then completed the CATA task in the same ballot. The nine sensory terms identical for treatments B and C were selected on the basis of bench top testing with sensory staff at Plant & Food Research. The terms were: dry, floral, metallic, hay-like, off-flavour, green tea, sweet, bitter, and grassy/vegetable. Sample presentation order was balanced according to an experimental design. Data collection took place in standard sensory booths under controlled temperature and airflow conditions. Arti-
246
S.R. Jaeger et al. / Food Quality and Preference 30 (2013) 242–249
ficial white lighting was used. Water was available for rinsing between samples. 2.7. Study 7: liking only vs. liking + short CATA vs. liking + long CATA (seaweed cracker) Study 7 sought to investigate whether potential bias of hedonic scores during co-elicitation with CATA questions was dependant on the length of the CATA question. 2.7.1. Participants Participants were the same group of individuals who took part in Study 1 of this research. 2.7.2. Samples The sample was a commercially available seaweed cracker (Fantastic Snacks’ Seaweed Flavor Rice Cracker™), served in 130 ml custard cups and labelled with 3-digit code. 2.7.3. Experimental design, CATA lexicon and data collection A between-subjects design was used. Consumers in Treatment A served as control group in which tasted the sample and rated liking using a 9-pt hedonic scale (same anchors as Study 1). In Treatment B, consumers were ask to taste the sample, rate overall liking and then complete a CATA task with 28 item: soy sauce, fish: weak, rice, salty: low, sticky, greasy, seaweed: weak, thick, dry, brittle, crunchy, sweet: high, rancid, smooth, aftertaste, toasted colour, salty: high, flavorsome, tasteless, oiled/fried flavor, toasted flavor, sweet: low, crisp, fish: strong, off-flavor, seaweed: strong, thin, hard. Consumers in Treatment C tasted the sample, rated overall liking and completed a CATA task with 7 items: grassy, seaweed, crunchy, toasted colour, flavorsome, off-flavor, hard. The CATA terms were generated in pilot work with Plant & Food Research staff. 2.8. Study 8: liking only vs. liking + CATA by modality vs. liking + single CATA (seafood dip) Study 8 explored whether possible bias of hedonic scores would be linked to implementation of the CATA question, and specifically whether a different effect would be obtained when CATA items are organised by sensory modality as opposed to appearing a single item list. 2.8.1. Participants Participants were the same group of individuals who took part in Study 1 of this research. 2.8.2. Samples One sample was used: a seafood dip (Country Goodness’ Seafood Fiesta Flavored sour cream dip™). Consumers received a teaspoon of the dip in a 130 ml custard cup, labelled with a 3-digits code, together with a water cracker (Arnott’s Original Water Cracker™). 2.8.3. Experimental design, CATA lexicon and data collection A between-subjects design was used. Consumers in Treatment A tasted the sample and gave overall liking ratings. In Treatment B, consumers tasted the sample, rated liking and then completed a CATA task in which items where organised under four separate headings: Appearance (white colour, smooth, lumpy), Aroma (garlic, tangy, cheese), Taste (off-flavour, seafood/shellfish, chives/parsley), Texture/Mouthfeel (grainy, soft, creamy). Consumers in Treatment C tasted the sample and completed a single-question CATA with 12 items: garlic, creamy, tangy, parsley/chives, off-flavour, savoury, shellfish, flavoursome, seafood, oily/fatty, mussel, cheese. The CATA terms were generated in pilot work with Plant & Food Research staff.
2.9. Study 9: hedonic + 1 CATA question vs. hedonic + 3 mixed sensory modality CATA questions vs. hedonic + 3 separate sensory modality CATA questions (Crackers) Study 9 explored whether CATA questions of similar length that featured CATA terms from only one or from multiple sensory modalities would lead to different bias on hedonic scores. 2.9.1. Participants Consumers (N = 120), who completed the study as part of a larger study on bakery products, were recruited from the consumer database of Departamento de Ciencia y Tecnología de Alimentos (Montevideo, Uruguay) based on their consumption of bakery products and crackers, as well as their interest and availability to participate in the study. They were aged 18–60 years old (64% female). Participants gave written informed consent and were compensated with a small gift. Consumers were randomly divided into three experimental groups of 40 participants, which completed the study in the same session. Key consumer demographics did not differ across groups (Gender: v2 = 0.89, p = 0.35, Age: v2 = 0.14, p = 0.71). 2.9.2. Samples Three commercially available samples of plain crackers were evaluated. The samples were purchased from local supermarkets. One cracker of each sample was served in a plastic plate labelled with a 3-digit code. 2.9.3. Experimental design, CATA lexicon and data collection A between-subjects design was used. The experimental design defined three treatments. One group of participants (Treatment A, n = 40) rated their overall liking using a 9 point hedonic scale, and then received a single CATA question featuring 21 attributes: hard, toasted colour, greasy, salty, big, adhesive, dry, toasted flavour, thin, heterogeneous colour, crunchy, sour, tasteless, homogeneous colour, soft, off-flavour, thick, small, aftertaste, brittle, and oily flavour. The second group of participants (Treatment B, n = 40) rated their overall liking and answered three CATA questions comprising 7 of the 21 terms, each featuring multiple sensory modalities. The third group of participants (Treatment C, n = 40) rated their overall liking and answered three CATA questions by modality, as follows: ‘‘Check all the terms you consider appropriate to describe the appearance of this cracker’’, comprising the attributes: toasted colour, big, thin, heterogeneous colour, homogeneous colour, thick and small; and the same question for texture (hard, greasy, adhesive, dry, crunchy, soft, brittle) and flavour (salty, toasted flavour, sour, tasteless, off-flavour, aftertaste and oily flavour). These terms were generated using available literature (Vázquez, Curia, & Hough, 2009) and previous qualitative consumer studies. Samples were assessed monadically according to a balanced random design (Williams’ design). Samples could be tasted more than once. The test took place in standard sensory booths, under white lighting, controlled temperature (23 C) and airflow conditions. 2.10. Data analysis For each study, linear mixed modelling was performed to uncover significant differences in hedonic ratings across experimental treatments. For those studies with only one sample (Studies 2, 7 and 8) the linear mixed model included experimental treatment as fixed effect and consumer (within experimental treatment) as random effect. For the other studies (Studies 1, 3–6 and 9), in which more than one sample were considered, the linear mixed model included experimental treatment, sample and their interaction as fixed effects, and consumer (within experimental
179
179
S.R. Jaeger et al. / Food Quality and Preference 30 (2013) 242–249
treatment) as random effect. A 5% significance level was considered in the analyses. When effects were significant, honestly significant differences were calculated using Tukey’s test. All analyses were carried out in R, Version 2.11.1. 3. Results1 3.1. Influence of CATA question on hedonic scores and comparison with other attribute scaling methodologies Table 1 reveals that overall liking scores elicited concurrently with CATA questions did not significantly differ from those elicited concurrently with attribute intensity scales (Study 1), attribute liking questions (Study 2) or just-about-right scales (Study 3). In Studies 1 and 2 the inclusion of attribute intensity scales, attribute liking questions or CATA questions did not lead to a systematic shift on overall liking scores (p > 0.07). Besides, in Study 1 CATA questions and attribute liking did not cause a significant change in rank order of samples, as seen by the non-significant interaction effect (p = 0.99). For completeness the average liking scores by experimental treatment and product is shown in Table 3. 3.2. Degree of biasing effect of different conditions of concurrent hedonic and CATA elicitation Part 2 of this research included Studies 4–9 and results regarding the influence of concurrent use of CATA questions on hedonic scores in one instance revealed evidence of bias. Specifically, significant main effects of sample and experimental treatment on hedonic scores were established in Study 5. Pair-wise comparisons revealed that the mean hedonic score across the six rice cracker samples was significantly different between Treatments A and C (MHedonic only = 6.4, MHedonic+CATA order 2 = 5.9, p = 0.0009), but not between Treatments A and B (MHedonic only = 6.4, MHedonic+CATA order1 = 6.3, p = 0.44). The effect on hedonic scores was transient in the sense that with CATA terms listed in one order hedonic bias was observed, whereas bias did not occur when CATA terms were listed in a different order. The interaction between sample and experimental treatment was not significant (Table 2), suggesting that the influence of the position of the CATA ballot did not lead to changes in consumers’ preference patterns. In Study 4 a significant effect on hedonic scores due to experimental treatment was also established. However, the difference was found between the two treatments where CATA was elicited concurrently with overall liking, with the Hedonic After treatment giving significantly lower ratings than the Hedonic First treatment (MHedonic After = 4.47 < MHedonic First = 5.11, adj. p = 0.017). No significant differences were found between overall liking scores in the Hedonic Only treatment (MHedonic Only = 4.74), versus either of the two treatments with concurrent rating of CATA, and this outcome did not change when the two latter groups were combined (t (514) = 0.33, p = 0.74). Although participants in the Hedonic After treatment did produce lower ratings than those in the Hedonic Only treatment, this difference did not reach statistical significance (MHedonic After = 4.47, MHedonic Only = 4.74, p = 0.24). As per Study 4, the interaction between sample and experimental treatment was not significant (Table 2), suggesting that the influence of the position of the CATA ballot did not lead to changes in consumers’ preference patterns. Taken together the results from Part 2 of this research revealed that CATA questions only caused weak and transient evidence on 1 It is beyond the scope of this paper to report the product-specific attribute information generated in the focal studies. Interested readers may contact the authors for further details.
180
180
247
co-elicited hedonic scores. The number of CATA questions, the length of the CATA questionnaire and considering mixed modality or single modality CATA questions did not seem to bias hedonic scores. Besides, asking consumers to complete a hedonic task for all samples and then answer the CATA question yielded the same results than asking them to complete one task after another for each sample. The one feature that distinguished Study 5 from the other Studies in Part 2 was the inclusion of a larger number of samples (6 vs. 1–4). Product category or degree of liking of the products did not appear to influence the results.
4. Discussion and conclusions In Part 1 of this research three studies were conducted to examine whether concurrent use of attribute intensity scales, attribute liking questions or just-about-right scales influenced hedonic scores. We found no evidence of hedonic bias and these results support those reported by Mela (1989), Vickers et al. (1993) and Gacula et al. (2008) when working with intensity scales, attribute liking or just-about-right scales, respectively. Six studies were conducted in Part 2 to examine if concurrent use of CATA questions to obtain sensory product characterizations influenced hedonic scores. The evidence of bias was weak and transient. Bias was established in one of six studies and only when the 14 CATA terms used to characterise the products were listed on one of the two tested orders (Study 5). In Study 4 hedonic bias was not established per se, but differences were observed between overall liking scores when CATA questions were asked before and after hedonic scores. Previously, Ares and Jaeger (2013) conducted three consumer studies and found no evidence of concurrent use of sensory CATA questions resulting in bias of hedonic scores. In their seminal work on CATA questions for sensory product characterization, Adams et al. (2007) stated that CATA questions do not produce a large bias on liking scores. Our results largely support this claim and the suitability of CATA questions for concurrent elicitation of consumers’ sensory and hedonic responses to food products. Reasons why bias of hedonic scores are unlikely to occur when CATA questions are used concurrently with the hedonic question may be linked to the characteristics of CATA questions. When completing a CATA question consumers have to check all the terms they consider appropriate for describing the product from a list that contains terms that are both applicable and not applicable to describe it. Therefore, consumers do not need to strongly focus their attention on each of the terms, which could minimize ‘‘priming effect’’ and the activation of information that can become more accessible, even without consumers’ awareness, when rating overall liking (Strack, 1992). For this reason, the extent to which consumer attention is directed towards specific attributes when evaluating their overall liking may be minimised. However, we cannot ignore that some evidence of bias was established. The work by Prescott et al. (2011) is in support hereof, and, moreover, imply that bias always be observed if co-elicitation of attribute information is performed. These authors suggested that asking analytical questions prevents consumer from constructing a synthetic representation of the product, which affects overall liking scores. Overall, our results suggest that concurrent sensory product characterization by CATA questions has the potential to bias hedonic scores. However, it should be taken into account that the observed influence of CATA questions on overall liking scores could have occurred by chance due to the numerous statistical tests performed. Further, we acknowledge possible bias of the results due to learning effects arising from the same group of participants completing five of the reported studies.
248
S.R. Jaeger et al. / Food Quality and Preference 30 (2013) 242–249
Table 3 Mean and standard deviations for liking ratings obtained across the studies. Ratings were collected on a 9-pt hedonic scale (1 = ‘dislike extremely’ and 9 = ‘like extremely’). Treatment A is the control group (liking only) for all studies except Study 9. Study ID
Sample
Study 1 (Flavoured water)
Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample
Study 2 (Salmon dip) Study 3 (Strawberries)
Study 4 (Beer)
Study 5 (Rice crackers)
Study 6 (Green tea) Study 7 (Seaweed cracker) Study 8 (Seafood dip) Study 9 (Plain crackers)
1 2 1 1 2 3 4 5 6 1 2 3 4 1 2 3 4 5 6 1 2 1 1 1 2 3
The reasons why bias occurred in some studies and not the others are unclear at present, but may be linked to the differences between the studies. Gacula (in Moskowitz et al., 2003) mentioned that lack of ability to generalise results of single studies have also been observed when comparing intensity scaling by trained panellists and consumers and he noted that this may be due to effects linked to product categories. In summary, further research is warranted into the conditions where bias occurs and until such time we advocate a pragmatic perspective whereby investigators make informed decisions about whether or not to co-elicit CATA attribute information and if choosing to do so acknowledging that hedonic scores may be biassed. In a broader perspective it seems that concurrent elicitation of attribute information in hedonic testing (using intensity scales, attribute liking questions, just-about-right scales or CATA questions) has the potential to bias hedonic scores. In Part 1, although we did not establish evidence of hedonic bias when using intensity scales, attribute liking questions or just-about-right scales, others have reported this. The inconsistency among available research findings thus suggests that the effect of analytical questions may be owing to multiple factors, which can be classified as relating to the product, the test ballot/testing conditions, and/or the test participants. While most of previous research, including the present work, has focused on the first two sources of bias, future research could fruitfully investigate whether and how interindividual differences play a role. The bias of analytical questions, as well as other biases, has been explained within the paradigm of effort avoidance (Krosnick, 1991; Kool, MacGuire, Rosen, & Botvink, 2010) which posits that anticipated cognitive demands have a disruptive effect on the hedonic experience. However, research in cognitive psychology show that individuals are known to vary greatly in the way they respond to effortful cognitive activities. Thus, the issue might be productively explored by using existing psychographic scales to account for inter-individual variation in cognitive effort avoidance, such as the need for cognition scale (Cacioppo & Petty, 1982; Cacioppo, Petty, & Kao, 1984) and the BIS/BAS scale (Carver & White, 1994). It would also be of interest to test other motivation-related factors that can have a moderating
Treatment A
Treatment B
Treatment C
4.3 ± 2.2 4.6 ± 2.0 7.0 ± 1.9 6.0 ± 2.5 5.8 ± 2.5 5.9 ± 2.3 4.8 ± 2.6 5.8 ± 2.4 5.9 ± 2.2 4.7 ± 2.1 5.1 ± 2.2 5.0 ± 2.0 4.1 ± 2.7 7.1 ± 1.8 6.7 ± 1.7 6.8 ± 1.6 5.9 ± 2.0 6.0 ± 2.2 6.0 ± 2.3 4.0 ± 2.1 3.2 ± 1.8 7.2 ± 1.2 6.6 ± 2.2 6.6 ± 1.8 6.3 ± 2.0 5.9 ± 2.0
4.7 ± 2.0 4.8 ± 1.7 7.1 ± 1.8 6.3 ± 2.5 5.3 ± 2.6 5.9 ± 2.5 4.5 ± 2.7 5.6 ± 2.5 5.9 ± 2.4 5.4 ± 1.8 5.3 ± 1.8 5.0 ± 2.0 4.6 ± 2.5 7.3 ± 1.0 6.0 ± 2.0 6.0 ± 1.8 6.1 ± 2.1 6.2 ± 1.8 6.0 ± 2.1 3.8 ± 2.0 3.6 ± 1.9 7.1 ± 1.4 7.0 ± 1.4 6.9 ± 1.4 6.5 ± 2.0 5.8 ± 1.7
4.4 ± 2.0 4.7 ± 1.7 7.0 ± 1.3 N/A N/A N/A N/A N/A N/A 4.0 ± 1.8 5.0 ± 1.7 4.5 ± 2.1 4.3 ± 2.8 6.7 ± 1.5 6.6 ± 1.8 5.7 ± 2.0 5.5 ± 1.8 5.4 ± 2.2 5.2 ± 2.1 4.3 ± 1.8 3.6 ± 1.5 7.0 ± 1.4 7.1 ± 1.6 6.5 ± 1.6 6.5 ± 1.9 5.9 ± 1.8
factor on effort aversion, both externally – such as the presence of a monetary incentive – and inter-individually – such as product usage and involvement. In conclusion, existing results demonstrate that the mere presence of analytical does not consistently bias liking ratings, and thus the notion that co-elicitation, in isolation, is sufficient to modify hedonic response should be refuted. Tentatively, a bias is more likely to arise from the interaction effect of two or more co-occurring factors. Research into the experimental conditions that are/are not associated with bias is needed. Another relevant issue to consider is how the inclusion of non-sensory terms in CATA questions affect hedonic scores. Author contributions SRJ, GA and DG jointly conceived the research, analysed the data and wrote the paper. All other authors contributed to data collection. Acknowledgements Members of the Sensory & Consumer Science team at Plant & Food Research, particularly SokLeang Chheang, David Jin and Denise Hunter are thanked for help in planning and collection of data in Studies 1–2 and 5–8. Financial support for Studies 1–2 and 5–8 was received from The New Zealand Ministry for Business, Innovation & Employment and Plant & Food Research. Study 4 was supported by the Danish Agency for Science, Technology and Innovation (through the consortium Danish Microbrew – Product innovation and quality) and by the Danish Ministry of Economic and Business Affairs (through the project Local Foods in Denmark). Additional support for Study 4 was provided by the Faculty of Science, University of Copenhagen, Denmark. The help of Maria Tougaard Andersen, Signe Maria Udengaard Christensen, Marlene Schou Grønbeck, Marie Wolsing Laugesen, and Jonas Astrup Pedersen with the data collection is thankfully acknowledged. Instituto Nacional de Investigación Agropecuaria (Uruguay) is thanked for providing the strawberry samples and carrying out
181
181
S.R. Jaeger et al. / Food Quality and Preference 30 (2013) 242–249
data collection in Study 3. Comisión Sectorial de Investigación Científica (Universidad de la República, Uruguay) is thanked for financial support for Study 9. References Adams, J., Williams, A., Lancaster, B., & Foley, M. (2007). Advantages and uses of check all-that-apply response compared to traditional scaling of attributes for salty snacks. Poster presented at the 7th Pangborn Sensory Science Symposium. Minneapolis, MN, USA (12-16 August). Delegate Manual. Ares, G., Barreiro, C., Deliza, R., Giménez, A., & Gámbaro, A. (2010). Application of a check-all-that-apply question to the development of chocolate milk desserts. Journal of Sensory Studies, 25, 67–86. Ares, G., Barrios, S., Lareo, C., & Lema, P. (2009). Development of a sensory quality index for strawberries based on correlation between sensory data and consumer perception. Postharvest Biology and Technology, 52, 97–102. Ares, G., & Jaeger, S. R. (2013). Check-all-that-apply questions: Influence of attribute order on sensory product characterization. Food Quality and Preference, 28, 141–153. Bruzzone, F., Ares, G., & Giménez, A. (2012). Consumers’ texture perception of milk desserts. II – Comparison with trained assessors’ data. Journal of Texture Studies, 43, 214–226. Cacioppo, J. T., & Petty, R. E. (1982). The need for cognition. Journal of Personality and Social Psychology, 42, 116–131. Cacioppo, J. T., Petty, R. E., & Kao, C. F. (1984). The efficient assessment of need for cognition. Journal of Personality Assessment, 48, 306–307. Carver, C. S., & White, T. L. (1994). Behavioral inhibition, behavioral activation, and affective responses to impending reward and punishment: The BIS/BAS scales. Journal of Personality and Social Psychology, 67, 319–333. Dooley, L., Lee, Y., & Meullenet, J. (2010). The application of check-all-that-apply (CATA) consumer profiling to preference mapping of vanilla ice cream and its comparison to classical external preference mapping. Food Quality and Preference, 21, 394–401. Driesener, C., & Romaniuk, J. (2006). Comparing methods of brand image measurement. International Journal of Market Research, 48, 681–698. Earthy, P. J., MacFie, H. J. H., & Hedderley, D. (1997). Effect of question order on sensory perception and preference in central location trials. Journal of Sensory Studies, 12, 215–237. Gacula, M., Jr., Mohan, P., Faller, J., Pollack, L., & Moskowitz, H. R. (2008). Questionnaire practice. What happens when the JAR scale is placed between two ‘‘overall’’ acceptance scales? Journal of Sensory Studies, 23, 136–147. Kool, W., McGuire, J. T., Rosen, Z. B., & Botvinich, M. M. (2010). Decision making and the avoidance of cognitive demand. Journal of Experimental Psychology: General, 139, 665–682. Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5, 213–236. Krosnick, J. A. (1999). Survey research. Annual Review of Psychology, 50, 537–567.
182
182
249
Lado, J., Vicente, E., Manzzioni, A., & Ares, G. (2010). Application of a check-all-thatapply question for the evaluation of strawberry cultivars from a breeding program. Journal of the Science of Food and Agriculture, 90, 2268–2275. Lawless, H. T, & Heymann, H. (2010). Sensory evaluation of food. Principles and practices. New York, NY: Springer. Lim, J. (2011). Hedonic scaling: A review of methods and theory. Food Quality and Preference, 22, 733–747. Mela, D. J. (1989). A comparison of single and concurrent evaluation of sensory and hedonic attributes. Journal of Food Science, 54, 1098–1100. Moskowitz, H. R., Muñoz, A. M., & Gacula, M. C. (2003). Viewpoints and controversies in sensory science and consumer product testing. Trumbull, Connecticut: Blackwell Publishing. Péneau, S., Brockhoff, P. B., Escher, F., & Nuessli, J. (2007). A comprehensive approach to evaluate the freshness of strawberries and carrots. Postharvest Biology and Technology, 45, 20–29. Plaehn, D. (2012). CATA penalty/reward. Food Quality and Preference, 24, 141–152. Popper, R., Rosenstock, W., Schraidt, M., & Kroll, B. J. (2004). The effect of attribute questions on overall liking ratings. Food Quality and Preference, 15, 853–858. Prescott, J. (1999). Flavour as a psychological construct: Implications for perceiving and measuring the sensory qualities of foods. Food Quality and Preference, 10, 349–356. Prescott, J., Lee, S. M., & Kim, K. (2011). Analytic approaches to evaluation modify hedonic responses. Food Quality and Preference, 22, 391–393. Prescott, J. (2004). Psychological processes in flavour perception. In A. J. Taylor & D. Roberts (Eds.), Flavour perception (pp. 256–277). London: Blackwell Publishing. Rasinski, K. A., Mingay, D., & Bradburn, N. M. (1994). Do respondents really ‘‘mark all that apply’’ on self-administered questions? Public Opinion Quarterly, 58, 400–408. Small, D. M., & Prescott, J. (2005). Odor/taste integration and the perception of flavour. Experimental Brain Research, 166, 345–357. Stone, H., & Sidel, J. L. (2004). Sensory evaluation practices. San Diego, USA: Elsevier, Academic Press. Strack, F. (1992). ‘‘Order effects’’ in survey research: activation and information functions of preceding questions. In N. Schwarz & S. Sudman (Eds.), Context effects in social and psychological research (pp. 23–34). New York, NY: SpringerVerlag. Sudman, S., & Bradburn, N. M. (1992). Asking questions. San Francisco, CA: JosseyBass. Varela, P., & Ares, G. (2012). Sensory profiling, the blurred line between sensory and consumer science. A review of novel methods for product characterization. Food Research International, 48, 893–908. Vázquez, M. B., Curia, A., & Hough, G. (2009). Sensory descriptive analysis, sensory acceptability and expectation studies on biscuits with reduced added salt and increased fibre. Journal of Sensory Studies, 24, 498–511. Vickers, Z. M., Christensen, C. M., Fahrenholtz, S. K., & Gengler, I. M. (1993). Effect of the questionnaire design and the number of samples taste on hedonic ratings. Journal of Sensory Studies, 8, 189–200.
Paper V
Giacalone, D., Duerlund, M., Bøegh-Petersen, J., Bredie, W. L. P., Frøst, M. B. (Under review). The effect of stimulus collative properties on consumers’ flavor preferences.
183
184
The effect of stimulus collative properties on consumers’ flavor preferences Davide Giacalone1, Mette Duerlund, Jannie Bøegh-Petersen, Wender L. P. Bredie & Michael Bom Frøst
Department of Food Science, Faculty of Science University of Copenhagen
Abstract. The present work investigated consumers’ hedonic response to flavors in light of Berlyne’s (1967) collative-motivational model of aesthetic preferences. According to this paradigm, sensory preferences are a function of a stimulus’ arousal potential, which is determined by its collative properties. The relationship between overall arousal potential and hedonic response takes the shape of an inverted “U”, reaching an optimum at a certain level of arousal potential. In three independent studies, using different sets of novel beers as stimuli, consumers’ reported their hedonic rating and rated three collative properties: novelty, familiarity and complexity. Relationships between the selected collative properties and hedonic ratings revealed patterns clearly in line with Berlyne’s predictions (curvilinear effect) with regards to stimulus novelty, whereas mixed results were obtained for familiarity and complexity. Further, the moderating role of specific consumer characteristics, such as product knowledge, food neophobia and variety seeking tendency was investigated, and found to significantly affect all response variables to various extents. Keywords: Arousal; Complexity; Novelty; Familiarity; Hedonic response; Flavor perception
Introduction Product innovations can be a source of appeal to consumers, and are generally considered as an important source of competitive advantage (Calantone & Cooper, 1979). Looking at factors of new products success, a recurring theme in the literature is that consumers hold a dualistic attitude towards product innovativeness (van Trijp & van Kleef, 2008), and can often respond reluctantly towards innovative or new products. In the field of consumer psychology, this reluctance is explained as resulting from the lack of understanding of the new products’ value,
1
Corresponding author: Tel.: +45 3533 1018, Fax: +45 3533 3509, e-mail:
[email protected]
185
Working paper
plus the learning costs associated with effectively dealing with new products (Mugge & Dahl, 2011). In the food and beverage industry, product innovation is a key factor in determining a company’s success, yet the area is fraught with risk, with failure rates in the commercial food sector in excess of 90% being common (Stewart-Knox & Mitchell, 2003; Traill & Grunert, 1997). In part, this is due to the low rate of “true” innovation, since over 75% of new products being launched are in fact “copycats” or “me-too” products (Stewart-Knox & Mitchell, 2003). A more fundamental reason has to do the lack of theoretical basis for explaining consumer acceptance of novel foods, which is further complicated by the peculiar nature of our species’ eating behavior. Humans’ ambivalent behavior towards new foods, oscillating between approach and avoidance, has received considerable attention in the literature (see van Trijp & van Kleef, 2008, for a review). This dualistic behavior has been labeled the “generalist paradox” (Rozin, 1976) or “omnivore’s dilemma” (Fischler, 1990), and is phylogenetically determined by the concurrent need to eat a varied diet to maintain physical growth and maintenance (need to experiment), while being cautious enough to avoid ingestion of inedible or potentially harmful food (need for conservatism). In this paper, we seek to further our understanding of consumers’ acceptance of flavor stimuli by testing a theoretical framework – Berlyne’s collative motivational model (Berlyne, 1966; 1967; 1970) – for explaining consumers’ hedonic responses to a realistic beverage. Collative properties and consumer preferences Individuals are both variety seeking and novelty seeking, both at a conscious and non-conscious level. Variety seeking is especially pronounced for products with aesthetic value (Fishbach, Ratner, & Zhang, 2011), that is the product’s capacity to delight one or more of our sensory modalities (Desmet & Hekkert, 2007). In apparent contradiction to this is the concurrent positive effect of familiarity on aesthetic appraisal, which stems from the successful preservation of existing knowledge and the ability of the cognitive apparatus to recognize and categorize a stimulus (Mandler, 1982; Veryzer & Hutchinson, 1998). These two apparently contradictory tendencies (we like what we know vs. we sometimes appreciate the new) are combined in a popular principle of industrial design, the MAYA principle (Loewy, 1951) – MAYA being an acronym for “Most advanced, yet acceptable” – prescribing that designers need to find a balance between being as innovative as possible while still preserving an object’s typicality, here intended as the degree to which an object is representative of a category. A theoretical foundation of the MAYA principle is found in Daniel Berlyne’s basic work on aesthetic, in which it is proposed that human preferences are a function of a stimulus arousal potential which, in turn, is determined by a stimulus psychophysical properties (related to the quality and quantity of the physicochemical characteristics of the stimulus), ecological properties (emotional
186
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
associations with the stimulus), and – most relevant for the present research – collative properties (Berlyne, 1967). These collative properties are defined by Berlyne as depending on the comparison (or collation) of a stimulus elements or attributes, and on the degree to which these elements coexist or conflict (Berlyne, 1966; 1967). Arousal is defined as the psychobiological state of alertness or excitation of a person (Berlyne, 1967). Collative properties known to raise arousal include inter alia novelty, complexity, familiarity, surprisingness and incongruity (Berlyne, 1963; 1967). According to Berlyne, the relationship between arousal potential and hedonic response takes the shape of an inverted U-shaped pattern (Fig. 1), whereby stimuli that are either very familiar or very novel are disliked (too little or too much arousal potential respectively), whereas stimuli that will have an overall moderate arousal potential) will be preferred. This theory is in concordance with research in affective psychology showing that individuals derive positive affective association from successful interpretation of perceived ambiguity (Mandler, 1982), implying that both very familiar and very novel stimuli will frustrate (though for opposite reasons) the satisfaction derived from decoding a stimulus.
Fig. 1 – Relationship between arousal potential and hedonic response (adapted from Berlyne, 1970).
Berlyne’s paradigm has been widely applied to explain sensory preferences, in most cases using visual stimuli (Veryzer & Hutchinson, 1998; Whitfield, 1983; Blijlevens, Carbon, Mugge, & Schoormans, 2012; Cox & Cox, 1994; Mielby, Kildegaard, Gabrielsen, Edelenbos, & Thybo, 2012; Hekkert, Snelders, & Van Wieringen, 2003; Blijlevens, Gemser, & Mugge, 2012), and less frequently auditory stimuli (Martindale & Moore, 1989; North & Hargreaves, 1997). Little is known about whether the theory can be used to explain underlying preference structures in the so-called “lower senses” (Korsmeyer, 1999) – taste and smell – as well. Acquiring this knowledge would be of both scientific and practical importance, since the flavor of a food is a major component for consumers’ experienced quality, and a major driver of repeated purchase (Bruhn, 2008;
187
Working paper
Cardello, Schutz, & Lesher, 2007; Moskowitz & Krieger, 1995). The need for research addressing how a product success is related to its perceived novelty has also been acknowledged as being of specific importance for the food and beverage domain (van Trijp & van Kleef, 2008). The present research specifically focuses on testing the collative-motivational model on the hedonic appraisal of flavors. For clarity, the term “flavor” is here intended as defined by ISO: the complex combination of the olfactory, gustatory and trigeminal 2 sensations perceived during tasting (ISO 5492:2008 – Sensory Analysis: Vocabulary). The three modalities collectively are known as the chemical senses, as they require direct contact with chemical compounds at receptors. Among the arousal-inducing collative properties discussed by Berlyne, the most important ones are novelty and complexity (Berlyne, 1967; 1971). They are also the properties primarily examined in the present research, together with perceived familiarity, which treat as a separate dimension. Novelty is a collative property related to the distance between expectation and perception (Berlyne, 1950; Berlyne, 1966; Berlyne, 1970). As an arousal stimulating property, novelty is related to both positive hedonic response (curiosity and exploratory behavior) and negative ones (fear and withdrawal), inasmuch as its relationship with liking should follow an inverse U-shaped relationship (Berlyne, 1950). In particular, a positive appraisal is given when novelty refers to some unexpected feature in familiar material, by something that is in some degree similar and in some degree dissimilar to what is well known to an individual (Berlyne, 1950). Familiarity refers to whether the stimulus has been encountered before by an individual. For consumer products, familiarity is often operationalized as typicality, i.e. the degree to which an object is regarded to be representative of a category (Veryzer & Hutchinson, 1998; Blijlevens et al., 2012; Hekkert et al., 2003; Hekkert & Leder, 2008). Hence, familiarity can be thought to measure how well a sensory stimulus from a new product fits previously encountered products in the category. This relies on the sensory memories that each individual has stored in his/her memory. If the fit is close, then the categorization will be very fast and the product will be perceived as familiar. Both novelty and familiarity have to do with an individual’s expectations and past experience with a product. However, they are not two extremes of a single dimension. A novel stimulus is one that has some surprising elements, not necessarily one that has not been encountered before. The less than perfect correlation between novelty and familiarity has been observed empirically in previous studies (Hekkert et al., 2003; Lévy, MacRae, & Köster, 2006). From a methodological perspective, this implies that they likely underlie slightly different perceptual dimensions, and that thus can and should be measured separately. This 2 Trigeminal refers to the nervus trigeminus, the fifth cranial nerve, that has a branch that enervates the oral cavity, and there carries information from the general chemical sense, chemesthesis. Common chemesthetic sensations are the burn from alcohol or e.g. chilli, the cooling from menthol and the sting from carbonation.
188
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
view is consistent with recent neuroscientific evidence suggesting that separate neural processes underlie perception of familiarity and novelty (in the posterior parahyppocampal gyrus and the anterior half of the hippocampus respectively), both of whom contribute independently to stimulus recognition and memory performance (Daselaar, Fleck, Prince, & Cabeza, 2006). Complexity is another arousal stimulating collative property, which is related to the number of discernible elements within a stimulus, and to the degree to which several elements are responded to as a coherent unit (Berlyne, 1960). Complexity is thus very close to perceived ambiguity of a stimulus, and to the cognitive effort necessary for its interpretation. Like novelty, complexity is an arousal inducing property that can lead to either to positive or negative affect. In flavor research, perceived complexity has similarly been defined as the number of separate sensory attributes that make up the total impression a person has of a stimulus (Jellinek & Köster, 1979; 1983; Moskowitz & Barbe, 1977) 3 . Previous research has demonstrated that odor and taste complexity is a concept which is meaningful and directly measurable with untrained subjects (Jellinek & Köster, 1979; 1983; Lévy et al., 2006; Moskowitz & Barbe, 1977; Sulmont-Rossé, Chabanet, Issanchou, & Köster, 2008). Following Berlyne’s work, the working assumption is that perceived complexity, familiarity, and novelty can jointly be assumed to determine the arousal potential of a flavor stimulus. Thus, we tested the following theory-based hypothesis: H1: The arousal potential of a flavor can be determined by its combined degree of novelty, familiarity and complexity, and will be in an inverse U shaped relationship with hedonic response, such as a flavor with a moderate arousal potential (defined in terms of novelty, familiarity and complexity) will be preferred over a flavor with low or high arousal potential. Relevant individual traits As with any complex behavior, aesthetic appreciation is likely to involve a number of factors. In the present studies, we chose to focus on three personality traits relevant for food-related consumer behavior, and thus relevant to test within the collative-motivational model. These consumer traits are: product knowledge, variety seeking tendency and food neophobia.
3 Although early work on the topic (Jellinek & Köster, 1979; 1983) suggested that perceived complexity is related to chemical complexity (the number of different compounds actually present in a stimulus), subsequent research has determined that this association is not straightforward. An important corpus of work has since documented that perceptual processing of odors in humans is mostly associative in nature (i.e., we tend to perceive complex object odors as unique stimuli, rather than as a muddle of components). Accordingly, the relationship between the chemical complexity and perceived complexity of a flavor is rapidly lost with increasing number of flavor components (Livermore & Laing, 1996, 1998; Marshall, Laing, Jinks, & Hutchinson, 2006).
189
Working paper
Product knowledge In addition to a stimulus intrinsic qualities, perception of collative properties also depend on the individual ability to decode such stimulus. This ability which in turn is dependent on the individual’s experience, and for consumer products, can be thought of as product knowledge. Product knowledge has two main components: familiarity, i.e. the number of experiences accumulated with the products, and expertise, i.e. the ability to perform product related tasks (Alba & Hutchinson, 1987). The importance of product knowledge on consumers’ judgments has been well established (Alba & Hutchinson, 1987; Park, Mothersbaugh, & Feick, 1994; Sujan, 1985; Selnes & Howell, 1999), and is a very relevant factor to explore in the light of the collative-motivational model. Knowledge, through exposure, reduces the cognitive effort needed to process a stimulus (Alba & Hutchinson, 1987; Latour & Latour, 2010; Sujan, 1985), allows for a better categorization (Medin & Smith, 1984; Mervis & Rosch, 1984; Rosch & Mervis, 1975; Sujan, 1985), and increases product related memory (Alba & Hutchinson, 1987). Knowledge and expertise are also associated with enhanced memory and higher efficiency in processing sensory information in the so-called lower senses (Valentin, Dacremont, & Cayeux, 2010). Generally speaking, product knowledge increases with every type of exposure to a product (not only direct consumption). However, with regards to flavor preference, it is actual tasting experience that will be most relevant. Research in acquisition of food preferences has shown that repeated exposure is a sufficient condition to modify food likes (Pliner, 1982). The change is usually in the direction of an increase in liking, consistently with Zajonc’s mere exposure theory (Zajonc, 1968; 2001). However, recent studies have actually provided suggestive evidence that the direction of the effect actually depends on the initial arousal potential of the stimulus, viz. appreciation of complex stimuli increases over exposure, while it decreases for simple stimuli (Lévy et al., 2006; Sulmont-Rossé et al., 2008). This is consistent with the so-called Pacer theory (Dember & Earl, 1957), illustrated in figure 2, stating that exposure to a stimulus effectively leads to an actual sensory priming that causes a shift in the consumer arousal curve and brings him/her to gradually appreciate more complex products.
Fig. 2 – The shift of the original curve and the individual optimum of arousal potential as effect of exposure to a product (adapted from Lévy et al., 2006).
190
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
Importantly, this happens largely at a non-conscious level (i.e., independent of cognitive appraisal). This aspect should be emphasized because past research has shown that perceptual and conceptual knowledge do not develop symmetrically as a result of product exposure (Latour & Latour, 2010). In other words, an individual’s taste acquisition progresses regardless of whether or not s/he meanwhile becomes more aware of specific sensory attributes (Hoeffler & Ariely, 1999) and/or acquires a vocabulary associated with expertise (Latour & Latour, 2010). Including product knowledge in the experimental design was considered relevant in the present work. Previous consumer research has largely documented the role of product knowledge in structuring both preferences and cognitive evaluation for consumer products. Additionally, knowledge correlates highly with other constructs such as product interest and involvement (Sujan, 1985), which has also been shown to be influent with regards to e.g. how much complexity and novelty are sought after in food products (Charters & Pettigrew, 2007; 2008). In the present research, we subscribe to the view that knowledge assessment has to do more with an individual’s memory of experience with a given product, rather than objectively defined product class information (Park et al., 1994), and operationalize product knowledge as having three main components: self-assessed product knowledge, product involvement, and degree of exposure. Based on the preceding discussion, we set out to test the following two hypotheses: H2a: Product knowledge will yield a lower perception of complexity and novelty, and lead to a higher perceived familiarity; H2b: Hedonic ratings for complex beers will be higher in highly knowledgeable consumers. Variety-seeking and food neophobia Aside from the stimulus’ characteristics and contextual influences, it has also been suggested that an individual optimum level of arousal will be affected by specific personality traits (see e.g. Raju, 1980). Among them, of immediate relevance to this research is an individual’ s propensity to seek or avoid novel food, which is considered a relatively stable personality trait, and that has been widely studied in food oriented consumer research (e.g. (Pliner, Lahteenmaki, & Tuorila, 1998; van Trijp, Lähteenmäki, & Tuorila, 1992). In our experimental design, we operationalized this personality trait using two well-known psychometric tools relevant for measuring food choice behavior. The first is the variety seeking scale (VARSEEK, Van Trijp & Steenkamp, 1992), a measure developed specifically for measuring consumers’ variety seeking tendency with respect to food. There are eight items constituting this scale, administered as Likert statements. The second is the food neophobia scale (FNS, Pliner & Hobden, 1992). Food neophobia is a personality trait describing an individual’s stable propensity to either avoid novel food, which is assumed to have an adaptive value
191
Working paper
as it prevents human beings from ingesting potentially poisonous substances. The FNS is a validated paper and pencil measure of such trait, and consists of 10 Likert items on which a mean score is calculated (for some of the items the score is reversed). In the context of the present work, the relevance is that both of these constructs might underlie the degree to which an individual inherently behaves in a way that tends to resolve the omnivore’s dilemma in favor of approach or avoidance, and thus we expect them to significantly affect an individual optimal arousal level. Hence, we expected them to exert an effect on the hedonic response in the following two directions: H3a: Highly neophobic consumers will give lower hedonic ratings than neophilic ones whereas conversely H3b: High variety seeking consumers will give higher hedonic ratings than low variety seekers. Study 1 Materials and methods Stimuli Eight Danish beers were used as test stimuli. They were selected to span systematically across perceived novelty, as assessed by pilot work, as well as to represent the flavor diversity in the beer market (Table 1). All beers had special uncharacteristic flavors, except for Thy Pilsner, which was chosen to represent the “prototypical” lager beer. Two of the beers were brewed specifically for this study to represent ingredients very novel to beer (sea-buckthorn and pine). For the commercial beers, care was taken to ensure that all samples came from the same batch, in order to minimize inter-samples differences.
192
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
Beer Name
Producer
Beer Style
ABV
Main flavor ingredient
Sensory characteristic
Bøgebryg Fynsk Forår Enebær Stout Seabuckthorn
Bryggeri Skovlyst Ørbæk Bryggeri Grauballe Bryghus (Experimental)
Amber Ale Pale Ale Stout Pale lager
5.2 % 5.0 % 6.0 % 4.5 %
Woody Floral Berry, coffee Berry, sour
Pine
(Experimental)
Pale lager
4.5 %
Stjernebryg
Herslev Bryghus
Trappist
8.0 %
Beech Twigs Elderflower Juniper Berry Sea-buckthorn juice (2.5 ml/100ml) Pine flavor† (6.25ųl/100ml) Anise, coriander, liquorice Walnuts
Thisted Bryghus Pale lager 4.6 % Thy Pilsner Rise Bryggeri Brown Ale 7.0 % Valnød Hertug † “Pin Thyrol”, Firmenich International S. A., Geneva, Switzerland.
Woody Spiced, sweet Neutral, hoppy Nutty
Tab. 1 – Overview of the beers used as test stimuli in study 1.
Subjects A total of 135 consumers (93 men and 42 women, Mage= 39.6 ± 13.5) took part in the experiment. Prior to the experiment, all participants filled out a questionnaire where they provided information about their demographics and psychographic traits of interest. Product knowledge was measured as cumulative score of nine different measures: two Likert items about self-perceived knowledge of beer (I know a lot about beer, I can easily name and recognize several beer types), four Likert items related to product involvement (I am very interested in beer, I often try new beers, I like to take part in beer-related events, I would like to know more about beer), rated on a 9 points agree-disagree scale. Three measures of consumption frequency (drinking frequency, intake of a typical drinking event, number of different beers consumed per month), given as multiple choice questions. The degree to which the chosen set of items could be used as a unidimensional construct for product knowledge was assessed by computing Cronbach’s alpha, which indicated very high reliability (α = .904). Variety seeking with respect to food was measured used the already introduced tools, namely VARSEEK scale (Van Trijp & Steenkamp, 1992) and the food neophobia scale (Pliner & Hobden, 1992). Subjects signed a declaration that they would not drive right after the test. At the end of the tasting, they were provided with tickets to return home by public transportation. Each of them also received a bottle of premium beer as reward for participating (retail value ≈ 6 €). As further incentive, subjects’ names were placed in a lottery to win a gift card (retail value ≈ 150 €). 4
Calculated on available data from all consumers that expressed interest in participating (N = 305) in the experiment, not only on those that were later invited to participate.
193
Working paper
Experimental procedures The beer tasting was carried out at central location test facility where approximately 15 consumers per session tasted and evaluated the eight samples one at a time (monadic presentation). The beer samples were served in 28 cl clear glasses, blind labeled with a 3 digit randomized number. Serving order was randomized to balance for first order and presentation biases (Macfie, Bratchell, Greenhoff, & Vallis, 1989). The serving temperature was 10 ˚C. Water and soda crackers were served as palate cleansers between samples. Each consumer was served approximately 50 ml of each sample, and was instructed to smell, taste and swallow the sample at least once to get the full perception of the beer. Each beer was evaluated on a separate evaluation sheet, and participants were instructed not to look at their previous scores. Consumers rated liking on a 15 points hedonic scale with the following semantic anchors: 1= Dislike extremely, 8= Neither like nor dislike, 15= Like extremely. Ratings of novelty, familiarity and complexity were elicited via Likert items (I think the taste of this beer is familiar/novel/complex) and also scored on 15 points scales (1 = Completely disagree, 8 = Neutral, 15 = Completely agree). Results and discussion A preliminary one-way analysis of variance (ANOVA) was carried out to detect significant difference between the products with regards to the perceived level of all rated properties, using the samples as fixed variation factors. Tukey’s pair-wise multiple comparison tests (α = .05) were carried out following ANOVA to determine significantly different products pairs. Differences among the beers (p < .001) were found for all rated properties, as shown in table 2. Sample Stjernebryg
Liking a
10.6 (±3.8) a
Familiarity 8.3
c,d a,b
(±4.0)
Novelty
Complexity
a
11.2 a (±2.2)
c
7.4 (±3.3)
9.6 b (±3.1)
9.7 (±3.2)
Enebær Stout
10.1 (±3.4)
9.8
Bøgebryg
9.9 a (±3.6)
9.1 b,c (±3.9)
8.0 b,c (±3.2)
9.3 b (±2.7)
Valnød Hertug
9.8 a (±3.3)
7.7 c,d (±4.0)
8.7 a,b,c (±3.8)
9.0 b (±2.9)
Fynsk Forår
9.3 a (±3.5)
7.1 d (±3.9)
9.0 a,b (±3.3)
7.7 c (±3.1)
Thy Pilsner
6.6 b (±3.3)
10.6 a (±4.0)
4.8 d (±3.6)
5.5 d (±3.2)
Pine
6.4 b (±4.0)
5.1 e (±3.3)
9.8 a (±4.4)
7.8 b (±3.9)
Havtorn
6.1 b (±3.3)
4.1 e (±3.6)
8.6 a,b,c (±3.7)
7.1 c (±3.2)
(±3.6)
194
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
Tab. 2 – Mean overall ratings and standard deviations for liking and collative properties. Samples are sorted in descending order of liking (Study 1). Different superscript letters indicate significant differences (Tukey p < .05). S.D. = Standard deviation.
Relationships between rated variables were preliminarily assessed by computing Pearson product moment correlation coefficient (N= 1080 in each correlation coefficient). A weak linear dependence between liking and all response variables were observed: Familiarity (r = .28, p < .001), Novelty (r = .25, p < .001) and Complexity (r = .52, p < .001). Novelty and familiarity were, as expected, only moderately negatively correlated (r = -.5, p < .001), leaving out a significant amount of unshared variance. Complexity correlated positively with novelty (r = .49, p < .001), and was uncorrelated with familiarity (r = -.04, p = .19). Regression analyses A multiple linear regression model was carried out to analyze effects of the three collative properties on the hedonic response (liking). The resulting regression equation is described by eq. 1: where L is the response variable (liking), α is the intercept, β1,…,β3 the regression parameters, N, F, C the three explanatory variables (Novelty, Familiarity, Complexity), and e is the error. The regression model (Adj. R² = .4, F(3, 1076) = 239.66, p < .001) revealed that all properties significantly positively predicted liking, with complexity (b = .47, t(1076) = 14.67, p < .001) and familiarity (b = .39, t(1076) = 15.06, p < .001) being the strongest regressors, followed by novelty (b = .27, t(1076) = 8.23, p < .001). The fact that all predictors were highly significant is in itself an indication that we get the best model when we include information about all of these properties. This indication was also confirmed by the much higher predictive power of this model when compared to separate regression models using the three collative properties as single explanatory variables (R2Novelty = .06, F(1, 1078) = 74.51, p < .001; R2Familiarity = .07, F(1, 1078) = 89.34, p < .001; R2Complexity = .27, F(1, 1078) = , p < .001). These results are in line with the expected joint effect of novelty, familiarity and complexity in driving hedonic response. Nonlinear relationships were assessed by fitting the data to a second degree polynomial regression, i.e. by adding the quadratic terms of the original monomials in eq. 1:
In the polynomial regression model obtained (Adj. R² = .4, F(6, 1073) = 122.6, p < .001), familiarity and complexity regressors remain unaltered, and their quadratic terms were non-significant. On the contrary, the size of the regression coefficient for novelty increased substantially compared to the one obtained from the multiple linear regression (.594 > .270), as did its individual R² (.147 > .068), showing that by including the quadratic term in the equation, novelty increased
195
Working paper
both its effect and its predictive power with regards to consumers’ rated liking. Importantly, the N2 regressor was found as having a negative value, marked a reversal in the directionality of the effect (b = -.02, t(1073) = - 2.74, p < .001). This indicates a curvilinear association (inverted U-shaped) between novelty and hedonic response, consistently with our main hypothesis (H1). As a supplement to this analysis, each individual response variable was plotted against liking for visual assessment. To ease the interpretation of their relationships, we computed smoothing points using robust locally weighted regression (LOWESS, Cleveland, 1979; Cleveland & Devlin, 1988), a smoothing , where g is a procedure which accommodates the data for which smooth function and the ei are random variables with mean 0 and constant scale, so that yi is an estimate of g(xi). The procedure is “robust” in the sense that it guards against deviant points distorting the smoothed points. This analysis was carried out in the R language for statistical computing (R Development Core Team, 2011) using the routine “lowess”, whose complete algorithm is described in Cleveland (1979). The smoothed scatterplots (Fig. 3) quite unequivocally confirmed the indications of the regression analyses: hedonic response appears to be linear monotonic function of familiarity and complexity, whereas an inverse U shaped relationship characterizes its relationship with novelty, in further (partial) support of H1.
Fig. 3 – Robust smoothed values of novelty (a), familiarity (b) and complexity (c) against liking (Study 1). Values at neutral points are also visible (dashed line = neutral point in the hedonic scale; solid line = neutral point in Likert scales for collative properties).
Consumer variables The last analysis carried out to test our hypotheses concerning the effect of product knowledge (H2a and H2b), variety seeking (H3a) and neophobia (H3b). For that purpose, consumers were divided into subgroups according to their level in each of these consumer characteristics. Using the median as divide, each of the three consumer variables was transformed into a categorical variable with two possible levels (Low and High). The ANOVA equation that represents this situation is:
196
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
As hypothesized, consumers with higher product knowledge rated the beers significantly more familiar (M (High_Knowledge) = 8.1 > M (Low_Knowledge) = 7.47, p = .02) and less novel (M (High_Knowledge) = 7.96 < M (Low_Knowledge) = 8.52, p = .03). In addition consumers with high product knowledge showed a tendency to rate products less complex (M (High_Knowledge) = 8.14 < M (Low_Knowledge) = 8.56, p = .07). Nevertheless, no main effect of product knowledge on liking was found, which was against our expectations (H2a). A post-hoc model including interactions revealed the beer by product knowledge interaction term to be marginally significant (F(14, 1056) = 1.62, p = .06). High variety seeking consumers gave significantly higher overall liking ratings (M (High_VARSEEK) = 8.89 > M (Low_VARSEEK) = 8.37, p = .04), thus supporting H3a. No significant effect of degree of neophobia was observed. Study 2 One of the highlights in study 1 was that perceived novelty was found to be in an inverse curvilinear relationship with liking, as predicted by Berlyne’s theory. Nevertheless, the expected inverse U-shape was less pronounced than expected, showing that for every change in novelty, liking decreased less after the optimum than it increased before the optimum. Furthermore, the other variables expressing arousal potential, familiarity and complexity, showed a linear relation with the hedonic response, which is contrary to Berlyne’s theory. We interpreted this as an indication that the stimuli used in study 1 did not sufficiently cover the arousal spectrum, and try to replicate and expand the results with a new set of beers. Additionally, the second study expanded our understanding of the meaning of collative properties by measuring other relevant food related emotions (see Lyman, 1982, for an overview) selected among those used in previous food studies (Desmet & Schifferstein, 2008; King & Meiselman, 2010). Materials and methods Stimuli Unlike the first study, for this study the beers were experimentally developed by adding different flavor extracts to two commercially available base beers: Thy Pilsner (the prototypical lager used in study 1), and a darker, more bitter lager (Thy Classic). After identifying optimal flavor concentrations, eight beers were selected as test stimuli as having very distinct sensory profiles (Table 3).
197
Working paper
Beer name
Beer style
Flavoring added (Product Identifier)
Thy Pilsner
Pale lager
(None – Reference 1)
Perilla
Flavored pale lager
Lemon Lime
Flavored pale lager Flavored pale lager Dark lager
Aromatic water obtained by steam distillation of Perilla Frutescens leaves Lemon/lime † (QL34701) Star Anise † (MS-027-152-5) (None – Reference 2)
Star Anise Thy Classic
Juniper berry oil † (UJ-740-714-0) Wormwood oil † Wormwood (XR-936-574-1) Rum flavor † Rum cocktail (NN07414), Blackberry † (059141) Sage † (L-059142) † Givaudan S. A., Vernier, Switzerland. Juniper
Flavored dark lager Flavored dark lager Flavored dark lager
Concentration
Sensory characteristic of the pure flavor -
200 μl/100 mL
Fresh, pungent, bitter, chemesthetic, green, grassy, herbal, apple, acidic Fresh, acidic, sharp
2ųl/100ml 14ųl/100ml
Fennel, liquorice, tarragon -
6ųl/100ml
15 ųl/100ml
Spicy, gin, pine, green, fresh, sharp, citrus Tart, bitter, wry, acidic, herbal Alcoholic, harsh
10ųl/100ml
Sweet, tart, bitter;
2ųl/100ml)
Herbal, earthy, savory
20ųl/100ml
Tab. 3 – Beers used as test stimuli in study 2. ABV was 4.6% in all samples.
Subjects A total of 122 consumers (76 men and 46 women, Mage= 42±15.9) took part in the experiment. Recruiting procedures were in all aspects similar to the first experiment. Product knowledge and variety seeking tendency were collected with the same measures used in study 1. However, only product knowledge and variety seeking tendency were collected from this population. Since no effect was observed in study 1, it was chosen not to use the food neophobia scale further. Experimental procedures Serving size, order, temperature and general testing procedures mirrored those used in the main study. The ballot differed from the one in the main study in that this time consumers were asked to give hedonic ratings – on a 15 points hedonic scale – and subsequently rated 9 product attributes (novelty, familiarity, complexity, surprisingness, typicality, traditionality, stimulatingness, confusingness, and drinkability). The latter were given as Likert items (e.g. This beer tastes familiar) and rated on a 15 points scale with labeled anchors (Completely disagree – Neutral – Completely agree). The order in which the attributes appeared on the ballot was randomized across consumers.
198
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
Results and discussion The same stepwise analytical approach was employed to the second dataset. A preliminary analysis of variance revealed significant between-samples differences (p < .001) for all response variables. Post-hoc testing was carried out to elucidate pairwise differences (Table 4). Sample
Liking
Familiarity
Novelty
9.0 (±4.0)
7.7
(±4.0)
8.2 (±4.5)
8.3 b,c (±3.8)
Juniper
8.9 a (±3.4)
7.3 c (±3.9)
7.8 b,c (±3.9)
8.8 a,b (±3.4)
Thy Pilsner
8.8 a (±3.0)
10.9 a (±3.4)
4.1 e (±3.2)
5.2 e (±3.2)
Rum Cocktail
8.6 a,b (±4.0)
5.3 e (±3.9)
9.6 a (±4.3)
9.3 a (±3.5)
Thy Classic
8.4 a,b (±3.6)
8.6 b (±3.7)
6.2 d (±3.7)
7.8 c (±3.5)
Star Anise
7.8 b,c (±3.8)
5.3 e (±4.1)
9.4 a (±4.8)
8.6 a,b (±3.8)
c,d
b
Complexity
Lemon Lime
a
Wormwood
7.6 (±4.0)
6.6
(±4.2)
8.1 (±4.5)
8.7 a,b (±3.6)
Perilla
7.6 c (±3.7)
7.0 c,d (±4.4)
7.1 c (±4.6)
7.0 d (±3.9)
c
d,e
b
Tab. 4 – Mean overall ratings and standard deviations for liking and collative properties. Samples are sorted in descending order of liking (Study 2). Different superscript letters indicate significant differences (Tukey p < .05). S.D. = Standard deviation.
Regression analyses A multiple linear regression model (Adj. R² = .25, F(3, 972) = 107.9, p < .001), analogous to the one described in eq. (1), again showed significant a positive linear relationship between liking and perceived complexity (b = .3, t(972) = 8.9, p < .001) and familiarity (b = .46, t(972)= 14.8, p < .001), and also with novelty which again was the weakest regression (b = .19, t(972) = 5.7, p < .001). Again, the model including all three predictor variables had a much higher explanatory power than single-predictors models (R2Novelty = .005, F(1, 974) = , p < .001; R2Familiarity = .07, F(1, 2 974) = 79.22, p < .001; R Complexity = .07, F(1, 974) = , p < .001), indicating that the three properties to a certain degree have an independent effect on consumer preferences, and that the best prediction is obtained when all are considered. These results show a comforting degree of agreement with those of study 1, and provide a more general support for hypothesis H1. Fitting the data to a second order polynomial equation (cf. eq. 3) revealed the expected pattern: an increase in overall prediction power (Adj. R² = .26, F(6, 969) = 57.74, p < .001), an increase in coefficient size for the novelty predictor (b = .40, t(969) = 3.245, p = .001) compared to the linear regression model, and the emergence of a significant quadratic effect (b = -.01, t(969) = - 1.8, p = .07). Importantly, a significant negative effect was found for the familiarity quadratic regressor (b = .02, t(969) = - 2.177, p = .03), suggesting a similar pattern as for novelty. However, the size of the quadratic regressor clearly shows a lesser importance than the linear
199
Working paper
regressor. Lastly, the direction of the complexity effect did not change (b = .3, t(969) = 2.344, p = .02), and its quadratic term was not significant. As in study 1, the liking ratings were plotted against the collative properties after smoothing. Visual inspection of the curves (Fig. 4) shows some important indications, particularly in the light of study 1 results (Fig. 3). With regards to novelty, the inverse U shaped trajectory, reaching an optimum at around 10 points, is in good agreement with the finding of study 1. Familiarity and complexity, however, did not show the linear relationship as in the first study, but rather saturation curves that do not exclude the possibility of quadratic relationship in Berlyne’s terms might have been observed, but that, unlike for novelty, a higher “rejection threshold” is needed for these two collative properties.
Fig. 4 – Robust smoothed values of novelty (a), familiarity (b) and complexity (c) against liking (Study 2). Values at neutral points are also visible (dashed line = neutral point in the hedonic scale; solid line = neutral point in Likert scales for collative properties).
Relationships between response variables Pearson’s correlation coefficient were computed to elucidate relationship between response variables (N = 976 in each correlation coefficient). Liking showed a positive correlation with all rated variables, except for a moderate negative correlation with the attribute confusing (r = - .3, p < .001). Novelty correlated well with surprisingness (r = .73, p < .001), and moderately with complexity (r = .6, p < .001) and confusingness (r = .44, p < .001). It is important to notice here that novelty correlated with both positive and negative properties (surprisingness vs. confusingness), which, confirm the twofold nature of this collative property, providing a semantic substantiation to the curvilinear effect uncovered by the regression analyses. A very high correlation was found between familiarity and traditionality (r = .79, p < .001), and between familiarity and typicality (r = .78, p < .001), which is consistent with findings from visual stimuli (Blijlevens et al., 2012; Hekkert et al, 2003). As expected, the latter three properties were all inversely correlated with novelty, but the strength of this relationship was not as strong (r = - .64, p < .001, with familiarity; r = - .57, p < .001 with typicality; r = - .6, p < .001 with traditionality), reinforcing the notion that familiarity and novelty are not perfect antonyms. A complete account of these results is given in Table 5.
200
Liking
.27
Drinkability Stimulating
.08
Confusing Surprising Familiarity Typicality Novelty
.27
Drinkability
Stimulating
Traditional
Confusing
Surprising
Familiarity .27
-.02
-.30
.22
.70
.70
***
***
***
***
-.44
.40
.51 ***
***
*
***
***
n.s.
-.01
-.14
-.45
.44
-.23
n.s.
***
***
***
***
***
***
.51
.35
.06
.05
.25
-.08
.01
n.s.
n.s.
***
*
n.s.
*** Traditional
Typicality
Novelty
Complexity
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
-.41
-.60
.84
.79
-.71
-.49
***
***
***
***
***
***
.30
.44
-.49
-.55
.52
***
***
***
***
***
.56
.73
-.68
-.70
***
***
***
***
-.35
-.64
.77
***
***
***
-.37
-.57
***
***
.60 ***
*** p < .001, ** p < .01, * p < .05, n.s. non-significant Tab. 5 – Pearson’s correlation coefficients between response variables in study 2 (N = 976 in each correlation coefficient).
Consumer variables With regards to product knowledge effect, ANOVA on consumer background information showed results in full accordance with the findings of the first study: knowledgeable consumers perceived the beers as significantly less novel (M (High_Knowledge) = 6.7 < M (Low_Knowldge) = 8.13, p < .001) and complex (M (High_Knowledge) = 7.19 < M (Low_Knowledge) = 8.3, p < .001), and more familiar (M (High_Knowledge) = 7.68 > M (Low_Knowledge) = 6.98, p = .08), although for familiarity the effect was only marginally significant. These results support our hypothesis H2a. Just like in study 1, no main effect of product knowledge could be found to support H2b, although again a significant beer by knowledge interaction was observed (p = .006).
201
Working paper
The effect of variety seeking on liking observed in study 1 could not be replicated here. High variety seekers in this study perceived the samples as significantly more familiar (M (High VARSEEK) = 7.51 > M (Low VARSEEK) = 8.13, p = .03). The same result had been observed in study 1, though in that case the effect was not significant.
Study 3 A last study was carried out to confirm the quadratic effect of novelty on hedonic response.
Materials and methods Stimuli Eight beers were used as test stimuli. Similarly to study 2, the beers were experimentally developed by adding different flavors to two commercially available pale lager which served as base beer (table 6). Beer name
Beer style
Flavoring added (Product Identifier)
Concentration
Sensory characteristic of the pure flavor -
Thy Pilsner
Pale lager
(None – Reference)
Cherry
Flavored pale lager Flavored pale lager Flavored pale lager Flavored pale lager
Cherry flavor † (10133-33) † Lemon/lime † (QL34701) Hop Golding † (L-032083) Sake † (L-58162) Yumberry † (073728)
4 μl/100 mL
Fruity, sweet
200ųl/100ml
Fresh, acidic, sharp
14ųl/100ml
Flavored pale lager
Abbey flavor † (L-035410)
30ųl/100ml
Mild, Classic English hop aroma Sake (Sweet, Acidic) Yumberry (Berry, grapefruit) (Multi component mixture) Fruity, peppery aroma
Abbey
Stout
Flavored pale lager
Stout flavor † (L-032095)
200ųl/100ml
Wheat
Flavored pale lager
Wheat flavor † (L020658)
300ųl/100ml
Lemon Lime Hop Golding Yumberry/ Sake
6ųl/100ml 6ųl/100ml
(Multicomponent mixture) Roasted malt, coffee (Multicomponent mixture) Phenolic, yeasty
† Givaudan S. A., Vernier, Switzerland. Tab.6 – Beers used as test stimuli in study 3. ABV was 4.6% in all samples.
202
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
Subjects A total of 103 consumers (76 men and 27 women, Mage= 39.9 ± 14.6) took part in the experiment. Recruiting procedures were in all aspects similar to the previous studies.
Experimental procedures Serving size, order, temperature and tasting procedures were the same as in the previous two studies. The ballot design differed from the previous ones, that this time consumers were divided into three groups and asked to evaluate novelty in three different ways5, but all of them started by rating their liking and perceived novelty (on the same 15 points scales used in the previous two studies). In the context of the present work, ratings of liking and novelty across groups (no significant between-group differences on either novelty or liking were found) were used to confirm the results obtained in the previous two studies, with regards to the relationships between perceived novelty and hedonic response.
Results and discussion In order to study the effect of perceived novelty, data were first fitted to a second order polynomial regression of the following shape (eq. 4):
where L is the response variable (liking), α is the intercept, β1 and β2 the regression parameters, Ni and Ni2 the explanatory variable (perceived novelty, linear and quadratic), and e is the error. In accordance with the previous results, the model (Adj. R² = .02, F(2, 821) = 8.96, p < .001) showed that novelty significantly predicted liking (b = .46, t(821) = 3.90, p < .001), whereas its quadratic term (N2) corresponded to a negative slope in the regression line (b = - .03, t(821) = - 3.34, p < .001). Again, regression results were supplemented with a visual assessment of the scatterplot smoothed by locally weighted regression. Fig. 5 below shows the regression line fitted through the smoothed points, whose curvilinear shape confirms quite decidedly the findings of the two earlier studies.
5 One group rated only liking and novelty, and did so first individually in the aforementioned order. One group rated liking, novelty plus a set of six other elicited emotions and collative properties. The last group rated liking, novelty plus a set of 19 other variables (elicited emotions and collative properties). Additionally all subjects provided a qualitative description of their interpretation of the “novelty” concept. These results are not discussed here, as they are part of a separate ongoing effort aimed at developing a method for consumers’ assessment of novelty in food products.
203
Working paper
Fig. 5 – Robust smoothed values of novelty against liking (Study 3). Values at neutral points are also visible (dashed line = neutral point in the hedonic scale; solid line = neutral point in Likert scale for Novelty).
General discussion Collative-motivational preferences
model
predictive
ability
of
consumer
flavor
The present paper attempted to test the collative-motivational model as a framework for predicting consumers’ flavor preferences. Our working hypothesis stated that preference would arise from the combined effect of novelty, familiarity, and complexity (H1), and that the relationship between arousal potential and hedonic response would take the shape of an inverted U. The results of study 1 and 2 revealed that all measured properties – novelty, familiarity and complexity – had a significant positive effect on hedonic response, but that the effect of perceived novelty was best described by a quadratic function. Furthermore, weak correlations were found between variables, suggesting that they exert an independent effect on hedonic response, consistently with previous findings obtained with visual stimuli (Blijlevens et al., 2012; Hekkert et al., 2003). At last, regression analyses also show the most predictive model includes all variables, confirming the theoretical assumption that aesthetic preferences are driven by a concurrent need for consistency (preference for familiar products) and a need for stimulation (preference for complex products). These results indicate that consumers will prefer food products that deliver a moderate amount of arousal, through novel and complex elements, rather than very low or very high amount of novelty. These results also support H1, but suggested that arousal potential of flavor stimuli is triggered by perceived novelty rather than complexity, which is contrary to previous claims (Levy, MacRae, & Köster, 2006). Accordingly, we were successful in providing empirical evidence for a curvilinear effect of novelty on hedonic response. This was consistently revealed in all three studies by looking at changes in effect direction when a quadratic term was included in the respective regression models, as well as significant increase in
204
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
novelty’s R2, explained by the significant effect of novelty’s quadratic term on hedonic ratings. The reasons why the same curvilinear relationship was not observed for the other two collative properties – complexity and familiarity – needs further examinations. The most immediate explanation is that for these two properties we failed in spanning spectrum sufficiently enough to activate the minimum rejection threshold. This is consistent with the fact that the primary aversion system has a much higher activation threshold than the primary reward system, thus only extremely high arousal levels will decrease aesthetic appraisal (Berlyne, 1967). However, alternative explanations could be brought forth. Complexity, as discussed in the introduction, is related to perception of many independent components in a stimulus (Berlyne 1960, 1963, 1967). Its status as arousal inducing property, however, might depend not only on their detectability, but also on whether these elements co-exists or conflicts, viz. their degree of harmony or congruity. For example, jazz musicians John Coltrane and Miles Davis are both renowned for having composed some of the most complex progressions in music history, but most jazz connoisseurs would probably find Coltrane more arousing, while Davis more harmonic. In our studies, even though large inter-samples differences were observed for perceived complexity, it is possible that the samples were still perceived as not incongruent or disharmonic enough to drive hedonic response down to the uncomfortable range where one would reject the stimulus. The fact that a realistic beverage was employed, instead of an artificial stimulus, might have had a mitigating effect on these less desirable aspects of complexity. An alternative explanation, of a different nature, is that a context effect might have been responsible for a shift in complexity tolerance. In a recent article, Pocheptsova, Labroo, & Dhar (2010) found that in the context of special occasion products, the task difficulty increases the attractiveness of a product, masking the normal aversion for difficulty. It seems plausible that such masking effect could have been at play in our studies, as participants were informed that they would taste and evaluate “experimental beers”. This might have had an impact especially on novice consumers, e.g. causing them to suppress their initial hedonic appraisal (e.g. a preference for simple and familiar flavors) and adopt a more analytical evaluation mode. For familiarity, the possibility of a limited sample range could be more plausible because it is very hard to create extremely unfamiliar samples for a very familiar product category, such as beer. This is consistent with the idea that evaluation stimulus collative properties are mediated by categorical processing of the product being evaluated, confirming that the group of items to which a stimulus is related plays an important role in aesthetic evaluation. As previous studies have found, product category can mask the full spectrum, causing familiarity in everyday object to be linearly related to liking (Whitfield, 1983; Blijlevens et al., 2012; Veryzer & Hutchinson, 1998; Hekkert et al., 2003), whereas conversely Berlyne’s prediction are better observable when abstract stimuli are evaluated. Finding a curvilinear relationship clearly depends on the range of the perceived inter-products differences; thus, a study spanning over more samples or even product categories
205
Working paper
might have revealed such curvilinear effect of familiarity more pronouncedly, as suggested by study 2 where indeed a weak but significant quadratic effect was observed. In general, research specifically addressing the masking effect of familiar product category would be an important continuation of the present work. Before moving on to the next part of the discussion, we would like to bring forth a few reflections on the practical implications of these findings, particularly with regards to product innovation in the food and beverage industry (but also in other product categories where perception in the chemical senses is important, like perfumery) and help reducing the high rate of product failure previously mentioned. This research shows that there is an optimal level of “newness” in a flavor stimulus, and that hedonic appraisal is maximized in products that deliver moderate novelty (inverted U-shaped relationship). This is in general agreement with previous claims that consumers prefer novel flavors, provided that they are able to relate them to something familiar (Mielby & Frøst, 2010; Tuorila, Meiselman, Cardello, & Lesher, 1998). From a product development perspective, it’s important to observe that the imperfect correlation between novelty and familiarity leaves out a ratio of unshared variance where product manipulation can occur (Hekkert, Sneders, & van Wieringen, 2003). For example, an immediately applicable strategy would be relating perceived novelty with physico-chemical composition of the product, and/or with data from a sensory panel. Knowledge of such relationships could then be used to guide flavor optimization towards the identified novelty optimum, thus reducing the risk of consumers’ rejection.
Effect of consumers’ individual traits Two of the studies conducted tested the effect of specific personality traits – product knowledge, variety seeking and neophobia – on hedonic response and collative properties. Based on the mere exposure effect, we hypothesized that knowledgeable consumers will report a lower perceived novelty and complexity, and a higher familiarity (H2a). Substantial evidence in support of this hypothesis was obtained in study 1 and 2. In terms of flavor preference development, these results are concordant with previous experiments where product knowledge was directly manipulated through exposure (Lévy et al., 2006; Sulmont-Rossé et al., 2008), and in line with the predictions of the pacer-theory (Dember & Earl, 1957). Results are also in accordance with the expectation that higher product expertise reduces the cognitive effort for decoding the stimulus, and allow for an easier categorization (Alba & Hutchinson, 1987; Sujan, 1985). Our second hypothesis stated that consumers with high product knowledge would have a higher optimum stimulation level and thus exhibit higher liking for the beers (H2b). In the presented studies, product knowledge was not found to have an effect on preference, thus no support for H2b could be found. However, significant interactions were observed in both studies where product knowledge was a design factor, suggesting that alternative explanations for the null results
206
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
cannot be ruled out. These include limits of the experimental design (e.g. the product range), but also a number of masking factors, both physiological and psychological, that could have been at play to alter preference formation. For example e.g. sensory acuity of the tasters was found in a previous study to be a more important factor than knowledge alone in explaining flavor preferences (Frøst & Noble, 2002). Additionally, psychological variables such as thinking styles (e.g. holistic vs. analytical), which are also known to affect novelty acceptance independently of other factors (Chinchanachokchai & Noel, 2011), might account for the lack of difference in liking ratings between knowledgeable vs. novice consumers. The last two hypotheses related to whether an individual’s intrinsic desire for variety would influence the optimal level of arousal and consequently affect hedonic ratings. A significant effect of the variety seeking level in the predicted direction (H3a) was found in study 1, but the same could not be replicated in study 2, although the tendency was seen. Level of neophobia was used as a design factor in the first study only, revealing no effect on hedonic response, contrary to our hypothesis (H3b). To summarize, these results suggest that product knowledge has implications for flavor perception, whereas variety seeking might have an effect, but not that it certainly will. In general, it seems that the variety seeking and neophobia constructs may be more relevant for other aspects of the product experience, e.g. food choice.
Limitations and future research This study is not without limitations, the first of which is that all the presented studies were conducted with the same product, beer, as test stimulus, and therefore cross-products studies are warranted to ensure that results can indeed be generalized over food products categories. Further, in this work we have limited our focus to consumers’ responses to flavor. This calls for a few observations are due concerning the degree to which results can be translated to overt behavior (e.g. actual product choice). The flavor of a food is a major component in the experienced quality, and is considered the most important driver of repeated purchasing (see e.g. Bruhn, 2008; Cardello et al., 2007; Moskowitz & Krieger, 1995). Nevertheless, since consumers do not (ordinarily) have the possibility to taste food products before purchase, it follows that in prepurchase stages the product appearance will be the key factor upon which consumers will form their expectations, including evaluations about the perceived novelty and prototypicality (Mugge & Schoormans, 2012) (for food specific studies, see e.g. (Puyares, Ares, & Carrau, 2010) for an example of how visual cues about wine bottles affect the perceived quality of wine, and Garber, Hyatt, & Starr, (2000) for an example of how the color of food packaging color serves as a powerful cue for conferring novelty to food products). Previous research has shown that the relative importance of different sensory modalities changes across stages of product usage (Fenko, Schifferstein, & Hekkert, 2010), and thus the significance of
207
Working paper
the results cannot be generalized, beyond actual product tasting, to other aspects of product-user interactions. Further, experimental design limited the interactions between individuals, to maximize subjects focus on their own taste perception. This obviously differs from a natural consumption situation, particularly one with social interactions. It is important to stress this aspect, since it is known that social interactions (implicit and explicit) influence both variety seeking and preference, as demonstrated (incidentally, in a study also using beer as stimuli) by Ariely & Levav (2000). Finally, it should be noted that all three studies were conducted at the same testing facility and with nearly identical experimental procedures. Contextual factors have been long known to have important implications for consumer behavior (Belk, 1975; Meiselman, 2008). The relevance for the present discussion is that situational factors have been suggested to exert a moderating effect on the perceived typicality and novelty of a product (Bloch, 1995), and recent results appear to support that claim (Blijlevens et al., 2012). Analogous dynamics have been proposed for the lower senses as well (Frøst & Mortensen, 2011), though empirical evidence still needs to be provided. Systematic investigations of contextual aspects would constitute a promising continuation of the present research and contribute to a deeper integration of the collative-motivational model with general (consumer) behavior theory.
Conclusions To conclude, the main contribution of this research has been to test a theory of aesthetic preferences, Berlyne’s collative motivational model, for predicting consumers’ hedonic response to flavors in a realistic beverage. Empirical evidence for a curvilinear effect of perceived stimulus novelty on hedonic response, in line with the theory’s prediction, was gathered. The other two collative properties examined, familiarity and complexity, were more linearly related to liking, although one study suggested that a curvilinear effect might have been observed with increasing levels of familiarity and complexity. Taken collectively, these results indicate that arousal potential in flavor stimuli is mostly related to perceived novelty, but suggest that at very high levels familiarity and complexity also might also exhibit a similar behavior. A number of moderating factors have been discussed as potentially explaining the much higher rejection threshold for these two properties, and future research on the topic is warranted. At last, the moderating role of three relevant consumer traits – product knowledge, variety seeking and food neophobia – was investigated. A significant effect of product knowledge on collative properties was established, but not on hedonic response, possibly indicating that sensory characteristics in this case are more important. No conclusive evidence concerning the effect of variety seeking and neophobia was obtained. Results suggest that these traits may affect on flavor perception, but the possibility of a spurious effect should be ruled out by future research more explicitly addressing the issue.
208
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
References
Alba, J. W., & Hutchinson, J. W. (1987). Dimensions of consumer expertise. Journal of Consumer Research, 13, 411-454. Ariely, D., & Levav, J. (2000). Sequential choice in group settings: Taking the road less traveled and less enjoyed. Journal of Consumer Research, 27, 279-290. Belk, R. W. (1975). Situational variables and consumer behavior. Journal of Consumer Research, 2, 157-164. Berlyne, D. E. (1950). Novelty and curiosity as determinants of exploratory behaviour. British Journal of Psychology, 41, 68-80. Berlyne, D. E. (1960). Conflict, arousal, and curiosity. New York: McGraw-Hill Book Company. Berlyne, D. E. (1963). Complexity and incongruity variables as determinants of exploratory choices and evaluative ratings. Canadian Journal of Psychology, 17, 274-290. Berlyne, D. E. (1966). Curiosity and exploration. Science, 153, 25-33. Berlyne, D. E. (1967). Arousal and reinforcement. In D. Levine (Ed.), Nebraska symposium on motivation (pp. 1-110). Lincoln, Nebraska: University of Nebraska Press. Berlyne, D. E. (1970). Novelty, complexity and hedonic value. Perception and Psychophysics, 8, 279-286. Blijlevens, J., Carbon, C., Mugge, R., & Schoormans, J. P. L. (2012). Aesthethic appraisal of products designs: Independent effects of tipicality and arousal British Journal of Psychology, 103, 57-57. Blijlevens, J., Gemser, G., & Mugge, R. (2012). The importance of being ‘well-placed’: The influence of context on perceived typicality and esthetic appraisal of product appearance. Acta Psychologica, 139, 178-186. Bloch, P. H. (1995). Seeking the ideal form - product design and consumer response. Journal of Marketing, 59, 16-29. Bruhn, C. M. (2008). Consumer acceptance of food innovations. Innovation: Management, Policy & Practice, 10, 91-95. Calantone, R. J., & Cooper, R. G. (1979). A discriminant model for identifying scenarios of industrial new product failure. Journal of the Academy of Marketing Science, 7, 163-183. Cardello, A. V., Schutz, H. G., & Lesher, L. L. (2007). Consumer perceptions of foods processed by innovative and emerging technologies: A conjoint analytic study. Innovative Food Science & Emerging Technologies, 8, 73-83. Charters, S., & Pettigrew, S. (2007). The dimensions of wine quality. Food Quality and Preference, 18, 997-1007.
209
Working paper
Charters, S., & Pettigrew, S. (2008). Why do people drink wine? A consumer-focused exploration. Journal of Food Products Marketing, 14, 13-31. Chinchanachokchai, S., & Noel, H. (2011). Does it sound familiar? The effects of thinking style on judgment of new products. Proceedings of the Society for Consumer Psychology 2011 Winter Conference, Atlanta, Georgia, USA. 30-30. Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74, 829-836. Cleveland, W. S., & Devlin, S. J. (1988). Locally weighted regression: An approach to regression analysis by local fitting. Journal of the American Statistical Association, 83, 596-610. Cox, D., & Cox, A. (1994). The effect of arousal seeking tendency on consumer preferences for complex product design. Advances in Consumer Research, 21, 554-558. Daselaar, S. M., Fleck, M. S., & Cabeza, R. (2006). Triple dissociation in the medial temporal lobe: Recollection, familiarity, and novelty. Journal of Neurophysiology, 96, 1902-1911. Dember, W. N., & Earl, R. W. (1957). Analysis of exploratory, manipulatory and curiosity behaviors. Psychological Review, 64, 91-96. Desmet, P., & Hekkert, P. (2007). Framework of product experience. International Journal of Design, 1, 57-66. Desmet, P., & Schifferstein, H. N. J. (2008). Sources of positive and negative emotions in food experience. Appetite, 50, 290-301. Fenko, A., Schifferstein, H. N. J., & Hekkert, P. (2010). Shifts in sensory dominance between various stages of user–product interactions. Applied Ergonomics, 41, 34-40. Fischler, C. (1990). L'homme omnivore. Paris: Editions Odile Jacob. Fishbach, A., Ratner, R. K., & Zhang, Y. (2011). Inherently loyal or easily bored? Nonconscious activation of consistency versus variety-seeking behavior. Journal of Consumer Psychology, 21(1), 38-48. Frøst, M. B., & Mortensen, L. M. (2011). Collative properties, elicited emotions and their relationships to liking in high-end restaurant dishes. 9th Rose Mary Pangborn Sensory Science Symposium, Toronto, Canada. Frøst, M. B., & Noble, A. C. (2002). Preliminary study of the effect of knowledge and sensory expertise on liking for red wines. American Journal of Enology and Viticulture, 53, 275-284. Garber, L. L., Hyatt, E. M., & Starr, R. G. (2000). The effect of food color on perceived flavour. Journal of Marketing Theory and Practise, 8, 59-70. Hekkert, P., & Leder, H. (2008). Product aesthetics. In H. N. J. Schifferstein & P. Hekkert (Eds.), Product Experience. Palo Alto, CA: Elsevier Science. Hekkert, P., Snelders, D., & Van Wieringen, P. C. W. (2003). 'Most advanced, yet acceptable': Typicality and novelty as joint predictors of aesthetic preference in industrial design. British Journal of Psychology, 94, 111-124.
210
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
Hoeffler, S., & Ariely, D. (1999). Constructing stable preferences: A look into dimensions of experience and their impact on preference stability. Journal of Consumer Psychology, 8, 113-139. Jellinek, J. S., & Köster, E. P. (1979). Perceived fragrance complexity and its relation to familiarity and pleasantness. Journal of the Society of Cosmetic Chemists, 30, 253-262. Jellinek, J. S., & Köster, E. P. (1983). Perceived fragrance complexity and its relation to familiarity and pleasantness II. Journal of the Society of Cosmetic Chemists, 34, 83-97. King, S. C., & Meiselman, H. L. (2010). Development of a method to measure consumer emotions associated with foods. Food Quality and Preference, 21(2), 168-177. Korsmeyer, C. (1999). Making sense of taste: Food and philosophy. Ithaca: Cornell University Press. Latour, K. A., & Latour, M. S. (2010). Bridging aficionados' perceptual and conceptual knowledge to enhance how they learn from experience. Journal of Consumer Research, 37, 688-697. Lévy, C. M., MacRae, A., & Köster, E. P. (2006). Perceived stimulus complexity and food preference development. Acta Psychologica, 123, 394-413. Livermore, A., & Laing, G. (1996). Influence of training and experience on the perception of multicomponent odor mixtures. Journal of Experimental Psychology: Human Perception and Performance, 22, 267-277. Livermore, A., & Laing, G. (1998). The influence of chemical complexity on the perception of multicomponent odor mixtures. Perception & Psychophysics, 60, 650-661. Loewy, R. (1951). Never leave well enough alone. New York: Simon and Schuster. Lyman, B. (1982). A Psychology of Food, More Than a Matter of Taste. New York: Van Nostrand Reinhold. Macfie, H. J. H., Bratchell, N., Greenhoff, K., & Vallis, L. V. (1989). Designs to balance the effect of order of presentation and first-order carry-over effects in hall tests. Journal of Sensory Studies, 4, 129-148. Mandler, G. (1982). The structure of value: Accounting for taste. In M. S. Clark, & T. S. Fiske (Eds.), Affect and cognition: The 17th Annual Carnegie Symposium (pp. 203-230). Hillsdale, NJ: Lawrence Erlbaum Associates. Marshall, K., Laing, D. G., Jinks, A. L., & Hutchinson, I. (2006). The capacity of humans to identify components in complex odor-taste mixtures. Chemical Senses, 31, 539-545. Martindale, C., & Moore, K. (1989). Relationship of musical preference to collative, ecological, and psychophysical variables. Music Perception: An Interdisciplinary Journal, 6, pp. 431-445. Medin, D. L., & Smith, E. E. (1984). Concept and concept formation. Annual Review of Psychology, 35, 113-138. Meiselman, H. L. (2008). Experiencing food products within a physical and social context. In H. N. J. Schifferstein & P. Hekkert (Eds.), Product Experience. Palo Alto, CA: Elsevier Science.
211
Working paper
Mervis, C. B., & Rosch, E. (1984). Categorization of natural objects. Annual Review of Psychology, 32, 89-115. Mielby, L. H., & Frøst, M. B. (2010). Expectations and surprise in a molecular gastronomic meal. Food Quality and Preference, 21, 213-224. Mielby, L. H., Kildegaard, H., Gabrielsen, G., Edelenbos, M., & Thybo, A. K. (2012). Adolescent and adult visual preferences for pictures of fruit and vegetable mixes – effect of complexity. Food Quality and Preference, 26, 188-195. Morin-Audebriand, L., Mojet, J., Chabanet, C., Issanchou, S., Møller, P., Köster, E. P., & SulmontRossé, C. (2012). The role of novelty detection in food memory. Acta Psychologica, 139, 233-238. Moskowitz, H. R., & Barbe, C. D. (1977). Profiling of odor components and their mixtures. Sensory Processes, 1, 212-226. Moskowitz, H. R., & Krieger, B. (1995). The contribution of sensory liking to overall liking: An analysis of six food categories. Food Quality and Preference, 6, 83-90. Mugge, R., & Dahl, D. (2011). The influence of design newness on new products evaluation. Proceedings of the Society for Consumer Psychology 2011 Winter Conference Atlanta, Georgia USA. 45-46. Mugge, R., & Schoormans, J. P. L. (2012). Product design and apparent usability. the influence of novelty in product appearance. Applied Ergonomics, 43, 1081-1088. North, A. C., & Hargreaves, D. J. (1997). Liking, arousal potential and the emotions expressed by music. Scandinavian Journal of Psychology, 38, 45-53. Park, C. W., Mothersbaugh, D. L., & Feick, L. (1994). Consumer knowledge assessment. Journal of Consumer Research, 21, 71-82. Pliner, P. (1982). The effects of mere exposure on liking for edible substances. Appetite, 3, 283-290. Pliner, P., & Hobden, K. (1992). Development of a scale to measure the trait of food neophobia in humans. Appetite, 19, 105-120. Pliner, P., Lahteenmaki, L., & Tuorila, H. (1998). Correlates of human food neophobia. Appetite, 30, 93 Pocheptsova, A., Labroo, A. A., & Dhar, R. (2010). Making products feel special: When metacognitive difficulty enhances evaluation. Journal of Marketing Research, 47, 1059-1069. Puyares, V., Ares, G., & Carrau, F. (2010). Searching a specific bottle for tannat wine using a checkall-that apply question and conjoint analysis. Food Quality and Preference, 21, 684-691. Rosch, E., & Mervis, C. B. (1975). Family resemblance: Studies in the internal structures of categories. Cognitive Psychology, 7, 573-605. Rozin, P. (1976). The selection of foods by rats, humans, and other animals. In D. Lehrman, R. A. Hinde & E. Shaw (Eds.), Advances in the Study of Behavior. New York: Academic Press. Selnes, F., & Howell, R. (1999). The effect of product expertise on decision making and search for written and sensory information. Advances in Consumer Research, 26, 80-89.
212
D. Giacalone et al.- Stimulus collative properties and consumer’s flavor preferences
Stewart-Knox, B., & Mitchell, P. (2003). What separates the winners from the losers in new food product development? Trends in Food Science & Technology, 14, 58-64. Sujan, M. (1985). Consumer knowledge: Effects on evaluation strategies mediating consumer judgments. Journal of Consumer Research, 12, 31-45. Sulmont-Rossé, C., Chabanet, C., Issanchou, S., & Köster, E. P. (2008). Impact of the arousal potential of uncommon drinks on the repeated exposure effect. Food Quality and Preference, 19, 412-420. Traill, B., & Grunert, K. G. (Eds.). (1997). Product and process innovation in the food industry (1st Ed.). London, UK: Chapman & Hall. Tuorila, H. M., Meiselman, H. L., Cardello, A. V., & Lesher, L. L. (1998). Effect of expectations and the definition of product category on the acceptance of unfamiliar foods. Food Quality and Preference, 9, 421-430. Valentin, D., Dacremont, C., & Cayeux, I. (2010). Does short-term odor memory increase with expertise? An experimental study with perfumers, flavorists, trained panelists and novices. Flavour and Fragrance Journal, 26, 408-415. van Trijp, H. C. M., & Steenkamp, J. E. M. (1992). Consumers' variety seeking tendency with respect to foods: Measurement and managerial implications. European Review of Agricultural Economics, 19, 181-195. van Trijp, H. C. M., Lähteenmäki, L., & Tuorila, H. (1992). Variety seeking in the consumption of spread and cheese. Appetite, 18, 155-164. van Trijp, H. C. M., & van Kleef, E. (2008). Newness, value and new product performance. Trends in Food Science & Technology, 19, 562-573. Veryzer, R. W., & Hutchinson, J. W. (1998). The influence of unity and prototypicality on aesthetic responses to new products designs. Journal of Consumer Research, 24, 374-394. Whitfield, T. W. A. (1983). Predicting preference for familiar, everyday objects: An experimental confrontation between two theories of aesthetic behavior. Journal of Environmental Psychology, 3, 221-237. Zajonc, R. B. (1968). Attitudinal effects of mere exposure. Journal of Personality and Social Psychology, Monograph Supplement, 9, 1-27. Zajonc, R. B. (1980). Feeling and thinking: Preferences need no inferences. American Psychologist, 35, 151-175. Zajonc, R. B. (2001). Mere exposure: A gateway to the subliminal. Current Directions in Psychological Science, 10, 224-228.
213
214
Paper VI
Giacalone, D., Frøst, M. B., Bredie, W. L. P., Jaeger, S. R. (In preparation). Situational appropriateness and consumers’ use patterns for beers: The moderating role of product familiarity.
215
216
Situational appropriateness and consumers’ use patterns for beers: The moderating role of product familiarity Davide Giacalone1, Wender L. P. Bredie1, Michael Bom Frøst1, & Sara R. Jaeger2, 1
Department of Food Science, Faculty of Science, University of Copenhagen 2
The New Zealand Institute for Plant and Food Research Ltd.
Abstract. Consumer research has long maintained that explicit account of contextual variables can enhance the ability to understand and predict behavioral acts. In this domain, one aspect that has received little attention is whether context equally affects familiar and unfamiliar food products. The topic is investigated in four consumer studies (N = 76, N = 97, N = 93, and N = 145), using beer images as test stimuli. Using the situational appropriateness framework, we derived a quantitative characterization of product – context associations, revealing major differences between products. The results were robust across studies, suggesting that patterns of situational appropriateness are substantive, and that situation-based segmentation can be a valuable addition to traditional user-based approaches. The level of product familiarity had a strongly correlated with usage versatility and affected the perceived appropriateness for many of the specific usage-contexts, possibly acting as a cue to infer product quality and performance. Keywords: context, familiarity, appropriateness, beer, consumer research
Introduction 1.1. Situational appropriateness and food-related consumer behavior Consumer research has long maintained that explicit account of contextual variables can enhance the ability to understand and predict behavior (Belk, 1974, 1975; Sandell, 1968). The topic is of considerable importance with regards to foodoriented consumer research (Meiselman, 2008), as many authors have demonstrated that both preference and choice of food and beverages are affected by a variety of
Corresponding author: Tel.: +64 9 925 7035. Fax: +64 9 815 4201. E-mail:
[email protected]
217
Working paper
contextual influences, including social influences (Ariely & Levav, 2000; de Castro, 1991), environmental factors (Bell, Meiselman, Pierson & Reeve, 1994; Bell & Meiselman, 1995; de Graaf, Cardello, Kramer, Lesher, Meiselman & Schutz, 2005; Edwards, Meiselman, Edwards & Lesher, 2003; Meiselman, Johnson, Reeve, & Crouch, 2000), temporal aspects (Kramer, Rock, & Engell, 1992; Rozin & Tuorila, 1993), and accompanying items (Eindhoven & Peryam, 1959; Hersleth, Mevik, Næs, & Guinard, 2003; Moskowitz & Klarman, 1977; Turner and Collison, 1988). Awareness that consumers behave differently in different situations have prompted the development of a number of methodological approaches based on contextual segmentation, i.e. on identification of perceived product benefits across different situations (Dubow, 1992; Jaeger, Bava, Worch, Dawson & Marshall, 2011; Jaeger, Marshall & Dawson, 2009; Köster & Mojet, 2006). Of particular relevance within this stream of research is the approach based on judgments of situational appropriateness proposed by Schutz (Schutz, 1988; 1994; Cardello & Schutz, 1996) who adapted a basic anthropological technique (Stefflre, 1971) for application in food studies. In such an approach, consumers evaluate products and usage situations simultaneously, essentially being asked how well a product (or a set of products) would fit each of the given usage contexts (varying in e.g. time of the day, location, presence of others, etc.). The appropriateness framework has been utilized over the years with a variety of product categories, to study the effect of different intrinsic and extrinsic aspects of foods – such as different sensory intensity levels, nutritional and label information, packaging and processing (e.g. Bruhn & Schutz, 1986; Jack, Piggott, & Paterson, 1994; Lähteenmäki & Tuorila, 1997; Resurreccion, 1986; Schutz, Cardello, & Winterhalter, 2005) – and established itself as a simple methodology to investigate the instrumental roles of food products as defined by usage contexts.
1.2. The role of product familiarity One aspect that has received little attention to date is how judgments of situational appropriateness are influenced by consumer’s degree of familiarity with food products, in spite of suggestive evidence in this direction (Jaeger, Rossiter, & Lau, 2005). Product familiarity can be defined as the evaluative judgment that a consumer makes regarding his/her subjective knowledge about a product (by “subjective” it is intended that it is unrelated to objective product knowledge). Familiarity is related to the amount of previous exposure with the target stimulus or product, and has been found to be strongly related to product typicality, i.e. the degree to which a product is representative of its overall category concept (Schwanenflugel & Rey, 1986). In general, consumers tend to be somewhat reluctant to try very new and unfamiliar products. This reluctance stems from lack of understanding of the product’s value and potential usage, and from aversion to the learning costs associated with effectively using a new product (Shugan, 1980; Mukherjee & Hoyer, 2001). While for familiar products, a consumer can easily retrieve relevant characteristics and determine whether that product is appropriate for an intended use more or less irrespective of context and external elements (Goodman, Broniarczyk, Griffin, & McAlisher, 2013), the same task is more
218
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
difficult for unfamiliar products. For unfamiliar products contextual elements can provide a frame of reference for understanding new products by e.g. orienting consumers’ towards particular features that may be of salience in relation to a given context usages (Herr, 1989; Hoeffler, 2003; Ratnseshwar & Shocker, 1991; Veryzer, 1998; Warlop & Ratneshwar, 1993). Accordingly, extant literature in consumer research suggests that contextual influences might be more relevant for consumers’ choice of novel/unfamiliar products, particularly because contexts have been shown to facilitate consumers’ cognitive categorization of unfamiliar items. Evidence for this argument emerged also in the field of food choice and acceptance. For example, Tuorila, Meiselman, Cardello and Lesher (1994), and Mielby and Frøst (2010) demonstrated that providing verbal information (as part of a context manipulation) increased the acceptability of unfamiliar food dishes. Other authors have suggested that acceptance and choice of familiar and well-liked foods might be relatively less influenced by specific consumption contexts (King, Weber, Meiselman, & Lv, 2004; King, Meiselman, Hottenstein, Work, & Cronk, 2007).
1.3. Aims of the research There is hitherto little understanding of the role of product familiarity plays in consumers’ judgments of situational appropriateness of food and beverages. Gaining such knowledge would be beneficial both from a research standpoint – i.e. furthering the concept of situational appropriateness by including another productrelevant dimension – and from a practical standpoint, because it relates to how consumers form consideration sets and thus could aid food companies with the positioning of new food products. The present research starts to fill this gap by focusing on a case study: situational appropriateness of beers varying in degree of familiarity, evaluated by New Zealand consumers. Beer lends itself very well as a case study; past research has established that a number of contextual influences have an impact on hedonic appreciation and choice of this popular drink (Allison & Uhl, 1964; Caporale & Monteleone, 2004; Coquillat et al., 2009; Hajdu, Major, & Lakner, 2007; Lee, Frederick, & Ariely, 2006; Mohr et al., 2001; Sester et al., 2013). Further, beer is a very traditional product in New Zealand (it accounts for 63% of all available alcohol for sale (Carroll, 2011), and as such particularly suited for exploring consumers’ associations (Caporale & Monteleone, 2004; Sester, Dacremont, Deroy, & Valentin, 2013). Finally, the need for further studies investigating the role of familiarity in shaping consumers’ mental representation of beer has been recently highlighted (Sester et al., 2013). The specific aims and research questions were: RQ1: To investigate the role of usage contexts in shaping consumers’ evaluation of beers; it is anticipated that different beers will be associated with different usage contexts, and that the nature of this association be related to extrinsic (e.g. the product name) and intrinsic (e.g. the beer style) product
219
Working paper
characteristics. Further, we expect that these associations can be harnessed by the situational appropriateness concept in a reliable (i.e. repeatable across studies) way. RQ2: To investigate to the relationships between the degree of familiarity with the beers and their perceived situational appropriateness. Because familiar products are encountered more easily by consumers, we expect familiarity to be linked to versatility, which can be defined as the total number of usage contexts for which a given beer will be perceived as appropriate (Ratneshwar & Shocker, 1991). On the other hand, evaluating unfamiliar products in relation to a context may facilitate the appraisal of the former by focusing the consumer’s attention to specific context-relevant features.
2. Materials and Methods These questions are investigated across four studies, in which we apply the situational appropriateness framework to reveal associations between different beers and the contexts in which they are consumed. All four studies follow a consistent structure. In order to facilitate a quantitative characterization of product– contexts associations, a questionnaire was designed using beer images and names as stimuli. This choice was motivated by the argument that vision is the most important sensory modality at the moment of purchase, suggesting that that product appearance is a significant cue for assessing the perceived usage appropriateness of products (Creusen & Schoormans, 2005; Fenko, Shifferstein, & Hekkert, 2010; Mugge & Schoormans, 2012a). Further, this stimulus format has been employed effectively in extant research on the same topic (e.g. Jaeger, Rossiter, & Lau, 2005; Raats & Shepherd, 1992; Sester et al., 2013), and is known to enhance external validity of consumers’ evaluations in product categories that depend strongly on visual inspection (Jaeger, Hedderley, & MacFie, 2001; Vriens, Loosschilder, Rosbergen, & Wittink, 1998). For each beer image shown, consumers evaluated appropriateness for different contexts and perceived familiarity, and the information was related via correlational measures. Relevant contextual attributes were developed according to existing classifications of contextual variables affecting food choice and acceptance (Blake et al., 2007; Bisogni et al., 2007; Meiselman, 2008): locations – including the general location (e.g. at home, at a restaurant, etc.), and the specific place within the location (e.g. at home in front of the TV) –, occasions (e.g. at a rugby match, at a concert), social surroundings (e.g. for guests, to impress someone), physiological processes (e.g. as a thirst-quencher) and mental processes (e.g. as a treat for myself).
2.1. Study 1 2.1.1. Participants Participants were a convenience sample of consumers from the general population in Auckland (N= 76, 38 men and 38 women, aged 18 – 60), recruited
220
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
based on their availability and willingness to participate, who completed the questionnaire on a voluntary basis. Summary of key demographic characteristics for the three studies are given in Table 1.
Variable
Gender
Age
Frequency of beer consumption
Consideration set size wrt beer
Percentage of the sample Study 1
Study 2A
Study 2B
Study 3
(N = 76)
(N = 97)
(N = 93)
(N = 145)
Male
50
34
39.8
43.8
Female
50
66
60.2
56.2
18 – 30
27.6
52.6
35.5
19.4
31 – 45
46.1
27.8
36.5
40.3
46 – 60
26.3
19.6
28
40.3
Never
9.2
7.2
7.5
3.5
Rarely (1-2 times a year or less)
7.9
15.5
16.1
10.6
Sometimes (3-12 times a year)
15.8
7.2
15.1
12.1
Often (1-3 times a month)
26.3
28.9
22.6
26.2
Regularly (once a week or more)
40.8
41.2
38.7
47.5
I never drink beer
11.8
8.2
8.6
N/A
Generally, I tend to drink the same beers and beer styles I am familiar with
9.2
24.7
20.4
N/A
I occasionally like to try different beers and beer styles, but I don't go out of my way to do it
44.7
50.5
57
N/A
I usually drink a variety of beers and beer styles, and I actively seek out to try new ones
34.2
15.5
14
N/A
Tab. 1 – Key demographics for the four studies (percentages may not sum up to 100 due to rounding).
2.1.2. Choice of stimuli Images of nine commercially available NZ beers were selected as stimuli (Fig.1). Stimulus elements included a picture of the beer (11x4 cm high-quality color photograph), the name of the product and the name of the producer. The nine beers were chosen to represent three different levels of familiarity – low, medium and high – pre-assessed based on pilot test results. All selected beers were produced in New Zealand, in order to control for possible biases due to diverse meanings associations related to the country of origin (Donadini & Fumi, 2010; Luomala, 2007).
221
Working paper
Fig. 1 – Product images used as stimuli in Study 1. Product codes used throughout the paper are indicated into brackets.
2.1.3. Data collection Relevant usage contexts were identified based on previous studies (Schutz, 1988; Nantachai, Petty & Scriven, 1991–1992; Marshall & Bell, 2003; Belk, 1975; Hajdu, Major & Lakner, 2007; Nantachai et al., 1991–1992; Dubow, 1992), and used either verbatim or with some modifications aimed to better capture a local (i.e. New Zealand) context. Pre-testing (N= 12) of the questionnaire was carried out to assess the layout and the wording of the context items. Subjects who helped with piloting the ballot were also asked to state whether they felt some relevant contextual attributes were missing and whether some of the existing ones were redundant or never appropriate for beer, and their suggestions helped shaping the final ballot. The following 15 usage contexts were included in Study 1: As a gift for someone (Gift)1, As a treat for myself (Treat), At a BBQ with friends (BBQ), As a thirst-quencher (Thirst-quench.), At a fine-dining restaurant (Fine dining), At a music concert (Concert), At a party (Party), At a public house (bars, pub, etc.) (Pub House), At a rugby match (Rugby), At work for Friday drinks (Work), On a camping or fishing trip (Camping/Fishing), To celebrate an achievement (Achievement), To serve to guests (Guests), Watching TV at home (TV), With a snack (Snack).
1
Abbreviation used in tables and figures.
222
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
Appropriateness evaluation were elicited with a multiple choice task in which consumers were asked to look at a beer image, and tick all the contextual attributes they perceived that beer to be appropriate for. Note that this format differs from earlier applications of the item-by-use appropriateness, in which consumers rate the perceived appropriateness on category scales (usually, 7 points with anchors labeled “Never appropriate” and “Always appropriate”). The choice of adopting a check-all-that-apply (CATA) format owed to 1) the fact that CATA questions are quicker to administer and less cognitively burdensome for the respondents (Krosnick, 1991; 1999; Sudman & Bradburn, 1992), and to 2) recent methodological research suggesting that this method produces (at an aggregate level) results that are to a high extent equivalent to rating techniques (Bruzzone, Ares, & Gimenez, 2012; Reinbach, Giacalone, Ribeiro, Bredie, & Frøst, 2013). Presentation order of the beers was randomized across participants. The order of the contextual attributes was also randomized, in order to minimize primacy and order biases in CATA responses (Ares & Jaeger, 2013). For practical reasons, attribute randomization was carried out only between (but not within) subjects. Additionally, respondents rated the perceived product familiarity of each beer on a 5 points scale with end-points anchors (1= Not at all familiar, 5= Extremely familiar, as in Raju, 1977), their frequency of consumption for that beer, and their overall familiarity with that beer brand (on an analogous 5 points scale). At the end of the ballot, respondents provided some background information, namely gender, age, frequency of beer consumption and consideration set size for beer (details in Tab.3).
2.2. Studies 2A and 2B In order to add robustness to the experimental plan, a second set of two experiments was designed to replicate the results of the Study 1 and increase their independence from the experimental conditions. The goal of these two studies, both of which maintained the same methodological approach used in Study 1, was to verify whether the same patterns of product-context association could be observed with a different product set and a different wording of the contextual attributes (Study 2A). Additionally, in Study 2B, we restricted the stimuli set to beers with homogeneous profiles (only pale lager-type beer), in order to better separate the “true” familiarity effect from potentially distracting elements such as expectations about price, taste and availability, as well as personal preferences for certain beer styles.
2.2.1. Participants Participants (N2A = 97; N2B = 93) were recruited by a marketing research provider and received cash compensation for their time. All lived in Auckland, New Zealand. They were aged 18 – 60 years old and self-identified as Caucasian.
223
Working paper
They completed the questionnaire as an add-on to a large-scale study that investigated consumers’ responses to a range of foods and beverages.
2.2.2. Choice of stimuli As in Study 1, pilot work was conducted to ensure that the selected stimuli sufficiently spanned the familiarity spectrum. The set of beer images used in the second study is reported below (Fig. 2A and Fig. 2B). In Study 2A beers varied over beer styles and included both craft beers and standard lagers. In contrast, beers included in Study 2B had a more homogeneous profile: all were pale lager type, with an alcohol by volume between 4 and 5%, a price ranging from 9.99$ to 14.99$ 2 . All were brewed by the three main large commercial breweries in New Zealand.
Fig. 2A – The stimuli used in the ballot for Study 2A. Product codes used throughout the paper are indicated into brackets.
2
Prices are in New Zealand dollars (1 NZD ≈ 0.8 USD / 0.6 EUR) and refer to a 6 pack/330 ml bottles.
224
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
Fig. 2A – The stimuli used in the ballot for Study 2B. Product codes used throughout the paper are indicated into brackets.
2.2.3. Data collection The data were collected in a sensory laboratory facility designed in accordance with ISO 8589 (1988). The questionnaire format was identical to Study 1. The selected contexts were: As an alternative to wine (Alt. Wine), As a drink for women (Women), Anytime (Anytime), At a casual dining restaurant (Casual dining), At a pub (Pub), At a sport event (Sport event), At home (Home), At parties (Parties), For a special occasion (Special occasion), To drink alone (Alone), To impress someone (Impress), When I want something different (Different), When I want something refreshing (Refreshing), When I want to relax (Relax).
2.3. Study 3 Although the variation in stimuli sets and context wording across the studied studies added robustness to the findings, all formats used verbal description to convey contexts to the participants. As highlighted by Jaeger et al. (2001), the extent to which substantive conclusions can be drawn from consumers’ judgments may be dependent on presentation format and, in particular, on whether verbal description or images are presented during the task. While the former are generally more prone to be interpreted differently by different consumers, the latter are expected to increase the realism of the task and the ecological validity of the elicited judgments. Accordingly, it was considered prudent to conduct a further study using pictorial images to evoke contexts, to explore whether the results would
225
Working paper
differ from those obtained with the verbal description of contextual attributes (Jaeger et al., 2001; Vriens et al., 1998).
2.3.1. Participants Participants (N2A = 145) were recruited by a marketing research provider and received cash compensation for their time. All of them lived in Auckland, New Zealand, were self-identified as Caucasian, and were aged 18 – 60. As in Studies 2A and 2B, they completed the questionnaire as an add-on to a large-scale sensory study focused on sensory acuity and food preferences.
2.3.2. Choice of stimuli In order to facilitate comparisons with the previous studies, a subset of previously used beer images were used as stimuli (H1, H6 and H5 for the familiar product class, M1, M2 and M8, for the moderately familiar, and L1, L6 and L8 in the unfamiliar product class).
2.3.3. Data collection Data collection procedures were identical to the previous studies. Nine situations were depicted: A BBQ in the summer (Summer BBQ), As a drink for women (Women), As an alternative to wine for dinner (Alt. Wine), Having a drink at a pub (Pub), Having dinner at a casual restaurant (Casual dining), On a camping trip (Camping trip), Rugby fans having a beer before the game (Rugby fans), Watching a rugby game on TV at home (Rugby on Tv), While relaxing on a hammock (Hammock).
226
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
Fig. 3 – Sample images used for visual elicitation of contexts in Study 3.
2.4. Data Analysis The same set of data analytical procedures was carried out for each of the three studies. All analyses were performed within the R environment for statistical computing (R Development Core Team, 2010).
2.4.1. Manipulation checks for variations in product familiarity of beer stimuli
227
Working paper
To assess whether the intended variation in familiarity was achieved, an analysis of variance (ANOVA) on mean product and brand familiarity ratings was conducted using the following model: ܻ ൌ ߤ ܲ ߙ ߝ
where Yin is the (in)th observation of familiarity rating, µ is the general mean, Pi is the main product effect (i= 1,…,9), αn is the random effect of consumers, and εin is the error term. Pairwise mean comparisons were performed following ANOVA by Tukey’s HSD test, to uncover which pairs of products differed from each other. Results showed that the beers were clearly differentiated in familiarity (Study 1: F (8, 600) = 101.04; p < 0.001; Study 2A: F (8, 768) = 230.4; p < 0.001; Study 2B: F (8, 735) = 106.3; p < 0.001; Study 3: F (8, 1296) = 135.8; p < 0.001). As intended, three groups of products with different degree of familiarity were identified (Table 1): a high familiarity group (H1,…,H6), a medium-familiarity (M1,…,M8), and a low familiarity group (L1,…, L8). An identical analysis on brand familiarity ratings (Study 1: F(8, 600) = 131.2; p < 0.001; Study 2A: F (8, 768) = 299.2; p < 0.001; Study 2B: F (8, 735) = 247.3; p < 0.001; Study 3: F (8, 1296) = 314.4; p < 0.001) revealed a slightly different group separation (Table 2), suggesting that product and brand familiarity were conceptually differentiated by consumers3,4. Study 1 (N=76)
Study 2A (N=97)
Study 2B (N=93)
Study 3 (N=145)
Beer
Prod. Fam.
Brand Fam.
Beer
Prod. Fam.
Brand Fam.
Beer
Prod. Fam.
Brand Fam.
Beer
Prod. Fam.
Brand Fam.
H1 a
4.3± 0.9
4.5±0.9
H4 a
4.2±1
4.4±0.9
H1 a
3.8±1.2
4.2±1
H1 a
4.4±0.9
4.6±0.7
H3 a
4.3± 1
4.3±1
H5 a
4.1±1.1
4.4±1.1
H6 a
3.7±1.3
3.9±1.1
H6 a
4.2±1.0
4.1±1.0
H2 a
4.1± 1.2
4.2±1.1
H1 a
4.0±1
4.4±0.9
H4 a
3.1±1.5
3.3±1.4
H5 a
4.0±1.3
4.4±0.8
M1 b
3.2± 1.5
4±1.3
M5 b
2.8±1.3
2.9±1.3
M8 b
2.6±1.2
3.7±1.2
M8 b
2.9±1.4
3.8±1.2
M2 b
3±1.5
3.9±1.3
M1 b
2.8±1.2
3.9±1.3
M6 b,c
2.3±1.3
2.5±1.3
M1
2.8±1.4
4.2±1.0
M3 b
2.7±1.6
2.7±1.6
M4 c
2.2±1.2
2.2±1.2
M7 b,c
2.3±1.3
2.2±1.2
M2
2.7±1.4
4.1±1.0
b,c
b,c
L3 c
1.7±1.3
1.7±1.3
L1 d
1.6±1.3
1.6±1.3
L6 c
1.9±1.2
4.1±1
L6 c
2.5±1.5
4.4±0.9
L2 c
1.7±1.3
1.6±1.3
L5 d
1.2±0.6
1.1±0.6
L7 d
1.3±0.7
1.2±0.5
L8 d
1.3±0.8
1.3±0.7
L1 c
1.5±1.1
1.5±1.1
L4 d
1.1±0.5
1.1±0.6
L8 d
1.1±0.5
1.1±0.3
L1 d
1.1±0.6
1.1±0.6
Tab. 2 – Means and standard deviations of products’ and brands’ familiarity ratings (1 = Not at all familiar, 5 = Extremely familiar) across all studies. Products appear in descending order of product familiarity. Different superscript letters indicate statistically significant differences (Tukey p ≤ 0.05).
3
A similar indication was obtained by assessing the internal consistency of these two constructs: although the overall agreement was high (Study 1: r = 0.87; Study 2A: r = 0.89; Study 2B: r = 0.71; Study 3: r = 0.68), lower values were obtained for individuals beers, particularly in the medium and low familiarity group (e.g. for products M8 and L6 the correlation coefficients was 0.52 and 0.24 respectively). 4 Further, the correlation between product familiarity and consumption frequency was strong (Study 1: r = 0.64; Study 2A: r = 0.71; Study 2B: r = 0.72; Study 3: r = 0.70) albeit not perfect, in concordance with the expectation that familiarity has more components than exposure alone.
228
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
2.4.2. Analyses pertaining to RQ1: Product-contexts associations The first set of analyses aimed at investigating how consumers classify beers according to perceived appropriateness in given usage contexts (RQ1). To this end, consumers’ responses to the contextual attributes were coded as binary (1 = presence, 0 = absence) and organized in a matrix crossing consumers, products and attributes. Frequency of use of each contextual attribute was determined by counting the number of consumers that checked that attribute as appropriate to describe each product. The data were then rendered into cross tabulation matrices displaying the frequency distribution of each beer in the contextual attributes present in the ballot. Correspondence analysis (CA, Greenacre, 1993) was applied to this matrix to extract and visualize the main data structure into a lower dimensional subspace. Mean ratings of product familiarity were used as supplementary variable in the model. This analysis allowed us to visualize the main patterns of product-contexts association on the correspondence map. CA is an exploratory technique not suitable for statistical testing. Thus, Cochran’s Q test (Manoukian, 1986) was carried out separately on an unfolded matrix crossing consumers and contexts, in order to identify significant differences between products for each of the contextual attributes included. Briefly, Cochran’s Q test is a non-parametric statistical test used when the response variable can take only two possible outcomes (0/1), in order to verify whether k treatments have identical effect. Its test statistic is defined as follows: � � ∑���� ��� � � � � � ��� � �� � ∑��� �� �� � �� �
where k is the number of products, X.j is the column total for the jth products, b is the number of consumers, Xi. is the row total for the ith consumer, and N is the grand total. This test was carried out on an attribute-by-attribute basis. Where significant differences were found, pairwise multiple comparisons were made by applying Cochran’s Q test for all possible product combinations.
2.4.3. Analyses pertaining to RQ2: Relationships between familiarity and appropriateness The second data-analytical step concerned more specifically the relationships between familiarity and appropriateness (RQ2). To investigate the expected link between familiarity and versatility, we wanted to model the number of appropriate contexts (rendered as an additional column with a count of how many attributes a person had checked per each beer) given the product familiarity rating. Due to both the nature of the response variable (counts) and evidence of a multi-modal distribution in the data, ordinary fitting with the least square approach was
229
Working paper
unviable. One common way to deal with this distributional problem is to use Poisson regression instead: �� �
�
� ��� �� � ������� � �� �� � � �� !
where Yi is the number of appropriate contexts, n is the number of consumers, and the parameter λi is modeled as ln(λi) = Xiβi, where X is the explanatory variable (product familiarity ratings) affecting the probability for Y. Such model can then be estimated by maximum likelihood, based on the log-likelihood function: �
� � � ����� � �� �� �� � �� !� ���
Unfortunately, ordinary Poisson regression was not appropriate either, since our response variable included a large fraction of 0s which caused overdispersion in the dataset (i.e., the variance was greater than the mean). Therefore, negative binomial regression was used as an alternative. Briefly, negative binomial regression is a generalization of Poisson regression that has an extra parameter to model over-dispersion, defined as ��� � � ��� ���� � , which accounts for the large variation observed. The distribution of Yi conditional to ηi is again Poisson (Greene, 2003), and thus the model can again be straightforwardly estimated by maximum likelihood. It is the approach adopted in this work. Finally, we were interested in relating the degree of familiarity with the situational appropriateness of specific contextual attributes. As customary when modeling binary dependent variables, we used logistic regression for each of the attributes: ������� � � � ��� ������������ � �� �� � �
where logit p (which is equal to [log(p/(1-p)] is the link function that transform the linear regression output into a form suitable for probability estimation, a is the intercept, b the regression coefficient and �� is the familiarity ratings, and n is the number of consumers.
Logistic regression was to estimate the probability of an attribute being checked given the product familiarity, by looking at the odd ratios (i.e. the exponential function of the b coefficient of the respective logistic regression model). Odd ratios, in the context, of this research can be interpreted as: O. R. = 1
Familiarity is unrelated to the odds of a given context being selected
O. R. > 1 selected
Familiarity is associated with higher odds of a given context being
O. R. < 1 selected
Familiarity is associated with lower odds of a given attribute being
230
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
2.4.4. Supplementary analyses For all studies, the effect of selected consumer background variables – gender, age, consideration set size and consumption frequency of beer – on familiarity ratings and usage of contextual attributes was examined by ANOVA and chi-square tests respectively.
3. Results 3.1. Product-context associations (RQ1) Correspondence analysis applied to the cross tabulation matrices (containing situational appropriateness judgments) was used to address the first aim of the study, i.e. to illustrate underlying cognitive associations between beers and usage contexts (Figures 4A-B-C-D).
Fig. 4A – First two dimensions of the CA plot (Study 1).
231
Working paper
Fig. 4B – First two dimensions of the CA plot (Study 2A).
Fig. 4C – First two dimensions of the CA plot (Study 2B).
232
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
Fig. 4D – First two dimensions of the CA plot (Study 3).
Scree plot for eigenvalues suggested that a bi-dimensional solution was optimal in all cases. All models achieved a very high cumulative retention of the original variance within the first two CA dimensions (Study 1: 93%, Study 2A: 92.7%, Study 2B: 83%, Study 3: 93.5%). This indicates a strong underlying variance structure, that is, clear associations of different beers to different contexts. Accordingly, Cochran’s Q test uncovered significant differences between the beers for all the contexts (p < 0.05), with the exception of one single attribute in Study 2B (detailed results in the appendix). Visual inspection of the CA plots showed that, in all studies, the beers were quite clearly ranked by familiarity along the first CA dimension, with the high familiarity beer cluster (the “H” beers) in the left quadrants, the moderate familiarity cluster (the “M” beers) towards the center of the plot, and the low familiarity cluster (the “L” beers) in the right quadrants. This is confirmed by the vector direction of the supplementary variable Product familiarity. Familiar beers (left quadrants) tended to be very strongly associated with sport events, possibly as a result of the respective breweries’ active involvement in major New Zealand sports team (e.g. Steinlager’s long standing sponsorship of the All Blacks Rugby Union team). Familiar beers were also considered as highly appropriate for usages such as at parties, concerts, while watching TV and for camping trips. The context anytime, included in two studies, was also associated with this cluster (cf. Fig. 4B and 4C). On the right quadrants, unfamiliar beers tended to be associated to contexts such as to impress someone, for special occasions, as an alternative to wine, for restaurant dining, and for women. Results across studies are summarized in Table 3.
233
Working paper
Study 1 Dimension 1 (85.6%) Positive Negative As a gift (0.77) Fine dining (0.75) Treat (0.57) Achievement (0.56) Guests (0.33)
Dimension 2 (7.4%) Positive Negative
Rugby match (-0.53) Thirst-quenching (-0.32) Camping/fishing (-0.31) TV (-0.23) Concert (-0.15)
Rugby match (0.27) As a gift (0.17) Fine dining (0.14) Achievement (0.08) Camping/fishing (0.06)
Prod. familiarity (-0.18)
Prod. familiarity (0.12)
Work (-0.13) BBQ (-0.06) With a snack (-0.05) Treat (-0.03) Party (-0.03)
Study 2A Dimension 1 (75.4%) Positive Negative Impress (0.63) Different (0.58) Special occasion (0.54) Altern. to wine (0.27) Women (0.34)
Dimension 2 (17.3%) Positive Negative
Sport event (-0.56) Pub (-0.28) At parties (-0.16) Alone (-0.14) Anytime (-0.14)
Refreshing (0.24) Casual dining (0.22) Anytime (0.21) Women (0.15) Dinner (0.15)
Prod. familiarity (-0.27)
Prod. familiarity (0.05)
Different (-0.36) Alone (-0.11) Impress (-0.17) Pub (-0.09) Sport event (-0.09)
Study 2B Dimension 1 (63.4%) Positive Negative Impress (0.77) Different (0.52) Special occasion (0.51) Women (0.23) Altern. to wine (0.09)
Dimension 2 (19.9%) Positive Negative
Sport event (-0.22) Anytime (-0.17) At home (-0.08) Pub (-0.1) Dinner (-0.07)
Women (0.61) Refreshing (0.17) Different (0.06) Altern. To wine (0.04) Casual dining (0.03)
Prod. familiarity (-0.27)
Impress (-0.34) Special occasion (-0.19) Alone (-0.08) Pub (-0.07) Relax (-0.03) Prod. familiarity (-0.09)
Study 3 Dimension 1 (68.7%) Positive Negative Altern. to wine (0.25) Casual dining (0.22) Women (0.16) Pub (0.13)
Dimension 2 (24.8%) Positive Negative
Rugby Fans (-0.30) Camping trip (-0.15) Rugby on TV (-0.14) On a hammock (-0.01)
Women (0.25) Summer BBQ (0.08) Camping trip (0.05) On a hammock (0.01)
Prod. familiarity (-0.21)
Pub (-0.16) Rugby on TV (-0.04) Rugby fans (-0.04) Casual dining (-0.04) Prod. familiarity (-0.16)
Tab. 3 – Rank order and factor scores for the five highest weighted contexts across all studies. The correlation of the product familiarity vector with the CA dimension is also reported.
Elements of consistency across the studies were evident and suggested an inherent robustness of these product-context associations. Inspection of Table 3, for example, suggests a high degree of similarity in the factorial structure among all studies, as can be quite readily assessed by comparison of Studies 2A and 2B. Further, in both of these studies, two products (H1 and H4) were purposefully
234
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
included to enable a diagnostic check. As the exact same ballot was used, Fisher’s exact test was carried out as a straightforward quantitative assessment of whether the frequency of use for each term differed across ballot versions. The results strongly suggested the robustness of these judgments: out of 30 comparisons, the only significant difference between the two populations was observed for product H4 and the attribute As an alternative to wine (p = 0.034). Some notable differences across studies emerged as well, mostly pertaining to the relative contribution of the second CA dimension. If the first dimension roughly described variation in familiarity, the second dimension singled-out specific product-contexts associations that related to various extrinsic factors. In two studies (2B and 3), the second dimension highlighted the usage for women. In Study 2B the direction was driven by product L7, whose design resembles a soft-drink. In Study 3 this was due to product M2, a dark lager called “Black beer”, being rated as significantly less appropriate for women as the rest, possibly because of culturallydetermined associations of black with masculinity (e.g. Ellis & Ficek, 2001). Further, different studies showed different degrees of within-clusters (H, M, L) heterogeneity. For instance, in Study 1 product L2, which indeed had a critical influence in the model5, significantly differed with the rest of the novel cluster (L1 and L3) for 11 out of 15 contextual attributes, including two out of the four attributes that significantly correlated with the first dimension (Gift and Fine Dining, p < 0.05). In Study 2A, H1 and H4 were quite different compared to H5. Cochran’s Q test revealed that the two former differ significantly (p < 0.05) between each other only in four of the 15 attributes, whereas they both differ in at least eight attributes with H5, pointing at the importance of brand (H1 and H4 were the two Steinlager® beers) in situational appropriateness judgments. The usage context thirst-quenching generically associated with the familiar lager beers (Study 1), in the following studies was found to be more closely related to the color of the beer bottle: in particular, all studies showed that beers in green bottles were perceived as more refreshing/thirst-quenching than brown beers within their respective clusters (cf. H1 vs. H3 and H3 in Study 1, H1 and H4 versus H5 in Study 2A, H1 and H4 vs. H6 in Study 2B, and so on. A complete overview of product comparisons is available in the appendix). This is in line with the general association of green color with the expected refreshing character of beverages (Zellner & Durlach, 2003), as well as previous beer-specific findings indicating that the more brown a beer is, the less thirst-quenching it is perceived to be (Guinard, Souchard, Picot, Rogeaux, & Sieffermann, 1998). However, additional cues contributed: for example, sample H4 and M6 (both of which had the word “pure” in the name) scored significantly higher than all other beers in the attribute refreshing.
5
Nevertheless, recalculating the model omitting L2 showed a much clearer contrast between products L1, M2 and L3 and the familiar cluster, but without much change in the factor structure.
235
Working paper
3.2. Relationships between situational appropriateness and product familiarity (RQ2) The second part of this research looked more specifically at the relationships between familiarity and appropriateness. First, we considered the link between familiarity and versatility, operationalized as the total number of checked contexts per beers. On average, consumers checked 5.3 (±4) usage contexts in Study 1, 4.9 (±3.6) in Study 2A, 4.9 (±3.5) in Study 2B, and 5.4 (±2.3) in Study 3. Across all studies, negative binomial regression revealed that product familiarity significantly predicted the number of contexts perceived as appropriate (Study 1: b = 0.18, z (683) = 10.6, p < 0.001; Study 2A: b = 0.17, z (872) = 10.6, p < 0.001; Study 2B: b = 0.12, z (835) = 7.4, p < 0.001; Study 3: b = 0.06, z (1304) = 8.8, p < 0.001). These results support the hypothesis that familiarity is related to usage versatility for beers. However, it should be noted that familiarity only explained a limited proportion of the variance in the number of appropriate contexts (Maximum Likelihood Pseudo R2 (Study 1) = 0.14; R2 (Study 2A) = 0.11; R2 (Study 2B) = 0.07; R2 (Study 3) = 0.06). Relationships between a product familiarity and its appropriateness in specific situations were studied by logistic regression; results from all studies are reported below (Table 4). Recall that this analysis essentially models the probability that a consumer would select any given context as appropriate for each value of product familiarity. Inspecting Table 4 confirm the patterns so far observed, showing that familiar beers are more likely to be perceived appropriate in many situations (11 out of 15 contexts in Study 1, 12 out of 15 in Study 2A and 2B, and 8 out of 9 in Study 3). Conversely, unfamiliar beers are (slightly) more likely to be perceived as appropriate in a few situations – At a fine dining restaurant, As a Gift, To impress someone, When I want something different, and For a special occasion – possibly indicating that those situations might trigger a variety seeking behavior and a desire for novelty.
236
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
Effect of product familiarity on perceived appropriateness for each context: odd ratios from logistic regression Study 1 Outcome variable
(N = 76)
Outcome variable
Study 2A
Study 2B
(N = 97)
(N = 93)
O. R.
Study 3 Outcome variable
(N = 145)
O. R.
O. R.
Thirst-Quench.
1.63
Pub
1.65
1.32
Women
O. R. 1.10
Snack
1.51
Impress
0.86
0.79
Camping Trip
1.33
Fine Dining
0.98
Sport Event
1.94
1.60
Rugby on TV
1.45
Public House
1.22
Refreshing
1.40
1.18
Rugby Fans
1.55
Guests
1.16
Women
1.20
1.17
Altern. to wine
1.10
Party
1.40
Dinner
1.27
1.36
Hammock
1.18
Work
1.44
Parties
1.61
1.40
Summer BBQ
1.18
Treat
1.08
Casual Dining
1.32
1.16
Pub
1.10
BBQ
1.40
Alone
1.24
1.39
Casual dining
1.00
Achievement
1.04
Different
0.78
0.77
Gift
0.94
Anytime
1.66
1.42
Rugby
1.72
Home
1.46
1.36
TV
1.51
Relax
1.35
1.37
Camping / fish.
1.63
Altern. to Wine
1.17
1.10
Concert
1.49
Special occasion
0.99
0.94.
Tab. 4 – Logistic regression results (all studies) for individual contextual attributes, reported in terms of odd ratios (O.R. are obtained by raising e to the power of the logistic regression coefficient). For example, the value of O.R. for thirst-quenching in Study 1 can be interpreted as “for every one point increase in familiarity the odds of that product being appropriate as a thirst quencher increase by e 0.49 = 1.63 times”.
4. General discussion 4.1. Product context-associations and consumers’ evaluation of beers The first aim of this work was the application of the situational appropriateness framework to explore consumers’ perceived uses of beers. Results of all studies indicate that consumers perceived the beers as significantly different in perceived appropriateness across different usage contexts, indicating that beer choice and consumption are highly context-dependent. Interestingly, similar beers were not always clustered together. Results indicated that extrinsic features (including a beer’s name, brand and design) may alter the situational appropriateness of beers – by enhancing the salience of specific context-based features (such as the thirst-quenching character) – and provide opportunities for product differentiation. An interesting result was that very different beers have the potential to deliver on the same situational contexts, as demonstrated by the fact that patterns of beer-contexts associations in Study 2B were very similar to the other studies.
237
Working paper
Importantly, these patterns appeared to be stable and robust (viz. repeatable across studies). This result is important for two reasons. The first one is methodological: it indicates that the situational appropriateness is a workable approach for eliciting substantive product-context associations from consumers. Importantly, inter-product differences in appropriateness were often remarkably large (cf. effect sizes in Cochran’s Q Test), confirming that usage contexts are at least as important as intrinsic product factors (e.g. sensory quality) in determining people’s choice of food and beverages (Hersleth, Mevik, Næs, & Guinard, 2003). The second reason is practical: it means that these associations are quite widely held by consumers (recall that a between-subject design was employed) and thus highly relevant from a marketer perspective as they can be expected to simplify consumers’ information processing and direct decision making activities in the market place. Situational appropriateness is indeed a predictor of food choice (Kramer, Rock, and Engell, 1992; Hersleth et al., 2003) and accordingly, understanding the patterns of perceived appropriateness between product variants is important to producers as it relates to how consumers form consideration sets. A few methodological points should be raised. For example, it would be interesting to explore whether the same results would have been obtained with a different research paradigm. About this aspect, it is relevant to observe that, unlike similar studies which emphasize the ideographic nature of context by deriving consumer-based context categories (usually by means of qualitative methods, as in e.g. Blake et al., 2007; Bisogni et al., 2007; Jaeger et al., 2009; Sester et al., 2013), the situational appropriateness technique emphasizes the nomothetic approach (Raats & Shepherd, 1992; Schutz, 1994), viz. a constant set of attributes evaluated by all consumers. The obvious trade-off involves a heightened risk of overlooking potentially relevant usage contexts in the ballot design phase (although this risk seems limited in the present research, due to pre-testing of the ballot items). Furthermore, the nature of the situational appropriateness framework facilitates a stimulus-driven (bottom-up) judgment. A context-driven (top-down) approach (such as in Hein et al., 2010) may have increased the salience of particular contextual features, perhaps leading to a different product separation. Finally, the situational appropriateness framework is a tool for context-based segmentation (Dubow, 1992). However, appropriateness judgments arise from the interaction between people, products, and context. Accordingly relevant consumer-related characteristics (demographics, psychographics and behavioral) are likely to exert independent effects on appropriateness judgments6. Ultimately, all level of analyses are necessary for an accurate segmentation.
4.2. The role of product familiarity on perceived appropriateness for beers The main focus of this research was the importance of familiarity in structuring product-context associations. An important opposition between familiar 6
Significant differences between consumer sub-groups (for the factors listed Tab. 1) in perceived appropriateness for specific factors were observed across all studies. The data is not shown here as subgroup analysis is without the scope of this paper.
238
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
and novel beers arose in all studies, highlighting both quantitative and qualitative differences between these two groups. “Quantitative differences” refer to the small but highly significant effect of familiarity on usage versatility that was consistently observed across all studies. Consumers perceived familiar beers to be appropriate in most usage contexts and probably less context-dependent overall (see e.g. the use of the attribute “anytime” in Studies 2A and 2B), while unfamiliar beers were more specifically tied to (fewer) specific usages. This suggests that consumers may prefer familiar beers in most situations, or when they make “generic” food provisioning decisions (e.g. stocking up on beers with no specific usage occasion in mind). Conversely, they may choose an unfamiliar beer as a response to the constraints of a more specific usage (e.g. to make a gift, as an alternative to wine for dinner, etc.). In both cases, the final judgment about a beer’s situational appropriateness depends on the degree to which that beer seems to fulfill the goals associated with particular usage contexts. This view is consistent with most motivational perspectives on consumer decision making (see e.g. Gutman, 1982; Olson, 1989). The reasons behind the observed pattern (viz., familiar beers are versatile/unfamiliar beers associated to specific situations), remains unclear. A possible explanation is that cognitive processing underlying appropriateness judgments are slightly different for familiar and unfamiliar products. For a familiar product, appropriateness judgments may be more likely driven by a controlled search of memory for past experiences with the product (or with others perceived to be substitutes), and then retrieving situations associated to the consumption of this product, arguably also overlapping with those appropriate for the overall category. Conversely, for unfamiliar products it is the stimulus directly (i.e., not mediated by a controlled search of memory) that acts as a cue towards potential usages. In this case, the features that will catch most attention are the ones that will be most relevant for the perceived situational appropriateness (e.g. it resembles a soft drink = it’s a beer for women, etc.). “Qualitative differences” were also observed pertaining to the beers-contexts associations. While familiar beers were associated mostly to serving physiological needs (e.g. as a thirst-quencher), specific locations (e.g. at a rugby match, at a concert) and occasions (e.g. at a party, at a camping trip), usages associated with unfamiliar beer referred to feelings and mental processes (e.g. to celebrate an achievement, as a treat for myself), or to signaling social status (e.g. at a fine dining restaurant, to serve to guests). Tentatively, consumers’ associated an atypical appearance to prestige and exclusiveness. These results would fit would fit well with past research into other real-life domains showing that consumers associate a novel/unfamiliar design of beers to superior product quality, which (Creusen & Schoormans, 2005; Mugge & Schoormans, 2012a, 2012b). An unfamiliar appearance may also be advisable when a beer must be differentiated from other products in a category with many competing alternatives – for example, commercial lagers – as the case of product L8 demonstrated. Nevertheless, results also advice caution. In particular, the limited size of the odd ratios suggests that familiarity is not in itself – that is, without interacting with other product features – a strong predictor of situational appropriateness for beers.
239
Working paper
5. Conclusions This research has investigated situational appropriateness for beers with varying degrees of familiarity. Taken together, the results of four studies present consistent, converging evidence that there are wide differences in degree of perceived appropriateness for beers depending on usage situations, and that these associations can be reliably explored within the situational appropriateness framework. In this respect this research can provide hints to both practitioners and marketers about what situations may have consumers opting for different beers and, methodologically, how such relationships can be elicited and analyzed. Further, results indicated that familiarity stands in a positive relation to usage versatility, and affects perceived appropriateness in specific contexts, but at the same time suggest that familiarity per se is not but one factor – and likely not one of the most salient ones – in determining the appropriateness of beers.
Acknowledgements Support for this work was provided by the Danish Agency for Science, Technology and Innovation (through the consortium “Dansk Mikrobryg – Produkt Innovation og Kvalitet”) and by the Faculty of Science, University of Copenhagen. The staff at the New Zealand Institute for Plant and Food Research is warmly thanked for help with piloting of the ballot and data collection. Yilin Jia, also from Plant and Food Research, is thanked for valuable discussions and inputs to the data analysis.
Appendix A. Supplementary data. Frequency tables showing the occurrence of each contextual attribute for each of the product for all studies. The last two rows report test statistics for Cochran’s Q test (CQT). Different superscript letters indicate significant differences between beers (p ≤ 0.05).
240
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
Camping/ fishing
Concert
Row Total
53 a
47 a
15 b,c
57 a
11 b,c
12 c
55 a
43 a
51 a
31 a
542
H2
36 b
32 a
6d
52 a
22 b
53 a
39 a
9 c,d
60 a
9 c,d
5d
59 a
40 a
57 a
29 a
508
H3
29 b
19 b
2e
40 b
6c
45a,b
29 b
5d
41 b,c
4d
4d
57 a
29 b
45 b
19 b
374
M1
33 b
23 a,b
10 d
50 a
33 a
51 a
44 a
21 b
57 a
10 b,c
16 b,c
24 b
30 b
37 b,c
24 a,b
463
M2
12
c
a,b
b
a
a
b
b
b
c
a,b
b,c
b
b,c
371
M3
27 b
26 a
18 b,c
53 a
33 a
43a,b
37 a
27 a
51 a,b
14 a,b
13 c
17 b
22b,c,d
32 c
17 b,c
430
L1
15 b
18 b
19 b,c
36 b
29 a
32 b
25 b
27 a
34 c
18 a,b
23 b
10 c
18 c,d
19 d
16 c
339
L2
4c
10 c
33 a
33 b
31 a
16 c
9c
32 a
21 d
22 a
37 a
4d
7e
10 e
7d
276
L3
13 b
13 a
14 b,c,d
38 b
29 a
33 a
24 b
25 a,b
35 c
17 a,b
24 a
15 b
15 d
18 d,e
15 c
328
Column Total CQT T value
216
194
135
401
248
359
279
180
393
123
153
257
229
289
175
116.5
41.8
65.7
42
43.3
79
76.3
54.6
87.2
28.7
75.4
237
84.9
137
39.8
22
21
57
30
25
19
BBQ
Treat
Party 33
37
18
19
16
TV
35 a
Rugby
42 b
Gift
12 c,d,e
Achiev.
31 a
Work
Public House
47 a
Guests
Fine Dining
H1
Thirstquench
Snack
Frequency of use of contextual attributes (counts) for Study 1 (N = 76)
25
b,c
20
d
17
Pub
Impress
Sport event
Refresh.
Women
Dinner
Parties
Casual dining
Alone
Differen t
Anytime
Home
Relax
Alt. Wine
Special occasion
Row Total
Frequency of use of contextual attributes (counts) for Study 2A (N = 97)
H1
72
10
74
40
13
45
74
56
28
15
40
60
39
25
20
611
H4
77
18
65
57
25
48
72
63
26
17
49
63
45
42
27
694
H5
82
1
81
25
8
24
77
28
31
9
28
59
43
19
3
518
M1
67
15
28
42
22
38
57
52
25
48
27
51
42
33
21
568
M4
50
21
26
26
19
34
45
43
20
45
20
40
28
33
32
482
M5
75
1
75
6
3
10
48
5
23
8
15
40
18
8
1
336
L1
32
21
15
23
8
26
34
28
16
45
13
26
23
26
23
359
L4
36
15
28
14
5
13
42
16
15
40
12
28
18
16
15
313
L5
25
23
8
17
21
28
29
32
16
45
10
29
22
28
27
360
516
125
400
250
124
266
478
323
200
272
214
396
278
230
169
Column Total CQT T value
241
Working paper
Impress
Sport event
Refresh.
Women
Dinner
Parties
Casual dining
Alone
Different
Anytime
Home
Relax
Alt. Wine
Special occasion
Row Total
H1
Pub
Frequency of use of contextual attributes (counts) for Study 2B (N = 93)
68 a
4c
62 a,b
36 b,c
6 c,d
31 b
68 a
45a,b
24
10 d
39 a
55 a,b
37 a,b
23 a,b
8d
516
a,b
11
b
a
a
26
25
c
H6
29 e
27
7d
31a,b,c
57 a,b
33 b
14 d,e
3e
449
M6
48 c,d
4c
43 c
44 b
7 c,d
27 b,c
56 b,c
43a,b,c
21
24 c
27b,c,d
45 b,c
32 b
21 a,b
10 c,d
452
M7
42 d,e
8 b,c
35 c,d
17 e
2e
18 d,e
49 c,d
30 d,e
20
30 c
17 e
41 c
21 c
10 e
18 a,b,c
358
M8
57 b,c
4c
55 b
35b,c,d
6 c,d,e
26b,c,d
51b,c,d
34c,d,e
23
25 c
22 d,e
52 b
28 b,c
13 d,e
6 d,e
437
L6
51 c,d
10b,c
44 c
31 c,d
12 b,c
25b,c,d
54 b,c
40b,c,d
19
34 b,c
26 c,d
40 c
23 c
15 c,d
18 a,b,c
442
c,d
d,e
a,b
c
c
b,c,d
b,c,d
387
37
22
16
47
30
16
38
22
26
620
61 a,b
e
43
20 a,b
25b,c,d
31
64
a
5 d,e
6
36
a
24 d,e
e
53
a
69 a
a
70
a,b
0d
b,c
43
a
64 a,b
d,e
16
a,b
66
b,c
59
a
H4
e
62
a,b
L7
34
16
43
L8
44 d,e
24 a
25 e
27 d
5d
20c,d,e
42 d
29 e
21
47 a
16 e
37 c
28 b,c
18
22a,b,c
11
26 a
Column Total CQT T value
474
71
426
310
81
231
498
333
197
245
230
429
267
162
120
76.4
66
114.9
85.4
63.5
46
48.4
41
13N.S.
91.9
50.3
51.8
34.4
31.4
54.1
At a pub
Casual dining
Row Total
68 a,b
121 a
132 a
126 b
76 a
125 a
133 a
122 b
104 a
1007
H5
39 d
124 a
127 a
130 a,b
28 c
109 b
123 a
98 c,d
52 d
830
H6
48 c,d
124 a
133 a
137 a
31 c
107 b
132 a
104 c
57 d
d,e
b
b,c
a
b
a
74 c
55 d
50 b
70 e
70 c
135 a
83 b,c
856
M8
61 a,b
101 b
114 b
97 c
54 b
100b,c
128 a
95 c,d
83 b,c
713
e
b
b,c
786
55
83
108
b
124
90
d
93
65
L6
72 a
93 b,c
105 b
91 c
57 b
107 b
126 a
108 c
97 a
600
L8
59 b,c
78 d
76 c
53 d
52 b
92 c,d
124 a
100c,d
79 c
833
CQT T value
42
d
125
L1
Column Total
61
98
Summer BBQ
Alt. Wine
M2
48 f
66
57
653
15 e
d
52
873
a,b
74
e
81
c
M1
a,b
82
c,d
Hammock
Rugby on TV
H1
Women
Camping trip
Rugby fans
Frequency of use of contextual attributes (counts) for Study 3 (N = 145)
83
501
837
903
783
460
891
1069
976
731
102.7
225
242.2
385.2
67
84.1
163.3
86.3
85
242
413
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
References
Allison, R. I., & Uhl, K. P. (1964). Influence of beer brand identification on taste perception. Journal of Marketing Research, 1, 36-39. Ares, G., & Jaeger, S. R. (2013). Check-all-that-apply questions: Influence of attribute order on sensory product characterization. Food Quality and Preference, 28, 141-153. Ariely, D., & Levav, J. (2000). Sequential choice in group settings: Taking the road less traveled and less enjoyed. Journal of Consumer Research, 27, 279-290. Belk, R. W. (1974). An exploratory assessment of situational effects in buyer behavior. Journal of Marketing Research, 11, 156-163. Belk, R. W. (1975). Situational variables and consumer behavior. Journal of Consumer Research, 2, 157-164. Bell, R., Meiselman, H. L., Pierson, B. J., & Reeve, W. G. (1994). Effects of adding an Italian theme to a restaurant on the perceived ethnicity, acceptability and selection of food. Appetite, 22, 11-24. Bell, R., & Meiselman, H. L. (1995). The role of eating environment in determining food choice. In D. Marshall (Ed.), Food choice and the consumer. Pp. 292-310. Glasgow: Blackie Academic and Professional. Bisogni, C. A., Winter Falk, L., Madore, E., Blake, C. E., Jastran, M., Sobal, J., Devine, C. M. (2007). Dimensions of everyday eating and drinking episodes. Appetite, 48, 218-231. Blake, C. E, Bisogni, C. A., Sobal, J., Devine, C. M., Jastran, M. (2007). Classifying foods in contexts: How adults categorize foods for different eating settings. Appetite, 49, 500-510. Bruhn, C. M., & Schutz, H. G. (1986). Consumer perceptions of dairy and related-use foods. Food Technology, 40, 33-51. Bruzzone, F., Ares, G., & Gimenez, A. (2012). Consumers' texture perception of milk desserts. II comparison with trained assessors data. Journal of Texture Studies, 43, 214-226. Caporale, G., & Monteleone, E. (2004). Influence of information about manufacturing process on beer acceptability. Food Quality and Preference, 15, 271-278. Cardello, A. V., & Schutz, H. G. (1996). Food appropriateness measures as an adjunct to consumer preference/acceptability evaluation. Food Quality and Preference, 7, 239-249. Carroll, Joanne (2011). Beer hops off buyers' lists. The New Zealand Herald, March 20th 2011. Retrieved October 8th, 2012 at www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=10713655 Coquillat, C., Galia, F., Sirot, B., Sonier, P., Sutan, A., & Valentin, D. (2009). Drinking beer in consonant and dissonant contexts: An experimental investigation. Paper presented at Beeronomics the Economics of Beer and Brewing, Leuven, Belgium, May 27-30, 2009.
243
Working paper
Creusen, M. E. H., & Schoormans, J. P. L. (2005). The different roles of product appearance in consumer choice. Journal of Product Innovation Management, 22, 63-81. de Castro, J. M. (1991). Social facilitation of the spontaneous meal size of humans occurs on both weekdays and weekends. Physiology & Behaviour, 49, 1289-1291. de Graaf, C., Cardello, A. V., Kramer, F. M., Lesher, L. L., Meiselman, H. L., & Schutz, H. G. (2005). A comparison between liking ratings obtained under laboratory and field conditions: The role of choice. Appetite, 44, 15-22. Donadini, G., & Fumi, M. D. (2010). Sensory mapping of beers on sale in the Italian market. Journal of Sensory Studies, 25, 19-49. Dubow, J. S. (1992). Occasion-based vs. user-based benefit segmentation: A case study. Journal of Advertising Research, 32, 11-18. Edwards, J. S. A., Meiselman, H. L., Edwards, A., & Lesher, L. L. (2003). The influence of eating location on the acceptability of identically prepared foods. Food Quality and Preference, 14, 647652. Eindhoven, J., & Peryam, D. R. (1959). Measurement of preferences for food combinations. Food Technology, 13, 379-382. Ellis, L., & Ficek, C. (2001). Color preferences according to gender and sexual orientation. Personality and Individual Differences, 31, 1375-1379. Fenko, A., Schifferstein, H. N. J., & Hekkert, P. (2010). Shifts in sensory dominance between various stages of user-product interactions. Applied Ergonomics, 41, 34-30. Goodman, J. K., Broniarczyk, S. M., Griffin, J. G., McAlisher, L. (2013). Help or hinder? When recommendation signage expands consideration sets and heightens decision difficulty. Journal of Consumer Psychology, 23, 165-174. Greene, W. H. (2003). Econometric Analysis (4th Edition). New York: Prentice Hall. Greenacre, M. (1993). Correspondence Analysis in Practise. London: Academic Press. Guinard, J. X., Souchard, A., Picot, M., Rogeaux, M., & Sieffermann, J., M. (1998). Sensory determinants of the thirst-quenching character of beer. Appetite, 31, 101-115. Hajdu, I., Major, A., & Lakner, Z. (2007). Consumer behavior in the Hungarian beer market. Studies in Agricultural Economics, 106, 89-104. Hein, K., Hamid, N., Jaeger, S. R., & Delahunty, C. (2010). Application of a written scenario to evoke a consumption context in a laboratory setting: Effects on hedonic ratings. Food Quality and Preference, 21, 410-416. Herr, P. M. (1989). Priming price: Prior knowledge and context effects. Journal of Consumer Research, 16, 67-75. Hersleth, M., Mevik, B., Næs, T., & Guinard, J. (2003). Effect of contextual factors on liking for wine—use of robust design methodology. Food Quality and Preference, 14, 615-622.
244
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
Hoeffler, S. (2003). Measuring preferences for really new products. Journal of Marketing Research, 40, 406-420. Hoeffler, S., & Ariely, D. (1999). Constructing stable preferences. A look into dimensions of experience and their impact on preference stability. Journal of Consumer Psychology, 8, 113-149. Jack, F. R., Piggott, J. R., Paterson, A. (1994). Use and appropriateness in cheese choice, and an evaluation of attributes influencing appropriateness. Food Quality and Preference, 5, 281-290. Jaeger, S. R., Bava, C. M., Worch, T., Dawson, J., & Marshall, D. W. (2011). The Food Choice Kaleidoscope. A framework for structured description of product, place, and person as sources of variation in food choice. Appetite, 56, 412-423. Jaeger, S. R., Hedderley, D., & MacFie, H. J. H. (2001). Methodological issues in conjoint analysis: A case study. European Journal of Marketing, 35, 1217-1239. Jaeger, S. R., Marshall, D. W, & Dawson, J. (2009). A quantitative characterization of meals and their contexts in a sample of 25 to 49-year-old Spanish people. Appetite, 52, 318-327. Jaeger, S. R., Rossiter, K. L., & Lau, K. (2005). Consumer perceptions of novel fruit and familiar fruit: A repertory grid application. Journal of the Science of Food and Agriculture, 85, 480-488. King, S. C., Meiselman, H. L., Hottenstein, A. W., Work, T. M., & Cronk, V. (2007). The effects of contextual variables on food acceptability: A confirmatory study. Food Quality and Preference, 18, 58-65. King, S. C., Weber, A. J., Meiselman, H. L., & Lv, N. (2004). The effect of meal situation, social interaction, physical environment and choice on food acceptability. Food Quality and Preference, 15, 645-653. Köster, E. P., & Mojet, J. (2006). Theories of food choice development. In L. Frewer & H. C. M. van Trijp (Eds.), Understanding Consumers of Food Products (pp. 93-124), Cambridge: Woodhead Publishing. Kramer, F. M., Rock, K., & Engell, D. (1992). Effects of time of day and appropriateness on food intake and hedonic ratings at morning and midday. Appetite, 18, 1-13. Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5, 213-236. Krosnick, J. A. (1999). Survey research. Annual Review of Psychology, 50, 537-567. Lähteenmäki, L., & Tuorila, H. L. (1997). Item-by-use appropriateness of drinks varying in sweetener and fat content. Food Quality and Preference, 8, 85-90. Lee, L., Frederick, S., & Ariely, D. (2006). Try it, you'll like it. The influence of expectation, consumption and revelation on preferences for beer. Psychological Science, 17, 1054-1058. Luomala, H. T. (2007). Exploring the role of food origin as a source of meanings for consumers and as a determinant of consumers’ actual food choice. Journal of Business Research, 60, 122-129. Manoukian, E. B. (1986). Mathematical Nonparametric Statistics. New York, NY: Gordon & Breach.
245
Working paper
Marshall, D., & Bell, R. (2003). Meal construction: Exploring the relationship between eating occasion and location. Food Quality and Preference, 14, 53-64. Meiselman, H. L., Johnson, J. L., Reeve, W., & Crouch, J. E. (2000). Demonstrations of the influence of the eating environment on food acceptance. Appetite, 35, 231-237. Meiselman, H. L. (2008). Experiencing food products within a physical and social context. In Paul Hekkert (Ed.), Product Experience, Elsevier. Mielby, L. H., & Frøst, M. B. (2010). Expectations and surprise in a molecular gastronomic meal. Food Quality and Preference, 21, 213-224. Mohr, C. D., Armeli, S., Tennen, H., Carney, M. A., Affleck, G., & Hromi, A. (2001). Daily interpersonal experiences, context, and alcohol consumption: Crying in your beer and toasting good times. Journal of Personality and Social Psychology, 80, 489-500. Moskowitz, H. R. (1977). Food compatibilities and menu planning. Journal of the Canadian Institute of Food Science and Technology, 10, 257-264. Mukherjee, A., & Hoyer, W. D. (2001). The effect of novel attributes on product evaluation. Journal of Consumer Research, 28, 462-472. Mugge, R., & Schoormans, J. P. L. (2012a). Product design and apparent usability. The influence of novelty in product appearance. Applied Ergonomics, 43, 1081-1088. Mugge, R., & Schoormans, J. P. L. (2012b). Newer is better! The influence of a novel appearance on the perceived performance quality of products. Journal of Engineering Design, 23, 469-484. Nantachai, K., Petty, M. F., & Scriven, F. M. (1991–1992). An application of contextual evaluation to allow simultaneous food product development for domestic and export markets. Food Quality and Preference, 3, 13-22. Olson, J. C. (1989). The importance of cognitive processes and existing knowledge structures for understanding food acceptance. In J. Solms & R. C. Holms (eds.), Criteria of Food Acceptance (pp. 69-81), Zurich: Foster Verlag. R Development Core Team (2010). R: A language and environment for statistical computing. ISBN 3-900051-07-0. Raats, M. M., & Shepherd, R. (1992). An evaluation of the use and perceived appropriateness of milk using the repertory grid method and the “item by use” appropriateness method. Food Quality and Preference, 3, 89-100. Raju, P. S. (1977). Product familiarity, brand name, and price influences on product evaluation. Advances in Consumer Research, 4, 64-71. Ratneshwar, S., & Shocker, A. D. (1991). Substitution in use and the role of usage context in product category structures. Journal of Marketing Research, 28, 281-295. Reinbach, H. C., Giacalone, D., Ribeiro, L. M., Bredie, W. L. P., & Frøst, M. B. (2013). Comparison of three sensory profiling methods based on consumer perception: CATA, CATA with intensity and Napping®. (In press). Food Quality and Preference.
246
D. Giacalone et al.- Situational appropriateness and consumers’ use patterns for beers
Resurreccion, A. V. A. (1986). Consumer use patterns for fresh and processed vegetable products. Journal of Consumer Studies and Home Economics, 10, 317-332. Rozin, P., & Tuorila, H. (1993). Simultaneous and temporal contextual influences on food acceptance. Food Quality and Preference, 4, 11-20. Sandell, R. G. (1968). Effects of Attitudinal and Situational Factors on Reported Choice Behavior. Journal of Marketing Research, 5, 405-408. Schutz, H. G. (1994). Appropriateness as a measure of the cognitive-contextual aspects of food acceptance. In H. J. H. MacFie & D. M. H. Thomson (Eds.), Measurement of Food Preferences (pp. 25-50), London: Blackie Academic. Schutz, H. G. (1988). Beyond preference: Appropriateness as a measure of contextual acceptance of food. In D. M. H. Thomson (Ed.), Food Acceptability (pp. 115-134), London: Elsevier. Schutz, H. G., Cardello, A. V., & Winterhalter, C. (2005). Perceptions of fiber and fabric uses and the factors contributing to military clothing comfort and satisfaction. Textile Research Journal, 75, 223-232. Schwanenflugel, P. J., & Rey, M. (1986). The relationship between category typicality and concept familiarity: Evidence from Spanish- and English-speaking monolinguals. Memory and Cognition, 14, 150-163. Sester, C., Dacremont, C., Deroy, O., & Valentin, D. (2013). Investigating consumers’ representations of beers through a free association task: A comparison between packaging and blind conditions. Food Quality and Preference, 28, 475-483. Sester, C., Deroy, O., Sutan, A., Galia, F., Desmarchelier, J., Valentin, D., & Dacremont, C (2013).“Having a drink in a bar”: An immersive approach to explore the effects of context on food choice. Food Quality and Preference, 28, 23-31. Shugan, S. (1980). The cost of thinking. Journal of Consumer Research, 7, 99-111. Smythe, J. E., O'Mahony, M. A., & Bamforth, C. W. (2002). The impact of appearance of beer on its perception. Journal of the Institute of Brewing, 108, 37-42. Stefflre, V. (1971). Some eliciting and computational procedures for descriptive semantics. In P. Kay, P. Reich, & M. McClaran (eds.). Exploration in Mathematical Anthropology (pp. 211-248), Cambridge, MA: MIT Press. Sudman, S., & Bradburn, N. M. (1992). Asking Questions. San Francisco, CA: Jossey-Bass. Tuorila, H. M., Meiselman, H. L., Cardello, A. V., & Lesher, L. L. (1994). Effect of expectations and the definition of product category on the acceptance of unfamiliar foods. Food Quality and Preference 9, 421-430. Turner, M., & Collison, R. (1988). Consumer acceptance of meals and meal components. Food Quality and Preference, 1, 21-24. Veryzer, R. W. (1998). Key factors affecting customer evaluation of discontinuous new products. Journal of Product Innovation Management, 15, 136-150.
247
Working paper
Vriens, M., Loosschilder, G. H., Rosbergen, E., and Wittink, D. R. (1998). Verbal versus realistic pictorial representations in conjoint analysis with design attributes. Journal of Product Innovation Management, 15, 455-467. Warlop, L., & Ratneshwar, S. (1993). The role of usage contexts in consumer choice: A problem solving perspective. Advances in Consumer Research, 20, 377-382. Zellner, D. A., & Durlach, P. (2003). Effect of color on expected and experienced refreshment, intensity, and liking of beverages. The American Journal of Psychology, 116, 633-647.
248
Paper VII
Giacalone, D., Reinbach, H. C., & Frøst, M. B. (2011). A snapshot mapping of the Danish beer market. Scandinavian Brewers’ Review, Vol. 68, Issue 1, pp. 12-20.
249
250
article
a SnapShot Mapping of the DaniSh Beer Market D av i D e g i a c a l o n e , e - M a i l : D g i @ l i f e . k u . D k ; h e l e n e c h r i S t i n e r e i n B a c h , e - M a i l : h c r e @ l i f e . k u . D k a n D M i c h a e l B o M f r ø S t, e - M a i l : M B f @ l i f e . k u . D k . D e pa r t M e n t o f f o o D S c i e n c e , fa c u lt y o f l i f e S c i e n c e S , u n i v e r S i t y o f c o p e n h a g e n
This article, written by scientists and researchers at KU Life, the new partner of the Scandinavian School of Brewing, takes a fascinating, inspiring and innovative look at the myriad of Danish beers, and, through multivariate statistics, successfully finds very interesting relations between them.
wa n t t o k n o w M o r e a B o u t M u lt i va r i at e S tat i S t i c S ?
‘The micro-brewing industry … has traditionally sought to distinguish itself from the macro-domestic industry by typically making ales more than lager and inter alia by including
The research group of Quality and Technology at KU
substantial levels of hop bitterness, by the adventurous use of
Life is our leading unit in developing and applying
specialty malts and roasted materials, by all-malt brewing and
multivariate analytical models to solve food-related
high gravity alcoholic beers …’.
problems. if you want to know more about the PCA and other multivariate methods, we encourage you to have a
This opening quote comes from the inspiring article recently
look at the links below, where you can also find a series
published by Prof. Lewis in a very recent issue of Scandinavian
of video tutorials that show in a graphical and intuitive
Brewers’ Review, where he brings forth the vivid image of an
way the mathematical procedure for extracting principal
American beer industry characterised by a ‘dash to the extreme’
components.
where micro breweries specialise in brewing increasingly heavy ales, much to counter the opposite tendency by large breweries
Q&T group homepage:
to brew mostly light lagers.
www.models.life.ku.dk/ We asked ourselves whether the Danish beer industry reflected Tutorials:
that description, and we resolved to answer this question
www.youtube.com/user/QualityAndTechnology
by making a little statistical experiment: first, we collected information on about 300 Danish beers from the majority of Danish breweries, using self-reported information on each beer
12
SCANDINAVIAN BREWERS’ REVIEW . VOL.68 NO.1 2011
251
a SnapShot Mapping of the DaniSh Beer Market
provided by the brewery. for each beer, we used the following
some breweries are more represented than others, and so are
information: 1) producer’s size and 2) producer’s emphasis on
certain types of beers. Nevertheless, we think it represents
local identity, 3) type of fermentation, 4) bitterness (iBU) by
what you can find on the Danish beer market. Table 2,
beer style, 5) alcohol by volume, 6) type(s) of malt, 7) type(s) of
overleaf, lists all the beers we used for the analysis and their
hops, and 8) usage of special flavouring ingredients.
corresponding number in the plots.
Statistics requires numbers, so we had to transform all the
figure 1, on page 15, shows the frequency of different beer
information about the characteristics of beers into data suitable
styles (according to average iBU) in our sample database.
for statistical analysis. Table 1, below, gives an account of how we actually did it.
Methods, aka Multivariate statistics in
What resulted after this was a data matrix with exactly 297
Beers can vary widely in many things: alcohol content, colour,
beers. They represent beers from nearly all main breweries in
style, producer, etc. When you want to compare few beers, you
Denmark and they were only chosen if sufficient information
can just collect all your info in a simple table and look at them
for classification were provided either on the company’s
side by side. A table with hundreds of beers, however, would be
website or on the beer label. We also used ratebeer.com with
quite problematic to look at and, most importantly, it would be
regards to beer style in case this information was not listed
almost impossible to get an overview. That’s where multivariate
by the producer. Because of the way the data were gathered,
statistics can be of great assistance.
a nutshell
ta B l e 1 VARIABLE
Data type
Explanation
Brewery Size
Semi-continuous
(# of employees – Source: Danmarks Statistik)
Local identity
Category
‘Yes’ if active use of their origin in their communication and/or presence of beer
fermentation
Category
Bottom, Top, or both
Bitterness – iBU
Continuous
Average beer style iBU according to the Brewers Association 2010 Beer Style Guide,
Malt Type (Malt colour,
Continuous
pub/restaurant
or listed by the producer. Lovibond)
for obvious reasons, exact fractions of individual malt types are rarely listed by producer. As an approximation, we summed degrees Lovibond for individual listed malts for each beer.
Alcohol By Volume
Continuous
No explanation needed
Category
Aroma or Bitter. Here, we adopted a restrictive definition of ‘aroma hops’, including
– ABV Hops
in the category only those varieties with an alpha acid percentage up to 6% (such as saaz, styrian golding, crystal, etc.), whereas intermediate hops variety (e.g. centennial) are still listed as bitter hops. Special flavouring ingredients
Category
increasingly, other ingredients than malts, hops, yeast and water are used in beer. in some of the beers, up to five other ingredients are used. further elaboration on this topic is given later.
ScanDinavian BrewerS’ review . vol.68 no.1 2011
252
13
a SnapShot Mapping of the DaniSh Beer Market
ta B l e 2 BReWeRY
NUMBeR iN PLOTS AND CORReSPONDiNG BeeR
Thisted
(1) Limfjords Porter, (2) Triple Pilsner, (3) Thy Ale, (4) Porse Bock, (5) Brown Ale, (6) Thy Bock, (7) Thy Porter
Grauballe
(8) enebær Stout, (9) Mørk Mosebryg, (10) River Beer, (11) Honey Gold, (12) Orange Blossom, (13) iPA Nørrebryg;
Thor
(14) Thor Pilsner, (15) Thor Classic, (16) Blå Thor;
Tuborg
(17) Tuborg Lime,(18) Grøn Tuborg, (19) Guld Tuborg, (20) Tuborg Classic, (21) Tuborg Rød, (22) Tuborg Julebryg, (23) fine festival, (24) Tuborg Super Light;
Carlsberg
(25) Abbey Ale, (26) Semper Ardens Christmas Ale, (27) Carls Hvede, (28) Carlsberg LiTe, (29) Carls Lager, (30) Carls Porter, (31) Carls Ale, (32) Carls Special, (33) Carlsberg 47, (34) Carslberg elephant, (35) Carlsberg Light Pilsner, (36) Carlsberg Pilsner, (37) Semper Ardens Summer Dubbel, (38) Semper Ardens Blonde Bier, (39) Semper Ardens Keller Pilz;
Albani
(40) Odense Classic, (41) Odense Pilsner, (42) Odense Rød Classic;
Refsvindinge
(43) Pilsner, (44) Prima Landøl, (45) Ale nr. 16, (46) Mors Stout, (47) enkens Anden Lys, (48) Skibsøl, (49) HP Bock, (50) Den Sorte enke, (51) Humlepilsner, (52) Solbær Ale, (53) Bedstemors Stout, (54) Ævleøl, (55) Hvid Guld;
Harboe
(56) Harboe Pilsner, (57) Harboe Pilsner Light, (58) Harboe Classic;
Skands
(59) Humlefryd, (60) New Stout, (61) Nicks Ale, (62) Porter, (63) elmegade iPA, (64) Brüssel Wit, (65) Økofryd;
Nørrebro
(66) Bombay Pale Ale, (67) Ceske Böhmer, (68) King’s County Brown Ale, (69) furesø framboise, (70) La Granja Stout, (71) Little Korkny Ale, (72) New York Lager, (73) Nørrebros Julebryg, (74) Pacific Summer Ale, (75) Påske Bock, (76) Ravnsborg Rød, (77) Skärgaards Porter, (78) Sorterdam Sauvage, (79) S:t Jørgen Stout, (80) fanø Lyng, (81) Currant Practise, (82) NoPale NoAle, (83) Brugge Blonde, (84) Balders Blid, (85) Peblinge Pêche, (86) Montceau Ginger, (87) Oak Wise, (88) Tärnö imperial Stout, (89) Rood Wit 32°, (90) Stuykman Wit;
Svaneke
(91) Staerk Preben, (92) Classic, (93) Stout, (94) Gul Påske, (95) Sejler Øl, (96) Choko Stout, (97) Weisse, (98) Gammeldags Pilsner, (99) Julebryg, (100) Rød Jul, (101) Den eneste ene, (102) Porter, (103) Aurum, (104) Salmecyke, (105) Pivo;
indslev
(106) Sort Hvede, (107) Hvede Bock, (108) Julehvede, (109) Påskehvede, (110) Hvede i.P.A.;
Jacobsen
(111) Saaz Blonde, (112) Sommer Wit, (113) Dark Lager, (114) Golden Naked Christmas Ale, (115) Brown Ale, (116) extra Pilsner, (117) forårsbryg, (118) Bramley Wit;
Mikkeller
(119) Big Worse Barrel Aged edition, (120) Big Worse, (121) Beer Geek Breakfast Bourbon Barrel Aged edition, (122) Single Hop Chinook iPA, (123) Single Hop Centennial iPA, (124) east Kent Golding Single Hop iPA, (125) Single Hop Amarillo iPA, (126) Single Hop Tomahawk iPA, (127) Beer Geek Breakfast, (128) Drink in the Sun, (129) Nugget Single Hop iPA, (130) Tjekket Pilsner, (131) Drikkeriget DiPA, (132) USAlive!, (133) Not Just Another Wit, (134) from To, (135) it's Alive!, (136) All Others Pale, (137) Green Gold;
Royal Unibrew
(138) Royal Classic, (139) Royal Pilsner, (140) Royal Stout, (141) Royal export;
Herslev
(142) Pale Ale;
Ceres
(143) Ceres 2 Pilsner;
BrewPub
(144) Amarillo Red Ale, (145) Cole Porter, (146) PJ Harvey, (147) Ralf Hutter, (148) Schlager, (149) Smokin’, (150) Stevie Ray, (151) James Brown Ale;
Amager Bryghus
(152) Bryggens Blonde, (153) Christianshavn Pale Ale, (154) Sundby Stout, (155) Dragørs Tripel, (156) iPA, (157) Amager fælled, (158) Ryeking, (159) Dicentra Cucullaira, (160) Double Black iPA, (161) Galanthus Nivalis, (162) Hr. frederiksen, (163) Rugporter, (164) Red Nitro, (165) Nitro, (166) Black Nitro, (167) imperial Stout, (168) Høstbryg, (169) Summer fusion, (170) imperial Brown Ale, (171) Honning Porter;
Vejle Bryghus
(172) Vejle forårsbryg, (173) Vejle Porter, (174) Vejle Bryghus Pilsner, (175) Vejle Belgian Strong Ale, (176) Vejle GrassHopper iPA, (177) Vejle Bryghus Klassik, (178) Vejle Bryghus Silent Night, (179) Vejle Bryghus Golden Ale, (180) Normaler Weisse, (181) Vejle Oktoberfest, (182) Vejle Brown Ale, (183) Vejle Holy Night;
Skovlyst
(184) Skovmærkebryg, (185) JuleBryg, (186) BirkeBryg, (187) Havre Stout, (188) india Pale Ale, (189) AhornBryg, (190) egeBryg,(191) Bøgebryg;
Rise
(192) Premium Jule Ale, (193) Grolle Pilsner, (194) Marstal’s iPA, (195) Ærøskøbing’s Dark Ale, (196) Søby Stout, (197) No. 5 Valnød Hertug Hans, (198) Premium Dark Ale, (199) Premium india Pale Ale;
Midtfyns Bryghus
(200) Midtfyns Wheat, (201) Midtfyns Sommer Wit, (202) Midtfyns Ale, (203) Gunners Ale, (204) india Pale Ale, (205) Double india Pale Ale, (206) Chili Tripel, (207) Stout, (208) imperial Stout, (209) Jule Ale, (210) Jule Stout;
Ølfabrikken
(211) Ølfabrikken Pale Ale, (212) Ølfabrikken Pilsner;
Raasted Bryghus
(213) Columbus Ale, (214) Black Gold Coffee Stout, (215) Raasted Rug iPA, (216) Raasted Cascade iPA, (217) Raasted imperial Stout, (218) Raasted Trippel, (219) Raasted Dunkel, (220) Raasted Pilsner, (221) Raasted Juleøl;
DagmarBryggeriet
(222) Bengerds forår, (223) Byens Øl, (224) Broder Gregers iPA, (225) Skt. Bendt Porter;
Søgaards Bryghus
(226) Jomfruhumle, (227) Munkens Ale, (228) Brown Ale, (229) US Pale Ale, (230) Utzon Blond, (231) Utzon Dark;
Beer Here
(232) Jule iPA, (233) Påske, (234) fat Cat Red Ale, (235) Tia Loca, (236) Dark Hops, (237) Mørke – Pumpernickel Porter, (238) Black Cat;
Aarhus Bryghus
(239) Aarhus extra Pilsner, (240) Stout 2010, (241) Klosterbryg, (242) Sommer Hvede, (243) fregatten Jylland, (244) Julebryg, (245) Classic Pale Ale;
Hornbeer
(246) Brown Ale, (247) Kiss the frog, (248) Røgøl, (249) Blonde, (250) iPA, (251) Vårøl, (252) Hornbock, (253) imperial iPA, (254) Oak Aged Cranberry Bastard, (255) Russian imperial Stout, (256) funky Monk, (257) Caribbean Rum Stout, (258) Winter Porter, (259) Helge;
Bryggeriet Vestfyen
(260) Schwarzbier, (261) Pale Ale, (262) Pilsner, (263) Light Pilsner, (264) Willemoes Strong Lager, (265) Willemoes Stout, (266) Willem. Porter, (267) Willem. Påske Ale, (268) Willemoes Julebryg, (269) Willemoes 200 år, (270) Willemoes Classic;
Brøckhouse
(271) Classic Lager, (272) Premium Pilsner, (273) Premium Julebryg;
Ørbæk Bryggeri
(274) fynsk forår, (275) Blueberry Hill Ale;
Hancock
(276) Saaz Brew, (277) Høker Bajer, (278) Black Lager, (279) Old Gambrinus Light, (280) Old Gambrinus Dark;
fur Bryghus
(281) fur frokost, (282) fur Ale, (283) fur Bock, (284) fur Hvede, (285) fur Lager, (286) fur Porter, (287) fur Renæssance, (288) fur Steam Beer, (290) fur Julebryg, (290) fur Påskebryg, (291) fur Barley Wine;
faxe
(292) faxe Premium, (293) faxe 10%, (294) faxe Royal Strong, (295) faxe Amber, (296) faxe Royal export, (297) faxe festbock.
14
SCANDINAVIAN BREWERS’ REVIEW . VOL.68 NO.1 2011
253
stival, (24) Tuborg Super Light;
1) Carls Ale, (32) Carls Special, bbel, (38) Semper Ardens Blonde
rte enke, (51) Humlepilsner, (52)
e Korkny Ale, (72) New York Lager, Sauvage, (79) S:t Jørgen Stout, ceau Ginger, (87) Oak Wise,
er, (99) Julebryg, (100) Rød Jul,
117) forårsbryg, (118) Bramley Wit;
p Chinook iPA, (123) Single Hop ) Beer Geek Breakfast, ust Another Wit, (134) from To,
1) James Brown Ale;
, (158) Ryeking, (159) Dicentra o, (166) Black Nitro,
a SnapShot Mapping of the DaniSh Beer Market
Generally speaking, multivariate statistics comprises a series of
they have a certain position on the score plot. The results, in
tools designed to deal with large data set containing many different
short, show graphically how our Danish beers are different
variables. in this article, we used one of the simplest multivariate
between themselves and why.
methods: Principal Component Analysis (PCA), which can be looked at as a transformation in which many original variables are
Mapping out danish beers
transformed into a few important dimensions. The usefulness of
The horizontal and vertical axes are our principal components
PCA (or any multivariate method) is that it allows the observer to
(underlying dimensions of difference instead of our original
look at many different variables simultaneously in a graphical, easy-
variables). Looking only at the score plot (the beers), we see
to-interpret way. in practice, what we did was transforming the
can see that there is a cluster of beers in the lower left quadrant
variables in table 1 into fewer latent or underlying variables called
which contains many beers from the large breweries (Carlsberg
principal components and these are the horizontal and the vertical
– including Semper Ardens – and Royal Unibrew), whereas
axis in the following plots. Thus, we were able to map approximately
other breweries are distributed in all quadrants. Why is it so?
300 Danish beers in a few dimensions, according to the criteria
The answer comes from looking at the loading plots, which
mentioned before (brewery size, ABV, malt type, etc.), and ordered
shows the position of the variables. On the left quadrants, we
after the criteria’s ability to explain differences between the beers.
find among others the variables Bottom fermentation and Brewery size. in the right side of the loading plot, we find the
The results of our PCA model – represented graphically in
variables Top fermentation, iBU, Malt colour. This means
the following figures – are organised into ‘score plots’ and
that beers located in that direction in the score plot are top
‘loading plots’. every point in the score plot represents a beer
fermented beers, high in alcohol and brewed on darker malt
in our database. The loading plot represents how the original
types. The loading plot shows that the size of a brewery is
variables (Table 1) are correlated to each other. in other
negatively correlated to the top fermentation, which mean that
words, the score plots show which beers are most similar or
Professor Lewis’s argument is substantially confirmed by our
different from each other and the loading plots explain why
statistical analysis of the Danish beer market: large breweries
figure 1 90
PA, (177) Vejle Bryghus Klassik, own Ale, (183) Vejle Holy Night;
80
,(191) Bøgebryg;
70
ndia Pale Ale, (206) Chili Tripel,
60
lack Cat;
45) Classic Pale Ale;
4) Oak Aged Cranberry Bastard,
Willem. Porter, (267) Willem.
) fur Steam Beer,
fReQUeNCY
nød Hertug Hans,
tout, (218) Raasted Trippel,
Brown Ale Dark Lager fruit Beer Wietbier framboise Dubbel
Vienna Oktoberfest Bock Schwarzbier
Pilsner Pale Ale Amber Ale Blonde Ale Dark Ale
Classic Porter
50 40
Stout
iPA
imperial Stout
US Pilsner
30
Double iPA
Lambic
20
Barley Wine
10 0
-20
0
20
40
60
80
100
iBU Histogram plot showing the distribution of our samples by beer style (measured by IBU). The beers are almost normally distributed, with a peak between 28 and 35 IBU (mostly Pilsners and light ales) and between 18-22 (in correspondence to e.g. wheat beers and brown ales) and a presence of high IBU beers (IPA, Imperial Stout, etc.). Interestingly, the highest peaks correspond to the beer categories that Michael Lewis indicates as the two consumer favourites in the US.
ScanDinavian BrewerS’ review . vol.68 no.1 2011
254
15
a SnapShot Mapping of the DaniSh Beer Market
figure 2a
PC-2 (16%)
SCOReS PC1 VS PC2
PC-1 (25%) Score plot for the first two principal components. Every point in the figure represents a beer (see Table 2). Their distribution across the spatial distance indicates how different or similar they are to each other.
figure 2B
PC-2 (16%)
LOADiNGS PC1 VS PC2
PC-1 (25%) Loading plots for the first two principal components.
16
SCANDINAVIAN BREWERS’ REVIEW . VOL.68 NO.1 2011
255
a SnapShot Mapping of the DaniSh Beer Market
figure 3a
PC-3 (13%)
SCOReS PC1 VS PC3
PC-1 (25%)
figure 3B
PC-3 (13%)
SCOReS PC1 VS PC3
PC-1 (25%)
ScanDinavian BrewerS’ review . vol.68 no.1 2011
256
17
a SnapShot Mapping of the DaniSh Beer Market
figure 4
PC-2
Bi-PLOT Blue = Scores Red = Loadings
PC-1 Biplot, breweries average data. The variable ‘Top/bottom ratio’ gives a measure of the internal (i.e. for each brewery) prevalence of either ales or lager.
figure 5 14 12 10 8 6 4
18
SCANDINAVIAN BREWERS’ REVIEW . VOL.68 NO.1 2011
257
Grapes
Rosemary
Chili
Chokolade
Sea-buckthorn
Peach
Ginger
Yarrow
Grains of paradise
Wallnuts
Chamomile
fennel
Bog Myrtle
elder flower
Lemon
* Blackcurrant (2), Raspberry (3), Elderberry (1), Cranberry (1), Blueberry (2) ** Oak bark extract (1), Sycamore (1), Woodruff (2), -birch (1), Angelica (1) *** Raisins (1), dried plums (1), dried figs (1), dates (1)
Apricot
Plums
Timian
Apple
Dried fruits***
Lime
Bitter orange
Juniper
Star anis
Coffe
Plant and trees**
Oat
Liquorice
Berries*
Rock Candy
Honey
0
Orange Coriander
2
a SnapShot Mapping of the DaniSh Beer Market
specialise in the lager department, whereas smaller breweries seem to focus more on brewing ales.
“
SMaller BrewerieS, alSo accorDing to
Looking more closely at the first loading plot, we can see
thiS plot, experiMent More with Special
several other things. first of all, the variable ‘brewery size’ is
ingreDientS than large BrewerieS, aS
inversely correlated to the variable ‘local identity’, i.e. small breweries stress the local aspect more in their communication
Brewery Size iS SoMewhat negatively
efforts (e.g. Svaneke’s Bornholmish pride) and usually make
correlateD to Special ingreDientS. alSo,
active usage of their brewery as a pub/restaurant (e.g. Nørrebro Bryghus). Secondly, we can see that ABV, Bitterness (=iBU) and malt colour are very close, which indicate that more alcoholic
the Special flavouring ingreDientS are More often founD in aleS than in lagerS
“
beers are usually higher in bitterness and use generally darker types of malt. Thirdly, beers with these characteristics (high ABV, iBU, malt colour) are mostly ales and to some degree also contain more special ingredients, as indicated by the relative closeness of ‘Top fermentation’ and ‘Special ingredients’. Surely, none of these considerations come as a big surprise, but it is always a good sign when common sense and statistics match.
Thus, finally we tried to take an average of each brewery
Plus, you now are more familiar with how to read these kinds of
instead of treating each beer separately in order to see
graphical plots.
whether there are some consistent patterns. for this analysis, we excluded breweries with only one beer present in the
Our analysis indicated that a third underlying variable also
database. Therefore, the results are hardly representative, but
described important systematic differences. This is shown
it was quite interesting to do especially because the resulting
in figure 3a and b, where the interpretation can be focused
plot is highly interpretable. figure 4 gives an account of the
on the vertical axis, as the horizontal axis is the same as
result, now as a biplot, with both scores and loadings in the
in figure 2. The third dimension is more or less explained
same figure.
by use of special flavouring ingredients. Beers in the lower part of the figure are all flavoured with special ingredients,
This plot is definitely more straightforward to interpret given
exemplified by Søgaard’s Utzon Blond (nr. 230 in the plot
that there are much fewer elements. However, interpretations
3a. it contains five ingredients: saffron, lime, ginger, lemon
must be taken with a grain of salt, as the generalisation from
myrtle, and orange blossom honey); Beer Here’s Tia Loca (nr.
single beers to average for a brewery is a rough approximation.
235: unmalted wheat, oat meal, orange peel, and coriander),
first of all, nearly all microbreweries are found on the right
and Skovlyst’s Julebryg (nr. 185: nuts, flavoured syrups, fresh
quadrants, ‘pulled’ by the variable Top fermentation, i.e. on
pine, cinnamon and a number of soft brown sugars; among
average they produce more heavy ales than light lagers, and
them clayed sugar).
large breweries quite the opposite, as we had seen previously. Mikkeller and Amager Bryghus are the most representative
Smaller breweries, also according to this plot, experiment
examples of this cluster and seem to pull along a fairly crowded
more with special ingredients than large breweries, as
group of breweries (Rise, Raasted, Midtfyns, etc.) which share
brewery size is somewhat negatively correlated to special
the same characteristics. furthermore, some small breweries
ingredients. Also, the special flavouring ingredients are
(upper part towards the right side) are characterized by use
more often found in ales than in lagers (the variables
of special ingredients (Beer Here, Midtfyns), some with more
‘Top fermentation’ and ‘Special ingredients’ are fairly well
focus on local identity (Skovlyst, Nørrebro), and for some also
correlated).
more use of bitter hops (Ørbæk, Refsvindinge). On the lower right part, we find breweries characterised by producing beers
if you look at the scores plot we have seen so far, it can be seen
in the higher end of alcohol content and with a much larger
that although some clear patterns can be observed, smaller
use of aroma hops. And in case you are wondering just which
breweries’ beers are often scattered on a quite large area –
special flavouring ingredients are used, figure 5 summarises our
meaning that large internal differences in their beers exist.
findings with regards to which ingredients are most often added
ScanDinavian BrewerS’ review . vol.68 no.1 2011
258
19
a SnapShot Mapping of the DaniSh Beer Market
to beers by Danish breweries (the counts do not account for
the elephant in the room of this discussion is obviously
how the ingredients are used – e.g. raw materials, dehydrated
market share, which is overwhelmingly pending on the lager
products, alcohol extracts, flavour extracts – and when – during
department. We did not include sales volume; it is enough to
the wort cooking or after. This was mainly because breweries
say that the official data (Danmarks Statistik) are only available
seldom upfront that kind of information).
for lager beers. Lager beer is, after all, the drink that conquered the world, accounting for more than 95 per cent of the
conclusions
consumption worldwide. But, to follow up Lewis’ conclusion,
We could sum up our analysis by saying that most of the
it appears that also in Denmark there is plenty of room for
experimentation goes on in the ales department and is provided
reinventing more characteristic lagers, and we encourage
by craft breweries; a fundamental association has provided
brewers to do so – in particular craft brewers, as it has the
the Danish market with a rich variety of beers. However,
potential to increase their sales dramatically.
aBout the authorS
referenceS
Davide, Helene and Michael work – respectively – as
Bruning, T. 2007, The microbrewers’ handbook,
PhD fellow, Assistant Professor and Associate Professor
Navigator Guides.
within the Sensory Science group at the Department of Bryggeriforeningen (ed) 2004, Guide til det Danske
food Science, University of Copenhagen. As partners of
Øl-Univers.
the consortium Dansk mikrobryg – produktinnovation og kvalitet, their current focus is on developing and
Hampson, T. (ed) 2008, The Beer Book,
applying several sensory methods to understand
Dorling Kindersley.
consumer preferences with regards to innovative beers. Project home:
Lewis, M.J. 2010, ‘Drinkability: countering a dash to the extreme’, Scandinavian Brewers’ Review, vol. 67,
www.danishmicrobrew.com/Kvalitetsbryg.htm
no. 5, pp. 8-11. Sensory Science Group: Mosher, R. 2004, Radical brewing: recipes, tales
www.en.ifv.life.ku.dk/faggrupper/sens.aspx
and world-altering meditations in a glass, Brewers Publications.
20
SCANDINAVIAN BREWERS’ REVIEW . VOL.68 NO.1 2011
259
260
Appendix 2 (Activities)
261
262
Summary of activities during the PhD
Courses overview
Course title
Institution
Credits (ECTS)
PhD Introduction course
KU
2
Exploratory Data analysis
KU
7.5
Food Choice and Acceptance
KU
7.5
Sensory Science
KU
9
KU/FOOD
6
Food, Medicine and Philosophy in East and West Total
32
Supplemental short courses and workshops:
Analysis of Descriptive sensory data by PanelCheck (KU/Nofima, Copenhagen, Denmark); ESOMAR Best of New Zealand 2012 – Dimensions of excellence in social and market research (ESOMAR – World association for market, social and opinion research, Auckland, New Zealand); Food and Health Entrepreneurship Academy (University of California, Davis, USA); MAPP Conference 2011: Understanding and creating the future markets for food: Challenges for research on customer relations in the food sector (MAPP, Middelfart, Denmark); Project Management Ph.D course (KU); SensNet Symposium 2010 (Danish Sensory Science Society, Svendborg, Denmark); What to consider when applying for external funding (KU, Copenhagen, Denmark); How can science best communicate with business? (Science Communicators Association of New Zealand, Auckland, New Zealand).
263
Conference contributions (international only)
Oral presentations -
“Patterns of product-context associations among beer consumers”. 10th Pangborn Sensory Science Symposium, 11-15 August 2013, Rio de Janeiro, Brazil.
-
“Do CATA questions bias overall liking ratings?” 10th Pangborn Sensory Science Symposium, 11-15 August 2013, Rio de Janeiro, Brazil.
-
“Comparison of three fast sensory profiling methods, Check-All-That-Apply (CATA), CATA with intensity ratings and Napping® to study consumer perception of eight beers”. 5th European Conference on Sensory and Consumer Research, 9-12 September 2012, Bern, Switzerland.
-
“All-In-One test” (AI1): A rapid and versatile consumer test method for food product research. FOOD Denmark PhD Congress, 22-23 November 2011, Frederiksberg, Denmark.
-
“CATA with consumers” – Workshop on alternative sensory methodologies. 9th Pangborn Sensory Science Symposium, 4-8 September 2011, Toronto, Canada.
-
“Consumer-based product profiling: Issues and perspectives on Napping”. “The multidimensional Consumer!”, Sensory and Consumer Seminar at SIK – The Swedish Institute for Food and Biotechnology, 9 June 2011, Gothenburg, Sweden.
-
“Consumer-based product profiling: application of partial Napping for sensory characterization of Danish beers”. LMC Congress FoodInFront, 23-24 May 2011, Odense, Denmark.
264
Posters -
Giacalone, D., Duerlund, M., Bøegh-Petersen, J., Bredie, W. L. P., & Frøst, M. B (2013). The effect of stimulus collative properties on consumers' flavor preferences. 10th Pangborn Sensory Science Symposium, 11-15 August 2013, Rio de Janeiro, Brazil.
-
Giacalone, D., Machado Ribeiro, L. & Frøst, M. B. (2012). Perception and description of premium beers by panels with different degrees of product expertise. 5th European Conference on Sensory and Consumer Research, 9-12 September 2012, Bern, Switzerland.
-
Giacalone, D., Bredie, W. L. P. & Frøst, M. B. (2012). Stimulus collative properties in food products and their importance for consumer liking: a case study with novel beers. 5th European Conference on Sensory and Consumer Research, 9-12 September 201, Bern, Switzerland.
-
Giacalone, D., Bredie, W. L. P. & Frøst, M. B. (2011). All in One test (Al1) – Expanding the Boundaries of Consumer Studies. 9th Pangborn Sensory Science Symposium, 4-8 September 2011, Toronto, Canada.
-
Frøst, M. B., & Giacalone, D. (2011). Using PLS-Regression for verification of product differences and important variables in a consumer sensory profile obtained by a check-allthat-apply (CATA) technique. 9th Pangborn Sensory Science Symposium, 4-8 September 2011, Toronto, Canada.
-
Giacalone, D., Machado Ribeiro, L., Dehlholm, C., Bredie, W.L.P. & Frøst M.B. (2011). Consumer-based
product
profiling:
application
of
partial
Napping
characterization of Danish beers. LMC Congress 2011 "Food In Front", 23-24 May 2011, Odense, Denmark.
265
for
sensory
Teaching activities
Ad-hoc lecturer and supervision of short group projects in the following courses: -
Sensory and Consumer Science (Bsc./Msc. level)
-
Advanced Sensory Methods and Sensometrics (Msc.)
-
Thematic course: Gastronomy and Health (Msc.)
-
Fundamentals of beer brewing and wine making (Msc.)
Theses supervision: -
M.Sc. student Leticia Machado Ribeiro. Thesis title: “Perception and description of premium beers by beer experts, novices and enthusiasts”. (October 2010 – June 2011)
-
M.Sc. student Jannie Bøegh-Petersen. Thesis title: “Comparison of methods for assessing novelty and related properties – a case study with beers”. (October 2011 – April 2012)
-
M.Sc. student Mette Duerlund Hansen. Thesis title: “Collative properties and hedonic responses to specialty beers”. (November 2011 – May 2012)
-
B.Sc. student Jon Ditlev Gregers Pold Christensen. Thesis title: “Consumer preferences associated with context and perception of collative properties of Danish Beer: A randomized trial”. (April 2012 – June 2012)
Honors and awards
-
Awarded the Rick Bell Memorial Travel Scholarship to the 9th Pangborn Sensory Science Symposium (3.000 USD), ‘on the basis of evidence of interdisciplinary approach and innovation in sensory and consumer research’.
-
Awarded doctoral student grant for participation to the 5th European Conference on Sensory and Consumer Research (400 EUR).
-
Awarded Elsevier Student Award to the 10th Pangborn Sensory Science Symposium (450 USD).
266
Acknowledgements The journey to obtaining a PhD is a wonderful experience. I enjoyed every minute of it and am deeply grateful to all the people who made it happen the way it did. First and foremost, I want to thank my principal supervisor, associate professor Michael Bom Frøst, who has been a stellar mentor and friend all along. His supervision style was the perfect mix of support, encouragement, guidance and freedom to pursue my ideas. I also thank Michael for serving as an excellent role model to me as a junior member of the research world. Likewise, I thank my co-supervisor, Prof. Wender L. P. Bredie., for providing me with additional support, and constructive criticism that helped putting my work into a larger perspective and grow my skills as a scientist. I would also like to thank the members of the Dansk Mikrobryg consortium for inspiring meetings and tons of insights into the fascinating world of craft brewing. A special thought goes to in particular Anders Kissmeyer, Stefan Stadler Nielsen, William Frank, Bodil Pallesen, Mathias Andersen, Anders Iversen, Henrik Siegumfeldt and Sine Haxgart, who provided invaluable help with some of the studies and have been a driving force behind the project. Warm and sincere thanks go to Dr. Sara R. Jaeger at the New Zealand Institute for Plant and Food Research, who welcomed me in her team and supervised me during my stay abroad. Working with Sara was a real privilege and an amazing learning experience. Thanks are extended also to the great people at the Sensory and Consumer team – Mei, Christina, Michelle, Amy, Rozenn, Denise, David, Sok, Benedicte and Roger – for great collaborations and making my time down under awesome from all point of views. I am of course grateful to all current and former colleagues at the sensory science group at University of Copenhagen, who have been and will continue to be a source of friendship and collaboration. I especially thank my officemates in the PhD office – Sandra, Sara, Christian and Emily – for making my days fun and cozy. Christian is also thanked for sharing with me his vast knowledge on projective mapping, and for the many inspiring conversations we had around this and other topics. Among former colleagues, I thank Helene C. Reinbach for substantial help in the first part of my PhD, and positive collaborations (fueled by awesome beer tastings) afterwards. I am also very grateful to four students – Leticia, Mette, Jannie, and Jon – who have written their theses as part of the beer project. Not only has it been a pleasure working alongside them, they are also thanked for important contributions to my PhD. In addition to the people above, I also thank the sensory research community at large. The world of sensory science is an exciting place to be. I am happy to have conducted my PhD in this field, and I look forward to being part of it in the years to come. On the personal side, I thank my family and my friends in Denmark, Italy, and many other places, for being there for me and supporting me through all my endeavors. The last thank you goes to my soon-to-be wife Slavka for her love, support and for constantly reminding me what life, besides getting a PhD, is really all about.
Copenhagen, August 5th 2013 Davide Giacalone
267
DAVIDE GIACALONE
DEPARTMENT OF FOOD SCIENCE PHD THESIS 2013 · ISBN 978-87-7611-641-5
university of copenhagen
DAVIDE GIACALONE Consumers’ perception of novel beers Sensory, affective, and cognitive-contextual aspects
Consumers’ perception of novel beers · Sensory, affective, and cognitive-contextual aspects
Cover photo courtesy of David Arky Photography
Consumers’ perception of novel beers
Sensory, affective, and cognitive-contextual aspects PhD thesis · 2013 Davide Giacalone