opening the black box

4 downloads 149134 Views 237KB Size Report
“Good” advertisements (19): Combines medium and high sales uplift, and ..... using Tobii Studio Professional software (using an ASUS F70 series laptop),.
Page 1 – CONGRESS 2012

Copyright © ESOMAR 2012

OPENING THE BLACK BOX AN ACADEMIC EVALUATION OF THE ABILITY OF ELECTROENCEPHALOGRAPHY (EEG) AND EYE TRACKING (ET) TO PREDICT ADVERTISING EFFECTIVENESS Gemma Calvert • Cristina de Balanzó • Steve Watkins

INTRODUCTION Following on from our last article published at ESOMAR (“Predicting brand decisions through emotional engagement”, de Balanzó, Ohme, Eising, 2011) we started to build a methodology that would be able to address one of the most important issues around communication testing, which is to predict advertising effectiveness more accurately. We suggested that biometric and brain analysis can significantly enrich traditional copy testing but further research and studies were needed. As the next step in this journey the authors, in collaboration with neuroscientists at the University of Reading, set out to learn more about this field of research. In this paper, we describe the results of an academic evaluation of the ability of eletroencephalography (EEG) and eye tracking (ET) to predict advertising effectiveness - defined either by objective sales uplift figures or from subjective expert assessment of advertisements based on their creative treatment, or from post-exposure behavioural measures (whether the advertisements were recalled or not two weeks after exposure and the extent to which they were rated as highly likable). The primary purpose was to try and unlock the proverbial “black box” of the brain – are there consistent and detectable brain responses that correlate with some measure of advertising effectiveness and can these be used as an aid to pretesting and what are the nature of these EEG or ET signals that purportedly predict effectiveness and that are often alluded to by commercial practitioners of neuromarketing but not necessarily described in transparent detail. It is generally agreed that advertising research in general, and pre-testing in particular, still remains one of the most contentious areas of consumer research. Existing approaches for predicting advertising effectiveness, and that rely on explicit consumer feedback, have long been questioned over their ability to accurately measure and explain subsequent consumer behaviour (de Balanzó, Ohme, Eising, 2011). Central to the debate is whether these widely accepted industry approaches truly reflect the way the human brain processes and responds to advertising campaigns. While the research industry talks about ‘emotion’ or ‘motivational valence’, and has begun to explore the role of neuroscience in predicting advertising effectiveness, the continued reliance on measures such as ‘awareness’ and ‘persuasion’ suggests that the underlying assumptions remain unchanged. In recent years neuroscience and biometrics have become the ‘much talked about’ tools in the marketing industry, particularly given the ability of these methods to measure human emotions – now recognised as a key determinant of consumer choice (Ariely and Berns, 2010). Of the tools available, EEG has emerged as one of the most widely employed methods to measure consumer responses to advertising campaigns and animatics. This is due primarily to its superior temporal resolution that makes it possible to follow neural responses to advertisements on a millisecond basis (i.e. near frame by frame) and relative ease of acquisition (i.e. the use of electrode caps that can be placed on respondents’ advertisements in normal viewing environments) rather than its ability to accurately record brain signals from the emotional centres of the brain that reside deep inside the skull. In order to measure emotions from these deep reward structures (e.g. the amygdala and nucleus accumbens), the only tool capable of capturing these signals non-invasively and with any resolution is functional magnetic resonance imaging (FMRI). However, the requirement for respondents to lie still within a large MRI scanner, often situated in an academic or medical institution, has meant that practitioners of neuromarketing have largely opted for the more superficial but slightly more practical methodology of EEG to measure consumer responses. While the analysis of FMRI data is well standardised and data analysis tools publicly accessible, the analysis of commercial EEG data remains largely under described in the interests of retaining intellectual property. Furthermore, there seems to be a large range in the number of EEG channels

Page 2 – CONGRESS 2012

Copyright © ESOMAR 2012

used (from 2 – 32/64), the nature of the signals purportedly being recorded (ERPs/frequency band changes in time/space), and insight into how these signals relate to explicit behaviours. A review of the relevant academic literature suggests that EEG may be capable of picking up brain responses that reflect cognitive processes such as attention, memory (Ambler, 2000; Ambler & Burne, 1999) and, potentially, emotional valence (withdrawal and approach behaviours) (Davidson, 1979, Davidson et al, 1990) However, there is inconsistency in the responses detected during the study of emotional processes using EEG (Coan and Allen, 2004)). Given the increasingly widespread use of EEG in commercial studies and in view of the considerable scepticism from the academic community (and EEG specialists in particular) about the ability of EEG to detect unequivocally emotional brain responses (e.g. in response to effective advertising campaigns and may therefore later predict a desired consumer response such as purchasing), there is clearly a need to submit the methodology to further scrutiny. We scanned a group of primary household shoppers while they viewed a 60-minute documentary interspersed with five advertisement breaks each of 5.5 minutes. The advertisements consisted of a set of campaigns for the same brand that were pre-selected by Leo Burnett based on varying sales uplift figures (low, medium and high) and another group of advertisements that were identified in consultation with Institute of Practitioners in Advertising (IPA) Award data as being “emotional” or “functional” as well as advertisements that were very “poor” in terms of both creative treatment and message. METHODS Procedure Thirty-five right-handed individuals aged between 21-45 years, males and females, took part in the study. They were prescreened to ensure they were familiar with the brands that were being evaluated and were self-reported primary household shoppers. All participants were recruited from the Reading area and asked to come to the University of Reading to participate in a study on television viewing using EEG. On arrival at the testing site, participants were informed that they would be asked to watch a nature documentary while wearing an EEG cap and while their eye movements were tracked. They were told that to simulate the natural TV viewing environment, the documentary would be interspersed with advertisements. They were asked to simply watch the programme as they would do at home and may be asked about some of the content after the scan. We also asked them to rate how hungry they were feeling just prior to scanning ranging from 1 = Not at all hungry to 10 Extremely hungry. This was recorded to ensure there were no outliers in the EEG responses to food advertisements. During viewing, participants’ brains responses were recording using state-of-the-art EEG equipment, with 64 electrode BrainCaps, amplified with a BrainAmp MR amplifier. Conductance between the cap electrodes and scalp was facilitated by first swabbing the area under each electrode with a clinical alcohol solution and then applying a high chloride gel. This is an important and standard procedure in academic EEG studies that is often omitted in commercial tests because of the requirement to wash the respondent’s hair after testing. The testing was conducted in an electrically shielded room to minimise external noise (a factor that poses considerable challenges for commercial EEG studies conducted outside the laboratory setting due to confounding electrical signals inherent in the environment). Eye movements were recorded from a subset of the participants using a TOBI eye tracker to obtain additional data on what aspects of the content of these advertisements was driving changes in the brain responses at different time points. For full details of the data acquisition methods, please see technical appendices. After the scan, participants were taken to a quiet testing room and asked to complete a short brand usage questionnaire. Twenty-four hours later, they were asked to complete an online questionnaire which measured free recall of the products and brands shown in the study, their prompted recall of the same brands and their liking of the advertisements to which they were exposed. A similar enquiry was conducted two weeks later and conducted online. Materials Documentary: A David Attenborough nature documentary was selected as the neutral but interesting program into which we embedded our five ad breaks. Advertisements: There were 39 advertisements selected for testing in this study. They were initially classified into three different conditions:

Page 3 – CONGRESS 2012

Copyright © ESOMAR 2012

based on their known sales uplift figures (low, medium or high performing); based on both subjective creative evaluation and achieved post-testing results (poor); based on subjective expert assessment of them as “functional” or “emotional”. The main difference between these two types of advertising is: “Functional” advertisements are commercials based on the product features and their power remains in the rational information given, while the “emotional” ones have their strength in the emotional content or creativity. In the sales uplift categories, the advertisements were matched for brand (e.g. a low, and a high performing ad for brand X, and where possible/accessible, a medium performing ad). In order to gain maximum statistical power, if the objective low/medium/high classification (fine-grained comparison) did not yield statistical differences in the EEG data, we proposed to run a further analysis whereby we collapsed together all objectively "low" and subjectively assessed “poor” advertisements into a group referred to as “bad” and all the medium or high performing advertisements from sales uplift and subjectively classified “emotional” advertisements into a group referred to as “good”. This resulted in four conditions pre-assigned for data analysis: 1. 2. 3. 4.

By sales uplift figures: Low (6), Medium (6), High (6) By creative treatment defined as: “Functional” (5) or “Emotional” (7) “Good” advertisements (19): Combines medium and high sales uplift, and emotional advertisements “Bad” advertisements (12): Combines low sales uplift and those rated of a “poor” quality

Finally, to test the relationship between our initial classification of the advertisements based on objective sales data or subjective industry experience, and actual behavioural data (explicit post-scan ratings of the advertisements as liked or not liked - on a 10 point scale - and recalled or not recalled) we additionally re-organised the data into two further groups explicitly liked/not-liked, and remembered/not-remembered and re-ran the EEG frequency band analysis on these subjective post-scan rating categorisations. 1. 2.

Recalled versus non-recalled Liked, not liked

Data analysis For this project, we selected three different analytic approaches by which to analyse the EEG data in order to maximise our chances of picking up statistical differences between the experimental conditions and to investigate whether the results would differ depending on the method of analysis. These were:

EEG Frequency band analysis (alpha, beta and gamma)

This is probably the most common approach used by commercial EEG companies and attempts to detect evidence of increased or decreased brain activity at different frequencies in response to a stimulus and was used in this project. Conventional offline data analysis was carried out using Brain Vision Analyser 2 analysis software (see technical appendices). We first measured the group averaged response to each ad across three different frequency bands: Alpha (8 – 13Hz), Beta (13 – 30Hz) and Gamma (30 – 80Hz). Here follows a brief description of the cognitive and emotive processes that are thought to be reflected in these different frequency bands: Alpha band (8 - 13Hz) activity is associated with relaxation, e.g. sleeping or feeling asleep, relaxation, rest, and higher alpha is also observed when the eyes are closed. Relative lower alpha activity in the left hemisphere, in comparison to the right hemisphere, is related to ‘approach’ behavioral tendency (connected mostly with positive emotional states like happiness, but also with anger). Relative lower alpha activity in the right hemisphere, in comparison to the left hemisphere, is related to withdrawal behavioural tendencies (connected with negative emotional states, e.g. fear, disgust). Beta band (13 – 30Hz) activity is associated with concentration and engagement but also with active movements (motor activity). Gamma band (30 – 100Hz) activity is associated with cognitive processing such as memory encoding and integration with existing knowledge.

Copyright © ESOMAR 2012

Page 4 – CONGRESS 2012

For this analysis, four regions of interest in the brain were defined in line with previous studies: Left Frontal (LF), Right Frontal (RF), Left Parietal (LP), and Right Parietal (RP). Analyses were carried out for each power band and each of the three different conditions by running Analysis-of-Variance with Condition, Hemisphere and Region as within-subject factors.

EEG Frequency analysis over time

In order to see whether some of these changes in alpha, beta or gamma were being driven by certain creative components (e.g. well-known faces, face close ups, point of branding, etc.), some companies attempt to cut the frequency data up into chunks (e.g. every 3s) and relate any statistical increases or decreases to specific parts of the footage. Consequently, we also subjected several of our advertisements that showed the most changes in different frequency bands over the entire ad, to the same analysis. However, it should be noted that this approach often fails to accurately assign a change in the frequency domain to a specific point in time of the advert because the act of cutting up the frequency data into 3s chunks unfortunately has the inherent consequence of making the matching process not very accurate (i.e. there may be many things that occur in an ad within a few frames, yet this approach only has the resolution of about 3s if the laws of statistics are to be obeyed.

EEG Empirical Mode Decomposition (EMD)

In order to circumvent the inherent problems with timeline frequency analysis described above, we also employed a rather newer approach which analyses the data continuously (as opposed to in chunks) and is therefore less vulnerable to errors and more likely to pick up changes in high frequency bands. This method is called Empirical Mode Decomposition (EMD) and allows us to investigate phase synchronization behavior in the gamma band that is known to correlate with higher cognitive functions including memory and evaluation. Empirical Mode Decomposition (EMD) is a data-driven method that is flexible in extracting amplitude and frequency information from complex signals, such as EEG. Because of this flexible nature of the decomposition, EMD is suitable for application to non-stationary (i.e. time-varying) signals, such as those recorded continuously to fast-changing information included in advertisements. For further details, please see Technical appendices. We used EMD to examine “good” versus “bad” advertisements, and “functional” versus “emotional” advertisements and with particular attention to high gamma band synchronization between brain regions between and across and within hemispheres. Eye Tracking analysis There are around ten different types of eye movement; the most important ones are saccades, fixations and smooth pursuit. A fixation is when the eye stops to focus, the length of these stops is 100-600 milliseconds (usually 350ms when viewing a scene) and during this stop the brain begins to process visual information (Rayner & Serano, 1994). Eye tracking data obtained while respondents viewed advertisements was filtered to remove looks away from the screen, poorly tracked looks (identified using eye movement validity estimates generated by the eye tracker) and duplicated items due to fixations. Analysis focussed on the location of participants’ looks for each individual advert; the coordinates of each eye movement event (a saccade or fixation) was transformed into a quadrant. This was achieved by dividing the screen into sixteen areas (see figure 1). FIGURE 1, LOCATION OF THE SIXTEEN SCREEN QUADRANTS

1 5 9 13

2 6 10 14

3 7 11 15

4 8 12 16

For each advert the number of changes of looks to quadrants was calculated, in order to control for the differing length of each advert the number of quadrants that were fixated upon was divided by the total length of each advert (in seconds). This gave a measure of the number of quadrants that were fixated upon per second, therefore the lower this value the more focussed participants’ attention was on specific regions. This was achieved using a combination of Visual Basic and R.

Page 5 – CONGRESS 2012

Copyright © ESOMAR 2012

RESULTS Frequency Bands by Ad Category Type For all advertisements (regardless of their classification) we found a significant decrease in alpha power across the whole brain (which has been associated with increased attention) compared to a resting baseline obtained before the documentary started. This suggests that the advertisements were being attended to during the experiment. No differences between baseline and ad viewing were found for beta or gamma. All further analysis was then conducted on the advertisements only.

a. Low/Medium/High Effectiveness Advertisements based on sales uplift

Based on a large body of literature relating relative left frontal activation (i.e. low alpha power) to approach motivation and relative right frontal activation to avoidance/withdrawal motivations (see e.g. Harmon-Jones et al., 2010, for a recent review), we anticipated that frontal asymmetry would be associated with advertisements that elicit clear approach motivations in the observer. In this fine-grained discrimination based on sales uplift figures, we failed to find any statistical differences between the three conditions in terms of alpha, beta or gamma frequency band changes. Additionally there were no significant differences in frontal alpha asymmetry between conditions. Accordingly, we then combined the low performing advertisements with subjectively assessed “poor” advertisements = “bad” condition; and the medium and high performing advertisements with the “emotional” advertisements = “good” condition and continued all further analyses on these conditions rather than simply by sales uplift figures. These new categories were created because high sales might be connected with many other factors other than emotions/tendency evoked by advertisements (and similarly low sales).

b. “Good” versus “Bad” advertisements

When the data from sales uplift figures were combined with advertisements classified as functional or emotional, we found a significant Condition x Hemisphere x Region (F(2,50)=9.84, p=0.002) interaction. Specifically, left hemisphere alpha was significantly lower for “Good” advertisements than “Bad” advertisements, indicating potentially stronger “approach” motivations towards the “Good” advertisements (t(26)=3.24, p=0.003). Alpha suppression related to “Bad” advertisements was marginally significantly higher in the right than left frontal cortex (t(26)=2.00, p=0.057) which in turn potentially indicates more “withdrawal" We found no significant effects for beta or gamma band activity.

c. Functional versus emotional advertisements based on prior expert assignment

There were significant interactions of Condition x Region (F(2,50)=7.74, p=0.006), with frontal activity being lower than parietal and Condition x Hemisphere x Region (F(2,50)=8.96, p=0.001). Right hemisphere alpha is significantly higher for Functional advertisements than Emotional advertisements, which potentially indicates more of a “withdrawal" response. We found no significant effects for beta or gamma band activity.

d. Results by explicit post-scan recall and liking measures

There were no significant differences in alpha amplitude or asymmetry between advertisements that were recalled versus those that were not, nor any differences in the gamma band. However, there was a significant increase in beta across the whole brain for recalled than not-recalled advertisements (t(14)=-2.22, p=0.044). We did not find any differences depending on whether advertisements were classified as liked or disliked. A subsequent correlational analysis found that recall and liking are both positively significantly correlated with gamma (r=0.43, p=0.042; r=0.71, p=0.001) and also with each other (r=0.824, p