groups within the data using data analysis and metric development. ... information visualization that allows users to more easily explore the composition of a ...
Defining and Visualizing DataDriven Personas Amy Wang, Daniel Rowland, Karina Harada & Kexiang Xu
Abstract Traditional (not datadriven) personas often fail to effectively reflect real user populations (Chapman). In this paper we put forward a datadriven persona analysis process, in which we categorize each persona based on a set of defining metrics. We then characterize them by comparing their attributes against those of other personas. We also introduce a set of visualizations for presenting datadriven personas that reflect detailed persona attributes similar to traditional personas but with data to back up our findings. The visualization does not only introduce personas, but also tells a story about the dataset that may bring interest to this new area of research.
Introduction Chapman et al (Chapman) pointed out an important weakness of personas models. In their paper they proposed a formal model for describing personas, which they called “personalike descriptions,” and compared the prevalence of each persona group in a user dataset as each persona took on more attributes. According to their research, each personalike descriptions possessed less prevalence as they became more fleshedout (taking on more attributes), and as a result represented less real users than the user researchers predicted. Their studies showed that Pearson’s r value of observed vs. predicted persona prevalence ranged between 0.160 0.636 at 99th percentile (Chapman), which indicated a very low relationship between the real prevalence and the designer’s perceived prevalence for a persona. We noted that the reason behind such findings was the lack of datadriven analysis on the actual user population when user researchers created detailed personas about their audience. We hoped to address this problem with our persona analysis process, where we categorized each persona based on a set of defining metrics that set them apart from the other users, and characterized them by comparing their attributes against those of other personas. Using the Yelp Academic Dataset, our goal was to identify and define different personality groups within the data using data analysis and metric development. We wanted to create an
information visualization that allows users to more easily explore the composition of a business’ aggregate Yelp rating. In doing so we hoped to provide insight to Yelp users to help them make better choices between Yelp search results and to help businesses better understand their reviewers. As defined by usability.gov, effective personas: ● Represent a major user group for a website ● Express and focus on the major needs and expectations of the most important user groups ● Give a clear picture of the user's expectations and how they are likely to use the site ● Aid in uncovering universal features and functionality ● Describe real people with backgrounds, goals, and values (“Personas”). One of the most important aspects of our visualization is that it tells a story. The idea of defining datadriven personas is a relatively new idea, which means that it does not inform the user about a current issue. That being said, it does the next best thing, which is to “raise awareness and create interest in a topic a reader may not otherwise have been aware of” (Kosara and Mackinlay 5). Even if the interest may be small now, datadriven personas have the potential to assist in gauging the credibility of business reviews. Imagine a time when filtering out fake Yelp profiles could be as easy as clicking a button.
Background
Related Works Chapman et al (Chapman) pointed out an important weakness of personas models. In their paper they proposed a formal model for describing personas, which they called “personalike descriptions,” and compared the prevalence of each persona group in a user dataset as each persona took on more attributes. According to their research, each personalike descriptions possessed less prevalence as they became more fleshedout (taking on more attributes), and as a result represented less real users than the user researchers predicted. Their studies showed that Pearson’s r value of observed vs. predicted persona prevalence ranged between 0.160 0.636 at 99th percentile (Chapman), which indicated a very low relationship between the real prevalence and the designer’s perceived prevalence for a persona. We noted that the reason behind such findings was the lack of datadriven analysis on the actual user population when user researchers created detailed personas about their audience. We hoped to address this problem with our persona analysis process, where we categorized
2
each persona based on a set of defining metrics that set them apart from the other users, and characterized them by comparing their attributes against those of other personas. Our research on Google Scholar concluded that there were no previous publications on creating personas based on the Yelp Academic Dataset. We found two papers on largescale classifications of Twitter users based on attributes mined from their activity streams using machinelearning approaches. Wagner et al. described in their paper called "Religious Politicians and Creative Photographers: Automatic User Categorization in Twitter" a model for extracting latent user attributes on Twitter users using Random Forest classifiers and demonstrated how to efficiently categorize Twitter users based on two of these attributes: personality and profession (Wagner). Pennacchiotti et al. presented in their paper called "Democrats, Republicans and Starbucks Afficionados.", an architecture of classifying Twitter users by a set of attributes, such as political affiliation, ethnicity and whether a user was a fan of Starbucks using Gradient Boosted Decision Trees and social network analysis (Pennacchiotti). However, unlike Wagner’s paper, it did not describe how to utilize these discrete attributes to categorize users into cohesive groups (i.e., AfricanAmerican Democrats who are Starbucks fans). Both papers focused on extracting features from each user’s activity streams, which predominantly consisted of textual data, as well as each user’s friend network. Our approach differed from the aforementioned papers in two important ways. First of all, we focused on an area overlooked by the aforementioned papers, a finegrained categorization and analysis of user groups (personas) based on each user’s activity attributes. We went beyond individual user attributes, and combined discrete user attributes to categorize users into cohesive groups, providing insights usable by both users and Yelp businesses alike. Second, we focused less on coming up with a set of user attributes, but rather relied heavily on the user attributes given to us in the dataset. Our system could have benefited from using machinelearning models to generate user attributes which are better suited for persona analysis, but it would also distract from building the core system. Instead, we considered the user attributes given in the dataset to be sufficient for the scope of this project. We did generate a number of user attributes based on aggregating their review activities, such as average review length and word frequencies. We considered using machinelearning models for categorizing users into personas, but did not foresee much benefit from a machinelearning based classifier due to the time constraints. There were also a number of applications that facilitates the creation of personas. For example, UserForge and Personapp were online applications which help user researchers create personas with full names, detailed attributes and contextual imagery (Farley and Engler; “Personapp”). We thought that personas created with these applications would suffer from the problems described in Christopher et al’s paper, that they would map poorly to the real user population. This is the kind of the problem that we set out to address with our paper.
3
User Research During the exploration stage, we conducted a series of interviews with users that were familiar with Yelp. We interviewed a Yelp Elite user (as defined by Yelp), a Yelp employee, and someone who used Yelp for food recommendations. We talked to a diverse group of users during the user research process to understand what features people cared the most about. Because the interview with the Yelp employee did not match the goals of our final visualization, we chose to disclude his interview results.
Yelp Elite User A Yelp Elite User, as defined by Yelp, is someone who is a stellar Yelp community member and a role model to both new and old yelpers. These people are either nominated by their peers or selfnominated. This user was an active food business reviewer and enjoyed communicating with other Yelp users online and in person. She enjoyed the community features of Yelp and how it was food focused, something that was different from other social media platforms. When we asked her what made a quality review, she compared it to a critique and included details such as the restaurant’s environment, service, and food. She especially emphasized how much service mattered in her review of the restaurant and mentioned that she never went to restaurants that had fewer than four stars. We also asked her about her friends on Yelp and whether she met them in person or online. She said that about 40% were her Facebook friends and 60% were her Yelp friends. The benefits of having Yelp friends was that they would populate her newsfeed on her homepage and she enjoyed reading quality reviews from her friends. After exploring the Yelp dataset more, we started to see potentially different kinds of personalities that populated Yelp. We then asked our Yelp Elite user what groups of people stood out to her on Yelp, and she mentioned the following: lurkers, humordriven reviewers, people who look at everything in a restaurant from service to food, people who hype up restaurants, and complainers. It bears mentioning that upon later reflection, we realized that the personas we pulled from the data ended up to be very similar to these.
Yelp User We also conducted an interview with a user who used Yelp at least once a week and used it to find reviews of restaurants and to explore new restaurants in a new area. She occasionally wrote reviews if she felt like the food or service was fantastic or horrible. When asked about how confident she was in the Yelp ratings, she said her confidence increased with the number of reviews a business had. However, she found that the ratings for Vietnamese restuarants were not as reliable since she grew up eating Vietnamese food and perceived it differently than other people who only ate the restaurant variety. One of the most important insights gained from this
4
interview was how different people have different tastes, which could affect the review of the restaurant’s food.
Methods
Data Analysis Text and Friend Network Analysis For our analysis, we primarily relied on the existing user attributes in the Yelp Academic Dataset, which included review count, average star rating given in reviews and number of funny / useful / cool votes each user received. We also used these to generate number metadata fields based on each user’s review activity and composition as well as their friend network. Individual user reviews were used to compute three aggregated values based on the size of the review in terms of number of characters, words, and sentences. For the word count and sentence count, we computed these values using the Natural Language Toolkit (NLTK) Treebank Tokenizer, a wellaccepted tokenizer in the Natural Language Processing (NLP) community (Bird and Loper 2009). We then computed three metrics for character count, word count, and sentence count: average, standard deviation, and the 70th percentile. The Yelp dataset came with an extensive social network graph, which we sadly did not have the capability to analyze in Tableau. As such, we merely computed each user’s friend count and used it in our analysis. We also came up with an attribute pertaining to how long each user had been on Yelp. In the raw dataset, such metrics were given to us in the form of each user’s registration year and month, which was not easy to analyze in Tableau. As such, we computed the number of days 5
the user had been on yelp by subtracting the user’s registration date from our time of analysis. Since the Yelp dataset was collected in early 2015, every user in the dataset had a “Yelp Age” of at least 300 days. However, since we were only interested in the relative difference between each user’s “Yelp Age”, this did not affect our analysis. After we came up with a set of personas, we also did word frequency analysis on each persona. This was done by taking all the reviews from users that belonged to each persona and computing a bagofword representation on the set of documents. We determined each word using NLTK’s Treebank tokenizer and also eliminated stop words. Since the dataset contained 1.6 million reviews at 1.5 GB, finishing the computation in a timeefficient manner was tricky. Our solution was to use the excellent sortedset implementation of the Redis keyvalue database to keep count of word frequencies. After we had the bagofword for each persona, we took the top 500 words of each persona and eliminated the words which cooccurred in the top 400 words of other personas. This was done to make sure that relatively common words which one persona uses more frequently did not get eliminated. The remaining words of each persona were then used in the analysis.
Persona Discovery During our preliminary analysis, we started to find interesting user groups to define the qualities of Yelp Elites. The scatterplot below compares the user’s friend count to their average star rating. In looking at this visualization we noticed odd groupings of users and began a segmented analysis of each. This led to our first three postulated personas, which were trolls, goody goodies, and hyper socials.
6
We also noted the existence of a sizable cluster of users with a disapproationally high ratio of funny votes to useful votes. On the scatterplot below, each dot represents a user. The horizontal axis shows the number of useful votes, which indicates the number of community members who have voted the user as “useful.” The vertical axis in turn shows the number of funny votes, and the color represents the ratio of funny votes to useful votes. In general, both of these metrics indicate popularity, as a user who receives a high number of useful votes also gets a lot of funny votes. However, as the graph shows, a number of users have disapproationally more funny votes than useful votes. We suspected that these users were humordriven reviewers and considered them as a Comedian persona. In similar ways, we noticed a group of users who had very high review output and a high number of useful votes. We considered them to potentially be our Professional Critic persona.
7
A challenge we faced in finding personas was the strictness of the defining metrics. Overly strict defining metrics resulted in small user groups that did not represent a large group of the userbase. Loose metrics, on the other hand, weakened the significant characteristics of each persona. Our solution to the problem was to use strict defining metrics to narrow down to a small subset of core users within each persona and then determine their characteristics. After that, we gradually loosened the defining metrics to include more users into each group while ensuring minimal impact to each persona’s characteristics. For example, we initially defined Professional Critic as users with over 100 reviews written and 2635 useful votes, which encompassed a group of 868 users. We then loosened the criteria to be over the 90th percentile range for the number of reviews written and useful votes, which widened the user group to around 10% of the total user population. Another challenge during persona discovery was the categorization of the last 1/3 of the user base. These users who we could not categorize into any personas had about the same median value with the whole population for attributes we considered, which seemed to indicate that they were a group of featureless average users. However, after looking at the distribution, it seemed like there were a number of outliers in friend count, which indicated that there could be a user group who used Yelp primarily for social networking rather than looking at business reviews. However, we were not able to isolate a substantial user group for further analysis. In the end we decided to separate the uncategorized group into users with less than 13 friends (less than 1.5 IQR of upper quartile) as an Average Joe persona who were more featureless and left the remaining 4,598 uncategorized users for future analysis.
8
Persona Characterization The dataset had three direct measures of how wellreceived an user was. Users were able to deem other reviewers as “Funny,” “Cool,” and “Helpful.” We calculated the average number of votes a review would receive to get an idea of what was “normal”. From there, we viewed the distribution of users with respect to these averages and began to identify quantitative attributes from which we could define personas. We extended this approach to other user profile data including friend count, review count, review length (both number of words and characters) and word choice. After we categorized users into personas, we characterized them based on their collective attributes. We gained a lot of insight on what these personas were like, which helped us flesh out these personas without sacrificing their prevalence in the real user population. For the persona characterization, we decided to analyze each persona based on the following attributes: average star rating, review count, length of review, number of useful votes, years active, friend count, and the ratio of funny votes to useful votes. During the characterization, our perception of some personas dramatically changed. The most significant example is the the persona called Average Reviewer, which we renamed to BASIC Characters (Balanced Active Social Involved Characters) to reflect the change. We defined this group initially as the average, middleoftheroad users who partook in social and business review activities but failed to stand out in any way. As we looked into the Unfriendly (now named Average Joe) and contrasted the two personas, we realized that the Unfriendly who possessed very few friends, wrote ~1 reviews and received ~1 votes were the true mass of average users, and the Average Reviewers were active in comparison. Thus we renamed the Unfriendly to Average Joe and coined the Average Review an acronym Balanced Active Social Involved Characters to reflect our belief that they were the active users on Yelp, who could not stand out in comparison to the other personas.
9
Figure shows evolution of personas and their designated names
Design Process Usability Studies After cycling through many prototypes and solidifying a working dashboard, we brought our design in front of two Yelp users to see how they would respond. We ended up getting a lot of great feedback including compliments and suggestions for improvement. Below is the prototype we put in front of our users; it is prototype version 12.2.
10
Initial Impressions Our users really liked how we overviewed the data with our area chart as an introduction to our personas, which shows an overview of the personas and business data. For example, one user stated that she “first noticed the [area graph] because of all of the different colors” and that she liked how “the color coded areas are easy to see”, while another user listed the area graph as one of the easily understandable parts of our visualization. This feedback helped us both in understanding the effectiveness of our visualization’s overview of the data, and how the area graph assisted in their understanding of personas present in the Yelp community. For this reason, we kept the area graph as the user’s main entry to our visualization.
Exploration of the Visualization Our users varied in how they began to explore the visualization, but there was a tendency to hover over sections instead of clicking on them. This proved to be helpful for our users because we provided information about the persona names in the tooltip for the area chart, parallel coordinates chart, and the pie charts. We did this to support detailsondemand so that users could automatically obtain details from the visualization. One thing we noticed is that our users made few interactions with our Top 10 Businesses filter. The top 10 businesses were actually listed twice, one controlled the overview area chart, while the other controlled the table with more detailed information. We took this into consideration when making changes to our dashboard to both eliminate redundancy and to make the filters seem more interactive. Although the intention of the pie charts were to show parttowhole comparisons between a highlighted persona to the whole Yelp community, one user also 11
used them as a way to find out more about which personas occupied the big chunks for each category by hovering over each pie slice. This was an interesting finding because we only intended them to be passive elements, but the pie charts also became a way of exploring the data into further detail.
Suggestions We got quite a few suggestions on how to improve our visualization, and the following were suggestions that came up multiple times during our usability assessments, were seen as ways to conform our labels into Yelp conventions, or were agreed upon changes that could improve the overall experience of the visualization. Location of Persona Information When the user highlights a persona by either clicking a shaded area of the area graph or clicking a line in the parallel coordinates graph, then the persona name, description, and corresponding pie slices in the pie chart area highlighted as well. This was to support zoom and filter tasks since these tasks allow the user to further explore personas in unique ways. However, when one of our users clicked part of the area graph, she saw no change in the surrounding information because the persona information was out of view from the overview area graph, which is something we did not predict because we assumed the user would naturally scroll down to view more about the persona. Afterwards, she elaborated on this experience by saying that she preferred to “put the Persona Information in the upper part [of the dashboard] because it’s related to the [area] graph more than the information about the top 10 businesses.” She also mentioned that she would have the Persona Information above the Parallel Coordinates graph as well so that she “could easily see the change in information from the [area graph filters].” Appearance and Explicitness of the Dashboard Directions Following the same experience as the previous section, the user actually did not interact with the area graph until we pointed out the directions at the top of the dashboard. This shows how making the appearance of the dashboard directions more apparent could help guide the user’s exploration of the persona information. After reading the directions and attempting to highlight a persona in the area graph, she thought she did something wrong and said, “Oh, I think I just broke it. This is probably not what you wanted me to do, is it?” As mentioned in the previous section, this is in part because the location of persona information was too low in the dashboard to visibly see any changes when looking at the overview of the data. However, when we repeated the same directions, but then added “and then scroll down to learn more about [the personas]”, she instantly scrolled down the dashboard and said “Oh! That makes more sense now.” As a result of this, we added this
12
small part to our dashboard directions and also made the directions more prominent at the top of our dashboard. Visual Separation between Different Categories on the Parallel Coordinates Chart When looking at the parallel coordinates chart, one user was confused about why the different points were connected when each category was distinctly different. She said, “I don’t see a meaningful connection between the different categories on the xaxis” and explained her rationale by saying that “when people see lines, they think of change either chronologically or [with] some kind of trend. So if [the points] were represented as dots or something, then it would be more obvious that they represent different categories.” The following figure shows our solution to this problem, which was to visually separate the categories through alternating white and light gray panes.
Although we understood what our user was saying, we believed that having the line connecting the categories made this parallel coordinates graph easier to follow one persona and better represented how closely related two or more personas. This can be seen in how the BASIC (orange line) and Comedians (green line) are similar in every category besides the amount of funny votes they receive for their reviews.
Design Decisions Word Clouds vs. List of Words for Most Used Words by Each Persona Because the text of Yelp reviews took up about three quarters of our dataset, we decided to conduct some text analysis to help characterize each persona. We planned on using word clouds as a way to visualize the top ten most used words by each persona. However, we were not creating the word clouds with the intent of comparing the words between the personas, but just to show the characteristics of each persona.
13
Unfortunately, there are many problems with word clouds that prevented us from using them. The main reason we abandoned the idea of word clouds was because word clouds make poor use of preattentive attributes, such as size and position due to the nature of some words being longer than others and because the position of the words in a word cloud have nothing to do with how frequently they appear in a body of text (Scott, “Text Visualization”). Another downside to word clouds is that words clouds require the user to linearly scan the data, whereas a table with a list of the top ten words would only require a quick vertical glance (Scott, “Text Visualization”). These points prove that sometimes a simple table or list is the best way to represent data due to its intuitive nature and systematic structure. Another struggle we had in visualizing the top ten most used words was whether we should explicitly rank the top ten most used words. Although the words could be ranked by word frequency, we wondered if that was relevant to the information we wanted the user to take away from our visualization. In addition, the relative frequencies of the words were so close to each other that ranking them would not have said anything meaningful about their usage. In the end, we decided to have the words ordered, but not explicitly labeled with rank because the words are there to characterize the personas, not to encourage comparison.
The Use of the ‘Evil’ Pie Charts The use of pie charts are generally discouraged because it is difficult for the viewer to accurately compare the lengths of pie slice arcs. However, we found pie charts to be very useful in parttowhole comparisons when comparing one persona against the whole Yelp community in categories such as persona population, number of reviews, number of friends, and number of useful votes. Using pie charts for parttowhole comparisons is one of the rare exceptions where pie charts exceed when comparing less than seven categories (Aragon, “How to Critique a Visualization”). We implemented this part to whole comparison by highlighting the corresponding persona pie slice when it is selected from either the overview area graph or parallel coordinates chart. In the following figure, the Professional Critics persona is highlighted in each pie chart, and tells quite a story when the pie charts are viewed sidebyside.
In the first pie chart, the viewer can see that Professional Critics take up a small amount of the Yelp community. But when compared to the second pie chart, you can see that they are responsible for the majority of the reviews on Yelp. The third and fourth pie charts show how their reviews attracts a substantial amount of friends and useful votes, respectively. Stories 14
such as this one of the Professional Critics would have been difficult to visualize had we not used pie charts, which is the reason why they are a part of our final visualization.
Visualizing Uncertainty One idea that came to us after the usability studies was visualizing uncertainty for each category displayed in the parallel coordinates chart. Our reasoning for doing so can be summarized by this quote from Thomas and Cook’s book Illuminating the Path, “Uncertainty must be displayed if it is to be reasoned with and incorporated into the visual analytics process” (87). In terms of our visualization, visualizing uncertainty helps the users understand the range of possibilities that encompasses the end values shown in the parallel coordinates graph. We also knew that visualizing uncertainty for our users would have to be easily understandable, so we explored various graph types as seen in the following three figures.
In the first figure, we tried visualizing the distribution for each persona using the position of each dot to represent the range of the values, but all of the personas are hard to distinguish from one another, so this idea was deserted.
15
Another idea we tried was a vertical histogram, shown in the second figure. Although this was a significant improvement over the previous graph, it still did not seem to accurately display the horizontal range of the distribution, especially when looking at the Comedian (green) and Dark Matter (blue) lines.
16
Eventually, we tried the violin charts, which is shown above. We noticed that it conveyed the information we wanted to show to our users by making good use of preattentive attributes such as length, position, shape, and curvature. During our success evaluation testing phase, three of our users really liked the addition of the violin charts. One of them said, “These [violin] charts make it a lot of easier to see the trend [between the personas].” In our final visualization, the violin charts are presented as an optional view through a button close to the parallel coordinates chart and a menu option when clicking on a persona in the parallel coordinates chart. Our intention was to have these violin charts available for users who wanted to explore personas in more depth, but they ended up being readily explored by all of our users who were just curious about the optional view of the data.
Reducing the Business Information View In the Usability Testing section, we mentioned having a more detailed view of the top ten most reviewed businesses underneath the overview in one of our prototypes. The business information included a table with the names of the top ten businesses, the type of business (restaurant, buffet, breakfast & brunch, etc.), and the price range. There was also a bar graph showing the review count for each of the top ten businesses. Because of the feedback we got about the business information being seen as unrelated to the persona information, we decided to reduce the business information down to showing the filtered row of business information above the overview area graph, and eliminating the review count bar chart as it did not display any meaningful data.
Results
Personas Our analysis produced 6 personas:
Professional Critics Professional Critics, though only 8.11% of the user population, are the biggest contributors to business reviews on Yelp. They authored 61.90% of total business reviews and receives 77.10% of total useful votes on user profiles. A Professional Critic user is defined by having well above average review count and useful votes, at over 74 reviews and 89 useful votes. These values are arrived by computing 90% percentile over the population.
17
Comedians Comedians are a small user group, 0.49% of the user population. They are well above average in review count, review length and popularity (useful votes and cool votes), but their highlight is having a disproportionately high number of funny votes compared to their other profile votes. They are not afraid of colorful word usage to achieve a comical effect. A Comedian user is defined by having over 14 funny votes ( 1.5 IQR of upper quartile), and a ratio of funny votes to useful votes over 1.2 (98 percentile).
Goody Goodies Goody Goodies only give 4 or 5 star reviews. They make up 25.16% of total population, but contribute 3.94% of total reviews, and receive 1.90% of total useful votes. They are below average in their review length, and are generally newer to Yelp than other users. They do not hesitate to give very nice words to businesses, such as “professional,” “beautiful,” and “thank.” A Goody Goodies user is defined by having an average review ratings of over 4.43 (upper quartile of population).
Complainers Complainers generally give 3 stars and lower reviews. They make up 13.50% of total population, but contribute 1.82% total reviews, although their reviews are longer than average. Like Goody Goodies, they are also generally newer to Yelp than other users. They tend to complain in their reviews, using words like “rude”, and “worst.” A Complainer user is defined by having an average review ratings of below 2.73 (lower quartile of population).
BASIC Characters BASIC stands for Balanced Active Social Involved Characters. BASIC characters are 13.89% of total users, and contribute to 18.73% of total reviews, though they receive 12.71% of total useful votes. They are well above average in review count, review length and popularity, similar to Comedians, but don’t receive as many funny votes for their effort. A BASIC Character user is defined by not having the defining metrics of Goody Goodies, Complainers, Professional Critics and Comedians, and receiving more than 7 cool votes and 6 useful votes. Both values are upper quartiles in the respective metrics.
Average Joe Average Joes are 37.58% of the userbase which are quite average in all attributes we considered. In review count, review length, friend count and other metrics, they do not differ from the median of the population. They contribute to 12.11% of total reviews. An Average Joe user is defined by not having the defining metrics of Goody Goodies, Complainers, Professional Critics and Comedians, and having less than 13 friends (less than 1.5 IQR of upper quartile). 18
Uncategorized The Uncategorized consist of everyone who does not fall into the other categories. They make up 1.25% of the userbase. The group median values of their attributes are all over the place, sometimes above population median and sometimes not, and in particular goes handinhand with Professional Critics in terms of friend count. They are preserved for future analysis.
Final Visualization Our final visualization can be found in the following link: https://public.tableau.com/profile/publish/AnalyzingPersonasAmongYelpUsers/OURDASHBOAR D#!/publishconfirm The following three figures are snapshots of parts of our dashboard. The first figure presents our overview that the user first sees when they enter our dashboard. The user can see the overview area graph that shows the review composition for all businesses contained in the Yelp dataset that is divided up by persona represented by different colored areas. There is also a line across the same graph that represents the average star rating; for each point on the line, the star rating is calculated with the current month and the preceding five months.
19
Next to the area graph is the parallel graph that shows how each persona is defined in the following categories: average star rating, review count, length of review, number of useful votes, years active as a Yelp user, number of friends, and number of funny votes. To the right of the parallel coordinates graph is a button that leads to a more detailed view of the personas. In the second figure, the Goody Goodies persona is highlighted to show how each section highlights across the area graph, parallel coordinates graph, population details, persona description, and top ten most used words used by Goody Goodies.
20
The last figure is only a partial view of our detailed persona view because this view is too long to show on one page. It consists of violin charts showing the uncertainty of each category contained in the parallel coordinates graph.
21
Evaluation When looking back at the goals that we were trying to achieve, we wanted to answer the following two questions with our data analysis and visualization: 1) What is the breakdown of personas on Yelp? and 2) Can information about personas help users make informed opinions about businesses? We accomplished our first goal because we were able to use datadriven analysis to provide differentiating personas. For our second goal, we had mixed reactions about whether or not using personas to categorize reviews would be an effective tool for our users.
Persona Analysis Evaluation We came up with two criteria which determined the quality of our personas: user base coverage and persona overlap. For the user base coverage, the percentage of the user population covered by the persona indicated the prevalence of our persona categories. For persona overlap, the amount of overlap between each persona and the other showed how independent our personas were, which was important in creating distinct and meaningful personas. We have covered 98.75% of the 366,715 users with 6 personas. Only 4,598 (1.25%) users in the dataset remained uncategorized.
22
There was a small amount of overlap between personas as described in the tables below. We noted that there was some overlap between Comedians, Professional Critic, Goody Goodies and Complainers. Some of the overlap may be due to imprecision in our personadefining metrics; for example, the defining metrics of Professional Critic only considered the number of reviews and useful votes attributed and failed to account for actual professionalism. Others may indicate actual users sharing attributes in both personas for example, professional reviewers who were also funny. We also noted that the Professional Critics overlaps with Comedians, while the Comedians does not overlap with Professional Critics according to the table. This was an error introduced by our order of categorization, due to the Comedians being selected from the entire user base initially. In addition, it could be observed that the Average Joes and BASIC Characters did not overlap with other personas. It was because these two personas were the result of analysis over the subset of users who did not belong to the other four personas. Overall, we believed that the overlap was insignificant and did not affect the result of persona characterization.
23
Success Evaluation Methods To assess the success of our visualization, we developed a semistructured user test with four participants which consisted of an exploration/familiarization phase followed by a task prompt. During the exploration phase, the test users were to take some time to become familiar with the dashboard and the personas. No effort was made to ensure their “correct” interpretation of any particular portion of the display. Afterwards, users were instructed to pretend that the top 10 business list in the left margin of the dashboard were results from a yelp search for restaurants and to use the visualization to make a choice of where to eat. After they had chosen, they were then asked to explain their choice. 24
Rationale The purpose of the exploration phase was to give the test users time to familiarize themselves with the dashboard, in terms of navigation as well as to let them make sense of the information it contains. Asking the users to make a choice based on our visualization prompted an attempt to complete our primary goal, which was to see if users could use our visualization to understand the features of each persona and then use that information to make good decisions about the reliability of a business’s review composition. Our task reflected this goal by allowing users to explore each persona and then make a decision on where to eat based on how each persona populated a business’s review composition.
Results We determined the success of our second goal by whether or not our participants would use the information about the personas to make decisions about where to eat. Two out of the four participants said that they would use the persona information to gauge the reliability of reviews for any given business. One of our participants noted that she would go to Earl’s Sandwich because she “judged the restaurant by the number of professional reviews and the lack of complainers.” Another participant also mentioned how she “sees that there are a lot of professional critics represented [there], so it must be a good [restaurant] to look out for.” For the other two participants, one of them cared more about the dollarsign associated with each business and the other was more focused on the average starrating of reviews for that business. These two participants stated that they were less likely to use the information on personas to make conclusions about business reviews. We noticed that the training or previous knowledge participants had on information visualizations as well as the amount of time they used for exploration were significant factors in their understanding of our visualization. As a result the participants who stated they had a hard time with statistics or didn’t understand the graphs may not have fully understood our visualization or were hasty in their examination. This suggests that our visualization is not ready for integration with the Yelp website as web users want to consume information and execute tasks quickly. We therefore should begin measuring improvements to our visualization in terms of speed of understanding and task completion.
25
Future Work Persona Analysis There are still 4,598 users in the dataset who remained uncategorized by our 6 personas. We wanted to continue research into this small subset of users, possibly through indepth text analysis. In addition, some potential user groups which were suggested during the user research phase never came to fruition, such as Evangelists who are keen on discovering recently opened and underappreciated businesses, and Surfers, who used Yelp primarily as a social network. We hoped to continue to uncover personas such as these as a part of future research. We also believed that it would be possible to generalize our analysis process to apply to other user datasets. In particular, we envisioned a system that could automatically complete such an analysis on a dataset of users and reveal potential personas that user researchers could look further into.
Improvements to the Visualization Throughout the process of creating the visualization, we were constantly hindered by the limited interactive capability of Tableau. As a result, we were not able to achieve the most fluid interaction that we hoped for, such as having the violin charts as popups on the parallel coordinates view, as a form of detailondemand. In order to create more powerful interactions for our visualization, we may have to reach out to a new visualization platform, such as D3, to overcome a lot of Tableau’s restrictions.
Applications for our Research We believe that our research can help in multiple areas such as finding fake Yelp profiles. Fake Yelp profiles entails user accounts that are only created to inflate the good aspects of a business or to sabotage a business’s reviews. By using our visualization, a researcher could isolate suspicious activity by a certain user group to identify the fake Yelp accounts. One suspicion we have is that the Uncategorized might be comprised of fake accounts because they are highly social but have very low length reviews.
26
Conclusion Although Yelp is a place to write and read reviews for businesses, it is also a social platform as well. Due to its social nature, personas have become apparent in Yelp users, which can be used to help define how users utilize Yelp to achieve their own end goals. Our research focused on defining datadriven personas by using metricsbased categorization has enabled us to bring attention to a new form of research that has the potential to help both Yelp users and business owners gain insight into how these personas comprise business reviews and potentially gauge the reliability of the business’s overall star rating. We used multiple forms of information visualizations to display our findings and allow the user to explore the personas and review composition of the top ten most reviewed businesses. We identified six major personas on Yelp including Average Joes, Balanced Active Social Involved Characters (BASIC), Comedians, Complainers, Goody Goodies, and Professional Critics. Each of these personas have defining characteristics that illustrate how different user groups use Yelp as a social platform. There is also the Uncategorized, which is a small user group that currently has no defining characteristics. We learned that certain personas elicit differing amounts of trust as seen in our usability studies and success evaluation. One example is how Professional Critics comprise a small part of the Yelp community, but are the largest contributors in terms of numbers of reviews, which in turn attracts a large number of friends and useful votes. Even though our persona list may be incomplete, it is a strong starting point in developing representative user groups on Yelp. We believe that our research can be applied to discovering fake Yelp profiles that either exaggerate the good aspects of select businesses or sabotage a business’s reviews. Future iterations include categorizing the small Uncategorized persona group, and also visualizing the connections between different personas in the Yelp community. But most importantly, our final visualization brings to the table a story about the evolution of Yelp, its users, and the reviewed businesses.
27
Acknowledgements We would like to thank Cecilia Aragon and Taylor Scott for teaching us about the foundations of good design in information visualizations. We would also like to thank all of the users who took the time out of their day to give us feedback on improving our visualization. Outside of the feedback we got from usability studies, we made a lot of design decisions for our dashboard based on our knowledge of foundational visualization principles as well as considering how to display our data in meaningful ways, even if it meant defying some common visualization guidelines. These decisions came after countless prototypes, we had twenty versions in all, which does not include the intermediate versions such as 18.1, 18.2, etc.
References Aragon, Cecilia. “How to Critique a Visualization.” University of Washington. Sieg Hall, Seattle, WA. 7 October 2015. Lecture. Bird, Steven, Edward Loper, and Ewan Klein. Natural Language Processing with Python. S.l.: O'Reilly Media, 2009. Print. Carter, Shan, Amanda Cox, Kevin Quealy, and Amy Schoenfeld. "How Different Groups Spend Their Day." The New York Times. The New York Times, 31 July 2009. Web. 16 Nov. 2015. . Chapman, C. N., E. Love, R. P. Milham, P. Elrif, and J. L. Alford. "Quantitative Evaluation of Personas as Information." Proceedings of the Human Factors and Ergonomics Society Annual Meeting 52.16 (2008): 1107111. Web. Farley, Matt, and Alvin Engler. "Create Effective User Personas, Together." UserForge . N.p., n.d. Web. . Kosara, Robert, and Jock Mackinlay. "Storytelling: The Next Step for Visualization." IEEE Computing Now 46.5 (2013): 4450. Robert Kosara . Robert Kosara, 2013. Web. 24 Nov. 2015. .
28
Pennacchiotti, Marco, and AnaMaria Popescu. "Democrats, Republicans and Starbucks Afficionados." Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD '11 (2011): 43038. Web. "Personapp." Personapp . Spook Studio, n.d. Web. . "Personas." Personas . N.p., n.d. Web. 12 Dec. 2015. . Scott, Taylor. “Text Visualization.” University of Washington. Sieg Hall, Seattle, WA. 25 November 2015. Lecture. Stone, Maureen. "Expert Color Choices for Presenting Data." BeyeNETWORK . BeyeNETWORK, 17 Jan. 2006. Web. 1 Nov. 2015. . Thomas, J. J., and Kristin A. Cook. "Chapter 3: Visual Representations and Interaction Technologies." Illuminating the Path . Los Alamitos, CA: IEEE Computer Society, 2005. 87. Print. Wagner, Claudia, Sitaram Asur, and Joshua Hailpern. "Religious Politicians and Creative Photographers: Automatic User Categorization in Twitter." 2013 International Conference on Social Computing (2013): 30310. Web.
29