A Visualization by Naveen Sinha for CS171

Spaghetti sauce: an allegory

In an entertaining and insightful TED talk, author Malcolm Gladwell talks about spaghetti sauce. For many years, companies kept trying to make the "best" sauce until they hired a consultant, who had the epiphany to offer both thin and chunky styles. Sales soared and supermarket aisles filled with dozens of variations. On-line restaurant rankings are in a similar state: how does one choose between the established magazines (e.g. Boston Magazine, Globe, Herald, and Phoenix) or the newer, more participatory ranking systems (e.g. Yelp.com, Urbanspoon.com, Chow.com)? The goal of this visualization is not necessarily to help the user figure out where to eat, but instead to better understand the pros, cons, and preferences of various review sources. However, by doing this, he or she can hopefully sift through the ocean of opinions and find a reviewer to trust.

Why do the top rankings seem to have negative rankings? (the data in detail)

The lengths of the bars do not correspond to absolute rankings: none of the reviewers had a strong dislike of Clio or O Ya, for instance. Instead, they represent the difference between a particular review source (e.g. Boston Globe) and a composite score (i.e. a weighted average of the Boston Magazine, Boston Globe, the Phantom Gourmet, and other sources). The restaurants are ordered by composite score along the horizontal axis, with the highest-ranked restaurants at the right.

I was inspired by a recent article by Boston Magazine about the Top 50 Restaurants. The key difference about this article from others that I had seen was their quantitative approach to combining data from a variety of sources. Even better, they had an unscientific, but thoroughly entertaining summary of each reviewer. For instance, Devra First, the critic for the Boston Globe, loves true neighborhood haunts and jokes with long setups ("Catfish fingers are fish sticks gone to heaven…. But when the barbecue arrives, Houston, Kansas City, and Memphis, we have a problem." [From a 2008 review of Roadhouse]). Mat Schaffer, the chief critic for the Herald, always orders steak tartare, if possible, but hates when is is overly dressed.

I thought that I could present the data in a visual way, rather than just a list, which would allow the viewer to differentiate between the different review sources. Unfortunately, this was complicated by the fact that each reviewer used a different scheme (e.g. letter grades, stars, numbers) and social review sources, like Yelp, all had different numbers of reviewers contributing to the final score (see article for the raw data). I wrote to Prof. Elaine Allen, the the professor at Babson College who did the actual analysis for the article, for her advice. She was extremely helpful and I am grateful for her contribution of the normalized rankings for each review source. She determined a consecutive ranking for each restaurant and combined these to create a 0 to 100 score with a mean of 50 for the composite score. The composite score is a weighted average of the Boston Magazine, Globe, Herald, and Phoenix, along with Chow.com, Zagat, and the Phantom Gourmet. In return for going through all this effort, the Boston Magazine is given twice the weight of the other sources. Since the Chowhound website contains a plethora of reviewers, the top five reviewers (in terms of popularity and number of reviewers) ranked the restaurants on a scale of one to five stars and their scores were combined into a single number.

To get the data about price, cuisine, and location, I wrote a Python script to open each restaurant's website on the Yelp.com website and extract the information. This took about the same amount of time as opening all the websites manually, since there were numerous differences between the name of the restaurants in the data table and the name used on the Yelp site (e.g. Rendezvous vs. Rendezvous in Central Square, Helmand vs. The Helmand).

Why did you design it this way?

As a scientist, my first inclination was two make a 2-D scatter plot, with the scores of a different review source on each axis. The idea of looking at correlation coefficients and differences between from the medium score came to mind. I made a working prototype and encoded information about the price and type of cuisine by the size and color of the data points, respectively.

However, after doing some beta-testing, I realized the non-intuitive nature of my approach. I realized that if scientists with PhDs in physics and chemistry could not easily decode my graph, there was little hope for a wide readership. I re-did the whole design with small multiples, which captured the essential aspect of the data, which was the differences between the review sources. Picking out the vertical deviation from a diagonal line on a scatter plot was difficult, but looking at the length of a bar above a horizontal line was easy. Moreover, by using small multiples, I could readily display the information from multiple sources. I initially had a white background, but after exploring various ColorBrewer scales and testing the software with my friends, I found that a black background was far more readable.

What do I do? (instructions)

The top 116 restaurants are shown along the horizontal axis, with the highest-ranked restaurants on the right, according to its composite score. Each source corresponds to a different horizontal line. with the length of the bar corresponding to the difference from the composite score. For instance, if the particular review source ranked a restaurant as #2, but the composite rank was #5, the bar is shown as three units above the horizontal line. If a restaurant was not reviewed, no bar is drawn, but if the review's score matches the composite, a blue square is drawn. As the cursor is dragged across the plot area, a vertical yellow bar indicates the active restaurant and displays its composite rank and name. By clicking on any bar, the website for the restaurant is opened in a browser window.

As a default, higher-than-composite ranks are shown in green and lower-than-composite in red. However, by clicking each of the filters on the right, a different color scheme can be applied according to the price, cuisine, location, or Chowhound ranking. When a particular color scheme is used, the corresponding legend is outlined. By clicking a single entry within a legend (e.g. $$ for price), that entry is outlined and the corresponding bars for the restaurants are highlighted. If the cursor is dragged over a highlighted restaurant, the text turns blue to facilitate alignment. Clicking the name of a review source will highlight the top 25 restaurants according to that source.

This browser does not have a Java Plug-in.
Get the latest Java Plug-in here.

What are the answers to your questions? What other interesting insights about your data did you gain from your visualization?

This visualization has provoked many great conversations among my friends about restaurant rankings. This tool is useful for finding the best-value restaurants (i.e. highest ranked in the lowest price bracket), the personal favorites for certain reviewers (i.e. lowest composite score for the source's top 25), the worst-rip-off (i.e. lowest rank in highest price bracket). The Boston Magazine rankings seems to deviate the least from the composite scores, likely since they are the ones who calculated the scores. The Phoenix shows the largest deviation. This is interesting, since the Phoenix critic Robert Nadeau has a reputation for being unpredictable and keeping readers on the toes, according the Boston Magazine article.

What extensions and improvements can you suggest?

I was only able to display a subset of the available data with this visualization. To set a feasible scale for this assignment, I only looked at fine-dining restaurants. However, there are numerous other review sources (e.g. Phantom Gourmet, Yelp.com, UrbanSpoon), which review all types of dining establishments that could be incorporated into the visualization. Only a small number out of the total cuisines and locations are shown due to space limitations, so a drop-down menu or similar device could be added. It might also be useful to be able to re-scale the x-axis to show the Top 20 restaurants, for instance, in more detail.

If I had more time, I would include more ways to interact and display the current data, too. For instance, I would allow multiple category selection to filter the restaurants. It would also be nice to display additional information about a restaurant without taking the user to the Yelp site.

Another direction for improvement would be better ways for the user to match his or her personal preferences. For instance, the user could enter his/her favorite restaurants and the visualization could highlight the review source with the best match. At they very least, keyboard input would allow the user to find a particular restaurant by typing in the name.

Several of my labmates recommended turning this into an iPhone application. It could be quite useful if the location data in the program could be displayed on a map and used with the device's internal GPS tracker.

I would also be interested to look at the content of the reviews for these restaurants, particularly those that deviated the most from the consensus view. This would be an even more effective way to reveal each reviewer's personal preferences. A word cloud or similar visualization could also portray more qualitative information about the restaurant.

What did you most enjoy about working on this project? What was the most challenging aspect? What was the most frustrating? What would you do differently next time?

I enjoyed the numerous discussion I've had as a result of showing this visualization to others. People were always excited to see the rankings of places where they had recently eaten or about which they read a review.

After working at this problem for weeks, the most challenging aspect was finding the "best" way to represent the data. A scatter plot was the most obvious approach to me, but was far from intuitive to understand. The bar chart multiples are an improvement, but they introduce a subtle bias: the higher ranked scores have more red bars, since there are more ways for the rank to be lower than higher (the reverse is true for the lower ranked scores). I explored ways to overcome this bias, but that would just complicate the interpretation.

The most frustrating aspect was dealing with idiosyncrasies in the source data.  Accent marks were one common source of trouble, since they were not properly recognized by Processing. The Yelp website made automatically scraping the data a bit difficult, since there was no one-to-one correspondence between restaurant names and URL. Some would have a suffix or prefix and others would go by a different version of the name.

Next time I would focus on following better object-oriented programming practices to make my program more adaptable and modular. The two-click selection mechanism with the legends, for instance, was used several times.