CS 171 - Final Project: Local Music Performances

Brendan Shea and Philipp Hanes

This visualization tries to answer questions about Boston area music performances, such as: The target audience is people who are interested in local music performances, and fans of music and live music performances in general.
We chose these motivating questions as they are of interest to both of us, in addition to being generally interesting to others.
Note that the data was scraped once (and took quote a while to download!) in early May 2009, and so is not likely to be all that practically useful soon. But you're free to scrape your own, using the tools provided.

Local Music Performances Visualization

Note: The visualization is embedded in this page, and requires Flash. If you do not have Flash installed please install it here.

Use

The visualization has two main parts: The filters available are: Note that the list of genres selected also doubles as the color legend for genres. The colors selected were from ColorBrewer.

By changing the filters, the data shown in the view will change to match. You can therefore 'drill down' into a subset of the data that is of interest.

The Graph view shows the data in an abstract representation -- location is encoded by position (the events are grouped by neighborhood), price is encoded by size, and genre is encoded by color. Events where we couldn't determine the price (which you will only see if you check 'Include No Price') show up as small squares, in order to differentiate them.

The Map view shows the same data, but now displayed on a map. Location is of course encoded by position, and genre is encoded by color.

Hovering over an event in either view will highlight its genre and neighborhood in the filter list. Also, clicking on the event will give detail information about the event, including the full price information, the venue, time, and a link to the artist's URL, if available.

Data

There are two primary sources of data that we used: There was a fair amount of cleanup that needed to be done when we were scraping the data. For instance, we would end up with some genres that were functionally similar or identical, despite being different strings -- for instance, "Electronica" and "Techno & Dance". (I suppose the Electronica connoisseur may object to being lumped in with Techno, but for our purposes they were the same.) So we did some cleanup and merging of the genres to get a reasonable, consolidated list.

Price was another area where cleanup of the data needed to happen. Often, the price from boston.com would be something like: In these cases, we parsed the string and used the first price that we find for filtering purposes (so for those examples, it would be $15, $27.50, and $10, respectively.) Even after this, some prices were unavailable. This is the reason we have the 'Include No Price' checkbox.

The data file (a tab separated file) that we use in the application is: cleangenres.tsv

Code

Application source files:

circles.as -- ActionScript responsible for rendering of the Graph view.
controls.as -- ActionScript responsible for handling events on the filter controls.
data.as -- ActionScript responsible for loading and parsing data.
DualDragSlider.mxml -- The dual slider component (for the time and price filters). Adapted from code found here.
GradientBox.mxml -- Used as part of the dual slider component. Adapted form code found here.
map.as -- ActionScript responsible for handling mapping.
MEVComponent.as -- Component to render the music events in the Graph view.
MultiCheckBoxWindow.mxml -- The multi-check box component for selecting neighborhoods and genres. Adapted from code found here.
MusicEvent.as -- Bean to represent a MusicEvent.
Venue.as -- Bean to represent a Venue.
Visualization.mxml -- The main application component.

Data scraping source files: allmusic.com.pl
boscom.pl
boston.com.pl
cleanup.pl
lib.pl

Also, the source is freely available on github.

Insights

It is interesting to note the mix of genres in each neighborhood. Much of this is due to the venues that are present in those neighborhoods, of course; and for neighborhoods that don't have as many, they tend to be dominated by the same genre. For instance, Central Square in Cambridge is mainly Alternative performances, with a smattering of Rock (and relatively inexpensive ones at that) as they are mostly at the Middle East. Harvard Square, on the other hand, has a richer mix of (slightly higher priced) genres, including Jazz, Pop/Rock, and a few Classical and Alternative performances thrown in there as well.

The filters also end up being a reasonably powerful way to find music events, such as the initial question posed, "For a given time period, where can I find folk performances for less than $20?" (It turns out that the answer is at Club Passim in Harvard Square.)

Future Improvements

Beyond some incremental cleanup to the data quality and data scraping, we can envision adding the following features:

Final Thoughts

Overall this was an enjoyable project to work on -- in addition to creating a cool visualization, we got to play with a new (to us) technology in Adobe Flex. In addition, we used Git as the source control tool, which was a great learning experience. There were some challenges getting reasonable data to visualize, but we were able to successfully clean it to the point where we can draw some conclusions, and have a useful tool.