Brendan Shea and Philipp Hanes
This visualization tries to answer questions about Boston area music performances, such
as:
For a given time period, where can I find folk performances for less than $20?
Are performances of certain music genres more common in some neighborhoods than others?
The target audience is people who are interested in local music performances, and fans of music and live music performances in general.
We chose these motivating questions as they are of interest to both of us, in addition to being generally interesting to others.
Note that the data was scraped once (and took quote a while to download!) in early May 2009, and so is not likely to be all that practically useful soon. But you're free to scrape your own, using the tools provided.
Local Music Performances Visualization
Note: The visualization is embedded in this page, and requires Flash. If you do not have Flash installed please install it here.
Use
The visualization has two main parts:
The view (on the top) shows either the Graph view of the data, or the Map view of the data, depending
on which is selected. (The default view is Graph.)
The filters (on the bottom) allow you to filter data by time, price, genre and neighborhood.
The filters available are:
Time - a dual slider to control the time range that you want to visualize
Price - a dual slider to control the price range that you want to visualize
Include No Price - includes events where we could not determine the price
Neighborhoods - click the title to select what neighborhoods you want to see
Genres - click the title to select what genres you want to see
Note that the list of genres selected also doubles as the color legend for genres. The colors selected were from ColorBrewer.
By changing the filters, the data shown in the view will change to match. You can therefore 'drill down' into
a subset of the data that is of interest.
The Graph view shows the data in an abstract representation -- location is encoded by position (the events are
grouped by neighborhood), price is encoded by size, and genre is encoded by color. Events where we couldn't determine the price (which you will only see if you check 'Include No Price') show up as small squares, in order to differentiate them.
The Map view shows the same data, but now displayed on a map. Location is of course encoded by position, and genre
is encoded by color.
Hovering over an event in either view will highlight its genre and neighborhood in the filter list. Also, clicking on the event will give detail information about the event, including the full price information, the venue, time, and a link to the artist's URL, if available.
Data
There are two primary sources of data that we used:
www.boston.com- this is where we got all of the music events from, plus a bunch of supporting data
www.allmusic.com - we used this to round out genre information, as well as get links
There was a fair amount of cleanup that needed to be done when we were scraping the data. For instance, we would end up with some genres that were functionally similar or identical, despite being different strings -- for instance, "Electronica" and "Techno & Dance". (I suppose the Electronica connoisseur may object to being lumped in with Techno, but for our purposes they were the same.) So we did some cleanup and merging of the genres to get a reasonable, consolidated list.
Price was another area where cleanup of the data needed to happen. Often, the price from boston.com would be something like:
"$15/$12; half price for mothers!"
"$27.50 - $35.00"
"$10 / $5 on Guest List"
In these cases, we parsed the string and used the first price that we find for filtering purposes (so for those examples, it would be $15, $27.50, and $10, respectively.) Even after this, some prices were unavailable. This is the reason we have the 'Include No Price' checkbox.
The data file (a tab separated file) that we use in the application is: cleangenres.tsv
Code
Application source files:
circles.as -- ActionScript responsible for rendering of the Graph view. controls.as -- ActionScript responsible for handling events on the filter controls. data.as -- ActionScript responsible for loading and parsing data. DualDragSlider.mxml -- The dual slider component (for the time and price filters). Adapted from code found here. GradientBox.mxml -- Used as part of the dual slider component. Adapted form code found here. map.as -- ActionScript responsible for handling mapping. MEVComponent.as -- Component to render the music events in the Graph view. MultiCheckBoxWindow.mxml -- The multi-check box component for selecting neighborhoods and genres. Adapted from code found here. MusicEvent.as -- Bean to represent a MusicEvent. Venue.as -- Bean to represent a Venue. Visualization.mxml -- The main application component.
It is interesting to note the mix of genres in each neighborhood. Much of this is due to the venues that are present in those neighborhoods, of course; and for neighborhoods that don't have as many, they tend to be dominated by the same genre. For instance, Central Square in Cambridge is mainly Alternative performances, with a smattering of Rock (and relatively inexpensive ones at that) as they are mostly at the Middle East. Harvard Square, on the other hand, has a richer mix of (slightly higher priced) genres, including Jazz, Pop/Rock, and a few Classical and Alternative performances thrown in there as well.
The filters also end up being a reasonably powerful way to find music events, such as the initial question posed, "For a given time period, where can I find folk performances for less than $20?" (It turns out that the answer is at Club Passim in Harvard Square.)
Future Improvements
Beyond some incremental cleanup to the data quality and data scraping, we can envision adding the following features:
Enabling directions in the map view (to answer the question, 'How do I get from my house to the event?')
Textual searching and filtering (by artist or venue, say)
Showing some data distribution 'density plot' on the time and price sliders x-axes, to get an idea of where the bulk of events lie, or if they are evenly distributed
Final Thoughts
Overall this was an enjoyable project to work on -- in addition to creating a cool visualization, we got to play
with a new (to us) technology in Adobe Flex. In addition, we used Git as the source control tool, which was a great learning experience.
There were some challenges getting reasonable data to visualize, but we were able to successfully clean it to the point where we can draw some conclusions, and have a useful tool.