TEAM: Daniel Geisinger and Wendy Lee
DISCLAIMER: You should not drink alcoholic drinks if you are under the age of 21.

The inspiration for this project came from our personal experiences with poorly mixed and very poor tasting drinks. Especially among college students with limited resources, it is common to find strange mixtures. Through our visualization, we wanted to look into the compatibility of various drinks to be able to offer a resource for people to use to figure out what alcoholic and non-alcoholic beverages are the most compatible with each other. We scraped the data from a site with over 9000 different popular drink combinations. We also ran taste trials with our peers to determine which drinks tasted good together, and which ones were not compatible. We personally had over 200 tests, along with the 9000+ drink dataset from Drinknation.com. Google Refine helped us to refine some errors in our data, although we did not throw out any information completely, and Processing was our main coding method along with some Java.
While looking at last year’s projects, we found the smoothie mixing one to be interesting and one that we would both actually use. However, the coding and design methods they used were not the best way to visualize the methods of making a smoothie. While it told caloric information, the other stats given were somewhat irrelevant. After doing some internet surfing for cool designs, we came across a cool Word Map that we thought could better express the mixing of alcoholic drinks, as well as provide knowledge every college student can use at one point.
The purpose of our design was to visualize the compatibility of the most popular beverages. We used size to indicate frequency of use, and all of the shapes are fairly similar, so the user can usually judge for themselves which mixer is more frequently used. Because our design enables the user to click and see each drink’s most common mixers, it is very user friendly and simple, as well as useful. Another design principle key to our design, since many of the boxes are too small to see, is the pop-up word when you hover over the smaller boxes. This allows the user to easily decipher which drink his or her mouse is currently hovering over.
Our data was taken from www.drinknation.com which is an online database that contains over 9000 drink recipes, from the most popular drinks out there to some pretty random ones. We wanted to use this database because of the quantity of data that was available. In order to better understand drink compatibility, we needed a large enough dataset to ensure that the information wasn’t too biased. In addition, the drinks on the website were very diverse and ranged from shots to martinis. We acquired our data by scraping it off the website with Python. It turned out to be quite a challenging task because of the numerous number of ingredients for each drink and because of the lack of distinct tags. In addition, the site was very cluttered. After experimenting a lot with Python, we were finally able to scrape the data successfully. Using Google Refine, we corrected a few small errors, such as quotation marks being preceded by and a few inconsistencies in drink names. Overall, our scraping script was able to capture all the ingredients very well. The data that we got is interesting because unlike a numerical dataset, the connections are less obvious and more difficult to explore. What we did know was that given the ingredients for all these drinks, we could figure out what combinations of alcohol and mixers were the most popular. For instance, we happened to find vodka and orange juice together in a lot of recipes that may indicate something about their compatibility. Likewise, we did not expect to see milk and tequila together much. Since the connections were not obvious just by looking at the data set, we were very eager to get started on our visualization, which would be able to answer questions about compatibility.
We implemented the visualization in Processing though we also experimented with Flare as well. By using Processing, we were able to take advantage of some of the examples provided on the Ben Fry website. Despite being able to use these examples as references, there were several implementation challenges. One of the biggest problems was working with our data set since each of the recipes had a different number of ingredients leaving us with an uneven number of columns for each row. In addition, the presentation of each ingredient of a recipe in a new cell in the row for a given drink made it difficult to check if two ingredients were compatible because it meant checking every ingredient in that row. Despite having the ease of working with a more straightforward dataset, we were able to overcome this challenge by Google Refining our data, finding a visualization method that made it easier to work with the data and also writing many functions to iterate through the data. Other challenges we encountered involved the size of our dataset, which has over 9000 rows and many ingredients in each row. We ran into a few issues of running out of memory, but we were able to allocate more. In the end, we found that following the example from Ben Fry actually ended up being more difficult because of the structure of the WordItems. It was difficult to pass words along and also hard to keep count across all words, but these are challenges we were able to eventually overcome by creating our own structures and functions as well as significantly modifying the code.
Below are a few screenshots from our visualization which shows what the main screen looks like and how the design changes depending on the ingredient selection. As mentioned above, the colors and the sizes of the boxes are able to strongly convey the message we were seeking to convey. The app works very fast despite having to iterate through all the cells in our data set. We chose to keep the application simple because of the question we were seeking to answer and thus it resulted in a simple, easy to use, visually appealing application. Because we changed our project topic a few times, we were not able to get feedback on our design and functionality.
Figure 1. Main display of all drink ingredients

Figure 2. Other ingredients in Amaretto recipes

Figure 3. Other ingredients in Rum, melon recipes
While the visualization does its job of showing the most mixable and most commonly mixed drinks, we were unable to implement a rating setting. Because of the huge data set, adding extra variables would have been extremely difficult. The only other problem we encountered was not being able to visualize past the mixing of two drinks, since many mixtures on the site repeat and would have therefore caused an infinite number of clicks (as well as an infinite amount of coding). However, since the point of the project was to help show what drinks a bored college student could/should mix without being very disappointed with the flavor, we feel a more than two drink combination was not necessary. Since we mainly used Processing, we were forced to familiarize ourselves with its language. Also, after trying to use online source codes from applications like Flare, we learned some codes are not as simple as grabbing them off of the internet and manipulating them. The visualization was much more complicated than we originally intended, and therefore the initial creation took much longer than expected. In the future, we should know our data set better, so we have a better gauge on how much coding and work will go into the project.