Homework 3: Multidimensional Data
Due date: Wed, March 18th at 5pm EST
Jason Gao
Part 1: Exploration with Tableau (20 points)
A. What is the relationship between operating system and numbers of years of programming experience?
There does not seem to be a relationship between operation system and number of years of programming experience. In the visualization below, there is no apparent trend as programming experience increases (as we look left-to-right).
The use of percentages, rather than overall count, makes more sense because we want to see if operating system choice becomes disproportionate at different experience levels. The saturation / darkness of the color indicates how many responses fall into the mark, which is a measure of "trustworthiness" of that calculated average, since a mark with only one response might not be very trustworthy / could be highly variant from the true population average.
B. What concentrations do students who are more comfortable with programming come from?
The students who are more comfortable with programming come from concentrations like Visualization, Software Engineering, and Computer Science, which is hardly surprising.
The saturation / darkness of the color indicates how many responses fall into the mark, which is a measure of "trustworthiness" of that calculated average, since a mark with only one response might not be very trustworthy / could be highly variant from the true population average.
C. What programming languages do experienced programmers use?
Many experienced programmers of over 3 years use Java, and to a lesser extent, C, PHP, and Ruby. We can see that while C dominates and is increasingly used during the first 3 years of experience, Java quickly takes over somewhere in the middle of 1-3 years of experience.
D. What programming languages do experienced programmers who code the most often use?
"Experience programmers who code the most often" was interpreted to mean programers who have been programming for over 3 years and code daily. The visualization below shows that they predominantly use Java.
E. What is the relationship between age, programming language, and programming experience?
Most distinctively, C and PHP are mostly used by younger programmers (18 to 24), while Java and HTML / CSS are mostly used by older programmers (24 to 44). Also, C enjoys an increase in popularity as experience increases, but then drops sharply compared to PHP for programmers with over 3 years of experience, while Java has a consistent inccrease in usage as experience increases.
Another interesting result is that the diversity of programming languages used increases considerably as experience increases, and seems to be independent of age. Programmers with less than 6 months of experience use only a few langauges, while the "Over 3 years" charts show a much wider range of languages used.
Part 2: Designing an Interaction (40 points)
This interaction will explore the relationship between programming experience, programming comfort level, and primary programming language.
Encodings
- Programming experience (ordinal) -> Spatial position on Y-axis
This allows users to see if there are trends going from top to bottom along the Y-axis.
- Programming comfort level (ordinal) -> Size of mark
Since there are only five comfort levels, distinguishing between the sizes of mark will be easy.
- Programming language (nominal) -> Color of mark
Because there are several programming languages, and there is no order to the languages, it makes sense to use color, which lets users differentiate between marks, but does not imply an ordering.
In this design, every data point is visible, and contributes to the running count along the x-axis. The marks themselves are bars, and size is varied by their height. I investigated the visualization with circles, but this caused spacing issues, which meant that the x-axis would not be able to be used as a running count indicator.
A color scheme was chosen that could be distinguished among easily from each other, without any one color being too vibrant or saturated, to reduce the effect of color pop. This follows the color rules and perceptual principles.
Everything is flat (no 3D) in order to keep the Lie ratio down, and to prevent distortions of data. Data-to-ink ratio was maximized by excluding unnecessary elements like borders or overly-detailed axes, or other decorations. Scales are presented, and are linearly spaced and start at the lowest possible values to avoid distortion.
The background is simple and white for the data-bearing elements, in order to maximize readability and contrast.
The interaction design process involved laying out the interface elements in Fireworks and overlaying example data points. Multiple graphics were generated to indicate states with linking / roll-overs and rest states, which were then exported to PDF. You can view the various interaction "slides" as frames within the Fireworks PNG.
Part 3: Interact With Your Classmates (40 points)
Open visualization
The interaction shows some interesting results about the data:
There is a strong relationship between programming experience and programming comfort: As experience increases (we go down the Y-Axis), the sizes of the bars increase in general (comfort).
Also, we can see that Java, Ruby, C#, and C++ are used almost exclusively by experienced and more comfortable programmers (rolling-over the Java entry in the legend shows this clearly), while SQL and BASIC are used exclusively by less experienced programmers. PHP, HTML / CSS, and C were used by programmers of all experiences and comfort levels, although there is an interesting rise from "under 6 months" to "1-3 years," and then a sharp drop in usage at "over 3 years" for C.
I was able to implement the additional feature of dynamically sorting the responses according to comfort level, or grouping responses using the same programming language together, selectable by clicking two buttons in the lower-right corner.
The final colors are actually the same colors as from Part 1, Question 5 (Tableau), since they had a very pleasing and distinguishable color set. ColorBrewer yielded a color set with a very hard to see "yellow."





