Built with Processing
I wanted to answer a couple of main questions in my visualization: What are the major conglomerations around which theorems in mathematics order themselves? Can we create suggestions for which theorems to study next based on a set of known theorems? How closely are theorems and concepts linked?
To answer these questions, I wanted to create a user-driven visualization of a large number of theorems across different categories of theoretical math. After doing some research, I found that the general agreement is to classify math into four categories: Algebra, Analysis, Geometry and Fundamentals of Mathematics. Smaller disciplines within these include number theory, topology, group theory, etc. I decided to scrape theorem data from Wolfram Math World which contains a large database of theorems with each concept within the theorem linked to an article about the concept itself.
The idea that each theorem shares several concepts in common with other theorems is the main principle on which my visualization hinges. This concept lends itself quite naturally to a graph with the theorems as nodes and the concepts as edges, so I decided to make this the basis of my visualization. Given the classification into four categories, this information should be encoded as well. I decided to encode each node with a color code corresponding to the category of mathematics, a size encoding for importance, and to give each edge a thickness encoding for strength.
To answer the questions, I decided to allow the user to interact in several modes with the data. The graph itself should answer the first question. The second question would be answered through a concept checklist mode. Since concepts are just abstractly defined ideas, a typical user is probably familiar with a list of such concepts. I provide the user with a list from which he/she can select the concepts he/she is familiar with, and then the graph will filter out the unfamiliar theorems, and display only those which should be completely known (theorems which address only known concepts) and those which are only missing one concept to become known (excluding theorems which only involve one concept and no others). For the last question, I allow the user to filter through to specific theorems and concepts by applying a theorem and concept filter mode, as well as a search mode if he/she has a specific theorem or concept in mind when using the visualization. In order to facilitate curiosity and exploration, the theorem and concept filter modes include an extra selection box in the lower right-hand corner which displays all the pictured theorems and concepts on the graph when any filter is applied. To lean more about any of these related theorems or concepts, the user can simply click in this box and the visualization will revert to the proper activation and filtering modes.
Data scraping was a big part of the workload in this project. Wolfram Math World classifies its theorems into the four categories described above. The articles are hyperlinked with concepts, so I wrote a scraper which would crawl articles from a browsing page (accessible from the website menu at left) for those ending in "theorem", "conjecture", "lemma", etc. that denote theorems. For this project, I ended up writing several separate scrapers, one for scraping names of theorems and associated concepts, one for classification of theorems and one for the inline images and html statement of theorems. This last scraper was not implemented in the final version of the project, but it could be helpful for including theorem statements in something other than processing. A list of downloads of the more useful of these scripts is available at the bottom of this webpage.
Because of the number of nodes, position encoding was an impractical way of conveying information due to overlap. I tried several versions of spring based physics simulations before settling on the one in the visualization. The final project runs on the assumption that edges behave as springs and nodes have weak gravity repulsion based on their size. However, the filtering modes can lead to dynamic movement that is somewhat overwhelming. One potential improvement for the project would be to look into ameliorating this.
Feel free to interact with the applet above. The graph nodes can be clicked and dragged to fix them. The movement of the nodes is intended to be dynamic, but in the filtering modes, it can be a bit overwhelming. Follow the details described above and play around with the different modes. Thanks for viewing my project!
Source code: interaction_v3_2 Edge ForcedNode Node Scrollbar SpringEdge Vector3D
Scrapers: Theorem name scraper Related concepts for each theorem scraper Theorem category scraper Theorem statement and inline graphic scraperPlease note that all scrapers run on a time delay as mathworld locks out ip addresses that rapidly open a large number of pages. Also, the data used to create the applet were formatted versions of the raw data that these scrapers provide.