Working with several of the open source visualization tools available allowed me to see some of the possibilities visualization provides for mapping data and exposing connections. The data we worked with, units and battles of the Civil War, was relatively simple but even with this small sample set and uncomplicated relationships the visualizations help reveal connections faster than studying a table of raw data.
However, one thing I realized even from this limited project is that that the compilation and organization of that raw data is the biggest part of any visualization project. Both Palladio and RAW provide pretty user friendly interfaces for uploading data, but that data has to be properly formatted for the program (and most visualization tools have different formatting requirements). Besides the formatting, which can become fairly obnoxious in and of itself (Gelphi especially requires some pretty extensive work to get data properly input), a digital historian first has to compile the data itself, which unless your professor is kind enough to provide to you already organized and formatting, can take a significant amount of time and effort. With the relative accessibility of Palladio and RAW, any visualization project will likely involve way more time spent compiling and organizing data than interfacing with the actual visualization tools.
Uploading the data into Palladio should be quick and easy, but since I’m a Windows user it wasn’t. Originally the data uploaded fine, but wouldn’t display in the graph screen. I was able to get it to work by switching from Internet Explorer to Google Chrome.
I thought the Palladio visualization was the most intuitive in showing both the connections between multiple units that saw service in the same battle and the relative amount of combat seen by the various regiments. There isn’t as many options in Palladio as in RAW, at least not for a limited data set we’re working with, although the addition of latitude and longitude would allow us to map the locations of battles these units fought in.
RAW allows more options for visualization, but once again with only this limited data set only a few are useful. The ones that worked the best were Alluvial Diagram, Circle Packing, Cluster Dendrogram, and Circular Dendogram.
Of these, the Alluvial Diagram does the best of illustrating overlapping battles, while the other three are only really useful in highlighting which units had participated in more battles. Still, the many options available on RAW provide more visualization options than Palladio.
Finally, there is Gephi. Gephi may be the exception to the rule I started this post with, that data compilation and organization was the most time consuming part of visualization. With Gephi, figuring out Gephi is the longest part. I was actually never able to get it to fully work, despite having the data already set up and the very helpful tutorial provided by Elena Friot.