MacKay Data Analysis

From Norsemathology
Jump to navigation Jump to search

Small Multiples

Small multiples for the graphic that MacKay used in his 1901 report on phenology, featuring the summarized variables that MacKay chose:

  • Mayflower,
  • Strawberry,
  • Apple,
  • Lilac, and
  • Blackberry.

"Small multiplies" is an idea due to Edward Tufte, who is famous for his designs of the proper way to illustrate graphical information. I found this article entitled "Tufte in R", which might be interesting to investigate. Ironically the section on "Small Multiples" is "in preparation". Maybe we could do this in R. I do have an example of Small Multiples in R, actually, which I've made available on mathstat (/var/www/html/mackay/R/SmallMultiples/generate_graphs.R). This also illustrates how to read an Excel spreadsheet in R, and some MYSQL commands for selecting data. It's all a little overwhelming perhaps! But it's worth digging into....

Here is the Mathematica File that produces the following plots [This needs to be updated Madison -- also the file for year 07 is called 1917m....]:

G1901.png G1902.png G1903.png G1904.png G1905.png G1906r.png G1907r.png G1908.png G1909.png G1911.png G1912.png G1913.png G14.png G15.png G16.png G17.png G18.png G19.png G20.png G21.png G22.png G23.png

Good work Madison and Steve: Animate.gif

  • I added "padding" for missing years, by repeating the preceding year.
  • To create the animation I used the ImageMagik command convert: convert -delay 200 -loop 0 *19*png animate.gif

Now to the important question: What do we learn?

Centroids

  • In order to get a (very) crude estimate for the centroid of each region, I reproduced the regions with cardboard and then balanced it on a pen to find the midpoint. From there, I was able to use a computer program to find a coordinate for each centroid to compare to the point I found. Eight of the regions coincide with county borders, so I was able to find coordinates for each county. I then put them into a program that would give me a geographical midpoint. I was able to compare to what I had done with the cardboard, and surprisingly enough, they actually matched up quite well. The estimate for regions six and seven were even more crude because they split up a county. Finding those coordinates involved a lot of trial and error.
  • Here are the coordinates I ended up with (in degrees);
    • Region 1: 43.9, -65.8
    • Region 2: 44.2, -65
    • Region 3: 44.85, -64.9
    • Region 4: 45.25, -63.6
    • Region 5: 45.1, -62.45
    • Region 6: 45.5, -63.97
    • Region 7: 45.7, -62.75
    • Region 8: 45.85, -60.45
    • Region 9: 46.4, -60.6
    • Region 10: 46.2, -61.1
  • Here is the link to the website I used to find midpoints


Missing Data