The GTD provides over 100,000 observations of terrorist incidents between 1970 and 2011. Of these, there are about 2400 observations in the USA. While this is not a large number, the graph still provides some interesting and intuitive results.
## Load librarieslibrary(ggplot2)library(plyr)library(maps)library(stringr)## Load terrorism datagtd.data <- read.csv("gtd.csv", stringsAsFactors =F)#### Begin USA heatmap plot#### Subset data to only include terrorist attacks in the USAgtd.usa <-subset(gtd.data, country_txt =="United States")## Clean provstate columngtd.usa$provstate <- str_replace(gtd.usa$provstate,"(U.S. State)","")gtd.usa$provstate <- str_replace(gtd.usa$provstate,"[(]","")gtd.usa$provstate <- str_replace(gtd.usa$provstate,"[)]","")## Trim whitespacesgtd.usa$provstate <- str_trim(gtd.usa$provstate)## Load US state population datapopulations <- read.csv("states.csv")## Create counts of terrorist activity in each statecounts <- count(gtd.usa,"provstate")## Merge the populations dataset with the counts datasetgtd.pop.merge <-merge(counts, populations, by.x ="provstate", by.y ="Name")## Create normalized terrorism frequency by dividing frequency## by the population of the stategtd.pop.merge <- mutate(gtd.pop.merge, normal = freq / CENSUS2010POP)gtd.pop.merge$normal <-log10(gtd.pop.merge$normal)gtd.pop.merge$provstate <-tolower(gtd.pop.merge$provstate)names(gtd.pop.merge)<-"region"## Load US state datastates <- map_data("state")## Merge the map data with our previous datasetmerged <-merge(states, gtd.pop.merge, sort =FALSE, by ="region")## Plot the heatmapg <- ggplot(merged)+ geom_polygon(aes(x = long, y = lat, group = group, fill = normal))g <- g + scale_fill_gradient(low ="lightgreen", high ="blue")g <- g + theme_bw()+ labs(fill ="Normalized Frequency of Terrorism")+ theme(legend.position ="bottom")g <- g + xlab(NULL)+ ylab(NULL)g <- g + theme(panel.grid.minor=element_blank(), panel.grid.major=element_blank())g <- g + theme(axis.text.x = element_blank(), axis.text.y = element_blank())g <- g + ggtitle("Normalized Frequency of Terrorism in the USA")g <- g + scale_x_continuous(breaks =NULL)+ scale_y_continuous(breaks =NULL)g
In order to obtain meaningful results, rather than simply plot the number of terrorist incidents per state, I divided each state’s count by the 2010 state population. I know that this is not entirely correct as population levels have fluctuated (with respect to one another) from 1970-2011 but this was fine for my purposes. I noticed some clustering in the frequencies of terrorist attacks so I took a log10 transform of those numbers to spread the numbers out more smoothly.