A recent factor analysis project (as discussed previously here, here, and here) gave me an opportunity to experiment with some different ways of visualizing highly multidimensional data sets. Factor analysis results are often presented in tables of factor loadings, which are good when you want the numerical details, but bad when you want to convey larger-scale patterns – loadings of 0.91 and 0.19 look similar in a table but very different in a graph. The detailed code is posted on RPubs because embedding the code, output, and figures in a webpage is much, much easier using RStudio's markdown functions. That version shows how to get these example data and how to format them correctly for these plots. Here I will just post the key plot commands and figures those commands produce.
First, a bar graph showing each measure's factor loadings with each factor in a separate facet (subplot):
Created by Pretty R at inside-R.org
Second, the full pairwise correlation matrix with a stacked bar graph showing each measure's (absolute) loading on each factor:
Created by Pretty R at inside-R.org
First, a bar graph showing each measure's factor loadings with each factor in a separate facet (subplot):
#For each test, plot the loading as length and fill color of a bar # note that the length will be the absolute value of the loading but the # fill color will be the signed value, more on this below ggplot(loadings.m, aes(Test, abs(Loading), fill=Loading)) + facet_wrap(~ Factor, nrow=1) + #place the factors in separate facets geom_bar(stat="identity") + #make the bars coord_flip() + #flip the axes so the test names can be horizontal #define the fill color gradient: blue=positive, red=negative scale_fill_gradient2(name = "Loading", high = "blue", mid = "white", low = "red", midpoint=0, guide=F) + ylab("Loading Strength") + #improve y-axis label theme_bw(base_size=10) #use a black-and-white theme with set font size
Fig. 1 from Mirman et al., 2015, Nature Communications |
library(grid) #for adjusting plot margins #place the tests on the x- and y-axes, #fill the elements with the strength of the correlation p1 <- ggplot(corrs.m, aes(Test2, Test, fill=abs(Correlation))) + geom_tile() + #rectangles for each correlation #add actual correlation value in the rectangle geom_text(aes(label = round(Correlation, 2)), size=2.5) + theme_bw(base_size=10) + #black and white theme with set font size #rotate x-axis labels so they don't overlap, #get rid of unnecessary axis titles #adjust plot margins theme(axis.text.x = element_text(angle = 90), axis.title.x=element_blank(), axis.title.y=element_blank(), plot.margin = unit(c(3, 1, 0, 0), "mm")) + #set correlation fill gradient scale_fill_gradient(low="white", high="red") + guides(fill=F) #omit unnecessary gradient legend p2 <- ggplot(loadings.m, aes(Test, abs(Loading), fill=Factor)) + geom_bar(stat="identity") + coord_flip() + ylab("Loading Strength") + theme_bw(base_size=10) + #remove labels and tweak margins for combining with the correlation matrix plot theme(axis.text.y = element_blank(), axis.title.y = element_blank(), plot.margin = unit(c(3,1,39,-3), "mm")) library(gridExtra) #for combining the two plots grid.arrange(p1, p2, ncol=2, widths=c(2, 1)) #side-by-side, matrix gets more space
Fig. 2 from Mirman et al., in press, Neuropsychologia |
Neat visualizations! You might also be interested in tableplots - they are very useful for interpreting factor analysis patterns, either with or without shading (R manual: http://cran.r-project.org/web/packages/tableplot/tableplot.pdf and brief article: http://www.datavis.ca/papers/Tableplot-Kwan-etal2009.pdf)
ReplyDelete