# Exploratory factor analysis

## Install required packages

wants <- c("GPArotation", "mvtnorm", "psych")
has   <- wants %in% rownames(installed.packages())
if(any(!has)) install.packages(wants[!has])

## Factor analysis

### Simulate data

N <- 200
P <- 6
Q <- 2
(Lambda <- matrix(c(0.7,-0.4, 0.8,0, -0.2,0.9, -0.3,0.4, 0.3,0.7, -0.8,0.1),
nrow=P, ncol=Q, byrow=TRUE))
     [,1] [,2]
[1,]  0.7 -0.4
[2,]  0.8  0.0
[3,] -0.2  0.9
[4,] -0.3  0.4
[5,]  0.3  0.7
[6,] -0.8  0.1

Non correlated factors

set.seed(123)
library(mvtnorm)
Kf <- diag(Q)
mu <- c(5, 15)
FF <- rmvnorm(N, mean=mu,        sigma=Kf)
E  <- rmvnorm(N, mean=rep(0, P), sigma=diag(P))
X  <- FF %*% t(Lambda) + E

### Using factanal()

(fa <- factanal(X, factors=2, scores="regression"))

Call:
factanal(x = X, factors = 2, scores = "regression")

Uniquenesses:
 0.616 0.638 0.225 0.885 0.658 0.608

Factor1 Factor2
[1,]  0.570  -0.243
[2,]  0.601
[3,] -0.217   0.853
[4,] -0.197   0.276
[5,]  0.327   0.486
[6,] -0.622

Factor1 Factor2
Proportion Var   0.211   0.184
Cumulative Var   0.211   0.395

Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 1.43 on 4 degrees of freedom.
The p-value is 0.84 

### Using fa() from package psych with rotation

Rotation uses package GPArotation

library(psych)
corMat <- cor(X)
(faPC  <- fa(r=corMat, nfactors=2, n.obs=N, rotate="varimax"))
Factor Analysis using method =  minres
Call: fa(r = corMat, nfactors = 2, n.obs = N, rotate = "varimax")
Standardized loadings (pattern matrix) based upon correlation matrix
MR2   MR1   h2   u2
1  0.57 -0.24 0.38 0.62
2  0.60  0.03 0.36 0.64
3 -0.22  0.85 0.77 0.23
4 -0.20  0.28 0.12 0.88
5  0.33  0.49 0.34 0.66
6 -0.62  0.07 0.39 0.61

MR2  MR1
Proportion Var        0.21 0.18
Cumulative Var        0.21 0.40
Proportion Explained  0.53 0.47
Cumulative Proportion 0.53 1.00

Test of the hypothesis that 2 factors are sufficient.

The degrees of freedom for the null model are  15  and the objective function was  0.82 with Chi Square of  160.4
The degrees of freedom for the model are 4  and the objective function was  0.01

The root mean square of the residuals (RMSR) is  0.02
The df corrected root mean square of the residuals is  0.04

The harmonic number of observations is  200 with the empirical chi square  1.37  with prob <  0.85
The total number of observations was  200  with MLE Chi Square =  1.43  with prob <  0.84

Tucker Lewis Index of factoring reliability =  1.067
RMSEA index =  0  and the 90 % confidence intervals are  NA 0.061
BIC =  -19.77
Fit based upon off diagonal values = 1
Measures of factor score adequacy
MR2  MR1
Correlation of scores with factors             0.81 0.88
Multiple R square of scores with factors       0.66 0.78
Minimum correlation of possible factor scores  0.33 0.56

## Factor scores

bartlett <- fa$scores head(bartlett)  Factor1 Factor2 [1,] -0.14693 -0.4087 [2,] 1.12205 -0.1700 [3,] 0.06507 0.7207 [4,] -0.11232 0.1412 [5,] -0.24712 1.4317 [6,] 1.09157 0.3979 anderson <- factor.scores(x=X, f=faPC, method="Anderson") head(anderson$scores)
         MR2     MR1
[1,] -0.2025 -0.4724
[2,]  1.3725 -0.1340
[3,]  0.1182  0.8227
[4,] -0.1308  0.1545
[5,] -0.2286  1.6144
[6,]  1.3650  0.5099

factor.plot(faPC, cut=0.5) plot of chunk rerMultFA01

fa.diagram(faPC) plot of chunk rerMultFA01

## Determine number of factors

Parallel analysis and a "very simple structure" analysis provide help in selecting the number of factors. Again, package psych has the required functions. vss() takes the polychoric correlation matrix as an argument.

fa.parallel(X)                     # parallel analysis plot of chunk rerMultFA02

Parallel analysis suggests that the number of factors =  2  and the number of components =  2 
vss(X, n.obs=N, rotate="varimax")  # very simple structure plot of chunk rerMultFA02


Very Simple Structure
Call: VSS(x = x, n = n, rotate = rotate, diagonal = diagonal, fm = fm,
n.obs = n.obs, plot = plot, title = title)
VSS complexity 1 achieves a maximimum of 0.59  with  3  factors
VSS complexity 2 achieves a maximimum of 0.72  with  3  factors

The Velicer MAP criterion achieves a minimum of NA  with  1  factors

Velicer MAP
 0.07 0.10 0.25 0.45 1.00   NA

Very Simple Structure Complexity 1
 0.45 0.57 0.59 0.57 0.52 0.53

Very Simple Structure Complexity 2
 0.00 0.69 0.72 0.72 0.67 0.67

## Useful packages

For confirmatory factor analysis (CFA), see packages sem, OpenMx, and lavaan which implement structural equation models. More packages can be found in CRAN task view Psychometric Models.

## Detach (automatically) loaded packages (if possible)

try(detach(package:psych))
try(detach(package:GPArotation))
try(detach(package:mvtnorm))