1. Table of Contents

This project explores different variations of the exploratory factor analysis method for discovering latent patterns in adequately correlated high-dimensional data using various helpful packages in R. Methods applied in the analysis to estimate and identify potential underlying structures from observed variables included Principal Axes Factor Extraction and Maximum Likelihood Factor Extraction. The approaches used to simplify the derived factor structures to achieve a more interpretable pattern of factor loadings included Varimax Rotation and Promax Rotation. Combinations of the factor extraction and rotation methods were separately applied on the original dataset across different numbers of factors, with the model fit evaluated using the standardized root mean square of the residual, Tucker-Lewis fit index, Bayesian information criterion and high residual rate. The extracted and rotated factors were visualized using the factor loading and dandelion plots. All results were consolidated in a Summary presented at the end of the document.

Exploratory factor analysis is a form of unsupervised learning method aimed at uncovering underlying patterns or relationships among a set of observed variables by exploring the structure of the data and identifying latent factors that explain the observed correlations; and at reducing the complexity of the data by grouping related variables together under these latent factors. The algorithms applied in this study (mostly contained in the psych, nFactors and parameters packages) attempt to extract a smaller number of factors that influence the correlations among variables and apply factor rotation methods to transform the initially extracted factors into a more interpretable form.

1.1 Sample Data

The JobHiring dataset obtained from the Statistics By Jim website by Jim Frost was used for this illustrated example.

Preliminary dataset assessment:

[A] 50 rows (observations)

[B] 12 columns (variables)
     [B.1] 12/12 descriptors = 12/12 ordered category
            [B.1.1] ACAD (Academic Record)
            [B.1.2] APPR (Appearance)
            [B.1.3] COMM (Communication)
            [B.1.4] CFIT (Company Fit)
            [B.1.5] EXPR (Experience)
            [B.1.6] JFIT (Job Fit)
            [B.1.7] LETT (Cover Letter)
            [B.1.8] LIKE (Likeability)
            [B.1.9] ORGN (Organization)
            [B.1.10] POTL (Potential)
            [B.1.11] RESM (Resume)
            [B.1.12] SCON (Self-Confidence)

Code Chunk | Output

##################################
# Loading R libraries
##################################
library(AppliedPredictiveModeling)
library(performance)
library(parameters)
library(HH)
library(tidyr)
library(caret)
library(psych)
library(lattice)
library(dplyr)
library(moments)
library(skimr)
library(RANN)
library(pls)
library(corrplot)
library(lares)
library(DMwR2)
library(gridExtra)
library(rattle)
library(RColorBrewer)
library(stats)
library(factoextra)
library(FactoMineR)
library(gplots)
library(qgraph)
library(ggplot2)
library(psych)
library(nFactors)
library(MBESS)
library(DandEFA)
library(EFAtools)

##################################
# Defining file paths
##################################
DATASETS_PATH <- file.path("datasets")

##################################
# Loading source and
# formulating the analysis set
##################################
JobHiring <- read.csv(file.path("..", DATASETS_PATH, "JobHiring.csv"),
                      na.strings=c("NA","NaN"," ",""),
                      stringsAsFactors = FALSE)
JobHiring <- as.data.frame(JobHiring)

##################################
# Performing a general exploration of the data set
##################################
dim(JobHiring)

## [1] 50 12

str(JobHiring)

## 'data.frame':    50 obs. of  12 variables:
##  $ ACAD: int  6 9 6 7 4 7 6 7 9 6 ...
##  $ APPR: int  8 8 7 8 7 7 8 6 8 8 ...
##  $ COMM: int  7 8 7 6 8 7 6 6 8 7 ...
##  $ CFIT: int  5 8 6 5 6 5 7 5 9 8 ...
##  $ EXPR: int  6 10 6 8 6 5 7 8 9 6 ...
##  $ JFIT: int  5 9 7 5 6 6 7 6 8 7 ...
##  $ LETT: int  7 8 7 9 6 5 8 7 10 8 ...
##  $ LIKE: int  7 9 8 8 7 7 7 6 7 7 ...
##  $ ORGN: int  7 8 8 7 8 8 5 5 8 8 ...
##  $ POTL: int  6 9 6 8 5 7 7 7 8 6 ...
##  $ RESM: int  7 9 6 7 4 4 8 6 9 7 ...
##  $ SCON: int  7 9 8 7 6 6 7 6 8 7 ...

summary(JobHiring)

##       ACAD            APPR            COMM            CFIT      
##  Min.   : 4.00   Min.   : 5.00   Min.   : 3.00   Min.   : 3.00  
##  1st Qu.: 6.25   1st Qu.: 7.00   1st Qu.: 6.00   1st Qu.: 6.00  
##  Median : 7.00   Median : 7.50   Median : 7.00   Median : 7.00  
##  Mean   : 7.40   Mean   : 7.44   Mean   : 6.86   Mean   : 6.88  
##  3rd Qu.: 8.00   3rd Qu.: 8.00   3rd Qu.: 8.00   3rd Qu.: 8.00  
##  Max.   :10.00   Max.   :10.00   Max.   :10.00   Max.   :10.00  
##       EXPR            JFIT            LETT            LIKE           ORGN     
##  Min.   : 5.00   Min.   : 3.00   Min.   : 4.00   Min.   :4.00   Min.   :3.00  
##  1st Qu.: 6.00   1st Qu.: 6.00   1st Qu.: 6.00   1st Qu.:7.00   1st Qu.:5.25  
##  Median : 7.00   Median : 7.00   Median : 7.00   Median :7.50   Median :7.00  
##  Mean   : 7.32   Mean   : 7.02   Mean   : 7.22   Mean   :7.38   Mean   :6.86  
##  3rd Qu.: 8.00   3rd Qu.: 8.00   3rd Qu.: 8.00   3rd Qu.:8.00   3rd Qu.:8.00  
##  Max.   :10.00   Max.   :10.00   Max.   :10.00   Max.   :9.00   Max.   :9.00  
##       POTL           RESM            SCON     
##  Min.   : 4.0   Min.   : 4.00   Min.   :5.00  
##  1st Qu.: 6.0   1st Qu.: 6.00   1st Qu.:7.00  
##  Median : 7.5   Median : 7.00   Median :7.00  
##  Mean   : 7.3   Mean   : 7.24   Mean   :7.34  
##  3rd Qu.: 8.0   3rd Qu.: 9.00   3rd Qu.:8.00  
##  Max.   :10.0   Max.   :10.00   Max.   :9.00

##################################
# Formulating a data type assessment summary
##################################
PDA <- JobHiring
(PDA.Summary <- data.frame(
  Column.Index=c(1:length(names(PDA))),
  Column.Name= names(PDA), 
  Column.Type=sapply(PDA, function(x) class(x)), 
  row.names=NULL)
)

##    Column.Index Column.Name Column.Type
## 1             1        ACAD     integer
## 2             2        APPR     integer
## 3             3        COMM     integer
## 4             4        CFIT     integer
## 5             5        EXPR     integer
## 6             6        JFIT     integer
## 7             7        LETT     integer
## 8             8        LIKE     integer
## 9             9        ORGN     integer
## 10           10        POTL     integer
## 11           11        RESM     integer
## 12           12        SCON     integer

1.2 Data Quality Assessment

[A] No missing observations noted for any variable.

[B] No low variance observed for any variable with First.Second.Mode.Ratio>5.

[C] No low variance observed for any variable with Unique.Count.Ratio<0.01.

[D] No high skewness observed for any variable with Skewness>3 or Skewness<(-3).

Code Chunk | Output

##################################
# Loading dataset
##################################
DQA <- JobHiring

##################################
# Formulating an overall data quality assessment summary
##################################
(DQA.Summary <- data.frame(
  Column.Name= names(DQA),
  Column.Type=sapply(DQA, function(x) class(x)),
  Row.Count=sapply(DQA, function(x) nrow(DQA)),
  NA.Count=sapply(DQA,function(x)sum(is.na(x))),
  Fill.Rate=sapply(DQA,function(x)format(round((sum(!is.na(x))/nrow(DQA)),3),nsmall=3)),
  row.names=NULL)
)

##    Column.Name Column.Type Row.Count NA.Count Fill.Rate
## 1         ACAD     integer        50        0     1.000
## 2         APPR     integer        50        0     1.000
## 3         COMM     integer        50        0     1.000
## 4         CFIT     integer        50        0     1.000
## 5         EXPR     integer        50        0     1.000
## 6         JFIT     integer        50        0     1.000
## 7         LETT     integer        50        0     1.000
## 8         LIKE     integer        50        0     1.000
## 9         ORGN     integer        50        0     1.000
## 10        POTL     integer        50        0     1.000
## 11        RESM     integer        50        0     1.000
## 12        SCON     integer        50        0     1.000

##################################
# Listing all descriptors
##################################
DQA.Descriptors <- DQA

##################################
# Listing all numeric Descriptors
##################################
DQA.Descriptors.Numeric <- DQA.Descriptors[,sapply(DQA.Descriptors, is.numeric)]

if (length(names(DQA.Descriptors.Numeric))>0) {
    print(paste0("There are ",
               (length(names(DQA.Descriptors.Numeric))),
               " numeric descriptor variable(s)."))
} else {
  print("There are no numeric descriptor variables.")
}

## [1] "There are 12 numeric descriptor variable(s)."

##################################
# Listing all factor Descriptors
##################################
DQA.Descriptors.Factor <- DQA.Descriptors[,sapply(DQA.Descriptors, is.factor)]

if (length(names(DQA.Descriptors.Factor))>0) {
    print(paste0("There are ",
               (length(names(DQA.Descriptors.Factor))),
               " factor descriptor variable(s)."))
} else {
  print("There are no factor descriptor variables.")
}

## [1] "There are no factor descriptor variables."

##################################
# Formulating a data quality assessment summary for factor Descriptors
##################################
if (length(names(DQA.Descriptors.Factor))>0) {

  ##################################
  # Formulating a function to determine the first mode
  ##################################
  FirstModes <- function(x) {
    ux <- unique(na.omit(x))
    tab <- tabulate(match(x, ux))
    ux[tab == max(tab)]
  }

  ##################################
  # Formulating a function to determine the second mode
  ##################################
  SecondModes <- function(x) {
    ux <- unique(na.omit(x))
    tab <- tabulate(match(x, ux))
    fm = ux[tab == max(tab)]
    sm = x[!(x %in% fm)]
    usm <- unique(sm)
    tabsm <- tabulate(match(sm, usm))
    ifelse(is.na(usm[tabsm == max(tabsm)])==TRUE,
           return("x"),
           return(usm[tabsm == max(tabsm)]))
  }

  (DQA.Descriptors.Factor.Summary <- data.frame(
  Column.Name= names(DQA.Descriptors.Factor),
  Column.Type=sapply(DQA.Descriptors.Factor, function(x) class(x)),
  Unique.Count=sapply(DQA.Descriptors.Factor, function(x) length(unique(x))),
  First.Mode.Value=sapply(DQA.Descriptors.Factor, function(x) as.character(FirstModes(x)[1])),
  Second.Mode.Value=sapply(DQA.Descriptors.Factor, function(x) as.character(SecondModes(x)[1])),
  First.Mode.Count=sapply(DQA.Descriptors.Factor, function(x) sum(na.omit(x) == FirstModes(x)[1])),
  Second.Mode.Count=sapply(DQA.Descriptors.Factor, function(x) sum(na.omit(x) == SecondModes(x)[1])),
  Unique.Count.Ratio=sapply(DQA.Descriptors.Factor, function(x) format(round((length(unique(x))/nrow(DQA.Descriptors.Factor)),3), nsmall=3)),
  First.Second.Mode.Ratio=sapply(DQA.Descriptors.Factor, function(x) format(round((sum(na.omit(x) == FirstModes(x)[1])/sum(na.omit(x) == SecondModes(x)[1])),3), nsmall=3)),
  row.names=NULL)
  )

}

##################################
# Formulating a data quality assessment summary for numeric Descriptors
##################################
if (length(names(DQA.Descriptors.Numeric))>0) {

  ##################################
  # Formulating a function to determine the first mode
  ##################################
  FirstModes <- function(x) {
    ux <- unique(na.omit(x))
    tab <- tabulate(match(x, ux))
    ux[tab == max(tab)]
  }

  ##################################
  # Formulating a function to determine the second mode
  ##################################
  SecondModes <- function(x) {
    ux <- unique(na.omit(x))
    tab <- tabulate(match(x, ux))
    fm = ux[tab == max(tab)]
    sm = na.omit(x)[!(na.omit(x) %in% fm)]
    usm <- unique(sm)
    tabsm <- tabulate(match(sm, usm))
    ifelse(is.na(usm[tabsm == max(tabsm)])==TRUE,
           return(0.00001),
           return(usm[tabsm == max(tabsm)]))
  }

  (DQA.Descriptors.Numeric.Summary <- data.frame(
  Column.Name= names(DQA.Descriptors.Numeric),
  Column.Type=sapply(DQA.Descriptors.Numeric, function(x) class(x)),
  Unique.Count=sapply(DQA.Descriptors.Numeric, function(x) length(unique(x))),
  Unique.Count.Ratio=sapply(DQA.Descriptors.Numeric, function(x) format(round((length(unique(x))/nrow(DQA.Descriptors.Numeric)),3), nsmall=3)),
  First.Mode.Value=sapply(DQA.Descriptors.Numeric, function(x) format(round((FirstModes(x)[1]),3),nsmall=3)),
  Second.Mode.Value=sapply(DQA.Descriptors.Numeric, function(x) format(round((SecondModes(x)[1]),3),nsmall=3)),
  First.Mode.Count=sapply(DQA.Descriptors.Numeric, function(x) sum(na.omit(x) == FirstModes(x)[1])),
  Second.Mode.Count=sapply(DQA.Descriptors.Numeric, function(x) sum(na.omit(x) == SecondModes(x)[1])),
  First.Second.Mode.Ratio=sapply(DQA.Descriptors.Numeric, function(x) format(round((sum(na.omit(x) == FirstModes(x)[1])/sum(na.omit(x) == SecondModes(x)[1])),3), nsmall=3)),
  Minimum=sapply(DQA.Descriptors.Numeric, function(x) format(round(min(x,na.rm = TRUE),3), nsmall=3)),
  Mean=sapply(DQA.Descriptors.Numeric, function(x) format(round(mean(x,na.rm = TRUE),3), nsmall=3)),
  Median=sapply(DQA.Descriptors.Numeric, function(x) format(round(median(x,na.rm = TRUE),3), nsmall=3)),
  Maximum=sapply(DQA.Descriptors.Numeric, function(x) format(round(max(x,na.rm = TRUE),3), nsmall=3)),
  Skewness=sapply(DQA.Descriptors.Numeric, function(x) format(round(skewness(x,na.rm = TRUE),3), nsmall=3)),
  Kurtosis=sapply(DQA.Descriptors.Numeric, function(x) format(round(kurtosis(x,na.rm = TRUE),3), nsmall=3)),
  Percentile25th=sapply(DQA.Descriptors.Numeric, function(x) format(round(quantile(x,probs=0.25,na.rm = TRUE),3), nsmall=3)),
  Percentile75th=sapply(DQA.Descriptors.Numeric, function(x) format(round(quantile(x,probs=0.75,na.rm = TRUE),3), nsmall=3)),
  row.names=NULL)
  )

}

##    Column.Name Column.Type Unique.Count Unique.Count.Ratio First.Mode.Value
## 1         ACAD     integer            7              0.140            7.000
## 2         APPR     integer            6              0.120            8.000
## 3         COMM     integer            8              0.160            7.000
## 4         CFIT     integer            8              0.160            8.000
## 5         EXPR     integer            6              0.120            8.000
## 6         JFIT     integer            7              0.140            7.000
## 7         LETT     integer            7              0.140            8.000
## 8         LIKE     integer            6              0.120            8.000
## 9         ORGN     integer            7              0.140            7.000
## 10        POTL     integer            7              0.140            8.000
## 11        RESM     integer            7              0.140            9.000
## 12        SCON     integer            5              0.100            7.000
##    Second.Mode.Value First.Mode.Count Second.Mode.Count First.Second.Mode.Ratio
## 1              8.000               14                12                   1.167
## 2              7.000               19                16                   1.188
## 3              8.000               15                14                   1.071
## 4              7.000               15                11                   1.364
## 5              6.000               12                10                   1.200
## 6              8.000               13                12                   1.083
## 7              7.000               12                11                   1.091
## 8              7.000               18                16                   1.125
## 9              5.000               12                10                   1.200
## 10             7.000               15                11                   1.364
## 11             6.000               12                11                   1.091
## 12             8.000               16                14                   1.143
##    Minimum  Mean Median Maximum Skewness Kurtosis Percentile25th Percentile75th
## 1    4.000 7.400  7.000  10.000   -0.091    2.687          6.250          8.000
## 2    5.000 7.440  7.500  10.000   -0.014    2.861          7.000          8.000
## 3    3.000 6.860  7.000  10.000   -0.689    3.572          6.000          8.000
## 4    3.000 6.880  7.000  10.000   -0.324    2.484          6.000          8.000
## 5    5.000 7.320  7.000  10.000   -0.007    2.136          6.000          8.000
## 6    3.000 7.020  7.000  10.000   -0.610    3.541          6.000          8.000
## 7    4.000 7.220  7.000  10.000   -0.140    2.271          6.000          8.000
## 8    4.000 7.380  7.500   9.000   -0.701    3.506          7.000          8.000
## 9    3.000 6.860  7.000   9.000   -0.397    2.238          5.250          8.000
## 10   4.000 7.300  7.500  10.000   -0.457    2.681          6.000          8.000
## 11   4.000 7.240  7.000  10.000   -0.254    2.155          6.000          9.000
## 12   5.000 7.340  7.000   9.000   -0.302    2.365          7.000          8.000

##################################
# Identifying potential data quality issues
##################################

##################################
# Checking for missing observations
##################################
if ((nrow(DQA.Summary[DQA.Summary$NA.Count>0,]))>0){
  print(paste0("Missing observations noted for ",
               (nrow(DQA.Summary[DQA.Summary$NA.Count>0,])),
               " variable(s) with NA.Count>0 and Fill.Rate<1.0."))
  DQA.Summary[DQA.Summary$NA.Count>0,]
} else {
  print("No missing observations noted.")
}

## [1] "No missing observations noted."

##################################
# Checking for zero or near-zero variance Descriptors
##################################
if (length(names(DQA.Descriptors.Factor))==0) {
  print("No factor descriptors noted.")
} else if (nrow(DQA.Descriptors.Factor.Summary[as.numeric(as.character(DQA.Descriptors.Factor.Summary$First.Second.Mode.Ratio))>5,])>0){
  print(paste0("Low variance observed for ",
               (nrow(DQA.Descriptors.Factor.Summary[as.numeric(as.character(DQA.Descriptors.Factor.Summary$First.Second.Mode.Ratio))>5,])),
               " factor variable(s) with First.Second.Mode.Ratio>5."))
  DQA.Descriptors.Factor.Summary[as.numeric(as.character(DQA.Descriptors.Factor.Summary$First.Second.Mode.Ratio))>5,]
} else {
  print("No low variance factor descriptors due to high first-second mode ratio noted.")
}

## [1] "No factor descriptors noted."

if (length(names(DQA.Descriptors.Numeric))==0) {
  print("No numeric descriptors noted.")
} else if (nrow(DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$First.Second.Mode.Ratio))>5,])>0){
  print(paste0("Low variance observed for ",
               (nrow(DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$First.Second.Mode.Ratio))>5,])),
               " numeric variable(s) with First.Second.Mode.Ratio>5."))
  DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$First.Second.Mode.Ratio))>5,]
} else {
  print("No low variance numeric descriptors due to high first-second mode ratio noted.")
}

## [1] "No low variance numeric descriptors due to high first-second mode ratio noted."

if (length(names(DQA.Descriptors.Numeric))==0) {
  print("No numeric descriptors noted.")
} else if (nrow(DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Unique.Count.Ratio))<0.01,])>0){
  print(paste0("Low variance observed for ",
               (nrow(DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Unique.Count.Ratio))<0.01,])),
               " numeric variable(s) with Unique.Count.Ratio<0.01."))
  DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Unique.Count.Ratio))<0.01,]
} else {
  print("No low variance numeric descriptors due to low unique count ratio noted.")
}

## [1] "No low variance numeric descriptors due to low unique count ratio noted."

##################################
# Checking for skewed Descriptors
##################################
if (length(names(DQA.Descriptors.Numeric))==0) {
  print("No numeric descriptors noted.")
} else if (nrow(DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Skewness))>3 |
                                               as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Skewness))<(-3),])>0){
  print(paste0("High skewness observed for ",
  (nrow(DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Skewness))>3 |
                                               as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Skewness))<(-3),])),
  " numeric variable(s) with Skewness>3 or Skewness<(-3)."))
  DQA.Descriptors.Numeric.Summary[as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Skewness))>3 |
                                 as.numeric(as.character(DQA.Descriptors.Numeric.Summary$Skewness))<(-3),]
} else {
  print("No skewed numeric descriptors noted.")
}

## [1] "No skewed numeric descriptors noted."

1.3 Data Preprocessing

1.3.1 Outlier Detection

[A] Outliers noted for 3 out of the 12 descriptors. Descriptor values were visualized through a boxplot including observations classified as suspected outliers using the IQR criterion. The IQR criterion means that all observations above the (75th percentile + 1.5 x IQR) or below the (25th percentile - 1.5 x IQR) are suspected outliers, where IQR is the difference between the third quartile (75th percentile) and first quartile (25th percentile).
     [A.1] APPR = 2
     [A.2] LIKE = 3
     [A.3] SCON = 4

Code Chunk | Output

##################################
# Loading dataset
##################################
DPA <- JobHiring

##################################
# Gathering descriptive statistics
##################################
(DPA_Skimmed <- skim(DPA))

Data summary
Name	DPA
Number of rows	50
Number of columns	12
_______________________
Column type frequency:
numeric	12
________________________
Group variables	None

Variable type: numeric

skim_variable	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
ACAD	1	7.40	1.29	4	6.25	7.0	8	10	▁▆▇▇▆
APPR	1	7.44	1.01	5	7.00	7.5	8	10	▃▇▇▂▁
COMM	1	6.86	1.47	3	6.00	7.0	8	10	▁▁▇▅▁
CFIT	1	6.88	1.62	3	6.00	7.0	8	10	▂▃▇▇▃
EXPR	1	7.32	1.36	5	6.00	7.0	8	10	▇▆▆▅▁
JFIT	1	7.02	1.60	3	6.00	7.0	8	10	▁▁▇▃▂
LETT	1	7.22	1.67	4	6.00	7.0	8	10	▆▅▇▇▇
LIKE	1	7.38	1.12	4	7.00	7.5	8	9	▁▂▇▇▃
ORGN	1	6.86	1.58	3	5.25	7.0	8	9	▁▃▂▅▇
POTL	1	7.30	1.39	4	6.00	7.5	8	10	▂▅▆▇▅
RESM	1	7.24	1.68	4	6.00	7.0	9	10	▃▆▅▅▇
SCON	1	7.34	1.17	5	7.00	7.0	8	9	▂▃▇▇▅

##################################
# Outlier Detection
##################################

##################################
# Listing all Descriptors
##################################
DPA.Descriptors <- DPA

##################################
# Listing all numeric Descriptors
##################################
DPA.Descriptors.Numeric <- DPA.Descriptors[,sapply(DPA.Descriptors, is.numeric)]

##################################
# Identifying outliers for the numeric Descriptors
##################################
OutlierCountList <- c()

for (i in 1:ncol(DPA.Descriptors.Numeric)) {
  Outliers <- boxplot.stats(DPA.Descriptors.Numeric[,i])$out
  OutlierCount <- length(Outliers)
  OutlierCountList <- append(OutlierCountList,OutlierCount)
  OutlierIndices <- which(DPA.Descriptors.Numeric[,i] %in% c(Outliers))
  print(
  ggplot(DPA.Descriptors.Numeric, aes(x=DPA.Descriptors.Numeric[,i])) +
  geom_boxplot() +
  theme_bw() +
  theme(axis.text.y=element_blank(), 
        axis.ticks.y=element_blank()) +
  xlab(names(DPA.Descriptors.Numeric)[i]) +
  labs(title=names(DPA.Descriptors.Numeric)[i],
       subtitle=paste0(OutlierCount, " Outlier(s) Detected")))
}

1.3.2 Zero and Near-Zero Variance

[A] No low variance observed for any descriptor using a preprocessing summary from the caret package. The nearZeroVar method using both the freqCut and uniqueCut criteria set at 95/5 and 10, respectively, were applied on the dataset.

Code Chunk | Output

##################################
# Zero and Near-Zero Variance
##################################

##################################
# Identifying columns with low variance
###################################
DPA_LowVariance <- nearZeroVar(DPA,
                               freqCut = 80/20,
                               uniqueCut = 10,
                               saveMetrics= TRUE)
(DPA_LowVariance[DPA_LowVariance$nzv,])

## [1] freqRatio     percentUnique zeroVar       nzv          
## <0 rows> (or 0-length row.names)

if ((nrow(DPA_LowVariance[DPA_LowVariance$nzv,]))==0){
  
  print("No low variance descriptors noted.")
  
} else {

  print(paste0("Low variance observed for ",
               (nrow(DPA_LowVariance[DPA_LowVariance$nzv,])),
               " numeric variable(s) with First.Second.Mode.Ratio>4 and Unique.Count.Ratio<0.10."))
  
  DPA_LowVarianceForRemoval <- (nrow(DPA_LowVariance[DPA_LowVariance$nzv,]))
  
  print(paste0("Low variance can be resolved by removing ",
               (nrow(DPA_LowVariance[DPA_LowVariance$nzv,])),
               " numeric variable(s)."))
  
  for (j in 1:DPA_LowVarianceForRemoval) {
  DPA_LowVarianceRemovedVariable <- rownames(DPA_LowVariance[DPA_LowVariance$nzv,])[j]
  print(paste0("Variable ",
               j,
               " for removal: ",
               DPA_LowVarianceRemovedVariable))
  }
  
  DPA %>%
  skim() %>%
  dplyr::filter(skim_variable %in% rownames(DPA_LowVariance[DPA_LowVariance$nzv,]))

}

## [1] "No low variance descriptors noted."

1.3.3 Collinearity

[A] No multicollinearity with Pearson correlation coefficients >80% was noted among pairs of descriptors using the preprocessing summary from the caret package.

Code Chunk | Output

##################################
# Visualizing pairwise correlation between descriptor
##################################
(DPA_Correlation <- cor(DPA.Descriptors.Numeric,
                        method = "pearson",
                        use="pairwise.complete.obs"))

##           ACAD      APPR      COMM      CFIT      EXPR      JFIT      LETT
## ACAD 1.0000000 0.4701427 0.4054067 0.4992183 0.6672390 0.5393435 0.3269457
## APPR 0.4701427 1.0000000 0.4939720 0.4170806 0.3099465 0.4105762 0.3034917
## COMM 0.4054067 0.4939720 1.0000000 0.5907387 0.1960094 0.5050808 0.4033781
## CFIT 0.4992183 0.4170806 0.5907387 1.0000000 0.3959678 0.8820580 0.4539438
## EXPR 0.6672390 0.3099465 0.1960094 0.3959678 1.0000000 0.5224501 0.4800248
## JFIT 0.5393435 0.4105762 0.5050808 0.8820580 0.5224501 1.0000000 0.4652077
## LETT 0.3269457 0.3034917 0.4033781 0.4539438 0.4800248 0.4652077 1.0000000
## LIKE 0.3990134 0.6392098 0.4900438 0.4842812 0.3058838 0.5305521 0.3572902
## ORGN 0.3578991 0.5370193 0.8617976 0.6223051 0.1732109 0.5274873 0.3372713
## POTL 0.7951979 0.5133541 0.4206029 0.6586426 0.7035542 0.7057635 0.3846555
## RESM 0.4606419 0.4986309 0.4255336 0.4432190 0.5172440 0.4607897 0.8368872
## SCON 0.4067717 0.6967065 0.4309427 0.4616973 0.4037882 0.4654267 0.2323177
##           LIKE      ORGN      POTL      RESM      SCON
## ACAD 0.3990134 0.3578991 0.7951979 0.4606419 0.4067717
## APPR 0.6392098 0.5370193 0.5133541 0.4986309 0.6967065
## COMM 0.4900438 0.8617976 0.4206029 0.4255336 0.4309427
## CFIT 0.4842812 0.6223051 0.6586426 0.4432190 0.4616973
## EXPR 0.3058838 0.1732109 0.7035542 0.5172440 0.4037882
## JFIT 0.5305521 0.5274873 0.7057635 0.4607897 0.4654267
## LETT 0.3572902 0.3372713 0.3846555 0.8368872 0.2323177
## LIKE 1.0000000 0.5258919 0.4881676 0.4685663 0.7532282
## ORGN 0.5258919 1.0000000 0.4199960 0.2968767 0.4900298
## POTL 0.4881676 0.4199960 1.0000000 0.4570046 0.5006195
## RESM 0.4685663 0.2968767 0.4570046 1.0000000 0.4334882
## SCON 0.7532282 0.4900298 0.5006195 0.4334882 1.0000000

DPA_CorrelationTest <- cor.mtest(DPA.Descriptors.Numeric,
                       method = "pearson",
                       conf.level = 0.95)

corrplot(cor(DPA.Descriptors.Numeric,
             method = "pearson",
             use="pairwise.complete.obs"),
             method = "circle",
             type = "upper",
             order = "original",
             tl.col = "black",
             tl.cex = 0.75,
             tl.srt = 90,
             sig.level = 0.05,
             p.mat = DPA_CorrelationTest$p,
             insig = "blank")

corrplot(cor(DPA.Descriptors.Numeric,
             method = "pearson",
             use="pairwise.complete.obs"),
             method = "number",
             type = "upper",
             order = "original",
             tl.col = "black",
             tl.cex = 0.75,
             tl.srt = 90,
             sig.level = 0.05,
             p.mat = DPA_CorrelationTest$p,
             insig = "blank")

##################################
# Identifying the highly correlated variables
##################################
(DPA_HighlyCorrelatedCount <- sum(abs(DPA_Correlation[upper.tri(DPA_Correlation)])>0.90))

## [1] 0

if (DPA_HighlyCorrelatedCount > 0) {
  DPA_HighlyCorrelated <- findCorrelation(DPA_Correlation, cutoff = 0.90)

  (DPA_HighlyCorrelatedForRemoval <- length(DPA_HighlyCorrelated))

  print(paste0("High correlation can be resolved by removing ",
               (DPA_HighlyCorrelatedForRemoval),
               " numeric variable(s)."))

  for (j in 1:DPA_HighlyCorrelatedForRemoval) {
  DPA_HighlyCorrelatedRemovedVariable <- colnames(DPA.Descriptors.Numeric)[DPA_HighlyCorrelated[j]]
  print(paste0("Variable ",
               j,
               " for removal: ",
               DPA_HighlyCorrelatedRemovedVariable))
  }

}

1.3.4 Linear Dependency

[A] No linear dependencies noted for any subset of decriptors using the preprocessing summary from the caret package applying the findLinearCombos method which utilizes the QR decomposition of a matrix to enumerate sets of linear combinations (if they exist).

Code Chunk | Output

##################################
# Linear Dependencies
##################################

##################################
# Finding linear dependencies
##################################
DPA_LinearlyDependent <- findLinearCombos(DPA.Descriptors.Numeric)

##################################
# Identifying the linearly dependent variables
##################################
DPA_LinearlyDependent <- findLinearCombos(DPA.Descriptors.Numeric)

(DPA_LinearlyDependentCount <- length(DPA_LinearlyDependent$linearCombos))

## [1] 0

if (DPA_LinearlyDependentCount == 0) {
  print("No linearly dependent predictors noted.")
} else {
  print(paste0("Linear dependency observed for ",
               (DPA_LinearlyDependentCount),
               " subset(s) of numeric variable(s)."))
  
  for (i in 1:DPA_LinearlyDependentCount) {
    DPA_LinearlyDependentSubset <- colnames(DPA.Descriptors.Numeric)[DPA_LinearlyDependent$linearCombos[[i]]]
    print(paste0("Linear dependent variable(s) for subset ",
                 i,
                 " include: ",
                 DPA_LinearlyDependentSubset))
  }
  
}

## [1] "No linearly dependent predictors noted."

##################################
# Identifying the linearly dependent variables for removal
##################################

if (DPA_LinearlyDependentCount > 0) {
  DPA_LinearlyDependent <- findLinearCombos(DPA.Descriptors.Numeric)
  
  DPA_LinearlyDependentForRemoval <- length(DPA_LinearlyDependent$remove)
  
  print(paste0("Linear dependency can be resolved by removing ",
               (DPA_LinearlyDependentForRemoval),
               " numeric variable(s)."))
  
  for (j in 1:DPA_LinearlyDependentForRemoval) {
  DPA_LinearlyDependentRemovedVariable <- colnames(DPA.Descriptors.Numeric)[DPA_LinearlyDependent$remove[j]]
  print(paste0("Variable ",
               j,
               " for removal: ",
               DPA_LinearlyDependentRemovedVariable))
  }

}

1.3.5 Distributional Shape

[A] No shape transformation was necessary as the distributional skewness observed among individual descriptors was normal (all values within +3 and -3) with minimal outliers.

Code Chunk | Output

##################################
# Distributional Shape
##################################

##################################
# Formulating the histogram
# for the numeric descriptors
##################################
for (i in 1:ncol(DPA.Descriptors.Numeric)) {
  Median <- format(round(median(DPA.Descriptors.Numeric[,i],na.rm = TRUE),2), nsmall=2)
  Mean <- format(round(mean(DPA.Descriptors.Numeric[,i],na.rm = TRUE),2), nsmall=2)
  Skewness <- format(round(skewness(DPA.Descriptors.Numeric[,i],na.rm = TRUE),2), nsmall=2)
  print(
  ggplot(DPA.Descriptors.Numeric, aes(x=DPA.Descriptors.Numeric[,i])) +
  geom_histogram(binwidth=1,color="black", fill="white") +
  geom_vline(aes(xintercept=mean(DPA.Descriptors.Numeric[,i])),
            color="blue", size=1) +
    geom_vline(aes(xintercept=median(DPA.Descriptors.Numeric[,i])),
            color="red", size=1) +
  theme_bw() +
  ylab("Count") +
  xlab(names(DPA.Descriptors.Numeric)[i]) +
  labs(title=names(DPA.Descriptors.Numeric)[i],
       subtitle=paste0("Median = ", Median,
                       ", Mean = ", Mean,
                       ", Skewness = ", Skewness)))
}

1.4 Data Pre-Assessment

1.4.1 Correlation Matrix Assessment - Covariance Validity

Covariance Validity evaluates whether the ratio of associated variables in the data set are sufficient enough to support the assumption that correlations exist. The criterion is computed by determining the proportion of correlation coefficients with values of at least 30% between all pairs of variables in the data set. A value closer to 1 suggests an adequate percentage of pairwise-correlated variables.

[A] Covariance among descriptors in the correlation matrix was sufficient to justify the conduct of an exploratory factor analysis. 94% (62/66) of the pairwise associations using the Pearson correlation coefficient were above 30%.

Code Chunk | Output

##################################
# Identifying the minimally correlated variables
##################################
(DPA_MinimallyCorrelatedCount <- sum(abs(DPA_Correlation[upper.tri(DPA_Correlation)])>0.30))

## [1] 62

(DPA_AllPairs <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(DPA_MinimallyCorrelatedCountPercentage <- DPA_MinimallyCorrelatedCount/DPA_AllPairs)

## [1] 0.9393939

qgraph(cor(DPA.Descriptors.Numeric),
       cut=0.30,
       details=TRUE,
       posCol="#2F75B5",
       negCol="#FF5050",
       labels=names(DPA.Descriptors.Numeric))

1.4.2 Correlation Matrix Assessment - Determinant Computation

Determinant Computation reflects the extent of multicollinearity among the variables in a correlation matrix. An extremely small determinant value indicates that the variables are highly correlated and nearly linearly dependent which can lead to unstable results and difficulty in interpreting their individual contributions.

[A] The determinant of the correlation matrix was computed as 0.00002 (greater than .00001) indicating the absence of the likelihood for a multicollinearity problem. The results allowed for matrix operations to produce stable results during exploratory factor analysis.

Code Chunk | Output

##################################
# Computing the determinant of the correlation matrix
##################################
(DPA_CorrelationMatrixDeterminant <- det(cor(DPA.Descriptors.Numeric)))

## [1] 2.258001e-05

1.4.3 Correlation Matrix Assessment - Bartlett’s Test of Sphericity

Bartlett’s Test of Sphericity evaluates whether the correlations between variables in a data set are significant enough to support the assumption that underlying factors exist and can be extracted. The test calculates a Chi-Square statistic based on the differences between the observed correlation matrix and an identity matrix. The larger the Chi-Square value, there is more evidence against the null hypothesis stating that the correlation matrix is an identity matrix which indicates that the variables are uncorrelated and do not have any underlying structure.

[A] The computed p-value from the Bartlett’s Test of Sphericity was statistically significant (<0.00001), rejecting the null hypothesis that the correlation matrix is an identity matrix (ones on the diagonal and zeros on the off-diagonal). Results indicated that there is enough evidence to support the existence of underlying factors, suggesting that the correlation matrix is appropriate for exploratory factor analysis.

Code Chunk | Output

##################################
# Calculating the Bartlett's Test of Sphericity
##################################
(DPA_BartlettTest <- cortest.bartlett(DPA.Descriptors.Numeric,
                                      n=nrow(DPA.Descriptors.Numeric)))

## $chisq
## [1] 472.5147
## 
## $p.value
## [1] 9.677847e-63
## 
## $df
## [1] 66

1.4.4 Correlation Matrix Assessment - Kaiser-Meyer-Olkin Factor Adequacy

Kaiser-Meyer-Olkin Factor Adequacy evaluates whether the observed variables are suitable for exploratory factor analysis based on their common variance and the potential for extracting meaningful factors. The criterion is computed by examining the ratio of the sum of squared correlations between variables to the sum of squared partial correlations. A KMO value closer to 1 suggests that the variables have high shared variance and are suitable to proceed with the analysis.

[A] The KMO measure of sampling adequacy was acceptable with a computed value of 0.798 for the complete model, indicating the suitability of the data for exploratory factor analysis. The estimated proportion of variance among all the observed variable was sufficiently adequate.

[B] The sampling adequacy for each variable in the model was acceptable with KMO values ranging from 0.676 to 0.883. Results indicated that each variable can be sufficiently predicted by other variables.
     [B.1] ACAD = 0.85
     [B.2] APPR = 0.88
     [B.3] COMM = 0.79
     [B.4] CFIT = 0.81
     [B.5] EXPR = 0.78
     [B.6] JFIT = 0.83
     [B.7] LETT = 0.68
     [B.8] LIKE = 0.85
     [B.9] ORGN = 0.74
     [B.10] POTL = 0.88
     [B.11] RESM = 0.71
     [B.12] SCON = 0.78

Code Chunk | Output

##################################
# Calculating the Kaiser-Meyer-Olkin Factor Adequacy
##################################
(DPA_KMOFactorAdequacy <- KMO(DPA.Descriptors.Numeric))

## 
## ── Kaiser-Meyer-Olkin criterion (KMO) ──────────────────────────────────────────
## 
## ✔ The overall KMO value for your data is middling.
##   These data are probably suitable for factor analysis.
## 
##   Overall: 0.798
## 
##   For each variable:
##  ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM  SCON 
## 0.847 0.876 0.790 0.809 0.782 0.827 0.676 0.851 0.740 0.883 0.712 0.777

1.5 Data Exploration

[A] With the descriptors treated as ordered categorical variables, distribution patterns were observed as follows:
     [A.1] As individual descriptors, ACAD, APPR, LIKE and SCON obtained higher proportions of likert scale scores of 6 and above
     [A.2] As individual descriptors, CFIT and ORGN obtained higher proportions of likert scale scores of 5 and below

[B] With the descriptors treated as numeric variables, all pairwise associations were positive with the highest Pearson correlation coefficient values noted for the following:
     [B.1] JFIT and CFIT = 0.88
     [B.2] COMM and ORGN = 0.86
     [B.3] RESM and LETT = 0.84

Code Chunk | Output

##################################
# Percentage contribution
# among descriptors when treated
# as ordered categorical variables
##################################
DPA.Descriptors.Ordered <- gather(DPA.Descriptors.Numeric,
                                   "ACAD",
                                   "APPR",
                                   "COMM",
                                   "CFIT",
                                   "EXPR",
                                   "JFIT",
                                   "LETT",
                                   "LIKE",  
                                   "ORGN",
                                   "POTL",
                                   "RESM",
                                   "SCON",
                                   key="Likert_Variable",
                                   value="Likert_Score")


DPA.Descriptors.Ordered$Likert_Variable <- as.factor(DPA.Descriptors.Ordered$Likert_Variable)
DPA.Descriptors.Ordered$Likert_Score <- factor(DPA.Descriptors.Ordered$Likert_Score,                                               levels=c("1","2","3","4","5","6","7","8","9","10"))

(DPA.Descriptors.OrderedTable <- table(DPA.Descriptors.Ordered$Likert_Variable,
                                       DPA.Descriptors.Ordered$Likert_Score))

##       
##         1  2  3  4  5  6  7  8  9 10
##   ACAD  0  0  0  1  1 11 14 12  9  2
##   APPR  0  0  0  0  1  8 16 19  5  1
##   CFIT  0  0  1  3  8  6 11 15  4  2
##   COMM  0  0  2  2  3 10 15 14  3  1
##   EXPR  0  0  0  0  5 10 12 12  9  2
##   JFIT  0  0  3  0  3 11 13 12  6  2
##   LETT  0  0  0  3  6  7 11 12  6  5
##   LIKE  0  0  0  1  2  6 16 18  7  0
##   ORGN  0  0  1  2 10  5 12 12  8  0
##   POTL  0  0  0  2  3  9 11 15  9  1
##   RESM  0  0  0  4  3 11  9  8 12  3
##   SCON  0  0  0  0  4  7 16 14  9  0

DPA.Descriptors.OrderedTableProportion <- as.data.frame(rbind(prop.table(DPA.Descriptors.OrderedTable,1)))
DPA.Descriptors.OrderedTableProportion$LikertVariables <- rownames(DPA.Descriptors.OrderedTableProportion)
DPA.Descriptors.OrderedTableProportion

##      1 2    3    4    5    6    7    8    9   10 LikertVariables
## ACAD 0 0 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04            ACAD
## APPR 0 0 0.00 0.00 0.02 0.16 0.32 0.38 0.10 0.02            APPR
## CFIT 0 0 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04            CFIT
## COMM 0 0 0.04 0.04 0.06 0.20 0.30 0.28 0.06 0.02            COMM
## EXPR 0 0 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04            EXPR
## JFIT 0 0 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04            JFIT
## LETT 0 0 0.00 0.06 0.12 0.14 0.22 0.24 0.12 0.10            LETT
## LIKE 0 0 0.00 0.02 0.04 0.12 0.32 0.36 0.14 0.00            LIKE
## ORGN 0 0 0.02 0.04 0.20 0.10 0.24 0.24 0.16 0.00            ORGN
## POTL 0 0 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02            POTL
## RESM 0 0 0.00 0.08 0.06 0.22 0.18 0.16 0.24 0.06            RESM
## SCON 0 0 0.00 0.00 0.08 0.14 0.32 0.28 0.18 0.00            SCON

likert(LikertVariables~., 
       DPA.Descriptors.OrderedTableProportion, 
       ReferenceZero=5.5, 
       ylab = "Likert Variables", 
       xlab = "Percentage Distribution of Likert Scale Scores",
       main = list("Variables Characterizing Candidates During Job Hiring"), 
       auto.key = list(columns=10, 
                       space="top",
                       reverse.rows=F))

##################################
# Pairwise association
# between descriptors when treated
# as quantitative variables
##################################
corrplot(cor(DPA.Descriptors.Numeric,
             method = "pearson",
             use="pairwise.complete.obs"),
             method = "number",
             type = "upper",
             order = "original",
             tl.col = "black",
             tl.cex = 0.75,
             tl.srt = 90,
             sig.level = 0.05,
             p.mat = DPA_CorrelationTest$p,
             insig = "blank")

##################################
# Identifying the maximally correlated variables
##################################
DPA_MaximallyCorrelated <- findCorrelation(DPA_Correlation, cutoff = 0.80)
colnames(DPA.Descriptors.Numeric)[DPA_MaximallyCorrelated]

## [1] "JFIT" "RESM" "COMM"

##################################
# Formulating the pairwise scatterplots
# among descriptors
##################################
splom(~DPA.Descriptors.Numeric,
      pch = 16,
      cex = 1,
      alpha = 0.45,
      auto.key = list(points=TRUE,
                      space="top"),
      main = "Exploratory Analysis Between Descriptors",
      xlab = "Scatterplot Matrix of Descriptors")

1.6 Factor Analysis

1.6.1 Principal Axes Factor Extraction and Varimax Rotation (FA_PA_V)

Principal Axes Factor Extraction identifies the underlying constructs that explain the observed correlations among variables by capturing both common variance (shared among variables) and unique variance (specific to each variable). This process potentially results to factors with lower communalities (explained variance) but with more direct interpretability. The algorithm performs eigenvalue decomposition on the correlation matrix. The eigenvalues represent the amount of variance explained by each eigenvector. Given a defined number of factors, loadings are calculated for each observed variable on each extracted factor. Factor loadings indicate the strength and direction of the relationship between variables and factors. Factors are interpreted based on the loading patterns. Variables with high loadings on a factor are strongly associated with the factor.

Varimax Rotation is an orthogonal rotation method which forces the rotated factors to be uncorrelated with each other, leading to simpler and more easily interpretable factor solutions. The algorithm aims to maximize the variance of the squared loadings within each factor which helps identify variables that are strongly associated with a single factor. The results are straightforward to interpret and can be particularly useful when the factors are expected to be independent.

[A] Appplying Principal Axes factor extraction and Varimax rotation, an evaluation was conducted using a set of empirical guidelines to determine the optimal number of factors to be retained for exploratory factor analysis. It was determined that:
     [A.1] 4 factors would be sufficient for an optimal balance between comprehensiveness and parsimony.
     [A.2] To ensure that both under-extraction and over-extraction are assessed, models with 3, 4 and 5 factors were sequentially evaluated for their interpretability and theoretical meaningfulness.
     [A.3] The choice of 4 factors was supported by maximum consensus (35.71%) from 5 (Bentler, Beta, Parallel Analysis, Kaiser Criterion and Standardized Scree) among 14 methods.

[B] Results for the exploratory factor analysis using a 3-Factor Structure were as follows:
     [B.1] Standardized Root Mean Square of the Residual = 0.07
     [B.2] Tucker-Lewis Fit Index = 0.62
     [B.3] Bayesian Information Criterion = -22.76
     [B.4] High Residual Rate = 0.27
     [B.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [B.5.1] ORGN: Loading = 0.88, Communality = 0.79
            [B.5.2] COMM: Loading = 0.79, Communality = 0.69
            [B.5.3] LIKE: Loading = 0.64, Communality = 0.54
            [B.5.4] APPR: Loading = 0.60, Communality = 0.50
            [B.5.5] SCON: Loading = 0.60, Communality = 0.51
            [B.5.6] CFIT: Loading = 0.60, Communality = 0.61
            [B.5.7] Cronbach’s Alpha = 0.88
     [B.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [B.6.1] POTL: Loading = 0.85, Communality = 0.87
            [B.6.2] EXPR: Loading = 0.76, Communality = 0.69
            [B.6.3] ACAD: Loading = 0.74, Communality = 0.65
            [B.6.4] JFIT: Loading = 0.56, Communality = 0.64
            [B.6.5] Cronbach’s Alpha = 0.88
     [B.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [B.7.1] LETT: Loading = 0.88, Communality = 0.86
            [B.7.2] RESM: Loading = 0.81, Communality = 0.83
            [B.7.3] Cronbach’s Alpha = 0.91

[C] Results for the exploratory factor analysis using a 4-Factor Structure were as follows:
     [C.1] Standardized Root Mean Square of the Residual = 0.04
     [C.2] Tucker-Lewis Fit Index = 0.73
     [C.3] Bayesian Information Criterion = -33.02
     [C.4] High Residual Rate = 0.15
     [C.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [C.5.1] POTL: Loading = 0.85, Communality = 0.89
            [C.5.2] EXPR: Loading = 0.73, Communality = 0.68
            [C.5.3] ACAD: Loading = 0.71, Communality = 0.64
            [C.5.4] JFIT: Loading = 0.61, Communality = 0.73
            [C.5.5] Cronbach’s Alpha = 0.88
     [C.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [C.6.1] ORGN: Loading = 0.84, Communality = 0.86
            [C.6.2] COMM: Loading = 0.76, Communality = 0.73
            [C.6.3] CFIT: Loading = 0.65, Communality = 0.76
            [C.6.4] Cronbach’s Alpha = 0.87
     [C.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [C.7.1] SCON: Loading = 0.82, Communality = 0.79
            [C.7.2] APPR: Loading = 0.70, Communality = 0.65
            [C.7.3] LIKE: Loading = 0.69, Communality = 0.66
            [C.7.4] Cronbach’s Alpha = 0.87
     [C.8] Factor 4 was a latent variable with higher loading towards the following descriptors:
            [C.8.1] LETT: Loading = 0.88, Communality = 0.89
            [C.8.2] RESM: Loading = 0.83, Communality = 0.88
            [C.8.3] Cronbach’s Alpha = 0.91

[D] Results for the exploratory factor analysis using a 5-Factor Structure were as follows:
     [D.1] Standardized Root Mean Square of the Residual = 0.02
     [D.2] Tucker-Lewis Fit Index = 0.96
     [D.3] Bayesian Information Criterion = -43.43
     [D.4] High Residual Rate = 0.00
     [D.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [D.5.1] ACAD: Loading = 0.82, Communality = 0.79
            [D.5.2] POTL: Loading = 0.78, Communality = 0.87
            [D.5.3] EXPR: Loading = 0.70, Communality = 0.67
            [C.5.4] Cronbach’s Alpha = 0.89
     [D.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [D.6.1] SCON: Loading = 0.86, Communality = 0.85
            [D.6.2] LIKE: Loading = 0.72, Communality = 0.69
            [D.6.3] APPR: Loading = 0.66, Communality = 0.64
            [C.6.4] Cronbach’s Alpha = 0.87
     [D.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [D.7.1] COMM: Loading = 0.83, Communality = 0.86
            [D.7.2] ORGN: Loading = 0.83, Communality = 0.89
            [C.7.3] Cronbach’s Alpha = 0.92
     [D.8] Factor 4 was a latent variable with higher loading towards the following descriptors:
            [D.8.1] LETT: Loading = 0.89, Communality = 0.89
            [D.8.2] RESM: Loading = 0.82, Communality = 0.87
            [C.8.3] Cronbach’s Alpha = 0.91
     [D.9] Factor 5 was a latent variable with higher loading towards the following descriptors:
            [D.9.1] JFIT: Loading = 0.80, Communality = 0.93
            [D.9.2] CFIT: Loading = 0.74, Communality = 0.86
            [C.9.3] Cronbach’s Alpha = 0.94

Code Chunk | Output

##################################
# Implementing various procedures for determining
# factor retention based on
# the maximum consensus between methods
##################################
(FA_PA_V_MethodAgreementProcedure <- parameters::n_factors(DPA.Descriptors.Numeric,
                                                           algorithm = "pa",
                                                           rotation = "varimax"))

## # Method Agreement Procedure:
## 
## The choice of 4 dimensions is supported by 5 (35.71%) methods out of 14 (Bentler, beta, Parallel analysis, Kaiser criterion, Scree (SE)).

as.data.frame(FA_PA_V_MethodAgreementProcedure)

##    n_Factors              Method              Family
## 1          1 Acceleration factor               Scree
## 2          1          Scree (R2)            Scree_SE
## 3          2 Optimal coordinates               Scree
## 4          3                 CNG                 CNG
## 5          4             Bentler             Bentler
## 6          4                beta Multiple_regression
## 7          4   Parallel analysis               Scree
## 8          4    Kaiser criterion               Scree
## 9          4          Scree (SE)            Scree_SE
## 10         6            Bartlett             Barlett
## 11         6                   t Multiple_regression
## 12         6                   p Multiple_regression
## 13         7            Anderson             Barlett
## 14         7              Lawley             Barlett

##################################
# Conducting exploratory factor analysis
# using Principal Axes extraction
# and Varimax rotation
# with 3 factors
##################################
(FA_PA_V_3F <- fa(DPA.Descriptors.Numeric,
              nfactors = 3,
              fm="pa",
              rotate = "varimax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  pa
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 3, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "varimax", residuals = TRUE, SMC = TRUE, fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       PA1  PA2  PA3   h2   u2 com
## ACAD 0.28 0.74 0.16 0.65 0.35 1.4
## APPR 0.60 0.32 0.18 0.50 0.50 1.7
## COMM 0.79 0.10 0.22 0.69 0.31 1.2
## CFIT 0.60 0.46 0.22 0.61 0.39 2.2
## EXPR 0.05 0.76 0.34 0.69 0.31 1.4
## JFIT 0.52 0.56 0.24 0.64 0.36 2.4
## LETT 0.22 0.20 0.88 0.86 0.14 1.2
## LIKE 0.64 0.31 0.19 0.54 0.46 1.6
## ORGN 0.88 0.09 0.11 0.79 0.21 1.0
## POTL 0.36 0.85 0.14 0.87 0.13 1.4
## RESM 0.28 0.31 0.81 0.83 0.17 1.5
## SCON 0.60 0.37 0.11 0.51 0.49 1.8
## 
##                        PA1  PA2  PA3
## SS loadings           3.49 2.85 1.84
## Proportion Var        0.29 0.24 0.15
## Cumulative Var        0.29 0.53 0.68
## Proportion Explained  0.43 0.35 0.22
## Cumulative Proportion 0.43 0.78 1.00
## 
## Mean item complexity =  1.6
## Test of the hypothesis that 3 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 33  and the objective function was  2.52 
## 
## The root mean square of the residuals (RMSR) is  0.07 
## The df corrected root mean square of the residuals is  0.1 
## 
## The harmonic n.obs is  50 with the empirical chi square  35.54  with prob <  0.35 
## The total n.obs was  50  with Likelihood Chi Square =  106.33  with prob <  1.2e-09 
## 
## Tucker Lewis Index of factoring reliability =  0.619
## RMSEA index =  0.21  and the 90 % confidence intervals are  0.168 0.259
## BIC =  -22.76
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy             
##                                                    PA1  PA2  PA3
## Correlation of (regression) scores with factors   0.94 0.94 0.94
## Multiple R square of scores with factors          0.88 0.88 0.88
## Minimum correlation of possible factor scores     0.76 0.75 0.75

(FA_PA_V_3F_Summary <- FA_PA_V_3F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (varimax-rotation)
## 
## Variable |  PA1 |  PA2 |  PA3 | Complexity | Uniqueness
## -------------------------------------------------------
## ORGN     | 0.88 |      |      |       1.05 |       0.21
## COMM     | 0.79 |      |      |       1.19 |       0.31
## LIKE     | 0.64 |      |      |       1.65 |       0.46
## APPR     | 0.60 |      |      |       1.72 |       0.50
## SCON     | 0.60 |      |      |       1.76 |       0.49
## CFIT     | 0.60 |      |      |       2.19 |       0.39
## POTL     |      | 0.85 |      |       1.41 |       0.13
## EXPR     |      | 0.76 |      |       1.40 |       0.31
## ACAD     |      | 0.74 |      |       1.40 |       0.35
## JFIT     |      | 0.56 |      |       2.35 |       0.36
## LETT     |      |      | 0.88 |       1.23 |       0.14
## RESM     |      |      | 0.81 |       1.54 |       0.17
## 
## The 3 latent factors (varimax rotation) accounted for 68.18% of the total variance of the original data (PA1 = 29.08%, PA2 = 23.78%, PA3 = 15.33%).

summary(FA_PA_V_3F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   PA1 |   PA2 |   PA3
## -------------------------------------------------------
## Eigenvalues                     | 6.069 | 1.209 | 0.903
## Variance Explained              | 0.291 | 0.238 | 0.153
## Variance Explained (Cumulative) | 0.291 | 0.529 | 0.682
## Variance Explained (Proportion) | 0.427 | 0.349 | 0.225

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_PA_V_3F_Residual <- residuals(FA_PA_V_3F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.03    NA                                                      
## COMM  0.07 -0.06    NA                                                
## CFIT -0.04 -0.13  0.02    NA                                          
## EXPR  0.04 -0.02  0.00 -0.06    NA                                    
## JFIT -0.06 -0.12 -0.01  0.27 -0.01    NA                              
## LETT -0.02 -0.05  0.01  0.04  0.02  0.03    NA                        
## LIKE -0.04  0.12 -0.09 -0.08 -0.03 -0.02 -0.01    NA                  
## ORGN  0.03 -0.04  0.13  0.04  0.03  0.00  0.03 -0.08    NA            
## POTL  0.04  0.00  0.02  0.03 -0.01  0.01  0.01 -0.03  0.01    NA      
## RESM  0.02  0.09 -0.01 -0.05 -0.01 -0.05  0.00  0.04 -0.06 -0.02    NA
## SCON -0.06  0.20 -0.10 -0.09  0.05 -0.08 -0.07  0.24 -0.08 -0.05  0.06
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_PA_V_3F_RMS <- FA_PA_V_3F$rms)

## [1] 0.07338105

(FA_PA_V_3F_TLI <- FA_PA_V_3F$TLI)

## [1] 0.6191606

(FA_PA_V_3F_BIC <- FA_PA_V_3F$BIC)

## [1] -22.76274

(FA_PA_V_3F_MaxResidual   <- max(abs(FA_PA_V_3F_Residual),na.rm=TRUE))

## [1] 0.2668687

(FA_PA_V_3F_HighResidual  <- sum(FA_PA_V_3F_Residual>abs(0.05),na.rm=TRUE))

## [1] 18

(FA_PA_V_3F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_PA_V_3F_HighResidualRate <- FA_PA_V_3F_HighResidual/FA_PA_V_3F_TotalResidual)

## [1] 0.2727273

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_PA_V_3F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Varimax Rotation : 3 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM","LIKE","APPR","SCON","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM", "LIKE", 
##     "APPR", "SCON", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.89     0.9      0.57 7.8 0.026  7.1 1.1     0.53
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.82  0.88  0.92
## Duhachek  0.83  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ORGN      0.84      0.86    0.85      0.55 6.0    0.036 0.014  0.49
## COMM      0.85      0.87    0.86      0.56 6.4    0.034 0.012  0.53
## LIKE      0.86      0.86    0.88      0.56 6.4    0.030 0.019  0.52
## APPR      0.86      0.87    0.89      0.57 6.7    0.030 0.019  0.51
## SCON      0.86      0.87    0.87      0.57 6.5    0.029 0.015  0.53
## CFIT      0.87      0.88    0.90      0.59 7.3    0.029 0.020  0.53
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.87  0.84  0.84   0.78  6.9 1.6
## COMM 50  0.84  0.81  0.79   0.74  6.9 1.5
## LIKE 50  0.78  0.81  0.77   0.69  7.4 1.1
## APPR 50  0.75  0.79  0.73   0.67  7.4 1.0
## SCON 50  0.76  0.80  0.76   0.67  7.3 1.2
## CFIT 50  0.78  0.75  0.66   0.64  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.10 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.20 0.30 0.28 0.06 0.02    0
## LIKE 0.00 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0
## APPR 0.00 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0
## SCON 0.00 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

alpha(DPA.Descriptors.Numeric[,c("POTL","EXPR","ACAD","JFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("POTL", "EXPR", "ACAD", 
##     "JFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.88    0.87      0.66 7.6 0.028  7.3 1.2     0.69
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.81  0.88  0.93
## Duhachek  0.82  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## POTL      0.79      0.80    0.74      0.58 4.1    0.051 0.0063  0.54
## EXPR      0.86      0.86    0.84      0.68 6.4    0.036 0.0169  0.71
## ACAD      0.84      0.84    0.80      0.64 5.4    0.040 0.0111  0.70
## JFIT      0.89      0.89    0.85      0.72 7.8    0.028 0.0043  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## POTL 50  0.93  0.93  0.92   0.87  7.3 1.4
## EXPR 50  0.83  0.84  0.75   0.71  7.3 1.4
## ACAD 50  0.86  0.87  0.83   0.76  7.4 1.3
## JFIT 50  0.82  0.80  0.71   0.65  7.0 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## POTL 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0
## ACAD 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0

alpha(DPA.Descriptors.Numeric[,c("LETT","RESM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("LETT", "RESM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_PA_V_3F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "prax",
                                  nfac = 3,
                                  rotation = "varimax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_PA_V_3F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

##################################
# Conducting exploratory factor analysis
# using Principal Axes extraction
# and Varimax rotation
# with 4 factors
##################################
(FA_PA_V_4F <- fa(DPA.Descriptors.Numeric,
              nfactors = 4,
              fm="pa",
              rotate = "varimax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  pa
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 4, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "varimax", residuals = TRUE, SMC = TRUE, fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       PA1   PA2  PA4  PA3   h2   u2 com
## ACAD 0.71  0.18 0.27 0.16 0.64 0.36 1.5
## APPR 0.23  0.27 0.70 0.17 0.65 0.35 1.7
## COMM 0.12  0.76 0.31 0.21 0.73 0.27 1.5
## CFIT 0.52  0.65 0.15 0.19 0.76 0.24 2.2
## EXPR 0.73 -0.03 0.18 0.33 0.68 0.32 1.5
## JFIT 0.61  0.53 0.17 0.21 0.73 0.27 2.4
## LETT 0.23  0.25 0.06 0.88 0.89 0.11 1.3
## LIKE 0.23  0.32 0.69 0.19 0.66 0.34 1.8
## ORGN 0.09  0.84 0.36 0.09 0.86 0.14 1.4
## POTL 0.85  0.26 0.29 0.12 0.89 0.11 1.5
## RESM 0.28  0.13 0.33 0.83 0.88 0.12 1.6
## SCON 0.27  0.21 0.82 0.09 0.79 0.21 1.4
## 
##                        PA1  PA2  PA4  PA3
## SS loadings           2.75 2.39 2.21 1.80
## Proportion Var        0.23 0.20 0.18 0.15
## Cumulative Var        0.23 0.43 0.61 0.76
## Proportion Explained  0.30 0.26 0.24 0.20
## Cumulative Proportion 0.30 0.56 0.80 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 4 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 24  and the objective function was  1.47 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic n.obs is  50 with the empirical chi square  9.84  with prob <  1 
## The total n.obs was  50  with Likelihood Chi Square =  60.87  with prob <  4.8e-05 
## 
## Tucker Lewis Index of factoring reliability =  0.732
## RMSEA index =  0.174  and the 90 % confidence intervals are  0.122 0.233
## BIC =  -33.02
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    PA1  PA2  PA4  PA3
## Correlation of (regression) scores with factors   0.94 0.93 0.91 0.95
## Multiple R square of scores with factors          0.89 0.87 0.82 0.91
## Minimum correlation of possible factor scores     0.78 0.74 0.65 0.82

(FA_PA_V_4F_Summary <- FA_PA_V_4F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (varimax-rotation)
## 
## Variable |  PA1 |  PA2 |  PA4 |  PA3 | Complexity | Uniqueness
## --------------------------------------------------------------
## POTL     | 0.85 |      |      |      |       1.46 |       0.11
## EXPR     | 0.73 |      |      |      |       1.53 |       0.32
## ACAD     | 0.71 |      |      |      |       1.53 |       0.36
## JFIT     | 0.61 |      |      |      |       2.38 |       0.27
## ORGN     |      | 0.84 |      |      |       1.41 |       0.14
## COMM     |      | 0.76 |      |      |       1.55 |       0.27
## CFIT     |      | 0.65 |      |      |       2.22 |       0.24
## SCON     |      |      | 0.82 |      |       1.39 |       0.21
## APPR     |      |      | 0.70 |      |       1.67 |       0.35
## LIKE     |      |      | 0.69 |      |       1.84 |       0.34
## LETT     |      |      |      | 0.88 |       1.31 |       0.11
## RESM     |      |      |      | 0.83 |       1.62 |       0.12
## 
## The 4 latent factors (varimax rotation) accounted for 76.28% of the total variance of the original data (PA1 = 22.93%, PA2 = 19.92%, PA4 = 18.41%, PA3 = 15.03%).

summary(FA_PA_V_4F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   PA1 |   PA2 |   PA4 |   PA3
## ---------------------------------------------------------------
## Eigenvalues                     | 6.152 | 1.259 | 0.943 | 0.800
## Variance Explained              | 0.229 | 0.199 | 0.184 | 0.150
## Variance Explained (Cumulative) | 0.229 | 0.429 | 0.613 | 0.763
## Variance Explained (Proportion) | 0.301 | 0.261 | 0.241 | 0.197

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_PA_V_4F_Residual <- residuals(FA_PA_V_4F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.04    NA                                                      
## COMM  0.07  0.01    NA                                                
## CFIT -0.06 -0.02 -0.05    NA                                          
## EXPR  0.05 -0.03  0.01 -0.05    NA                                    
## JFIT -0.07 -0.03 -0.07  0.15 -0.01    NA                              
## LETT -0.03 -0.01 -0.01  0.00  0.02  0.00    NA                        
## LIKE -0.03 -0.02 -0.03  0.02 -0.03  0.07  0.02    NA                  
## ORGN  0.03  0.02  0.08 -0.05  0.04 -0.06  0.01 -0.03    NA            
## POTL  0.05  0.03  0.01 -0.02  0.00 -0.03  0.01 -0.01  0.01    NA      
## RESM  0.02  0.03  0.02  0.01 -0.01 -0.01  0.00 -0.01 -0.03 -0.01    NA
## SCON -0.06 -0.01 -0.03  0.04  0.04  0.03 -0.01  0.05 -0.01 -0.03 -0.01
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_PA_V_4F_RMS <- FA_PA_V_4F$rms)

## [1] 0.03861068

(FA_PA_V_4F_TLI <- FA_PA_V_4F$TLI)

## [1] 0.7317578

(FA_PA_V_4F_BIC <- FA_PA_V_4F$BIC)

## [1] -33.01886

(FA_PA_V_4F_MaxResidual   <- max(abs(FA_PA_V_4F_Residual),na.rm=TRUE))

## [1] 0.1504566

(FA_PA_V_4F_HighResidual  <- sum(FA_PA_V_4F_Residual>abs(0.05),na.rm=TRUE))

## [1] 10

(FA_PA_V_4F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_PA_V_4F_HighResidualRate <- FA_PA_V_4F_HighResidual/FA_PA_V_4F_TotalResidual)

## [1] 0.1515152

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_PA_V_4F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Varimax Rotation : 4 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("POTL","EXPR","ACAD","JFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("POTL", "EXPR", "ACAD", 
##     "JFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.88    0.87      0.66 7.6 0.028  7.3 1.2     0.69
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.81  0.88  0.93
## Duhachek  0.82  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## POTL      0.79      0.80    0.74      0.58 4.1    0.051 0.0063  0.54
## EXPR      0.86      0.86    0.84      0.68 6.4    0.036 0.0169  0.71
## ACAD      0.84      0.84    0.80      0.64 5.4    0.040 0.0111  0.70
## JFIT      0.89      0.89    0.85      0.72 7.8    0.028 0.0043  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## POTL 50  0.93  0.93  0.92   0.87  7.3 1.4
## EXPR 50  0.83  0.84  0.75   0.71  7.3 1.4
## ACAD 50  0.86  0.87  0.83   0.76  7.4 1.3
## JFIT 50  0.82  0.80  0.71   0.65  7.0 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## POTL 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0
## ACAD 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0

alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.87      0.87    0.85      0.69 6.7 0.034  6.9 1.4     0.62
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.79  0.87  0.92
## Duhachek  0.80  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r  S/N alpha se var.r med.r
## ORGN      0.74      0.74    0.59      0.59  2.9    0.073    NA  0.59
## COMM      0.77      0.77    0.62      0.62  3.3    0.066    NA  0.62
## CFIT      0.92      0.93    0.86      0.86 12.5    0.021    NA  0.86
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.93  0.93  0.91   0.83  6.9 1.6
## COMM 50  0.91  0.92  0.89   0.80  6.9 1.5
## CFIT 50  0.84  0.83  0.65   0.63  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.10 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.20 0.30 0.28 0.06 0.02    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

alpha(DPA.Descriptors.Numeric[,c("SCON","APPR","LIKE")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("SCON", "APPR", "LIKE")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.83       0.7 6.9 0.031  7.4 0.99      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.80  0.87  0.92
## Duhachek  0.81  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## SCON      0.78      0.78    0.64      0.64 3.5    0.062    NA  0.64
## APPR      0.86      0.86    0.75      0.75 6.1    0.040    NA  0.75
## LIKE      0.82      0.82    0.70      0.70 4.6    0.051    NA  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## SCON 50  0.92  0.91  0.86   0.80  7.3 1.2
## APPR 50  0.86  0.87  0.76   0.71  7.4 1.0
## LIKE 50  0.90  0.89  0.82   0.76  7.4 1.1
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## SCON 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## APPR 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0
## LIKE 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0

alpha(DPA.Descriptors.Numeric[,c("LETT","RESM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("LETT", "RESM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_PA_V_4F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "prax",
                                  nfac = 4,
                                  rotation = "varimax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_PA_V_4F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

##################################
# Conducting exploratory factor analysis
# using Principal Axes extraction
# and Varimax rotation
# with 5 factors
##################################
(FA_PA_V_5F <- fa(DPA.Descriptors.Numeric,
              nfactors = 5,
              fm="pa",
              rotate = "varimax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  pa
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 5, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "varimax", residuals = TRUE, SMC = TRUE, fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       PA5  PA4   PA2  PA3  PA1   h2    u2 com
## ACAD 0.82 0.21  0.21 0.14 0.14 0.79 0.208 1.4
## APPR 0.27 0.66  0.32 0.18 0.05 0.64 0.355 2.0
## COMM 0.14 0.25  0.83 0.21 0.19 0.86 0.143 1.5
## CFIT 0.29 0.21  0.38 0.20 0.74 0.86 0.144 2.2
## EXPR 0.70 0.17 -0.06 0.33 0.19 0.67 0.331 1.8
## JFIT 0.37 0.24  0.23 0.21 0.80 0.93 0.065 2.0
## LETT 0.17 0.07  0.18 0.89 0.21 0.89 0.105 1.3
## LIKE 0.16 0.72  0.25 0.19 0.24 0.69 0.306 1.8
## ORGN 0.09 0.32  0.83 0.08 0.26 0.89 0.115 1.6
## POTL 0.78 0.28  0.18 0.13 0.36 0.87 0.132 1.9
## RESM 0.28 0.31  0.13 0.82 0.10 0.87 0.127 1.6
## SCON 0.22 0.86  0.16 0.09 0.17 0.85 0.153 1.3
## 
##                        PA5  PA4  PA2  PA3  PA1
## SS loadings           2.26 2.21 1.90 1.82 1.63
## Proportion Var        0.19 0.18 0.16 0.15 0.14
## Cumulative Var        0.19 0.37 0.53 0.68 0.82
## Proportion Explained  0.23 0.23 0.19 0.19 0.17
## Cumulative Proportion 0.23 0.46 0.65 0.83 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 5 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 16  and the objective function was  0.47 
## 
## The root mean square of the residuals (RMSR) is  0.01 
## The df corrected root mean square of the residuals is  0.03 
## 
## The harmonic n.obs is  50 with the empirical chi square  1.45  with prob <  1 
## The total n.obs was  50  with Likelihood Chi Square =  19.16  with prob <  0.26 
## 
## Tucker Lewis Index of factoring reliability =  0.965
## RMSEA index =  0.06  and the 90 % confidence intervals are  0 0.153
## BIC =  -43.43
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    PA5  PA4  PA2  PA3  PA1
## Correlation of (regression) scores with factors   0.92 0.92 0.94 0.95 0.94
## Multiple R square of scores with factors          0.85 0.85 0.88 0.91 0.88
## Minimum correlation of possible factor scores     0.71 0.70 0.77 0.82 0.76

(FA_PA_V_5F_Summary <- FA_PA_V_5F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (varimax-rotation)
## 
## Variable |  PA5 |  PA4 |  PA2 |  PA3 |  PA1 | Complexity | Uniqueness
## ---------------------------------------------------------------------
## ACAD     | 0.82 |      |      |      |      |       1.41 |       0.21
## POTL     | 0.78 |      |      |      |      |       1.92 |       0.13
## EXPR     | 0.70 |      |      |      |      |       1.77 |       0.33
## SCON     |      | 0.86 |      |      |      |       1.32 |       0.15
## LIKE     |      | 0.72 |      |      |      |       1.75 |       0.31
## APPR     |      | 0.66 |      |      |      |       1.99 |       0.36
## COMM     |      |      | 0.83 |      |      |       1.49 |       0.14
## ORGN     |      |      | 0.83 |      |      |       1.57 |       0.11
## LETT     |      |      |      | 0.89 |      |       1.29 |       0.11
## RESM     |      |      |      | 0.82 |      |       1.63 |       0.13
## JFIT     |      |      |      |      | 0.80 |       1.97 |       0.07
## CFIT     |      |      |      |      | 0.74 |       2.24 |       0.14
## 
## The 5 latent factors (varimax rotation) accounted for 81.80% of the total variance of the original data (PA5 = 18.81%, PA4 = 18.44%, PA2 = 15.86%, PA3 = 15.15%, PA1 = 13.55%).

summary(FA_PA_V_5F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   PA5 |   PA4 |   PA2 |   PA3 |   PA1
## -----------------------------------------------------------------------
## Eigenvalues                     | 6.211 | 1.294 | 0.967 | 0.867 | 0.477
## Variance Explained              | 0.188 | 0.184 | 0.159 | 0.152 | 0.135
## Variance Explained (Cumulative) | 0.188 | 0.372 | 0.531 | 0.683 | 0.818
## Variance Explained (Proportion) | 0.230 | 0.225 | 0.194 | 0.185 | 0.166

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_PA_V_5F_Residual <- residuals(FA_PA_V_5F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.01    NA                                                      
## COMM  0.01 -0.02    NA                                                
## CFIT  0.01  0.00  0.00    NA                                          
## EXPR  0.00 -0.04  0.00 -0.03    NA                                    
## JFIT  0.00  0.00  0.00  0.00  0.01    NA                              
## LETT -0.02 -0.01 -0.01  0.00  0.02 -0.01    NA                        
## LIKE  0.01  0.00  0.00 -0.02 -0.02  0.02  0.02    NA                  
## ORGN -0.01  0.01  0.00  0.00  0.02  0.00  0.02  0.00    NA            
## POTL -0.01  0.02 -0.01  0.01  0.01 -0.01  0.01  0.01  0.00    NA      
## RESM  0.01  0.03  0.01  0.02 -0.02  0.00  0.00 -0.01 -0.03 -0.01    NA
## SCON -0.02 -0.01  0.01  0.01  0.05 -0.01 -0.01  0.01  0.01 -0.02 -0.01
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_PA_V_5F_RMS <- FA_PA_V_5F$rms)

## [1] 0.01483907

(FA_PA_V_5F_TLI <- FA_PA_V_5F$TLI)

## [1] 0.9648451

(FA_PA_V_5F_BIC <- FA_PA_V_5F$BIC)

## [1] -43.43181

(FA_PA_V_5F_MaxResidual   <- max(abs(FA_PA_V_5F_Residual),na.rm=TRUE))

## [1] 0.04835397

(FA_PA_V_5F_HighResidual  <- sum(FA_PA_V_5F_Residual>abs(0.05),na.rm=TRUE))

## [1] 0

(FA_PA_V_5F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_PA_V_5F_HighResidualRate <- FA_PA_V_5F_HighResidual/FA_PA_V_5F_TotalResidual)

## [1] 0

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_PA_V_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Varimax Rotation : 5 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("ACAD","POTL","EXPR")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ACAD", "POTL", "EXPR")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.89      0.89    0.85      0.72 7.8 0.028  7.3 1.2      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.82  0.89  0.93
## Duhachek  0.83  0.89  0.94
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ACAD      0.83      0.83    0.70      0.70 4.7    0.049    NA  0.70
## POTL      0.80      0.80    0.67      0.67 4.0    0.056    NA  0.67
## EXPR      0.88      0.89    0.80      0.80 7.8    0.032    NA  0.80
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ACAD 50  0.91  0.91  0.85   0.79  7.4 1.3
## POTL 50  0.92  0.92  0.88   0.82  7.3 1.4
## EXPR 50  0.88  0.88  0.76   0.72  7.3 1.4
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## ACAD 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## POTL 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0

alpha(DPA.Descriptors.Numeric[,c("SCON","LIKE","APPR")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("SCON", "LIKE", "APPR")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.83       0.7 6.9 0.031  7.4 0.99      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.80  0.87  0.92
## Duhachek  0.81  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## SCON      0.78      0.78    0.64      0.64 3.5    0.062    NA  0.64
## LIKE      0.82      0.82    0.70      0.70 4.6    0.051    NA  0.70
## APPR      0.86      0.86    0.75      0.75 6.1    0.040    NA  0.75
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## SCON 50  0.92  0.91  0.86   0.80  7.3 1.2
## LIKE 50  0.90  0.89  0.82   0.76  7.4 1.1
## APPR 50  0.86  0.87  0.76   0.71  7.4 1.0
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## SCON 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## LIKE 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0
## APPR 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0

alpha(DPA.Descriptors.Numeric[,c("COMM","ORGN")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("COMM", "ORGN")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.92      0.93    0.86      0.86  12 0.021  6.9 1.5     0.86
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.87  0.92  0.96
## Duhachek  0.88  0.92  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## COMM      0.80      0.86    0.74      0.86 6.2       NA     0  0.86
## ORGN      0.92      0.86    0.74      0.86 6.2       NA     0  0.86
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## COMM 50  0.96  0.96   0.9   0.86  6.9 1.5
## ORGN 50  0.97  0.96   0.9   0.86  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5   6    7    8    9   10 miss
## COMM 0.04 0.04 0.06 0.2 0.30 0.28 0.06 0.02    0
## ORGN 0.02 0.04 0.20 0.1 0.24 0.24 0.16 0.00    0

alpha(DPA.Descriptors.Numeric[,c("LETT","RESM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("LETT", "RESM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0

alpha(DPA.Descriptors.Numeric[,c("JFIT","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("JFIT", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.94      0.94    0.88      0.88  15 0.018    7 1.6     0.88
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.89  0.94  0.96
## Duhachek  0.90  0.94  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## JFIT      0.87      0.88    0.78      0.88 7.5       NA     0  0.88
## CFIT      0.90      0.88    0.78      0.88 7.5       NA     0  0.88
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## JFIT 50  0.97  0.97  0.91   0.88  7.0 1.6
## CFIT 50  0.97  0.97  0.91   0.88  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_PA_V_5F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "prax",
                                  nfac = 5,
                                  rotation = "varimax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_PA_V_5F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

par(mfrow=c(1,3))
fa.diagram(FA_PA_V_3F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Varimax Rotation : 3 Factors",
           cex=0.75)
fa.diagram(FA_PA_V_4F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Varimax Rotation : 4 Factors",
           cex=0.75)
fa.diagram(FA_PA_V_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Varimax Rotation : 5 Factors",
           cex=0.75)

1.6.2 Principal Axes Factor Extraction and Promax Rotation (FA_PA_P)

Principal Axes Factor Extraction identifies the underlying constructs that explain the observed correlations among variables by capturing both common variance (shared among variables) and unique variance (specific to each variable). This process potentially results to factors with lower communalities (explained variance) but with more direct interpretability. The algorithm performs eigenvalue decomposition on the correlation matrix. The eigenvalues represent the amount of variance explained by each eigenvector. Given a defined number of factors, loadings are calculated for each observed variable on each extracted factor. Factor loadings indicate the strength and direction of the relationship between variables and factors. Factors are interpreted based on the loading patterns. Variables with high loadings on a factor are strongly associated with the factor.

Promax Rotation is an oblique rotation method which allows for more flexibility by accommodating the possibility of correlated factors. The algorithm aims to simplify the factor structure by both maximizing the variance of the squared loadings within each factor and allowing for correlated factors. It uses a more complex mathematical approach to find the optimal rotation that accounts for both variance and correlation. The results provide a more accurate representation of the underlying relationships when the factors are expected to be correlated.

[A] Appplying Principal Axes factor extraction and Promax rotation, an evaluation was conducted using a set of empirical guidelines to determine the optimal number of factors to be retained for exploratory factor analysis. It was determined that:
     [A.1] 4 factors would be sufficient for an optimal balance between comprehensiveness and parsimony.
     [A.2] To ensure that both under-extraction and over-extraction are assessed, models with 3, 4 and 5 factors were sequentially evaluated for their interpretability and theoretical meaningfulness.
     [A.3] The choice of 4 factors was supported by maximum consensus (35.71%) from 5 (Bentler, Beta, Parallel Analysis, Kaiser Criterion and Standardized Scree) among 14 methods.

[B] Results for the exploratory factor analysis using a 3-Factor Structure were as follows:
     [B.1] Standardized Root Mean Square of the Residual = 0.07
     [B.2] Tucker-Lewis Fit Index = 0.62
     [B.3] Bayesian Information Criterion = -22.76
     [B.4] High Residual Rate = 0.27
     [B.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [B.5.1] ORGN: Loading = 1.04, Communality = 0.79
            [B.5.2] COMM: Loading = 0.91, Communality = 0.69
            [B.5.3] LIKE: Loading = 0.64, Communality = 0.54
            [B.5.4] APPR: Loading = 0.59, Communality = 0.50
            [B.5.5] SCON: Loading = 0.57, Communality = 0.51
            [B.5.6] CFIT: Loading = 0.52, Communality = 0.61
            [B.5.7] Cronbach’s Alpha = 0.88
     [B.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [B.6.1] POTL: Loading = 0.95, Communality = 0.87
            [B.6.2] EXPR: Loading = 0.88, Communality = 0.69
            [B.6.3] ACAD: Loading = 0.82, Communality = 0.65
            [B.6.4] JFIT: Loading = 0.48, Communality = 0.64
            [B.6.5] Cronbach’s Alpha = 0.88
     [B.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [B.7.1] LETT: Loading = 0.94, Communality = 0.86
            [B.7.2] RESM: Loading = 0.82, Communality = 0.83
            [B.7.3] Cronbach’s Alpha = 0.91
     [B.8] Correlation between factors ranged from 0.48 to 0.64

[C] Results for the exploratory factor analysis using a 4-Factor Structure were as follows:
     [C.1] Standardized Root Mean Square of the Residual = 0.04
     [C.2] Tucker-Lewis Fit Index = 0.73
     [C.3] Bayesian Information Criterion = -33.02
     [C.4] High Residual Rate = 0.15
     [C.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [C.5.1] POTL: Loading = 0.94, Communality = 0.89
            [C.5.2] EXPR: Loading = 0.81, Communality = 0.68
            [C.5.3] ACAD: Loading = 0.77, Communality = 0.64
            [C.5.4] JFIT: Loading = 0.57, Communality = 0.73
            [C.5.5] Cronbach’s Alpha = 0.88
     [C.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [C.6.1] ORGN: Loading = 0.92, Communality = 0.86
            [C.6.2] COMM: Loading = 0.82, Communality = 0.73
            [C.6.3] CFIT: Loading = 0.64, Communality = 0.76
            [C.6.4] Cronbach’s Alpha = 0.87
     [C.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [C.7.1] SCON: Loading = 0.87, Communality = 0.79
            [C.7.2] APPR: Loading = 0.72, Communality = 0.65
            [C.7.3] LIKE: Loading = 0.68, Communality = 0.66
            [C.7.4] Cronbach’s Alpha = 0.87
     [C.8] Factor 4 was a latent variable with higher loading towards the following descriptors:
            [C.8.1] LETT: Loading = 0.95, Communality = 0.89
            [C.8.2] RESM: Loading = 0.85, Communality = 0.88
            [C.8.3] Cronbach’s Alpha = 0.91
     [C.9] Correlation between factors ranged from 0.41 to 0.56

[D] Results for the exploratory factor analysis using a 5-Factor Structure were as follows:
     [D.1] Standardized Root Mean Square of the Residual = 0.02
     [D.2] Tucker-Lewis Fit Index = 0.96
     [D.3] Bayesian Information Criterion = -43.43
     [D.4] High Residual Rate = 0.00
     [D.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [D.5.1] ACAD: Loading = 0.95, Communality = 0.79
            [D.5.2] POTL: Loading = 0.80, Communality = 0.87
            [D.5.3] EXPR: Loading = 0.73, Communality = 0.67
            [C.5.4] Cronbach’s Alpha = 0.89
     [D.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [D.6.1] SCON: Loading = 1.00, Communality = 0.85
            [D.6.2] LIKE: Loading = 0.78, Communality = 0.69
            [D.6.3] APPR: Loading = 0.67, Communality = 0.64
            [C.6.4] Cronbach’s Alpha = 0.87
     [D.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [D.7.1] COMM: Loading = 0.91, Communality = 0.86
            [D.7.2] ORGN: Loading = 0.87, Communality = 0.89
            [C.7.3] Cronbach’s Alpha = 0.92
     [D.8] Factor 4 was a latent variable with higher loading towards the following descriptors:
            [D.8.1] LETT: Loading = 0.91, Communality = 0.89
            [D.8.2] RESM: Loading = 0.81, Communality = 0.87
            [C.8.3] Cronbach’s Alpha = 0.91
     [D.9] Factor 5 was a latent variable with higher loading towards the following descriptors:
            [D.9.1] JFIT: Loading = 0.96, Communality = 0.93
            [D.9.2] CFIT: Loading = 0.85, Communality = 0.86
            [C.9.3] Cronbach’s Alpha = 0.94
     [D.10] Correlation between factors ranged from 0.36 to 0.61

Code Chunk | Output

##################################
# Implementing various procedures for determining
# factor retention based on
# the maximum consensus between methods
##################################
(FA_PA_P_MethodAgreementProcedure <- parameters::n_factors(DPA.Descriptors.Numeric,
                                                           algorithm = "pa",
                                                           rotation = "promax"))

## # Method Agreement Procedure:
## 
## The choice of 4 dimensions is supported by 5 (35.71%) methods out of 14 (Bentler, beta, Parallel analysis, Kaiser criterion, Scree (SE)).

as.data.frame(FA_PA_P_MethodAgreementProcedure)

##    n_Factors              Method              Family
## 1          1 Acceleration factor               Scree
## 2          1          Scree (R2)            Scree_SE
## 3          2 Optimal coordinates               Scree
## 4          3                 CNG                 CNG
## 5          4             Bentler             Bentler
## 6          4                beta Multiple_regression
## 7          4   Parallel analysis               Scree
## 8          4    Kaiser criterion               Scree
## 9          4          Scree (SE)            Scree_SE
## 10         6            Bartlett             Barlett
## 11         6                   t Multiple_regression
## 12         6                   p Multiple_regression
## 13         7            Anderson             Barlett
## 14         7              Lawley             Barlett

##################################
# Conducting exploratory factor analysis
# using Principal Axes extraction
# and Promax rotation
# with 3 factors
##################################
(FA_PA_P_3F <- fa(DPA.Descriptors.Numeric,
              nfactors = 3,
              fm="pa",
              rotate = "promax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  pa
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 3, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "promax", residuals = TRUE, SMC = TRUE, fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        PA1   PA2   PA3   h2   u2 com
## ACAD  0.02  0.82 -0.06 0.65 0.35 1.0
## APPR  0.59  0.15  0.02 0.50 0.50 1.1
## COMM  0.91 -0.23  0.10 0.69 0.31 1.1
## CFIT  0.52  0.32  0.04 0.61 0.39 1.7
## EXPR -0.31  0.88  0.19 0.69 0.31 1.3
## JFIT  0.37  0.48  0.04 0.64 0.36 1.9
## LETT  0.03 -0.05  0.94 0.86 0.14 1.0
## LIKE  0.64  0.12  0.03 0.54 0.46 1.1
## ORGN  1.04 -0.24 -0.05 0.79 0.21 1.1
## POTL  0.08  0.95 -0.13 0.87 0.13 1.1
## RESM  0.07  0.09  0.82 0.83 0.17 1.0
## SCON  0.57  0.24 -0.07 0.51 0.49 1.4
## 
##                        PA1  PA2  PA3
## SS loadings           3.56 2.92 1.70
## Proportion Var        0.30 0.24 0.14
## Cumulative Var        0.30 0.54 0.68
## Proportion Explained  0.44 0.36 0.21
## Cumulative Proportion 0.44 0.79 1.00
## 
##  With factor correlations of 
##      PA1  PA2  PA3
## PA1 1.00 0.64 0.48
## PA2 0.64 1.00 0.55
## PA3 0.48 0.55 1.00
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 3 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 33  and the objective function was  2.52 
## 
## The root mean square of the residuals (RMSR) is  0.07 
## The df corrected root mean square of the residuals is  0.1 
## 
## The harmonic n.obs is  50 with the empirical chi square  35.54  with prob <  0.35 
## The total n.obs was  50  with Likelihood Chi Square =  106.33  with prob <  1.2e-09 
## 
## Tucker Lewis Index of factoring reliability =  0.619
## RMSEA index =  0.21  and the 90 % confidence intervals are  0.168 0.259
## BIC =  -22.76
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy             
##                                                    PA1  PA2  PA3
## Correlation of (regression) scores with factors   0.96 0.97 0.96
## Multiple R square of scores with factors          0.93 0.93 0.91
## Minimum correlation of possible factor scores     0.85 0.86 0.83

(FA_PA_P_3F_Summary <- FA_PA_P_3F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (promax-rotation)
## 
## Variable |  PA1 |  PA2 |  PA3 | Complexity | Uniqueness
## -------------------------------------------------------
## ORGN     | 1.04 |      |      |       1.11 |       0.21
## COMM     | 0.91 |      |      |       1.15 |       0.31
## LIKE     | 0.64 |      |      |       1.07 |       0.46
## APPR     | 0.59 |      |      |       1.13 |       0.50
## SCON     | 0.57 |      |      |       1.37 |       0.49
## CFIT     | 0.52 |      |      |       1.67 |       0.39
## POTL     |      | 0.95 |      |       1.05 |       0.13
## EXPR     |      | 0.88 |      |       1.34 |       0.31
## ACAD     |      | 0.82 |      |       1.01 |       0.35
## JFIT     |      | 0.48 |      |       1.91 |       0.36
## LETT     |      |      | 0.94 |       1.01 |       0.14
## RESM     |      |      | 0.82 |       1.04 |       0.17
## 
## The 3 latent factors (promax rotation) accounted for 68.18% of the total variance of the original data (PA1 = 29.69%, PA2 = 24.31%, PA3 = 14.19%).

summary(FA_PA_P_3F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   PA1 |   PA2 |   PA3
## -------------------------------------------------------
## Eigenvalues                     | 6.069 | 1.209 | 0.903
## Variance Explained              | 0.297 | 0.243 | 0.142
## Variance Explained (Cumulative) | 0.297 | 0.540 | 0.682
## Variance Explained (Proportion) | 0.435 | 0.357 | 0.208

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_PA_P_3F_Residual <- residuals(FA_PA_P_3F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.03    NA                                                      
## COMM  0.07 -0.06    NA                                                
## CFIT -0.04 -0.13  0.02    NA                                          
## EXPR  0.04 -0.02  0.00 -0.06    NA                                    
## JFIT -0.06 -0.12 -0.01  0.27 -0.01    NA                              
## LETT -0.02 -0.05  0.01  0.04  0.02  0.03    NA                        
## LIKE -0.04  0.12 -0.09 -0.08 -0.03 -0.02 -0.01    NA                  
## ORGN  0.03 -0.04  0.13  0.04  0.03  0.00  0.03 -0.08    NA            
## POTL  0.04  0.00  0.02  0.03 -0.01  0.01  0.01 -0.03  0.01    NA      
## RESM  0.02  0.09 -0.01 -0.05 -0.01 -0.05  0.00  0.04 -0.06 -0.02    NA
## SCON -0.06  0.20 -0.10 -0.09  0.05 -0.08 -0.07  0.24 -0.08 -0.05  0.06
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_PA_P_3F_RMS <- FA_PA_P_3F$rms)

## [1] 0.07338105

(FA_PA_P_3F_TLI <- FA_PA_P_3F$TLI)

## [1] 0.6191606

(FA_PA_P_3F_BIC <- FA_PA_P_3F$BIC)

## [1] -22.76274

(FA_PA_P_3F_MaxResidual   <- max(abs(FA_PA_P_3F_Residual),na.rm=TRUE))

## [1] 0.2668687

(FA_PA_P_3F_HighResidual  <- sum(FA_PA_P_3F_Residual>abs(0.05),na.rm=TRUE))

## [1] 18

(FA_PA_P_3F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_PA_P_3F_HighResidualRate <- FA_PA_P_3F_HighResidual/FA_PA_P_3F_TotalResidual)

## [1] 0.2727273

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_PA_P_3F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Promax Rotation : 3 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM","LIKE","APPR","SCON","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM", "LIKE", 
##     "APPR", "SCON", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.89     0.9      0.57 7.8 0.026  7.1 1.1     0.53
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.82  0.88  0.92
## Duhachek  0.83  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ORGN      0.84      0.86    0.85      0.55 6.0    0.036 0.014  0.49
## COMM      0.85      0.87    0.86      0.56 6.4    0.034 0.012  0.53
## LIKE      0.86      0.86    0.88      0.56 6.4    0.030 0.019  0.52
## APPR      0.86      0.87    0.89      0.57 6.7    0.030 0.019  0.51
## SCON      0.86      0.87    0.87      0.57 6.5    0.029 0.015  0.53
## CFIT      0.87      0.88    0.90      0.59 7.3    0.029 0.020  0.53
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.87  0.84  0.84   0.78  6.9 1.6
## COMM 50  0.84  0.81  0.79   0.74  6.9 1.5
## LIKE 50  0.78  0.81  0.77   0.69  7.4 1.1
## APPR 50  0.75  0.79  0.73   0.67  7.4 1.0
## SCON 50  0.76  0.80  0.76   0.67  7.3 1.2
## CFIT 50  0.78  0.75  0.66   0.64  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.10 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.20 0.30 0.28 0.06 0.02    0
## LIKE 0.00 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0
## APPR 0.00 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0
## SCON 0.00 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

alpha(DPA.Descriptors.Numeric[,c("POTL","EXPR","ACAD","JFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("POTL", "EXPR", "ACAD", 
##     "JFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.88    0.87      0.66 7.6 0.028  7.3 1.2     0.69
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.81  0.88  0.93
## Duhachek  0.82  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## POTL      0.79      0.80    0.74      0.58 4.1    0.051 0.0063  0.54
## EXPR      0.86      0.86    0.84      0.68 6.4    0.036 0.0169  0.71
## ACAD      0.84      0.84    0.80      0.64 5.4    0.040 0.0111  0.70
## JFIT      0.89      0.89    0.85      0.72 7.8    0.028 0.0043  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## POTL 50  0.93  0.93  0.92   0.87  7.3 1.4
## EXPR 50  0.83  0.84  0.75   0.71  7.3 1.4
## ACAD 50  0.86  0.87  0.83   0.76  7.4 1.3
## JFIT 50  0.82  0.80  0.71   0.65  7.0 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## POTL 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0
## ACAD 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0

alpha(DPA.Descriptors.Numeric[,c("LETT","RESM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("LETT", "RESM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_PA_P_3F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "prax",
                                  nfac = 3,
                                  rotation = "promax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_PA_P_3F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

##################################
# Conducting exploratory factor analysis
# using Principal Axes extraction
# and Promax rotation
# with 4 factors
##################################
(FA_PA_P_4F <- fa(DPA.Descriptors.Numeric,
              nfactors = 4,
              fm="pa",
              rotate = "promax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  pa
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 4, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "promax", residuals = TRUE, SMC = TRUE, fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        PA1   PA2   PA4   PA3   h2   u2 com
## ACAD  0.77 -0.02  0.11 -0.04 0.64 0.36 1.0
## APPR  0.04  0.09  0.72  0.04 0.65 0.35 1.0
## COMM -0.14  0.82  0.11  0.10 0.73 0.27 1.1
## CFIT  0.44  0.64 -0.14  0.01 0.76 0.24 1.9
## EXPR  0.81 -0.28  0.03  0.19 0.68 0.32 1.4
## JFIT  0.57  0.46 -0.10  0.02 0.73 0.27 2.0
## LETT -0.01  0.14 -0.16  0.95 0.89 0.11 1.1
## LIKE  0.02  0.16  0.68  0.05 0.66 0.34 1.1
## ORGN -0.17  0.92  0.17 -0.05 0.86 0.14 1.1
## POTL  0.94  0.04  0.08 -0.13 0.89 0.11 1.1
## RESM  0.04 -0.08  0.20  0.85 0.88 0.12 1.1
## SCON  0.11 -0.01  0.87 -0.09 0.79 0.21 1.1
## 
##                        PA1  PA2  PA4  PA3
## SS loadings           2.85 2.47 2.09 1.74
## Proportion Var        0.24 0.21 0.17 0.15
## Cumulative Var        0.24 0.44 0.62 0.76
## Proportion Explained  0.31 0.27 0.23 0.19
## Cumulative Proportion 0.31 0.58 0.81 1.00
## 
##  With factor correlations of 
##      PA1  PA2  PA4  PA3
## PA1 1.00 0.53 0.52 0.54
## PA2 0.53 1.00 0.56 0.41
## PA4 0.52 0.56 1.00 0.42
## PA3 0.54 0.41 0.42 1.00
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 4 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 24  and the objective function was  1.47 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic n.obs is  50 with the empirical chi square  9.84  with prob <  1 
## The total n.obs was  50  with Likelihood Chi Square =  60.87  with prob <  4.8e-05 
## 
## Tucker Lewis Index of factoring reliability =  0.732
## RMSEA index =  0.174  and the 90 % confidence intervals are  0.122 0.233
## BIC =  -33.02
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    PA1  PA2  PA4  PA3
## Correlation of (regression) scores with factors   0.97 0.96 0.95 0.97
## Multiple R square of scores with factors          0.94 0.93 0.89 0.94
## Minimum correlation of possible factor scores     0.88 0.85 0.79 0.88

(FA_PA_P_4F_Summary <- FA_PA_P_4F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (promax-rotation)
## 
## Variable |  PA1 |  PA2 |  PA4 |  PA3 | Complexity | Uniqueness
## --------------------------------------------------------------
## POTL     | 0.94 |      |      |      |       1.06 |       0.11
## EXPR     | 0.81 |      |      |      |       1.36 |       0.32
## ACAD     | 0.77 |      |      |      |       1.05 |       0.36
## JFIT     | 0.57 |      |      |      |       1.99 |       0.27
## ORGN     |      | 0.92 |      |      |       1.14 |       0.14
## COMM     |      | 0.82 |      |      |       1.13 |       0.27
## CFIT     |      | 0.64 |      |      |       1.88 |       0.24
## SCON     |      |      | 0.87 |      |       1.05 |       0.21
## APPR     |      |      | 0.72 |      |       1.04 |       0.35
## LIKE     |      |      | 0.68 |      |       1.12 |       0.34
## LETT     |      |      |      | 0.95 |       1.11 |       0.11
## RESM     |      |      |      | 0.85 |       1.13 |       0.12
## 
## The 4 latent factors (promax rotation) accounted for 76.28% of the total variance of the original data (PA1 = 23.74%, PA2 = 20.58%, PA4 = 17.43%, PA3 = 14.53%).

summary(FA_PA_P_4F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   PA1 |   PA2 |   PA4 |   PA3
## ---------------------------------------------------------------
## Eigenvalues                     | 6.152 | 1.259 | 0.943 | 0.800
## Variance Explained              | 0.237 | 0.206 | 0.174 | 0.145
## Variance Explained (Cumulative) | 0.237 | 0.443 | 0.618 | 0.763
## Variance Explained (Proportion) | 0.311 | 0.270 | 0.229 | 0.190

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_PA_P_4F_Residual <- residuals(FA_PA_P_4F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.04    NA                                                      
## COMM  0.07  0.01    NA                                                
## CFIT -0.06 -0.02 -0.05    NA                                          
## EXPR  0.05 -0.03  0.01 -0.05    NA                                    
## JFIT -0.07 -0.03 -0.07  0.15 -0.01    NA                              
## LETT -0.03 -0.01 -0.01  0.00  0.02  0.00    NA                        
## LIKE -0.03 -0.02 -0.03  0.02 -0.03  0.07  0.02    NA                  
## ORGN  0.03  0.02  0.08 -0.05  0.04 -0.06  0.01 -0.03    NA            
## POTL  0.05  0.03  0.01 -0.02  0.00 -0.03  0.01 -0.01  0.01    NA      
## RESM  0.02  0.03  0.02  0.01 -0.01 -0.01  0.00 -0.01 -0.03 -0.01    NA
## SCON -0.06 -0.01 -0.03  0.04  0.04  0.03 -0.01  0.05 -0.01 -0.03 -0.01
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_PA_P_4F_RMS <- FA_PA_P_4F$rms)

## [1] 0.03861068

(FA_PA_P_4F_TLI <- FA_PA_P_4F$TLI)

## [1] 0.7317578

(FA_PA_P_4F_BIC <- FA_PA_P_4F$BIC)

## [1] -33.01886

(FA_PA_P_4F_MaxResidual   <- max(abs(FA_PA_P_4F_Residual),na.rm=TRUE))

## [1] 0.1504566

(FA_PA_P_4F_HighResidual  <- sum(FA_PA_P_4F_Residual>abs(0.05),na.rm=TRUE))

## [1] 10

(FA_PA_P_4F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_PA_P_4F_HighResidualRate <- FA_PA_P_4F_HighResidual/FA_PA_P_4F_TotalResidual)

## [1] 0.1515152

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_PA_P_4F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Promax Rotation : 4 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("POTL","EXPR","ACAD","JFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("POTL", "EXPR", "ACAD", 
##     "JFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.88    0.87      0.66 7.6 0.028  7.3 1.2     0.69
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.81  0.88  0.93
## Duhachek  0.82  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## POTL      0.79      0.80    0.74      0.58 4.1    0.051 0.0063  0.54
## EXPR      0.86      0.86    0.84      0.68 6.4    0.036 0.0169  0.71
## ACAD      0.84      0.84    0.80      0.64 5.4    0.040 0.0111  0.70
## JFIT      0.89      0.89    0.85      0.72 7.8    0.028 0.0043  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## POTL 50  0.93  0.93  0.92   0.87  7.3 1.4
## EXPR 50  0.83  0.84  0.75   0.71  7.3 1.4
## ACAD 50  0.86  0.87  0.83   0.76  7.4 1.3
## JFIT 50  0.82  0.80  0.71   0.65  7.0 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## POTL 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0
## ACAD 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0

alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.87      0.87    0.85      0.69 6.7 0.034  6.9 1.4     0.62
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.79  0.87  0.92
## Duhachek  0.80  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r  S/N alpha se var.r med.r
## ORGN      0.74      0.74    0.59      0.59  2.9    0.073    NA  0.59
## COMM      0.77      0.77    0.62      0.62  3.3    0.066    NA  0.62
## CFIT      0.92      0.93    0.86      0.86 12.5    0.021    NA  0.86
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.93  0.93  0.91   0.83  6.9 1.6
## COMM 50  0.91  0.92  0.89   0.80  6.9 1.5
## CFIT 50  0.84  0.83  0.65   0.63  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.10 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.20 0.30 0.28 0.06 0.02    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

alpha(DPA.Descriptors.Numeric[,c("SCON","APPR","LIKE")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("SCON", "APPR", "LIKE")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.83       0.7 6.9 0.031  7.4 0.99      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.80  0.87  0.92
## Duhachek  0.81  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## SCON      0.78      0.78    0.64      0.64 3.5    0.062    NA  0.64
## APPR      0.86      0.86    0.75      0.75 6.1    0.040    NA  0.75
## LIKE      0.82      0.82    0.70      0.70 4.6    0.051    NA  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## SCON 50  0.92  0.91  0.86   0.80  7.3 1.2
## APPR 50  0.86  0.87  0.76   0.71  7.4 1.0
## LIKE 50  0.90  0.89  0.82   0.76  7.4 1.1
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## SCON 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## APPR 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0
## LIKE 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0

alpha(DPA.Descriptors.Numeric[,c("LETT","RESM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("LETT", "RESM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_PA_P_4F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "prax",
                                  nfac = 4,
                                  rotation = "promax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_PA_P_4F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

##################################
# Conducting exploratory factor analysis
# using Principal Axes extraction
# and Promax rotation
# with 5 factors
##################################
(FA_PA_P_5F <- fa(DPA.Descriptors.Numeric,
              nfactors = 5,
              fm="pa",
              rotate = "promax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  pa
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 5, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "promax", residuals = TRUE, SMC = TRUE, fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        PA5   PA4   PA2   PA1   PA3   h2    u2 com
## ACAD  0.95 -0.05  0.16 -0.10 -0.07 0.79 0.208 1.1
## APPR  0.13  0.67  0.18 -0.16  0.03 0.64 0.355 1.4
## COMM  0.04 -0.05  0.91 -0.01  0.10 0.86 0.143 1.0
## CFIT  0.02 -0.01  0.17  0.81  0.02 0.86 0.144 1.1
## EXPR  0.73  0.00 -0.21  0.05  0.20 0.67 0.331 1.3
## JFIT  0.08  0.05 -0.05  0.91  0.02 0.93 0.065 1.0
## LETT -0.06 -0.15  0.08  0.09  0.96 0.89 0.105 1.1
## LIKE -0.11  0.78  0.02  0.13  0.05 0.69 0.306 1.1
## ORGN -0.04  0.08  0.87  0.10 -0.05 0.89 0.115 1.1
## POTL  0.80  0.05  0.02  0.22 -0.11 0.87 0.132 1.2
## RESM  0.07  0.18 -0.02 -0.09  0.85 0.87 0.127 1.1
## SCON -0.01  1.00 -0.11  0.04 -0.09 0.85 0.153 1.0
## 
##                        PA5  PA4  PA2  PA1  PA3
## SS loadings           2.22 2.18 1.85 1.81 1.76
## Proportion Var        0.18 0.18 0.15 0.15 0.15
## Cumulative Var        0.18 0.37 0.52 0.67 0.82
## Proportion Explained  0.23 0.22 0.19 0.18 0.18
## Cumulative Proportion 0.23 0.45 0.64 0.82 1.00
## 
##  With factor correlations of 
##      PA5  PA4  PA2  PA1  PA3
## PA5 1.00 0.57 0.36 0.61 0.52
## PA4 0.57 1.00 0.59 0.52 0.45
## PA2 0.36 0.59 1.00 0.55 0.35
## PA1 0.61 0.52 0.55 1.00 0.46
## PA3 0.52 0.45 0.35 0.46 1.00
## 
## Mean item complexity =  1.1
## Test of the hypothesis that 5 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 16  and the objective function was  0.47 
## 
## The root mean square of the residuals (RMSR) is  0.01 
## The df corrected root mean square of the residuals is  0.03 
## 
## The harmonic n.obs is  50 with the empirical chi square  1.45  with prob <  1 
## The total n.obs was  50  with Likelihood Chi Square =  19.16  with prob <  0.26 
## 
## Tucker Lewis Index of factoring reliability =  0.965
## RMSEA index =  0.06  and the 90 % confidence intervals are  0 0.153
## BIC =  -43.43
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    PA5  PA4  PA2  PA1  PA3
## Correlation of (regression) scores with factors   0.96 0.96 0.97 0.97 0.97
## Multiple R square of scores with factors          0.92 0.92 0.93 0.95 0.94
## Minimum correlation of possible factor scores     0.85 0.84 0.86 0.90 0.88

(FA_PA_P_5F_Summary <- FA_PA_P_5F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (promax-rotation)
## 
## Variable |  PA5 |  PA4 |  PA2 |  PA1 |  PA3 | Complexity | Uniqueness
## ---------------------------------------------------------------------
## ACAD     | 0.95 |      |      |      |      |       1.10 |       0.21
## POTL     | 0.80 |      |      |      |      |       1.20 |       0.13
## EXPR     | 0.73 |      |      |      |      |       1.32 |       0.33
## SCON     |      | 1.00 |      |      |      |       1.04 |       0.15
## LIKE     |      | 0.78 |      |      |      |       1.10 |       0.31
## APPR     |      | 0.67 |      |      |      |       1.35 |       0.36
## COMM     |      |      | 0.91 |      |      |       1.04 |       0.14
## ORGN     |      |      | 0.87 |      |      |       1.05 |       0.11
## JFIT     |      |      |      | 0.91 |      |       1.03 |       0.07
## CFIT     |      |      |      | 0.81 |      |       1.09 |       0.14
## LETT     |      |      |      |      | 0.96 |       1.09 |       0.11
## RESM     |      |      |      |      | 0.85 |       1.12 |       0.13
## 
## The 5 latent factors (promax rotation) accounted for 81.80% of the total variance of the original data (PA5 = 18.48%, PA4 = 18.16%, PA2 = 15.39%, PA1 = 15.12%, PA3 = 14.65%).

summary(FA_PA_P_5F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   PA5 |   PA4 |   PA2 |   PA1 |   PA3
## -----------------------------------------------------------------------
## Eigenvalues                     | 6.211 | 1.294 | 0.967 | 0.867 | 0.477
## Variance Explained              | 0.185 | 0.182 | 0.154 | 0.151 | 0.147
## Variance Explained (Cumulative) | 0.185 | 0.366 | 0.520 | 0.672 | 0.818
## Variance Explained (Proportion) | 0.226 | 0.222 | 0.188 | 0.185 | 0.179

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_PA_P_5F_Residual <- residuals(FA_PA_P_5F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.01    NA                                                      
## COMM  0.01 -0.02    NA                                                
## CFIT  0.01  0.00  0.00    NA                                          
## EXPR  0.00 -0.04  0.00 -0.03    NA                                    
## JFIT  0.00  0.00  0.00  0.00  0.01    NA                              
## LETT -0.02 -0.01 -0.01  0.00  0.02 -0.01    NA                        
## LIKE  0.01  0.00  0.00 -0.02 -0.02  0.02  0.02    NA                  
## ORGN -0.01  0.01  0.00  0.00  0.02  0.00  0.02  0.00    NA            
## POTL -0.01  0.02 -0.01  0.01  0.01 -0.01  0.01  0.01  0.00    NA      
## RESM  0.01  0.03  0.01  0.02 -0.02  0.00  0.00 -0.01 -0.03 -0.01    NA
## SCON -0.02 -0.01  0.01  0.01  0.05 -0.01 -0.01  0.01  0.01 -0.02 -0.01
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_PA_P_5F_RMS <- FA_PA_P_5F$rms)

## [1] 0.01483907

(FA_PA_P_5F_TLI <- FA_PA_P_5F$TLI)

## [1] 0.9648451

(FA_PA_P_5F_BIC <- FA_PA_P_5F$BIC)

## [1] -43.43181

(FA_PA_P_5F_MaxResidual   <- max(abs(FA_PA_P_5F_Residual),na.rm=TRUE))

## [1] 0.04835397

(FA_PA_P_5F_HighResidual  <- sum(FA_PA_P_5F_Residual>abs(0.05),na.rm=TRUE))

## [1] 0

(FA_PA_P_5F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_PA_P_5F_HighResidualRate <- FA_PA_P_5F_HighResidual/FA_PA_P_5F_TotalResidual)

## [1] 0

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_PA_P_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Promax Rotation : 5 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("ACAD","POTL","EXPR")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ACAD", "POTL", "EXPR")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.89      0.89    0.85      0.72 7.8 0.028  7.3 1.2      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.82  0.89  0.93
## Duhachek  0.83  0.89  0.94
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ACAD      0.83      0.83    0.70      0.70 4.7    0.049    NA  0.70
## POTL      0.80      0.80    0.67      0.67 4.0    0.056    NA  0.67
## EXPR      0.88      0.89    0.80      0.80 7.8    0.032    NA  0.80
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ACAD 50  0.91  0.91  0.85   0.79  7.4 1.3
## POTL 50  0.92  0.92  0.88   0.82  7.3 1.4
## EXPR 50  0.88  0.88  0.76   0.72  7.3 1.4
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## ACAD 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## POTL 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0

alpha(DPA.Descriptors.Numeric[,c("SCON","LIKE","APPR")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("SCON", "LIKE", "APPR")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.83       0.7 6.9 0.031  7.4 0.99      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.80  0.87  0.92
## Duhachek  0.81  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## SCON      0.78      0.78    0.64      0.64 3.5    0.062    NA  0.64
## LIKE      0.82      0.82    0.70      0.70 4.6    0.051    NA  0.70
## APPR      0.86      0.86    0.75      0.75 6.1    0.040    NA  0.75
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## SCON 50  0.92  0.91  0.86   0.80  7.3 1.2
## LIKE 50  0.90  0.89  0.82   0.76  7.4 1.1
## APPR 50  0.86  0.87  0.76   0.71  7.4 1.0
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## SCON 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## LIKE 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0
## APPR 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0

alpha(DPA.Descriptors.Numeric[,c("COMM","ORGN")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("COMM", "ORGN")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.92      0.93    0.86      0.86  12 0.021  6.9 1.5     0.86
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.87  0.92  0.96
## Duhachek  0.88  0.92  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## COMM      0.80      0.86    0.74      0.86 6.2       NA     0  0.86
## ORGN      0.92      0.86    0.74      0.86 6.2       NA     0  0.86
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## COMM 50  0.96  0.96   0.9   0.86  6.9 1.5
## ORGN 50  0.97  0.96   0.9   0.86  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5   6    7    8    9   10 miss
## COMM 0.04 0.04 0.06 0.2 0.30 0.28 0.06 0.02    0
## ORGN 0.02 0.04 0.20 0.1 0.24 0.24 0.16 0.00    0

alpha(DPA.Descriptors.Numeric[,c("LETT","RESM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("LETT", "RESM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0

alpha(DPA.Descriptors.Numeric[,c("JFIT","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("JFIT", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.94      0.94    0.88      0.88  15 0.018    7 1.6     0.88
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.89  0.94  0.96
## Duhachek  0.90  0.94  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## JFIT      0.87      0.88    0.78      0.88 7.5       NA     0  0.88
## CFIT      0.90      0.88    0.78      0.88 7.5       NA     0  0.88
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## JFIT 50  0.97  0.97  0.91   0.88  7.0 1.6
## CFIT 50  0.97  0.97  0.91   0.88  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_PA_P_5F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "prax",
                                  nfac = 5,
                                  rotation = "promax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_PA_P_5F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

par(mfrow=c(1,3))
fa.diagram(FA_PA_P_3F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Promax Rotation : 3 Factors",
           cex=0.75)
fa.diagram(FA_PA_P_4F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Promax Rotation : 4 Factors",
           cex=0.75)
fa.diagram(FA_PA_P_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Promax Rotation : 5 Factors",
           cex=0.75)

1.6.3 Maximum Likelihood Factor Extraction and Varimax Rotation (FA_ML_V)

Maximum Likelihood Factor Extraction aims to estimate the factor loadings in a way that maximizes the likelihood of observing the given data, assuming a specific factor model. Given the correlation matrix, the algorithm formulates a likelihood function that represents the probability of observing the given data under an assumed factor model representing the relationships between the latent factors and observed variables. The likelihood function quantifies how well the model explains the observed data. Optimization techniques are applied to determine the factor loadings that maximize the likelihood function. The process involves iteratively adjusting the factor loadings to improve the fit between the model and the data. Factor loadings indicate the strength and direction of the relationship between variables and factors. Factors are interpreted based on the loading patterns. Variables with high loadings on a factor are strongly associated with the factor.

Varimax Rotation is an orthogonal rotation method which forces the rotated factors to be uncorrelated with each other, leading to simpler and more easily interpretable factor solutions. The algorithm aims to maximize the variance of the squared loadings within each factor which helps identify variables that are strongly associated with a single factor. The results are straightforward to interpret and can be particularly useful when the factors are expected to be independent.

[A] Appplying Maximum Likelihood factor extraction and Varimax rotation, an evaluation was conducted using a set of empirical guidelines to determine the optimal number of factors to be retained for exploratory factor analysis. It was determined that:
     [A.1] 4 factors would be sufficient for an optimal balance between comprehensiveness and parsimony.
     [A.2] To ensure that both under-extraction and over-extraction are assessed, models with 3, 4 and 5 factors were sequentially evaluated for their interpretability and theoretical meaningfulness.
     [A.3] The choice of 4 factors was supported by maximum consensus (26.32%) from 5 (Bentler, Beta, Parallel Analysis, Kaiser Criterion and Standardized Scree) among 19 methods.

[B] Results for the exploratory factor analysis using a 3-Factor Structure were as follows:
     [B.1] Standardized Root Mean Square of the Residual = 0.08
     [B.2] Tucker-Lewis Fit Index = 0.69
     [B.3] Bayesian Information Criterion = -35.60
     [B.4] High Residual Rate = 0.24
     [B.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [B.5.1] POTL: Loading = 0.90, Communality = 0.92
            [B.5.2] ACAD: Loading = 0.76, Communality = 0.67
            [B.5.3] EXPR: Loading = 0.71, Communality = 0.61
            [B.5.4] JFIT: Loading = 0.62, Communality = 0.62
            [B.5.5] Cronbach’s Alpha = 0.88
     [B.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [B.6.1] ORGN: Loading = 0.96, Communality = 0.96
            [B.6.2] COMM: Loading = 0.85, Communality = 0.80
            [B.6.3] CFIT: Loading = 0.56, Communality = 0.62
            [B.6.4] LIKE: Loading = 0.48, Communality = 0.44
            [B.6.5] APPR: Loading = 0.48, Communality = 0.46
            [B.6.6] SCON: Loading = 0.43, Communality = 0.51
            [B.6.7] Cronbach’s Alpha = 0.88
     [B.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [B.7.1] RESM: Loading = 0.93, Communality = 0.83
            [B.7.2] LETT: Loading = 0.78, Communality = 0.71
            [B.7.3] Cronbach’s Alpha = 0.91

[C] Results for the exploratory factor analysis using a 4-Factor Structure were as follows:
     [C.1] Standardized Root Mean Square of the Residual = 0.06
     [C.2] Tucker-Lewis Fit Index = 0.76
     [C.3] Bayesian Information Criterion = -37.51
     [C.4] High Residual Rate = 0.15
     [C.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [C.5.1] JFIT: Loading = 0.87, Communality = 0.91
            [C.5.2] CFIT: Loading = 0.78, Communality = 0.85
            [C.5.3] POTL: Loading = 0.66, Communality = 0.65
            [C.5.4] EXPR: Loading = 0.50, Communality = 0.46
            [C.5.5] ACAD: Loading = 0.49, Communality = 0.45
            [C.5.6] Cronbach’s Alpha = 0.90
     [C.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [C.6.1] SCON: Loading = 0.81, Communality = 0.78
            [C.6.2] APPR: Loading = 0.72, Communality = 0.68
            [C.6.3] LIKE: Loading = 0.67, Communality = 0.65
            [C.6.4] Cronbach’s Alpha = 0.87
     [C.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [C.7.1] ORGN: Loading = 0.91, Communality = 0.97
            [C.7.2] COMM: Loading = 0.79, Communality = 0.79
            [C.7.3] Cronbach’s Alpha = 0.92
     [C.8] Factor 4 was a latent variable with higher loading towards the following descriptors:
            [C.8.1] RESM: Loading = 0.91, Communality = 1.00
            [C.8.2] LETT: Loading = 0.81, Communality = 0.78
            [C.8.3] Cronbach’s Alpha = 0.91

[D] Results for the exploratory factor analysis using a 5-Factor Structure were as follows:
     [D.1] Standardized Root Mean Square of the Residual = 0.02
     [D.2] Tucker-Lewis Fit Index = 0.99
     [D.3] Bayesian Information Criterion = -45.32
     [D.4] High Residual Rate = 0.03
     [D.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [D.5.1] ACAD: Loading = 0.81, Communality = 0.78
            [D.5.2] POTL: Loading = 0.79, Communality = 0.87
            [D.5.3] EXPR: Loading = 0.69, Communality = 0.64
            [C.5.4] Cronbach’s Alpha = 0.89
     [D.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [D.6.1] SCON: Loading = 0.87, Communality = 0.87
            [D.6.2] LIKE: Loading = 0.71, Communality = 0.68
            [D.6.3] APPR: Loading = 0.65, Communality = 0.65
            [C.6.4] Cronbach’s Alpha = 0.87
     [D.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [D.7.1] ORGN: Loading = 0.89, Communality = 0.96
            [D.7.2] COMM: Loading = 0.79, Communality = 0.80
            [C.7.3] Cronbach’s Alpha = 0.92
     [D.8] Factor 4 was a latent variable with higher loading towards the following descriptors:
            [D.8.1] RESM: Loading = 0.90, Communality = 1.00
            [D.8.2] LETT: Loading = 0.81, Communality = 0.77
            [C.8.3] Cronbach’s Alpha = 0.91
     [D.9] Factor 5 was a latent variable with higher loading towards the following descriptors:
            [D.9.1] JFIT: Loading = 0.81, Communality = 0.95
            [D.9.2] CFIT: Loading = 0.72, Communality = 0.85
            [C.9.3] Cronbach’s Alpha = 0.94

Code Chunk | Output

##################################
# Implementing various procedures for determining
# factor retention based on
# the maximum consensus between methods
##################################
(FA_ML_V_MethodAgreementProcedure <- parameters::n_factors(DPA.Descriptors.Numeric,
                                                           algorithm = "mle",
                                                           rotation = "varimax"))

## # Method Agreement Procedure:
## 
## The choice of 4 dimensions is supported by 5 (26.32%) methods out of 19 (Bentler, beta, Parallel analysis, Kaiser criterion, Scree (SE)).

as.data.frame(FA_ML_V_MethodAgreementProcedure)

##    n_Factors              Method              Family
## 1          1 Acceleration factor               Scree
## 2          1          Scree (R2)            Scree_SE
## 3          1    VSS complexity 1                 VSS
## 4          2 Optimal coordinates               Scree
## 5          2    VSS complexity 2                 VSS
## 6          3                 CNG                 CNG
## 7          4             Bentler             Bentler
## 8          4                beta Multiple_regression
## 9          4   Parallel analysis               Scree
## 10         4    Kaiser criterion               Scree
## 11         4          Scree (SE)            Scree_SE
## 12         5       Velicer's MAP        Velicers_MAP
## 13         5                 BIC                 BIC
## 14         6            Bartlett             Barlett
## 15         6                   t Multiple_regression
## 16         6                   p Multiple_regression
## 17         6      BIC (adjusted)                 BIC
## 18         7            Anderson             Barlett
## 19         7              Lawley             Barlett

##################################
# Conducting exploratory factor analysis
# using Maximum Likelihood extraction
# and Varimax rotation
# with 3 factors
##################################
(FA_ML_V_3F <- fa(DPA.Descriptors.Numeric,
              nfactors = 3,
              fm="ml",
              rotate = "varimax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  ml
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 3, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "varimax", residuals = TRUE, SMC = TRUE, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML3  ML2  ML1   h2    u2 com
## ACAD 0.76 0.24 0.20 0.67 0.326 1.3
## APPR 0.36 0.48 0.32 0.46 0.543 2.7
## COMM 0.16 0.85 0.23 0.80 0.203 1.2
## CFIT 0.53 0.56 0.19 0.62 0.377 2.2
## EXPR 0.71 0.04 0.32 0.61 0.385 1.4
## JFIT 0.62 0.45 0.21 0.62 0.379 2.1
## LETT 0.22 0.25 0.78 0.71 0.289 1.4
## LIKE 0.35 0.48 0.29 0.44 0.563 2.6
## ORGN 0.15 0.96 0.07 0.96 0.043 1.1
## POTL 0.90 0.29 0.14 0.92 0.079 1.3
## RESM 0.30 0.19 0.93 1.00 0.005 1.3
## SCON 0.39 0.43 0.25 0.40 0.597 2.6
## 
##                        ML3  ML2  ML1
## SS loadings           3.15 3.04 2.02
## Proportion Var        0.26 0.25 0.17
## Cumulative Var        0.26 0.52 0.68
## Proportion Explained  0.38 0.37 0.25
## Cumulative Proportion 0.38 0.75 1.00
## 
## Mean item complexity =  1.8
## Test of the hypothesis that 3 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 33  and the objective function was  2.22 
## 
## The root mean square of the residuals (RMSR) is  0.08 
## The df corrected root mean square of the residuals is  0.11 
## 
## The harmonic n.obs is  50 with the empirical chi square  41.84  with prob <  0.14 
## The total n.obs was  50  with Likelihood Chi Square =  93.49  with prob <  1.1e-07 
## 
## Tucker Lewis Index of factoring reliability =  0.686
## RMSEA index =  0.19  and the 90 % confidence intervals are  0.148 0.24
## BIC =  -35.6
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy             
##                                                    ML3  ML2  ML1
## Correlation of (regression) scores with factors   0.96 0.98 0.99
## Multiple R square of scores with factors          0.92 0.96 0.99
## Minimum correlation of possible factor scores     0.84 0.92 0.97

(FA_ML_V_3F_Summary <- FA_ML_V_3F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (varimax-rotation)
## 
## Variable |  ML3 |  ML2 |  ML1 | Complexity | Uniqueness
## -------------------------------------------------------
## POTL     | 0.90 |      |      |       1.25 |       0.08
## ACAD     | 0.76 |      |      |       1.35 |       0.33
## EXPR     | 0.71 |      |      |       1.39 |       0.39
## JFIT     | 0.62 |      |      |       2.09 |       0.38
## ORGN     |      | 0.96 |      |       1.06 |       0.04
## COMM     |      | 0.85 |      |       1.22 |       0.20
## CFIT     |      | 0.56 |      |       2.24 |       0.38
## LIKE     |      | 0.48 |      |       2.56 |       0.56
## APPR     |      | 0.48 |      |       2.67 |       0.54
## SCON     |      | 0.43 |      |       2.60 |       0.60
## RESM     |      |      | 0.93 |       1.29 |   4.98e-03
## LETT     |      |      | 0.78 |       1.38 |       0.29
## 
## The 3 latent factors (varimax rotation) accounted for 68.42% of the total variance of the original data (ML3 = 26.22%, ML2 = 25.33%, ML1 = 16.87%).

summary(FA_ML_V_3F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   ML3 |   ML2 |   ML1
## -------------------------------------------------------
## Eigenvalues                     | 6.079 | 1.261 | 0.922
## Variance Explained              | 0.262 | 0.253 | 0.169
## Variance Explained (Cumulative) | 0.262 | 0.515 | 0.684
## Variance Explained (Proportion) | 0.383 | 0.370 | 0.247

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_ML_V_3F_Residual <- residuals(FA_ML_V_3F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.02    NA                                                      
## COMM  0.04 -0.04    NA                                                
## CFIT -0.07 -0.10 -0.01    NA                                          
## EXPR  0.05 -0.07 -0.02 -0.06    NA                                    
## JFIT -0.08 -0.09 -0.02  0.27  0.00    NA                              
## LETT -0.06 -0.14 -0.02  0.05  0.06  0.06    NA                        
## LIKE -0.04  0.19 -0.04 -0.02 -0.06  0.04 -0.06    NA                  
## ORGN  0.00  0.00  0.00 -0.01  0.01 -0.01  0.01 -0.01    NA            
## POTL  0.01  0.01  0.00  0.00  0.00 -0.01  0.00 -0.01  0.00    NA      
## RESM  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00    NA
## SCON -0.04  0.27 -0.05 -0.03  0.03 -0.02 -0.15  0.34  0.00 -0.01  0.00
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_ML_V_3F_RMS <- FA_ML_V_3F$rms)

## [1] 0.07961754

(FA_ML_V_3F_TLI <- FA_ML_V_3F$TLI)

## [1] 0.6858498

(FA_ML_V_3F_BIC <- FA_ML_V_3F$BIC)

## [1] -35.60434

(FA_ML_V_3F_MaxResidual   <- max(abs(FA_ML_V_3F_Residual),na.rm=TRUE))

## [1] 0.3363878

(FA_ML_V_3F_HighResidual  <- sum(FA_ML_V_3F_Residual>abs(0.05),na.rm=TRUE))

## [1] 16

(FA_ML_V_3F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_ML_V_3F_HighResidualRate <- FA_ML_V_3F_HighResidual/FA_ML_V_3F_TotalResidual)

## [1] 0.2424242

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_ML_V_3F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Varimax Rotation : 3 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("POTL","ACAD","EXPR","JFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("POTL", "ACAD", "EXPR", 
##     "JFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.88    0.87      0.66 7.6 0.028  7.3 1.2     0.69
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.81  0.88  0.93
## Duhachek  0.82  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## POTL      0.79      0.80    0.74      0.58 4.1    0.051 0.0063  0.54
## ACAD      0.84      0.84    0.80      0.64 5.4    0.040 0.0111  0.70
## EXPR      0.86      0.86    0.84      0.68 6.4    0.036 0.0169  0.71
## JFIT      0.89      0.89    0.85      0.72 7.8    0.028 0.0043  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## POTL 50  0.93  0.93  0.92   0.87  7.3 1.4
## ACAD 50  0.86  0.87  0.83   0.76  7.4 1.3
## EXPR 50  0.83  0.84  0.75   0.71  7.3 1.4
## JFIT 50  0.82  0.80  0.71   0.65  7.0 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## POTL 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## ACAD 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## EXPR 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0

alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM","LIKE","APPR","SCON","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM", "LIKE", 
##     "APPR", "SCON", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.89     0.9      0.57 7.8 0.026  7.1 1.1     0.53
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.82  0.88  0.92
## Duhachek  0.83  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ORGN      0.84      0.86    0.85      0.55 6.0    0.036 0.014  0.49
## COMM      0.85      0.87    0.86      0.56 6.4    0.034 0.012  0.53
## LIKE      0.86      0.86    0.88      0.56 6.4    0.030 0.019  0.52
## APPR      0.86      0.87    0.89      0.57 6.7    0.030 0.019  0.51
## SCON      0.86      0.87    0.87      0.57 6.5    0.029 0.015  0.53
## CFIT      0.87      0.88    0.90      0.59 7.3    0.029 0.020  0.53
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.87  0.84  0.84   0.78  6.9 1.6
## COMM 50  0.84  0.81  0.79   0.74  6.9 1.5
## LIKE 50  0.78  0.81  0.77   0.69  7.4 1.1
## APPR 50  0.75  0.79  0.73   0.67  7.4 1.0
## SCON 50  0.76  0.80  0.76   0.67  7.3 1.2
## CFIT 50  0.78  0.75  0.66   0.64  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.10 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.20 0.30 0.28 0.06 0.02    0
## LIKE 0.00 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0
## APPR 0.00 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0
## SCON 0.00 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

alpha(DPA.Descriptors.Numeric[,c("RESM","LETT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("RESM", "LETT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_ML_V_3F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "mle",
                                  nfac = 3,
                                  rotation = "varimax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_ML_V_3F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

##################################
# Conducting exploratory factor analysis
# using Maximum Likelihood extraction
# and Varimax rotation
# with 4 factors
##################################
(FA_ML_V_4F <- fa(DPA.Descriptors.Numeric,
              nfactors = 4,
              fm="ml",
              rotate = "varimax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  ml
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 4, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "varimax", residuals = TRUE, SMC = TRUE, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML3  ML4   ML2  ML1   h2    u2 com
## ACAD 0.49 0.38  0.12 0.24 0.45 0.546 2.5
## APPR 0.18 0.72  0.29 0.22 0.68 0.325 1.7
## COMM 0.21 0.26  0.79 0.23 0.79 0.206 1.6
## CFIT 0.78 0.16  0.42 0.19 0.85 0.148 1.8
## EXPR 0.50 0.30 -0.07 0.35 0.46 0.535 2.6
## JFIT 0.87 0.19  0.29 0.19 0.91 0.089 1.4
## LETT 0.27 0.06  0.22 0.81 0.78 0.224 1.4
## LIKE 0.31 0.67  0.27 0.17 0.65 0.350 1.9
## ORGN 0.23 0.30  0.91 0.06 0.97 0.027 1.4
## POTL 0.66 0.40  0.15 0.18 0.65 0.347 1.9
## RESM 0.22 0.33  0.10 0.91 1.00 0.005 1.4
## SCON 0.27 0.81  0.19 0.10 0.78 0.220 1.4
## 
##                        ML3  ML4  ML2  ML1
## SS loadings           2.71 2.36 2.00 1.91
## Proportion Var        0.23 0.20 0.17 0.16
## Cumulative Var        0.23 0.42 0.59 0.75
## Proportion Explained  0.30 0.26 0.22 0.21
## Cumulative Proportion 0.30 0.56 0.79 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 4 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 24  and the objective function was  1.36 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.1 
## 
## The harmonic n.obs is  50 with the empirical chi square  22.17  with prob <  0.57 
## The total n.obs was  50  with Likelihood Chi Square =  56.38  with prob <  2e-04 
## 
## Tucker Lewis Index of factoring reliability =  0.764
## RMSEA index =  0.163  and the 90 % confidence intervals are  0.11 0.223
## BIC =  -37.51
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    ML3  ML4  ML2  ML1
## Correlation of (regression) scores with factors   0.95 0.91 0.97 0.99
## Multiple R square of scores with factors          0.91 0.83 0.95 0.98
## Minimum correlation of possible factor scores     0.82 0.65 0.90 0.95

(FA_ML_V_4F_Summary <- FA_ML_V_4F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (varimax-rotation)
## 
## Variable |  ML3 |  ML4 |  ML2 |  ML1 | Complexity | Uniqueness
## --------------------------------------------------------------
## JFIT     | 0.87 |      |      |      |       1.43 |       0.09
## CFIT     | 0.78 |      |      |      |       1.79 |       0.15
## POTL     | 0.66 |      |      |      |       1.94 |       0.35
## EXPR     | 0.50 |      |      |      |       2.56 |       0.54
## ACAD     | 0.49 |      |      |      |       2.54 |       0.55
## SCON     |      | 0.81 |      |      |       1.37 |       0.22
## APPR     |      | 0.72 |      |      |       1.66 |       0.32
## LIKE     |      | 0.67 |      |      |       1.90 |       0.35
## ORGN     |      |      | 0.91 |      |       1.37 |       0.03
## COMM     |      |      | 0.79 |      |       1.58 |       0.21
## RESM     |      |      |      | 0.91 |       1.41 |   5.00e-03
## LETT     |      |      |      | 0.81 |       1.39 |       0.22
## 
## The 4 latent factors (varimax rotation) accounted for 74.82% of the total variance of the original data (ML3 = 22.58%, ML4 = 19.64%, ML2 = 16.70%, ML1 = 15.90%).

summary(FA_ML_V_4F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   ML3 |   ML4 |   ML2 |   ML1
## ---------------------------------------------------------------
## Eigenvalues                     | 6.146 | 1.248 | 0.890 | 0.846
## Variance Explained              | 0.226 | 0.196 | 0.167 | 0.159
## Variance Explained (Cumulative) | 0.226 | 0.422 | 0.589 | 0.748
## Variance Explained (Proportion) | 0.302 | 0.263 | 0.223 | 0.212

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_ML_V_4F_Residual <- residuals(FA_ML_V_4F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.03    NA                                                      
## COMM  0.05 -0.01    NA                                                
## CFIT -0.04  0.00  0.00    NA                                          
## EXPR  0.24 -0.05 -0.02 -0.08    NA                                    
## JFIT -0.04  0.00  0.00  0.01 -0.01    NA                              
## LETT -0.04 -0.03 -0.03 -0.01  0.06  0.00    NA                        
## LIKE -0.08 -0.01  0.00 -0.01 -0.09  0.03  0.04    NA                  
## ORGN  0.00  0.00  0.00  0.00  0.01  0.00  0.01  0.00    NA            
## POTL  0.26  0.03  0.02 -0.02  0.20 -0.02  0.00 -0.05  0.00    NA      
## RESM  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00    NA
## SCON -0.08 -0.01 -0.01  0.02  0.00  0.00 -0.01  0.05  0.00 -0.05  0.00
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_ML_V_4F_RMS <- FA_ML_V_4F$rms)

## [1] 0.05795189

(FA_ML_V_4F_TLI <- FA_ML_V_4F$TLI)

## [1] 0.7644223

(FA_ML_V_4F_BIC <- FA_ML_V_4F$BIC)

## [1] -37.50857

(FA_ML_V_4F_MaxResidual   <- max(abs(FA_ML_V_4F_Residual),na.rm=TRUE))

## [1] 0.2590579

(FA_ML_V_4F_HighResidual  <- sum(FA_ML_V_4F_Residual>abs(0.05),na.rm=TRUE))

## [1] 10

(FA_ML_V_4F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_ML_V_4F_HighResidualRate <- FA_ML_V_4F_HighResidual/FA_ML_V_4F_TotalResidual)

## [1] 0.1515152

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_ML_V_4F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Varimax Rotation : 4 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("JFIT","CFIT","POTL","EXPR","ACAD")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("JFIT", "CFIT", "POTL", 
##     "EXPR", "ACAD")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##        0.9       0.9    0.92      0.64 8.8 0.024  7.2 1.2     0.66
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84   0.9  0.94
## Duhachek  0.85   0.9  0.94
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## JFIT      0.86      0.87    0.86      0.62 6.5    0.034 0.021  0.66
## CFIT      0.88      0.88    0.87      0.66 7.6    0.028 0.011  0.69
## POTL      0.85      0.85    0.88      0.58 5.6    0.036 0.029  0.53
## EXPR      0.89      0.89    0.91      0.68 8.5    0.025 0.022  0.68
## ACAD      0.88      0.88    0.90      0.64 7.3    0.029 0.028  0.68
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## JFIT 50  0.88  0.87  0.86   0.80  7.0 1.6
## CFIT 50  0.84  0.82  0.80   0.72  6.9 1.6
## POTL 50  0.91  0.92  0.90   0.85  7.3 1.4
## EXPR 50  0.76  0.78  0.71   0.64  7.3 1.4
## ACAD 50  0.81  0.83  0.78   0.72  7.4 1.3
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0
## POTL 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0
## ACAD 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0

alpha(DPA.Descriptors.Numeric[,c("SCON","APPR","LIKE")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("SCON", "APPR", "LIKE")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.83       0.7 6.9 0.031  7.4 0.99      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.80  0.87  0.92
## Duhachek  0.81  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## SCON      0.78      0.78    0.64      0.64 3.5    0.062    NA  0.64
## APPR      0.86      0.86    0.75      0.75 6.1    0.040    NA  0.75
## LIKE      0.82      0.82    0.70      0.70 4.6    0.051    NA  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## SCON 50  0.92  0.91  0.86   0.80  7.3 1.2
## APPR 50  0.86  0.87  0.76   0.71  7.4 1.0
## LIKE 50  0.90  0.89  0.82   0.76  7.4 1.1
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## SCON 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## APPR 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0
## LIKE 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0

alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.92      0.93    0.86      0.86  12 0.021  6.9 1.5     0.86
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.87  0.92  0.96
## Duhachek  0.88  0.92  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ORGN      0.92      0.86    0.74      0.86 6.2       NA     0  0.86
## COMM      0.80      0.86    0.74      0.86 6.2       NA     0  0.86
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.97  0.96   0.9   0.86  6.9 1.6
## COMM 50  0.96  0.96   0.9   0.86  6.9 1.5
## 
## Non missing response frequency for each item
##         3    4    5   6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.1 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.2 0.30 0.28 0.06 0.02    0

alpha(DPA.Descriptors.Numeric[,c("RESM","LETT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("RESM", "LETT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_ML_V_4F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "mle",
                                  nfac = 4,
                                  rotation = "varimax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_ML_V_4F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

##################################
# Conducting exploratory factor analysis
# using Maximum Likelihood extraction
# and Varimax rotation
# with 5 factors
##################################
(FA_ML_V_5F <- fa(DPA.Descriptors.Numeric,
              nfactors = 5,
              fm="ml",
              rotate = "varimax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  ml
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 5, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "varimax", residuals = TRUE, SMC = TRUE, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML5  ML4   ML2  ML1  ML3   h2    u2 com
## ACAD 0.81 0.20  0.20 0.17 0.14 0.78 0.218 1.4
## APPR 0.28 0.65  0.31 0.21 0.05 0.65 0.353 2.1
## COMM 0.14 0.25  0.79 0.24 0.20 0.80 0.196 1.6
## CFIT 0.30 0.22  0.37 0.21 0.72 0.85 0.154 2.3
## EXPR 0.69 0.19 -0.04 0.29 0.20 0.64 0.362 1.7
## JFIT 0.37 0.23  0.23 0.21 0.81 0.95 0.050 1.9
## LETT 0.19 0.07  0.20 0.81 0.21 0.77 0.229 1.4
## LIKE 0.17 0.71  0.24 0.18 0.25 0.68 0.319 1.8
## ORGN 0.10 0.31  0.89 0.07 0.25 0.96 0.035 1.5
## POTL 0.79 0.27  0.18 0.12 0.35 0.87 0.132 1.8
## RESM 0.27 0.30  0.10 0.90 0.10 1.00 0.005 1.5
## SCON 0.22 0.87  0.17 0.09 0.16 0.87 0.130 1.3
## 
##                        ML5  ML4  ML2  ML1  ML3
## SS loadings           2.29 2.18 1.92 1.82 1.61
## Proportion Var        0.19 0.18 0.16 0.15 0.13
## Cumulative Var        0.19 0.37 0.53 0.68 0.82
## Proportion Explained  0.23 0.22 0.20 0.19 0.16
## Cumulative Proportion 0.23 0.45 0.65 0.84 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 5 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 16  and the objective function was  0.42 
## 
## The root mean square of the residuals (RMSR) is  0.02 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic n.obs is  50 with the empirical chi square  2.44  with prob <  1 
## The total n.obs was  50  with Likelihood Chi Square =  17.27  with prob <  0.37 
## 
## Tucker Lewis Index of factoring reliability =  0.986
## RMSEA index =  0.034  and the 90 % confidence intervals are  0 0.141
## BIC =  -45.32
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    ML5  ML4  ML2  ML1  ML3
## Correlation of (regression) scores with factors   0.92 0.93 0.97 0.99 0.95
## Multiple R square of scores with factors          0.85 0.86 0.94 0.98 0.90
## Minimum correlation of possible factor scores     0.71 0.72 0.88 0.96 0.80

(FA_ML_V_5F_Summary <- FA_ML_V_5F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (varimax-rotation)
## 
## Variable |  ML5 |  ML4 |  ML2 |  ML1 |  ML3 | Complexity | Uniqueness
## ---------------------------------------------------------------------
## ACAD     | 0.81 |      |      |      |      |       1.40 |       0.22
## POTL     | 0.79 |      |      |      |      |       1.84 |       0.13
## EXPR     | 0.69 |      |      |      |      |       1.71 |       0.36
## SCON     |      | 0.87 |      |      |      |       1.30 |       0.13
## LIKE     |      | 0.71 |      |      |      |       1.81 |       0.32
## APPR     |      | 0.65 |      |      |      |       2.14 |       0.35
## ORGN     |      |      | 0.89 |      |      |       1.45 |       0.04
## COMM     |      |      | 0.79 |      |      |       1.62 |       0.20
## RESM     |      |      |      | 0.90 |      |       1.47 |   4.99e-03
## LETT     |      |      |      | 0.81 |      |       1.40 |       0.23
## JFIT     |      |      |      |      | 0.81 |       1.94 |       0.05
## CFIT     |      |      |      |      | 0.72 |       2.34 |       0.15
## 
## The 5 latent factors (varimax rotation) accounted for 81.81% of the total variance of the original data (ML5 = 19.05%, ML4 = 18.17%, ML2 = 15.99%, ML1 = 15.16%, ML3 = 13.44%).

summary(FA_ML_V_5F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   ML5 |   ML4 |   ML2 |   ML1 |   ML3
## -----------------------------------------------------------------------
## Eigenvalues                     | 6.215 | 1.301 | 0.958 | 0.877 | 0.477
## Variance Explained              | 0.190 | 0.182 | 0.160 | 0.152 | 0.134
## Variance Explained (Cumulative) | 0.190 | 0.372 | 0.532 | 0.684 | 0.818
## Variance Explained (Proportion) | 0.233 | 0.222 | 0.195 | 0.185 | 0.164

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_ML_V_5F_Residual <- residuals(FA_ML_V_5F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.01    NA                                                      
## COMM  0.02 -0.01    NA                                                
## CFIT  0.00  0.00  0.00    NA                                          
## EXPR  0.00 -0.07 -0.03 -0.05    NA                                    
## JFIT  0.00  0.00  0.00  0.00  0.01    NA                              
## LETT -0.04 -0.04 -0.03 -0.01  0.07  0.00    NA                        
## LIKE  0.01  0.01  0.01 -0.03 -0.04  0.01  0.03    NA                  
## ORGN  0.00  0.00  0.00  0.00  0.01  0.00  0.01  0.00    NA            
## POTL  0.00  0.01  0.00  0.01  0.01  0.00  0.01  0.01  0.00    NA      
## RESM  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00    NA
## SCON -0.01 -0.01  0.00  0.01  0.04  0.00 -0.01  0.00  0.00  0.00  0.00
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_ML_V_5F_RMS <- FA_ML_V_5F$rms)

## [1] 0.01923904

(FA_ML_V_5F_TLI <- FA_ML_V_5F$TLI)

## [1] 0.9858829

(FA_ML_V_5F_BIC <- FA_ML_V_5F$BIC)

## [1] -45.32318

(FA_ML_V_5F_MaxResidual   <- max(abs(FA_ML_V_5F_Residual),na.rm=TRUE))

## [1] 0.0733741

(FA_ML_V_5F_HighResidual  <- sum(FA_ML_V_5F_Residual>abs(0.05),na.rm=TRUE))

## [1] 2

(FA_ML_V_5F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_ML_V_5F_HighResidualRate <- FA_ML_V_5F_HighResidual/FA_ML_V_5F_TotalResidual)

## [1] 0.03030303

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_ML_V_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Varimax Rotation : 5 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("ACAD","POTL","EXPR")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ACAD", "POTL", "EXPR")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.89      0.89    0.85      0.72 7.8 0.028  7.3 1.2      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.82  0.89  0.93
## Duhachek  0.83  0.89  0.94
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ACAD      0.83      0.83    0.70      0.70 4.7    0.049    NA  0.70
## POTL      0.80      0.80    0.67      0.67 4.0    0.056    NA  0.67
## EXPR      0.88      0.89    0.80      0.80 7.8    0.032    NA  0.80
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ACAD 50  0.91  0.91  0.85   0.79  7.4 1.3
## POTL 50  0.92  0.92  0.88   0.82  7.3 1.4
## EXPR 50  0.88  0.88  0.76   0.72  7.3 1.4
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## ACAD 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## POTL 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0

alpha(DPA.Descriptors.Numeric[,c("SCON","LIKE","APPR")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("SCON", "LIKE", "APPR")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.83       0.7 6.9 0.031  7.4 0.99      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.80  0.87  0.92
## Duhachek  0.81  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## SCON      0.78      0.78    0.64      0.64 3.5    0.062    NA  0.64
## LIKE      0.82      0.82    0.70      0.70 4.6    0.051    NA  0.70
## APPR      0.86      0.86    0.75      0.75 6.1    0.040    NA  0.75
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## SCON 50  0.92  0.91  0.86   0.80  7.3 1.2
## LIKE 50  0.90  0.89  0.82   0.76  7.4 1.1
## APPR 50  0.86  0.87  0.76   0.71  7.4 1.0
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## SCON 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## LIKE 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0
## APPR 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0

alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.92      0.93    0.86      0.86  12 0.021  6.9 1.5     0.86
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.87  0.92  0.96
## Duhachek  0.88  0.92  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ORGN      0.92      0.86    0.74      0.86 6.2       NA     0  0.86
## COMM      0.80      0.86    0.74      0.86 6.2       NA     0  0.86
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.97  0.96   0.9   0.86  6.9 1.6
## COMM 50  0.96  0.96   0.9   0.86  6.9 1.5
## 
## Non missing response frequency for each item
##         3    4    5   6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.1 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.2 0.30 0.28 0.06 0.02    0

alpha(DPA.Descriptors.Numeric[,c("RESM","LETT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("RESM", "LETT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0

alpha(DPA.Descriptors.Numeric[,c("JFIT","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("JFIT", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.94      0.94    0.88      0.88  15 0.018    7 1.6     0.88
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.89  0.94  0.96
## Duhachek  0.90  0.94  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## JFIT      0.87      0.88    0.78      0.88 7.5       NA     0  0.88
## CFIT      0.90      0.88    0.78      0.88 7.5       NA     0  0.88
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## JFIT 50  0.97  0.97  0.91   0.88  7.0 1.6
## CFIT 50  0.97  0.97  0.91   0.88  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_ML_V_5F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "mle",
                                  nfac = 5,
                                  rotation = "varimax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_ML_V_5F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

par(mfrow=c(1,3))
fa.diagram(FA_ML_V_3F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Varimax Rotation : 3 Factors",
           cex=0.75)
fa.diagram(FA_ML_V_4F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Varimax Rotation : 4 Factors",
           cex=0.75)
fa.diagram(FA_ML_V_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Varimax Rotation : 5 Factors",
           cex=0.75)

1.6.4 Maximum Likelihood Factor Extraction and Promax Rotation (FA_ML_P)

Maximum Likelihood Factor Extraction aims to estimate the factor loadings in a way that maximizes the likelihood of observing the given data, assuming a specific factor model. Given the correlation matrix, the algorithm formulates a likelihood function that represents the probability of observing the given data under an assumed factor model representing the relationships between the latent factors and observed variables. The likelihood function quantifies how well the model explains the observed data. Optimization techniques are applied to determine the factor loadings that maximize the likelihood function. The process involves iteratively adjusting the factor loadings to improve the fit between the model and the data. Factor loadings indicate the strength and direction of the relationship between variables and factors. Factors are interpreted based on the loading patterns. Variables with high loadings on a factor are strongly associated with the factor.

Promax Rotation is an oblique rotation method which allows for more flexibility by accommodating the possibility of correlated factors. The algorithm aims to simplify the factor structure by both maximizing the variance of the squared loadings within each factor and allowing for correlated factors. It uses a more complex mathematical approach to find the optimal rotation that accounts for both variance and correlation. The results provide a more accurate representation of the underlying relationships when the factors are expected to be correlated.

[A] Appplying Maximum Likelihood factor extraction and Promax rotation, an evaluation was conducted using a set of empirical guidelines to determine the optimal number of factors to be retained for exploratory factor analysis. It was determined that:
     [A.1] 4 factors would be sufficient for an optimal balance between comprehensiveness and parsimony.
     [A.2] To ensure that both under-extraction and over-extraction are assessed, models with 3, 4 and 5 factors were sequentially evaluated for their interpretability and theoretical meaningfulness.
     [A.3] The choice of 4 factors was supported by maximum consensus (26.32%) from 5 (Bentler, Beta, Parallel Analysis, Kaiser Criterion and Standardized Scree) among 19 methods.

[B] Results for the exploratory factor analysis using a 3-Factor Structure were as follows:
     [B.1] Standardized Root Mean Square of the Residual = 0.08
     [B.2] Tucker-Lewis Fit Index = 0.69
     [B.3] Bayesian Information Criterion = -35.60
     [B.4] High Residual Rate = 0.24
     [B.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [B.5.1] POTL: Loading = 1.08, Communality = 0.92
            [B.5.2] ACAD: Loading = 0.88, Communality = 0.67
            [B.5.3] EXPR: Loading = 0.85, Communality = 0.61
            [B.5.4] JFIT: Loading = 0.60, Communality = 0.62
            [B.5.5] Cronbach’s Alpha = 0.88
     [B.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [B.6.1] ORGN: Loading = 1.18, Communality = 0.96
            [B.6.2] COMM: Loading = 0.99, Communality = 0.80
            [B.6.3] CFIT: Loading = 0.47, Communality = 0.62
            [B.6.4] LIKE: Loading = 0.41, Communality = 0.44
            [B.6.5] APPR: Loading = 0.40, Communality = 0.46
            [B.6.6] SCON: Loading = 0.35, Communality = 0.51
            [B.6.7] Cronbach’s Alpha = 0.88
     [B.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [B.7.1] RESM: Loading = 1.06, Communality = 0.83
            [B.7.2] LETT: Loading = 0.87, Communality = 0.71
            [B.7.3] Cronbach’s Alpha = 0.91
     [B.8] Correlation between factors ranged from 0.58 to 0.67

[C] Results for the exploratory factor analysis using a 4-Factor Structure were as follows:
     [C.1] Standardized Root Mean Square of the Residual = 0.06
     [C.2] Tucker-Lewis Fit Index = 0.76
     [C.3] Bayesian Information Criterion = -37.51
     [C.4] High Residual Rate = 0.15
     [C.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [C.5.1] JFIT: Loading = 1.04, Communality = 0.91
            [C.5.2] CFIT: Loading = 0.91, Communality = 0.85
            [C.5.3] POTL: Loading = 0.70, Communality = 0.65
            [C.5.4] EXPR: Loading = 0.47, Communality = 0.46
            [C.5.5] ACAD: Loading = 0.44, Communality = 0.45
            [C.5.6] Cronbach’s Alpha = 0.90
     [C.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [C.6.1] SCON: Loading = 0.93, Communality = 0.78
            [C.6.2] APPR: Loading = 0.80, Communality = 0.68
            [C.6.3] LIKE: Loading = 0.70, Communality = 0.65
            [C.6.4] Cronbach’s Alpha = 0.87
     [C.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [C.7.1] ORGN: Loading = 0.99, Communality = 0.97
            [C.7.2] COMM: Loading = 0.90, Communality = 0.79
            [C.7.3] Cronbach’s Alpha = 0.92
     [C.8] Factor 4 was a latent variable with higher loading towards the following descriptors:
            [C.8.1] RESM: Loading = 0.89, Communality = 1.00
            [C.8.2] LETT: Loading = 0.76, Communality = 0.78
            [C.8.3] Cronbach’s Alpha = 0.91
     [C.9] Correlation between factors ranged from 0.27 to 0.66

[D] Results for the exploratory factor analysis using a 5-Factor Structure were as follows:
     [D.1] Standardized Root Mean Square of the Residual = 0.02
     [D.2] Tucker-Lewis Fit Index = 0.99
     [D.3] Bayesian Information Criterion = -45.32
     [D.4] High Residual Rate = 0.03
     [D.5] Factor 1 was a latent variable with higher loading towards the following descriptors:
            [D.5.1] ACAD: Loading = 0.95, Communality = 0.78
            [D.5.2] POTL: Loading = 0.82, Communality = 0.87
            [D.5.3] EXPR: Loading = 0.73, Communality = 0.64
            [C.5.4] Cronbach’s Alpha = 0.89
     [D.6] Factor 2 was a latent variable with higher loading towards the following descriptors:
            [D.6.1] SCON: Loading = 1.02, Communality = 0.87
            [D.6.2] LIKE: Loading = 0.76, Communality = 0.68
            [D.6.3] APPR: Loading = 0.64, Communality = 0.65
            [C.6.4] Cronbach’s Alpha = 0.87
     [D.7] Factor 3 was a latent variable with higher loading towards the following descriptors:
            [D.7.1] ORGN: Loading = 0.96, Communality = 0.96
            [D.7.2] COMM: Loading = 0.85, Communality = 0.80
            [C.7.3] Cronbach’s Alpha = 0.92
     [D.8] Factor 4 was a latent variable with higher loading towards the following descriptors:
            [D.8.1] RESM: Loading = 0.93, Communality = 1.00
            [D.8.2] LETT: Loading = 0.79, Communality = 0.77
            [C.8.3] Cronbach’s Alpha = 0.91
     [D.9] Factor 5 was a latent variable with higher loading towards the following descriptors:
            [D.9.1] JFIT: Loading = 0.96, Communality = 0.95
            [D.9.2] CFIT: Loading = 0.87, Communality = 0.85
            [C.9.3] Cronbach’s Alpha = 0.94
     [D.10] Correlation between factors ranged from 0.39 to 0.62

Code Chunk | Output

##################################
# Implementing various procedures for determining
# factor retention based on
# the maximum consensus between methods
##################################
(FA_ML_P_MethodAgreementProcedure <- parameters::n_factors(DPA.Descriptors.Numeric,
                                                           algorithm = "mle",
                                                           rotation = "promax"))

## # Method Agreement Procedure:
## 
## The choice of 4 dimensions is supported by 5 (26.32%) methods out of 19 (Bentler, beta, Parallel analysis, Kaiser criterion, Scree (SE)).

as.data.frame(FA_ML_P_MethodAgreementProcedure)

##    n_Factors              Method              Family
## 1          1 Acceleration factor               Scree
## 2          1          Scree (R2)            Scree_SE
## 3          1    VSS complexity 1                 VSS
## 4          2 Optimal coordinates               Scree
## 5          2    VSS complexity 2                 VSS
## 6          3                 CNG                 CNG
## 7          4             Bentler             Bentler
## 8          4                beta Multiple_regression
## 9          4   Parallel analysis               Scree
## 10         4    Kaiser criterion               Scree
## 11         4          Scree (SE)            Scree_SE
## 12         5       Velicer's MAP        Velicers_MAP
## 13         5                 BIC                 BIC
## 14         6            Bartlett             Barlett
## 15         6                   t Multiple_regression
## 16         6                   p Multiple_regression
## 17         6      BIC (adjusted)                 BIC
## 18         7            Anderson             Barlett
## 19         7              Lawley             Barlett

##################################
# Conducting exploratory factor analysis
# using Maximum Likelihood extraction
# and Promax rotation
# with 3 factors
##################################
(FA_ML_P_3F <- fa(DPA.Descriptors.Numeric,
              nfactors = 3,
              fm="ml",
              rotate = "promax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  ml
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 3, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "promax", residuals = TRUE, SMC = TRUE, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        ML3   ML2   ML1   h2    u2 com
## ACAD  0.88 -0.04 -0.05 0.67 0.326 1.0
## APPR  0.19  0.40  0.19 0.46 0.543 1.9
## COMM -0.20  0.99  0.05 0.80 0.203 1.1
## CFIT  0.44  0.47 -0.05 0.62 0.377 2.0
## EXPR  0.85 -0.32  0.17 0.61 0.385 1.4
## JFIT  0.60  0.28 -0.04 0.62 0.379 1.4
## LETT -0.08  0.04  0.87 0.71 0.289 1.0
## LIKE  0.19  0.41  0.15 0.44 0.563 1.7
## ORGN -0.20  1.18 -0.18 0.96 0.043 1.1
## POTL  1.08 -0.02 -0.18 0.92 0.079 1.1
## RESM -0.02 -0.10  1.06 1.00 0.005 1.0
## SCON  0.28  0.35  0.09 0.40 0.597 2.1
## 
##                        ML3  ML2  ML1
## SS loadings           3.27 3.05 1.89
## Proportion Var        0.27 0.25 0.16
## Cumulative Var        0.27 0.53 0.68
## Proportion Explained  0.40 0.37 0.23
## Cumulative Proportion 0.40 0.77 1.00
## 
##  With factor correlations of 
##      ML3  ML2  ML1
## ML3 1.00 0.67 0.64
## ML2 0.67 1.00 0.58
## ML1 0.64 0.58 1.00
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 3 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 33  and the objective function was  2.22 
## 
## The root mean square of the residuals (RMSR) is  0.08 
## The df corrected root mean square of the residuals is  0.11 
## 
## The harmonic n.obs is  50 with the empirical chi square  41.84  with prob <  0.14 
## The total n.obs was  50  with Likelihood Chi Square =  93.49  with prob <  1.1e-07 
## 
## Tucker Lewis Index of factoring reliability =  0.686
## RMSEA index =  0.19  and the 90 % confidence intervals are  0.148 0.24
## BIC =  -35.6
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy             
##                                                    ML3  ML2  ML1
## Correlation of (regression) scores with factors   0.98 0.99 1.00
## Multiple R square of scores with factors          0.96 0.97 1.00
## Minimum correlation of possible factor scores     0.91 0.95 0.99

(FA_ML_P_3F_Summary <- FA_ML_P_3F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (promax-rotation)
## 
## Variable |  ML3 |  ML2 |  ML1 | Complexity | Uniqueness
## -------------------------------------------------------
## POTL     | 1.08 |      |      |       1.06 |       0.08
## ACAD     | 0.88 |      |      |       1.01 |       0.33
## EXPR     | 0.85 |      |      |       1.36 |       0.39
## JFIT     | 0.60 |      |      |       1.44 |       0.38
## ORGN     |      | 1.18 |      |       1.10 |       0.04
## COMM     |      | 0.99 |      |       1.09 |       0.20
## CFIT     |      | 0.47 |      |       2.02 |       0.38
## LIKE     |      | 0.41 |      |       1.70 |       0.56
## APPR     |      | 0.40 |      |       1.90 |       0.54
## SCON     |      | 0.35 |      |       2.08 |       0.60
## RESM     |      |      | 1.06 |       1.02 |   4.98e-03
## LETT     |      |      | 0.87 |       1.02 |       0.29
## 
## The 3 latent factors (promax rotation) accounted for 68.42% of the total variance of the original data (ML3 = 27.29%, ML2 = 25.39%, ML1 = 15.75%).

summary(FA_ML_P_3F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   ML3 |   ML2 |   ML1
## -------------------------------------------------------
## Eigenvalues                     | 6.079 | 1.261 | 0.922
## Variance Explained              | 0.273 | 0.254 | 0.157
## Variance Explained (Cumulative) | 0.273 | 0.527 | 0.684
## Variance Explained (Proportion) | 0.399 | 0.371 | 0.230

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_ML_P_3F_Residual <- residuals(FA_ML_P_3F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.02    NA                                                      
## COMM  0.04 -0.04    NA                                                
## CFIT -0.07 -0.10 -0.01    NA                                          
## EXPR  0.05 -0.07 -0.02 -0.06    NA                                    
## JFIT -0.08 -0.09 -0.02  0.27  0.00    NA                              
## LETT -0.06 -0.14 -0.02  0.05  0.06  0.06    NA                        
## LIKE -0.04  0.19 -0.04 -0.02 -0.06  0.04 -0.06    NA                  
## ORGN  0.00  0.00  0.00 -0.01  0.01 -0.01  0.01 -0.01    NA            
## POTL  0.01  0.01  0.00  0.00  0.00 -0.01  0.00 -0.01  0.00    NA      
## RESM  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00    NA
## SCON -0.04  0.27 -0.05 -0.03  0.03 -0.02 -0.15  0.34  0.00 -0.01  0.00
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_ML_P_3F_RMS <- FA_ML_P_3F$rms)

## [1] 0.07961754

(FA_ML_P_3F_TLI <- FA_ML_P_3F$TLI)

## [1] 0.6858498

(FA_ML_P_3F_BIC <- FA_ML_P_3F$BIC)

## [1] -35.60434

(FA_ML_P_3F_MaxResidual   <- max(abs(FA_ML_P_3F_Residual),na.rm=TRUE))

## [1] 0.3363878

(FA_ML_P_3F_HighResidual  <- sum(FA_ML_P_3F_Residual>abs(0.05),na.rm=TRUE))

## [1] 16

(FA_ML_P_3F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_ML_P_3F_HighResidualRate <- FA_ML_P_3F_HighResidual/FA_ML_P_3F_TotalResidual)

## [1] 0.2424242

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_ML_P_3F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Promax Rotation : 3 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("POTL","ACAD","EXPR","JFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("POTL", "ACAD", "EXPR", 
##     "JFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.88    0.87      0.66 7.6 0.028  7.3 1.2     0.69
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.81  0.88  0.93
## Duhachek  0.82  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## POTL      0.79      0.80    0.74      0.58 4.1    0.051 0.0063  0.54
## ACAD      0.84      0.84    0.80      0.64 5.4    0.040 0.0111  0.70
## EXPR      0.86      0.86    0.84      0.68 6.4    0.036 0.0169  0.71
## JFIT      0.89      0.89    0.85      0.72 7.8    0.028 0.0043  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## POTL 50  0.93  0.93  0.92   0.87  7.3 1.4
## ACAD 50  0.86  0.87  0.83   0.76  7.4 1.3
## EXPR 50  0.83  0.84  0.75   0.71  7.3 1.4
## JFIT 50  0.82  0.80  0.71   0.65  7.0 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## POTL 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## ACAD 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## EXPR 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0

alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM","LIKE","APPR","SCON","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM", "LIKE", 
##     "APPR", "SCON", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.88      0.89     0.9      0.57 7.8 0.026  7.1 1.1     0.53
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.82  0.88  0.92
## Duhachek  0.83  0.88  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ORGN      0.84      0.86    0.85      0.55 6.0    0.036 0.014  0.49
## COMM      0.85      0.87    0.86      0.56 6.4    0.034 0.012  0.53
## LIKE      0.86      0.86    0.88      0.56 6.4    0.030 0.019  0.52
## APPR      0.86      0.87    0.89      0.57 6.7    0.030 0.019  0.51
## SCON      0.86      0.87    0.87      0.57 6.5    0.029 0.015  0.53
## CFIT      0.87      0.88    0.90      0.59 7.3    0.029 0.020  0.53
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.87  0.84  0.84   0.78  6.9 1.6
## COMM 50  0.84  0.81  0.79   0.74  6.9 1.5
## LIKE 50  0.78  0.81  0.77   0.69  7.4 1.1
## APPR 50  0.75  0.79  0.73   0.67  7.4 1.0
## SCON 50  0.76  0.80  0.76   0.67  7.3 1.2
## CFIT 50  0.78  0.75  0.66   0.64  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.10 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.20 0.30 0.28 0.06 0.02    0
## LIKE 0.00 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0
## APPR 0.00 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0
## SCON 0.00 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

alpha(DPA.Descriptors.Numeric[,c("RESM","LETT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("RESM", "LETT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_ML_P_3F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "mle",
                                  nfac = 3,
                                  rotation = "promax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_ML_P_3F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

##################################
# Conducting exploratory factor analysis
# using Maximum Likelihood extraction
# and Promax rotation
# with 4 factors
##################################
(FA_ML_P_4F <- fa(DPA.Descriptors.Numeric,
              nfactors = 4,
              fm="ml",
              rotate = "promax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  ml
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 4, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "promax", residuals = TRUE, SMC = TRUE, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        ML3   ML4   ML1   ML2   h2    u2 com
## ACAD  0.44  0.26  0.07 -0.03 0.45 0.546 1.7
## APPR -0.13  0.80  0.07  0.15 0.68 0.325 1.1
## COMM  0.03  0.10  0.16  0.76 0.79 0.206 1.1
## CFIT  0.91 -0.18 -0.04  0.28 0.85 0.148 1.3
## EXPR  0.47  0.16  0.22 -0.24 0.46 0.535 2.2
## JFIT  1.04 -0.16 -0.07  0.11 0.91 0.089 1.1
## LETT  0.06 -0.21  0.90  0.14 0.78 0.224 1.2
## LIKE  0.09  0.70 -0.02  0.11 0.65 0.350 1.1
## ORGN  0.08  0.16 -0.06  0.89 0.97 0.027 1.1
## POTL  0.70  0.22 -0.06 -0.04 0.65 0.347 1.2
## RESM -0.12  0.15  0.99 -0.03 1.00 0.005 1.1
## SCON  0.01  0.93 -0.12  0.01 0.78 0.220 1.0
## 
##                        ML3  ML4  ML1  ML2
## SS loadings           2.93 2.36 1.85 1.84
## Proportion Var        0.24 0.20 0.15 0.15
## Cumulative Var        0.24 0.44 0.60 0.75
## Proportion Explained  0.33 0.26 0.21 0.20
## Cumulative Proportion 0.33 0.59 0.80 1.00
## 
##  With factor correlations of 
##      ML3  ML4  ML1  ML2
## ML3 1.00 0.66 0.60 0.40
## ML4 0.66 1.00 0.53 0.41
## ML1 0.60 0.53 1.00 0.27
## ML2 0.40 0.41 0.27 1.00
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 4 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 24  and the objective function was  1.36 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.1 
## 
## The harmonic n.obs is  50 with the empirical chi square  22.17  with prob <  0.57 
## The total n.obs was  50  with Likelihood Chi Square =  56.38  with prob <  2e-04 
## 
## Tucker Lewis Index of factoring reliability =  0.764
## RMSEA index =  0.163  and the 90 % confidence intervals are  0.11 0.223
## BIC =  -37.51
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    ML3  ML4  ML1  ML2
## Correlation of (regression) scores with factors   0.98 0.95 1.00 0.98
## Multiple R square of scores with factors          0.95 0.90 0.99 0.97
## Minimum correlation of possible factor scores     0.90 0.81 0.99 0.93

(FA_ML_P_4F_Summary <- FA_ML_P_4F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (promax-rotation)
## 
## Variable |  ML3 |  ML4 |  ML1 |  ML2 | Complexity | Uniqueness
## --------------------------------------------------------------
## JFIT     | 1.04 |      |      |      |       1.08 |       0.09
## CFIT     | 0.91 |      |      |      |       1.27 |       0.15
## POTL     | 0.70 |      |      |      |       1.22 |       0.35
## EXPR     | 0.47 |      |      |      |       2.21 |       0.54
## ACAD     | 0.44 |      |      |      |       1.67 |       0.55
## SCON     |      | 0.93 |      |      |       1.04 |       0.22
## APPR     |      | 0.80 |      |      |       1.14 |       0.32
## LIKE     |      | 0.70 |      |      |       1.08 |       0.35
## RESM     |      |      | 0.99 |      |       1.08 |   5.00e-03
## LETT     |      |      | 0.90 |      |       1.18 |       0.22
## ORGN     |      |      |      | 0.89 |       1.09 |       0.03
## COMM     |      |      |      | 0.76 |       1.13 |       0.21
## 
## The 4 latent factors (promax rotation) accounted for 74.82% of the total variance of the original data (ML3 = 24.45%, ML4 = 19.64%, ML1 = 15.41%, ML2 = 15.32%).

summary(FA_ML_P_4F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   ML3 |   ML4 |   ML1 |   ML2
## ---------------------------------------------------------------
## Eigenvalues                     | 6.146 | 1.248 | 0.890 | 0.846
## Variance Explained              | 0.245 | 0.196 | 0.154 | 0.153
## Variance Explained (Cumulative) | 0.245 | 0.441 | 0.595 | 0.748
## Variance Explained (Proportion) | 0.327 | 0.263 | 0.206 | 0.205

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_ML_P_4F_Residual <- residuals(FA_ML_P_4F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.03    NA                                                      
## COMM  0.05 -0.01    NA                                                
## CFIT -0.04  0.00  0.00    NA                                          
## EXPR  0.24 -0.05 -0.02 -0.08    NA                                    
## JFIT -0.04  0.00  0.00  0.01 -0.01    NA                              
## LETT -0.04 -0.03 -0.03 -0.01  0.06  0.00    NA                        
## LIKE -0.08 -0.01  0.00 -0.01 -0.09  0.03  0.04    NA                  
## ORGN  0.00  0.00  0.00  0.00  0.01  0.00  0.01  0.00    NA            
## POTL  0.26  0.03  0.02 -0.02  0.20 -0.02  0.00 -0.05  0.00    NA      
## RESM  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00    NA
## SCON -0.08 -0.01 -0.01  0.02  0.00  0.00 -0.01  0.05  0.00 -0.05  0.00
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_ML_P_4F_RMS <- FA_ML_P_4F$rms)

## [1] 0.05795189

(FA_ML_P_4F_TLI <- FA_ML_P_4F$TLI)

## [1] 0.7644223

(FA_ML_P_4F_BIC <- FA_ML_P_4F$BIC)

## [1] -37.50857

(FA_ML_P_4F_MaxResidual   <- max(abs(FA_ML_P_4F_Residual),na.rm=TRUE))

## [1] 0.2590579

(FA_ML_P_4F_HighResidual  <- sum(FA_ML_P_4F_Residual>abs(0.05),na.rm=TRUE))

## [1] 10

(FA_ML_P_4F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_ML_P_4F_HighResidualRate <- FA_ML_P_4F_HighResidual/FA_ML_P_4F_TotalResidual)

## [1] 0.1515152

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_ML_P_4F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Promax Rotation : 4 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("JFIT","CFIT","POTL","EXPR","ACAD")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("JFIT", "CFIT", "POTL", 
##     "EXPR", "ACAD")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##        0.9       0.9    0.92      0.64 8.8 0.024  7.2 1.2     0.66
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84   0.9  0.94
## Duhachek  0.85   0.9  0.94
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## JFIT      0.86      0.87    0.86      0.62 6.5    0.034 0.021  0.66
## CFIT      0.88      0.88    0.87      0.66 7.6    0.028 0.011  0.69
## POTL      0.85      0.85    0.88      0.58 5.6    0.036 0.029  0.53
## EXPR      0.89      0.89    0.91      0.68 8.5    0.025 0.022  0.68
## ACAD      0.88      0.88    0.90      0.64 7.3    0.029 0.028  0.68
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## JFIT 50  0.88  0.87  0.86   0.80  7.0 1.6
## CFIT 50  0.84  0.82  0.80   0.72  6.9 1.6
## POTL 50  0.91  0.92  0.90   0.85  7.3 1.4
## EXPR 50  0.76  0.78  0.71   0.64  7.3 1.4
## ACAD 50  0.81  0.83  0.78   0.72  7.4 1.3
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0
## POTL 0.00 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0
## ACAD 0.00 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0

alpha(DPA.Descriptors.Numeric[,c("SCON","APPR","LIKE")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("SCON", "APPR", "LIKE")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.83       0.7 6.9 0.031  7.4 0.99      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.80  0.87  0.92
## Duhachek  0.81  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## SCON      0.78      0.78    0.64      0.64 3.5    0.062    NA  0.64
## APPR      0.86      0.86    0.75      0.75 6.1    0.040    NA  0.75
## LIKE      0.82      0.82    0.70      0.70 4.6    0.051    NA  0.70
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## SCON 50  0.92  0.91  0.86   0.80  7.3 1.2
## APPR 50  0.86  0.87  0.76   0.71  7.4 1.0
## LIKE 50  0.90  0.89  0.82   0.76  7.4 1.1
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## SCON 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## APPR 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0
## LIKE 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0

alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.92      0.93    0.86      0.86  12 0.021  6.9 1.5     0.86
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.87  0.92  0.96
## Duhachek  0.88  0.92  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ORGN      0.92      0.86    0.74      0.86 6.2       NA     0  0.86
## COMM      0.80      0.86    0.74      0.86 6.2       NA     0  0.86
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.97  0.96   0.9   0.86  6.9 1.6
## COMM 50  0.96  0.96   0.9   0.86  6.9 1.5
## 
## Non missing response frequency for each item
##         3    4    5   6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.1 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.2 0.30 0.28 0.06 0.02    0

alpha(DPA.Descriptors.Numeric[,c("RESM","LETT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("RESM", "LETT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_ML_P_4F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "mle",
                                  nfac = 4,
                                  rotation = "promax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_ML_P_4F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

##################################
# Conducting exploratory factor analysis
# using Maximum Likelihood extraction
# and Promax rotation
# with 5 factors
##################################
(FA_ML_P_5F <- fa(DPA.Descriptors.Numeric,
              nfactors = 5,
              fm="ml",
              rotate = "promax",
              residuals=TRUE,
              SMC=TRUE,
              n.obs=nrow(DPA.Descriptors.Numeric)))

## Factor Analysis using method =  ml
## Call: fa(r = DPA.Descriptors.Numeric, nfactors = 5, n.obs = nrow(DPA.Descriptors.Numeric), 
##     rotate = "promax", residuals = TRUE, SMC = TRUE, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        ML5   ML4   ML2   ML3   ML1   h2    u2 com
## ACAD  0.95 -0.06  0.13 -0.10 -0.04 0.78 0.218 1.1
## APPR  0.15  0.64  0.18 -0.17  0.07 0.65 0.353 1.5
## COMM  0.02 -0.02  0.85  0.00  0.12 0.80 0.196 1.0
## CFIT  0.03  0.00  0.16  0.79  0.03 0.85 0.154 1.1
## EXPR  0.73  0.02 -0.19  0.06  0.15 0.64 0.362 1.2
## JFIT  0.07  0.04 -0.05  0.93  0.02 0.95 0.050 1.0
## LETT -0.03 -0.15  0.09  0.10  0.87 0.77 0.229 1.1
## LIKE -0.09  0.76  0.02  0.15  0.03 0.68 0.319 1.1
## ORGN -0.04  0.06  0.96  0.06 -0.09 0.96 0.035 1.0
## POTL  0.82  0.04  0.03  0.20 -0.12 0.87 0.132 1.2
## RESM  0.03  0.17 -0.08 -0.08  0.96 1.00 0.005 1.1
## SCON -0.03  1.02 -0.09  0.03 -0.09 0.87 0.130 1.0
## 
##                        ML5  ML4  ML2  ML3  ML1
## SS loadings           2.23 2.15 1.87 1.80 1.77
## Proportion Var        0.19 0.18 0.16 0.15 0.15
## Cumulative Var        0.19 0.36 0.52 0.67 0.82
## Proportion Explained  0.23 0.22 0.19 0.18 0.18
## Cumulative Proportion 0.23 0.45 0.64 0.82 1.00
## 
##  With factor correlations of 
##      ML5  ML4  ML2  ML3  ML1
## ML5 1.00 0.57 0.39 0.62 0.53
## ML4 0.57 1.00 0.58 0.51 0.46
## ML2 0.39 0.58 1.00 0.56 0.40
## ML3 0.62 0.51 0.56 1.00 0.47
## ML1 0.53 0.46 0.40 0.47 1.00
## 
## Mean item complexity =  1.1
## Test of the hypothesis that 5 factors are sufficient.
## 
## df null model =  66  with the objective function =  10.7 with Chi Square =  472.51
## df of  the model are 16  and the objective function was  0.42 
## 
## The root mean square of the residuals (RMSR) is  0.02 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic n.obs is  50 with the empirical chi square  2.44  with prob <  1 
## The total n.obs was  50  with Likelihood Chi Square =  17.27  with prob <  0.37 
## 
## Tucker Lewis Index of factoring reliability =  0.986
## RMSEA index =  0.034  and the 90 % confidence intervals are  0 0.141
## BIC =  -45.32
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    ML5  ML4  ML2  ML3  ML1
## Correlation of (regression) scores with factors   0.96 0.96 0.98 0.98 1.00
## Multiple R square of scores with factors          0.92 0.93 0.97 0.96 0.99
## Minimum correlation of possible factor scores     0.85 0.85 0.94 0.92 0.98

(FA_ML_P_5F_Summary <- FA_ML_P_5F %>%
  model_parameters(sort = TRUE, threshold = "max"))

## # Rotated loadings from Factor Analysis (promax-rotation)
## 
## Variable |  ML5 |  ML4 |  ML2 |  ML3 |  ML1 | Complexity | Uniqueness
## ---------------------------------------------------------------------
## ACAD     | 0.95 |      |      |      |      |       1.07 |       0.22
## POTL     | 0.82 |      |      |      |      |       1.17 |       0.13
## EXPR     | 0.73 |      |      |      |      |       1.24 |       0.36
## SCON     |      | 1.02 |      |      |      |       1.03 |       0.13
## LIKE     |      | 0.76 |      |      |      |       1.12 |       0.32
## APPR     |      | 0.64 |      |      |      |       1.45 |       0.35
## ORGN     |      |      | 0.96 |      |      |       1.04 |       0.04
## COMM     |      |      | 0.85 |      |      |       1.04 |       0.20
## JFIT     |      |      |      | 0.93 |      |       1.02 |       0.05
## CFIT     |      |      |      | 0.79 |      |       1.09 |       0.15
## RESM     |      |      |      |      | 0.96 |       1.09 |   4.99e-03
## LETT     |      |      |      |      | 0.87 |       1.11 |       0.23
## 
## The 5 latent factors (promax rotation) accounted for 81.81% of the total variance of the original data (ML5 = 18.55%, ML4 = 17.95%, ML2 = 15.60%, ML3 = 15.00%, ML1 = 14.72%).

summary(FA_ML_P_5F_Summary)

## # (Explained) Variance of Components
## 
## Parameter                       |   ML5 |   ML4 |   ML2 |   ML3 |   ML1
## -----------------------------------------------------------------------
## Eigenvalues                     | 6.215 | 1.301 | 0.958 | 0.877 | 0.477
## Variance Explained              | 0.185 | 0.179 | 0.156 | 0.150 | 0.147
## Variance Explained (Cumulative) | 0.185 | 0.365 | 0.521 | 0.671 | 0.818
## Variance Explained (Proportion) | 0.227 | 0.219 | 0.191 | 0.183 | 0.180

##################################
# Extracting the residuals
# from the Exploratory Factor Analysis
##################################
(FA_ML_P_5F_Residual <- residuals(FA_ML_P_5F,
                              diag=FALSE,
                              na.rm=TRUE))

##      ACAD  APPR  COMM  CFIT  EXPR  JFIT  LETT  LIKE  ORGN  POTL  RESM 
## ACAD    NA                                                            
## APPR  0.01    NA                                                      
## COMM  0.02 -0.01    NA                                                
## CFIT  0.00  0.00  0.00    NA                                          
## EXPR  0.00 -0.07 -0.03 -0.05    NA                                    
## JFIT  0.00  0.00  0.00  0.00  0.01    NA                              
## LETT -0.04 -0.04 -0.03 -0.01  0.07  0.00    NA                        
## LIKE  0.01  0.01  0.01 -0.03 -0.04  0.01  0.03    NA                  
## ORGN  0.00  0.00  0.00  0.00  0.01  0.00  0.01  0.00    NA            
## POTL  0.00  0.01  0.00  0.01  0.01  0.00  0.01  0.01  0.00    NA      
## RESM  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00    NA
## SCON -0.01 -0.01  0.00  0.01  0.04  0.00 -0.01  0.00  0.00  0.00  0.00
## [1]    NA

##################################
# Obtaining Fit Indices
##################################
(FA_ML_P_5F_RMS <- FA_ML_P_5F$rms)

## [1] 0.01923904

(FA_ML_P_5F_TLI <- FA_ML_P_5F$TLI)

## [1] 0.9858829

(FA_ML_P_5F_BIC <- FA_ML_P_5F$BIC)

## [1] -45.32318

(FA_ML_P_5F_MaxResidual   <- max(abs(FA_ML_P_5F_Residual),na.rm=TRUE))

## [1] 0.0733741

(FA_ML_P_5F_HighResidual  <- sum(FA_ML_P_5F_Residual>abs(0.05),na.rm=TRUE))

## [1] 2

(FA_ML_P_5F_TotalResidual <- length(DPA.Descriptors.Numeric)*(length(DPA.Descriptors.Numeric)-1)/2)

## [1] 66

(FA_ML_P_5F_HighResidualRate <- FA_ML_P_5F_HighResidual/FA_ML_P_5F_TotalResidual)

## [1] 0.03030303

##################################
# Graph the factor loading matrices
##################################
fa.diagram(FA_ML_P_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Promax Rotation : 5 Factors",
           cex=0.75)

##################################
# computing the internal consistency
# measure of reliability using the
# Cronbach's alpha coefficient
# for each factor
##################################
alpha(DPA.Descriptors.Numeric[,c("ACAD","POTL","EXPR")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ACAD", "POTL", "EXPR")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.89      0.89    0.85      0.72 7.8 0.028  7.3 1.2      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.82  0.89  0.93
## Duhachek  0.83  0.89  0.94
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ACAD      0.83      0.83    0.70      0.70 4.7    0.049    NA  0.70
## POTL      0.80      0.80    0.67      0.67 4.0    0.056    NA  0.67
## EXPR      0.88      0.89    0.80      0.80 7.8    0.032    NA  0.80
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ACAD 50  0.91  0.91  0.85   0.79  7.4 1.3
## POTL 50  0.92  0.92  0.88   0.82  7.3 1.4
## EXPR 50  0.88  0.88  0.76   0.72  7.3 1.4
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## ACAD 0.02 0.02 0.22 0.28 0.24 0.18 0.04    0
## POTL 0.04 0.06 0.18 0.22 0.30 0.18 0.02    0
## EXPR 0.00 0.10 0.20 0.24 0.24 0.18 0.04    0

alpha(DPA.Descriptors.Numeric[,c("SCON","LIKE","APPR")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("SCON", "LIKE", "APPR")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.83       0.7 6.9 0.031  7.4 0.99      0.7
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.80  0.87  0.92
## Duhachek  0.81  0.87  0.93
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## SCON      0.78      0.78    0.64      0.64 3.5    0.062    NA  0.64
## LIKE      0.82      0.82    0.70      0.70 4.6    0.051    NA  0.70
## APPR      0.86      0.86    0.75      0.75 6.1    0.040    NA  0.75
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## SCON 50  0.92  0.91  0.86   0.80  7.3 1.2
## LIKE 50  0.90  0.89  0.82   0.76  7.4 1.1
## APPR 50  0.86  0.87  0.76   0.71  7.4 1.0
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## SCON 0.00 0.08 0.14 0.32 0.28 0.18 0.00    0
## LIKE 0.02 0.04 0.12 0.32 0.36 0.14 0.00    0
## APPR 0.00 0.02 0.16 0.32 0.38 0.10 0.02    0

alpha(DPA.Descriptors.Numeric[,c("ORGN","COMM")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("ORGN", "COMM")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.92      0.93    0.86      0.86  12 0.021  6.9 1.5     0.86
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.87  0.92  0.96
## Duhachek  0.88  0.92  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## ORGN      0.92      0.86    0.74      0.86 6.2       NA     0  0.86
## COMM      0.80      0.86    0.74      0.86 6.2       NA     0  0.86
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## ORGN 50  0.97  0.96   0.9   0.86  6.9 1.6
## COMM 50  0.96  0.96   0.9   0.86  6.9 1.5
## 
## Non missing response frequency for each item
##         3    4    5   6    7    8    9   10 miss
## ORGN 0.02 0.04 0.20 0.1 0.24 0.24 0.16 0.00    0
## COMM 0.04 0.04 0.06 0.2 0.30 0.28 0.06 0.02    0

alpha(DPA.Descriptors.Numeric[,c("RESM","LETT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("RESM", "LETT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.91      0.91    0.84      0.84  10 0.025  7.2 1.6     0.84
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.84  0.91  0.95
## Duhachek  0.86  0.91  0.96
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## RESM      0.84      0.84     0.7      0.84 5.1       NA     0  0.84
## LETT      0.83      0.84     0.7      0.84 5.1       NA     0  0.84
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## RESM 50  0.96  0.96  0.88   0.84  7.2 1.7
## LETT 50  0.96  0.96  0.88   0.84  7.2 1.7
## 
## Non missing response frequency for each item
##         4    5    6    7    8    9   10 miss
## RESM 0.08 0.06 0.22 0.18 0.16 0.24 0.06    0
## LETT 0.06 0.12 0.14 0.22 0.24 0.12 0.10    0

alpha(DPA.Descriptors.Numeric[,c("JFIT","CFIT")])

## 
## Reliability analysis   
## Call: alpha(x = DPA.Descriptors.Numeric[, c("JFIT", "CFIT")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.94      0.94    0.88      0.88  15 0.018    7 1.6     0.88
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.89  0.94  0.96
## Duhachek  0.90  0.94  0.97
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## JFIT      0.87      0.88    0.78      0.88 7.5       NA     0  0.88
## CFIT      0.90      0.88    0.78      0.88 7.5       NA     0  0.88
## 
##  Item statistics 
##       n raw.r std.r r.cor r.drop mean  sd
## JFIT 50  0.97  0.97  0.91   0.88  7.0 1.6
## CFIT 50  0.97  0.97  0.91   0.88  6.9 1.6
## 
## Non missing response frequency for each item
##         3    4    5    6    7    8    9   10 miss
## JFIT 0.06 0.00 0.06 0.22 0.26 0.24 0.12 0.04    0
## CFIT 0.02 0.06 0.16 0.12 0.22 0.30 0.08 0.04    0

##################################
# Formulating the dandelion plot to
# visualize both factor variances and loadings
# from the factor loading matrices
##################################
FA_ML_P_5F_FactorLoading <- factload(DPA.Descriptors.Numeric,
                                  cormeth = "pearson",
                                  method = "mle",
                                  nfac = 5,
                                  rotation = "promax")

DandelionPlotPalette <- rev(rainbow(100, start = 0, end = 0.2))

dandelion(FA_ML_P_5F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

par(mfrow=c(1,3))
fa.diagram(FA_ML_P_3F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Promax Rotation : 3 Factors",
           cex=0.75)
fa.diagram(FA_ML_P_4F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Promax Rotation : 4 Factors",
           cex=0.75)
fa.diagram(FA_ML_P_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Maximum Likelihood Factor Extraction + Promax Rotation : 5 Factors",
           cex=0.75)

1.7 Consolidated Findings

[A] Among candidates, optimal results were obtained for the exploratory factor analysis model with 5-Factor Structure by demonstrating excellent model fit metrics. In addition, the latent variables obtained were contextually meaningful as individual factors in the analysis.
     [A.1] Lowest Standardized Root Mean Square of the Residual
     [A.2] Highest Tucker-Lewis Fit Index
     [A.3] Lowest Bayesian Information Criterion
     [A.4] Lowest High Residual Rate

[B] Among candidates, minimal errors were obtained for the exploratory factor analysis model with Principal Axes factor extraction by demonstrating the:
     [A.1] Lowest Standardized Root Mean Square of the Residual
     [A.2] Lowest High Residual Rate

[C] Among candidates, the exploratory factor analysis model with Promax rotation was more appropriate due to the presence of considerable correlation among the extracted factors.

[D] The selected exploratory factor analysis model demonstrated highly consistent factors described as follows:
     [D.1] Factor 1 corresponds to the academic and professional background of the candidate.
            [D.1.1] ACAD (Academic Record)
            [D.1.2] POTL (Potential)
            [D.1.3] EXPR (Experience)
     [D.2] Factor 2 corresponds to the candidate’s personality.
            [D.2.1] SCON (Self-Confidence)
            [D.2.2] LIKE (Likeability)
            [D.2.3] APPR (Appearance)
     [D.3] Factor 3 corresponds to the candidate’s soft skills.
            [D.3.1] COMM (Communication)
            [D.3.2] ORGN (Organization)
     [D.4] Factor 4 corresponds to the candidate’s presentation of skills and qualifications during application.
            [D.4.1] LETT (Cover Letter)
            [D.4.2] RESM (Resume)
     [D.5] Factor 5 corresponds to the candidate’s overall fit to the company and job.
            [D.5.1] JFIT (Job Fit)
            [D.5.2] CFIT (Company Fit)

Code Chunk | Output

##################################
# Consolidating the fit indices
##################################
FA_RMSSummary <- c(FA_PA_V_3F_RMS,
                   FA_PA_V_4F_RMS,
                   FA_PA_V_5F_RMS,
                   FA_PA_P_3F_RMS,
                   FA_PA_P_4F_RMS,
                   FA_PA_P_5F_RMS,
                   FA_ML_V_3F_RMS,
                   FA_ML_V_4F_RMS,
                   FA_ML_V_5F_RMS,
                   FA_ML_P_3F_RMS,
                   FA_ML_P_4F_RMS,
                   FA_ML_P_5F_RMS)

FA_TLISummary <- c(FA_PA_V_3F_TLI,
                   FA_PA_V_4F_TLI,
                   FA_PA_V_5F_TLI,
                   FA_PA_P_3F_TLI,
                   FA_PA_P_4F_TLI,
                   FA_PA_P_5F_TLI,
                   FA_ML_V_3F_TLI,
                   FA_ML_V_4F_TLI,
                   FA_ML_V_5F_TLI,
                   FA_ML_P_3F_TLI,
                   FA_ML_P_4F_TLI,
                   FA_ML_P_5F_TLI)

FA_BICSummary <- c(FA_PA_V_3F_BIC,
                   FA_PA_V_4F_BIC,
                   FA_PA_V_5F_BIC,
                   FA_PA_P_3F_BIC,
                   FA_PA_P_4F_BIC,
                   FA_PA_P_5F_BIC,
                   FA_ML_V_3F_BIC,
                   FA_ML_V_4F_BIC,
                   FA_ML_V_5F_BIC,
                   FA_ML_P_3F_BIC,
                   FA_ML_P_4F_BIC,
                   FA_ML_P_5F_BIC)

FA_HighResidualRateSummary <- c(FA_PA_V_3F_HighResidualRate,
                                FA_PA_V_4F_HighResidualRate,
                                FA_PA_V_5F_HighResidualRate,
                                FA_PA_P_3F_HighResidualRate,
                                FA_PA_P_4F_HighResidualRate,
                                FA_PA_P_5F_HighResidualRate,
                                FA_ML_V_3F_HighResidualRate,
                                FA_ML_V_4F_HighResidualRate,
                                FA_ML_V_5F_HighResidualRate,
                                FA_ML_P_3F_HighResidualRate,
                                FA_ML_P_4F_HighResidualRate,
                                FA_ML_P_5F_HighResidualRate)

FA_AlgorithmSummary <- c("FA_PA_V_3F",
                         "FA_PA_V_4F",
                         "FA_PA_V_5F",
                         "FA_PA_P_3F",
                         "FA_PA_P_4F",
                         "FA_PA_P_5F",
                         "FA_ML_V_3F",
                         "FA_ML_V_4F",
                         "FA_ML_V_5F",
                         "FA_ML_P_3F",
                         "FA_ML_P_4F",
                         "FA_ML_P_5F")

FA_Summary <- cbind(FA_RMSSummary,
                    FA_TLISummary,
                    FA_BICSummary,
                    FA_HighResidualRateSummary,
                    FA_AlgorithmSummary)

FA_Summary <- as.data.frame(FA_Summary)
names(FA_Summary) <- c("RMS",
                       "TLI",
                       "BIC",
                       "HighResidualRate",
                       "Algorithm")

FA_Summary$RMS <- as.numeric(as.character(FA_Summary$RMS))
FA_Summary$TLI <- as.numeric(as.character(FA_Summary$TLI))
FA_Summary$BIC <- as.numeric(as.character(FA_Summary$BIC))
FA_Summary$HighResidualRate <- as.numeric(as.character(FA_Summary$HighResidualRate))

FA_Summary$Algorithm <- factor(FA_Summary$Algorithm ,
                                        levels = c("FA_PA_V_3F",
                                                   "FA_PA_V_4F",
                                                   "FA_PA_V_5F",
                                                   "FA_PA_P_3F",
                                                   "FA_PA_P_4F",
                                                   "FA_PA_P_5F",
                                                   "FA_ML_V_3F",
                                                   "FA_ML_V_4F",
                                                   "FA_ML_V_5F",
                                                   "FA_ML_P_3F",
                                                   "FA_ML_P_4F",
                                                   "FA_ML_P_5F"))

print(FA_Summary, row.names=FALSE)

##         RMS       TLI       BIC HighResidualRate  Algorithm
##  0.07338105 0.6191606 -22.76274       0.27272727 FA_PA_V_3F
##  0.03861068 0.7317578 -33.01886       0.15151515 FA_PA_V_4F
##  0.01483907 0.9648451 -43.43181       0.00000000 FA_PA_V_5F
##  0.07338105 0.6191606 -22.76274       0.27272727 FA_PA_P_3F
##  0.03861068 0.7317578 -33.01886       0.15151515 FA_PA_P_4F
##  0.01483907 0.9648451 -43.43181       0.00000000 FA_PA_P_5F
##  0.07961754 0.6858498 -35.60434       0.24242424 FA_ML_V_3F
##  0.05795189 0.7644223 -37.50857       0.15151515 FA_ML_V_4F
##  0.01923904 0.9858829 -45.32318       0.03030303 FA_ML_V_5F
##  0.07961754 0.6858498 -35.60434       0.24242424 FA_ML_P_3F
##  0.05795189 0.7644223 -37.50857       0.15151515 FA_ML_P_4F
##  0.01923904 0.9858829 -45.32318       0.03030303 FA_ML_P_5F

##################################
# Consolidating all calculated values
# for the Standardized Root Mean Square of the Residual
##################################
(RMS_Plot <- dotplot(Algorithm ~ RMS,
                     data = FA_Summary,
                     main = "Algorithm Comparison : Standardized Root Mean Square of the Residual",
                     ylab = "Algorithm",
                     xlab = "RMSR",
                     auto.key = list(adj=1),
                     type=c("p", "h"),  
                     origin = 0,
                     alpha = 0.45,
                     pch = 16,
                     cex = 2))

##################################
# Consolidating all calculated values
# for the Tucker-Lewis Fit Index
##################################
(TLI_Plot <- dotplot(Algorithm ~ TLI,
                     data = FA_Summary,
                     main = "Algorithm Comparison : Tucker-Lewis Fit Index",
                     ylab = "Algorithm",
                     xlab = "TLI",
                     auto.key = list(adj=1),
                     type=c("p", "h"),  
                     origin = 0,
                     alpha = 0.45,
                     pch = 16,
                     cex = 2))

##################################
# Consolidating all calculated values
# for the Bayesian Information Criterion
##################################
(BIC_Plot <- dotplot(Algorithm ~ BIC,
                     data = FA_Summary,
                     main = "Algorithm Comparison : Bayesian Information Criterion",
                     ylab = "Algorithm",
                     xlab = "BIC",
                     auto.key = list(adj=1),
                     type=c("p", "h"),  
                     origin = 0,
                     alpha = 0.45,
                     pch = 16,
                     cex = 2))

##################################
# Consolidating all calculated values
# for the High Residual Rate
##################################
(HighResidualRate_Plot <- dotplot(Algorithm ~ HighResidualRate,
                     data = FA_Summary,
                     main = "Algorithm Comparison : High Residual Rate ",
                     ylab = "Algorithm",
                     xlab = "High Residual Rate",
                     auto.key = list(adj=1),
                     type=c("p", "h"),  
                     origin = 0,
                     alpha = 0.45,
                     pch = 16,
                     cex = 2))

##################################
# Plotting the Factor Loading Diagram
# for the optimal EFA model
##################################
fa.diagram(FA_PA_P_5F,
           sort=TRUE,
           cut=0,
           digits=3,
           main="Principal Axes Factor Extraction + Promax Rotation : 5 Factors",
           cex=0.75)

##################################
# Plotting the Dandelion Plot
# for the optimal EFA model
##################################
dandelion(FA_PA_P_5F_FactorLoading,
          bound=0,
          mcex=c(1,1.2),
          palet=DandelionPlotPalette)

2. Summary

3. References

[Book] Using Multivariate Analysis by Barbara Tabachnick and Linda Fidell
[Book] A Step-by-Step Guide to Exploratory Factor Analysis with R and RStudio by Marley Watkins
[Book] Just Enough R by Ben Whalley
[Book] Multiple Factor Analysis by Example Using R by Jerome Pages
[Book] Nonlinear Principal Component Analysis and Its Applications by Yuichi Mori, Masahiro Kuroda and Naomichi Makino
[Book] Applied Predictive Modeling by Max Kuhn and Kjell Johnson
[Book] An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie and Rob Tibshirani
[Book] Multivariate Data Visualization with R by Deepayan Sarkar
[Book] Machine Learning by Samuel Jackson
[Book] Data Modeling Methods by Jacob Larget
[Book] Introduction to R and Statistics by University of Western Australia
[Book] Feature Engineering and Selection: A Practical Approach for Predictive Models by Max Kuhn and Kjell Johnson
[Book] Introduction to Research Methods by Eric van Holm
[R Package] AppliedPredictiveModeling by Max Kuhn
[R Package] caret by Max Kuhn
[R Package] rpart by Terry Therneau and Beth Atkinson
[R Package] lattice by Deepayan Sarkar
[R Package] dplyr by Hadley Wickham
[R Package] tidyr by Hadley Wickham
[R Package] moments by Lukasz Komsta and Frederick
[R Package] skimr by Elin Waring
[R Package] RANN by Sunil Arya, David Mount, Samuel Kemp and Gregory Jefferis
[R Package] corrplot by Taiyun Wei
[R Package] tidyverse by Hadley Wickham
[R Package] lares by Bernardo Lares
[R Package] DMwR2 by Luis Torgo
[R Package] gridExtra by Baptiste Auguie and Anton Antonov
[R Package] rattle by Graham Williams
[R Package] RColorBrewer by Erich Neuwirth
[R Package] stats by R Core Team
[R Package] factoextra by Alboukadel Kassambara and Fabian Mundt
[R Package] FactoMineR by Francois Husson, Julie Josse, Sebastien Le and Jeremy Mazet
[R Package] gplots by Tal Galili
[R Package] qgraph by Sacha Epskamp
[R Package] ggplot2 by Hadley Wickham, Winston Chang, Lionel Henry and Thomas Lin Pedersen
[R Package] psych by William Revelle
[R Package] nFactors by Gilles Raiche and David Magis
[R Package] MBESS by Ken Kelley
[R Package] DandEFA by Artur Manukyan, Ahmet Sedef, Erhan Cene and Ibrahim Demir
[R Package] EFAtools by Markus Steiner and Silvia Grieder
[R Package] parameters by Daniel Ludecke
[R Package] performance by Daniel Ludecke
[R Package] HH by Richard Heiberger
[Article] 6 Dimensionality Reduction Techniques in R (with Examples) by CMDLineTips Team
[Article] 6 Dimensionality Reduction Algorithms With Python by Jason Brownlee
[Article] Introduction to Dimensionality Reduction for Machine Learning by Jason Brownlee
[Article] Introduction to Dimensionality Reduction by Geeks For Geeks
[Article] Factor Analysis with the psych package by Michael Clark
[Article] Factor Analysis in R with Psych Package: Measuring Consumer Involvement by Peter Prevos
[Article] Factor Analysis in R by Jinjian Mu
[Article] How To: Use the psych package for Factor Analysis and Data Reduction by William Revelle
[Article] A Practical Introduction to Factor Analysis: Exploratory Factor Analysis by UCLA Advanced Research Computing Team
[Article] Examining the Big 5 Personality Dataset with Factor Analysis by Tarid Wongvorachan
[Article] Principal Component Analysis versus Exploratory Factor Analysis by Diana Suhr
[Article] Exploratory Factor Analysis by Columbia University Irving Medical Center
[Article] Factor Analysis Example by Charles Zaiontz
[Article] Factor Analysis Guide with an Example by Jim Frost
[Article] What Is Factor Analysis and How Does It Simplify Research Findings? by Qualtrics Team
[Article] How Can I Perform A Factor Analysis With Categorical (Or Categorical And Continuous) Variables? by UCLA Advanced Research Computing Team
[Article] Factor Analysis on Ordinal Data Example in R (psych, homals) by Jiayu Wu
[Article] Factor Analysis by HandWiki Team
[Article] On Likert Scales In R by Jake Chanenson
[Publication] General Intelligence Objectively Determined and Measured by Charles Spearman (The American Journal of Psychology)
[Publication] The Effect of Standardization on a Chi-Square Approximation in Factor Analysis by Maurice Bartlett (Psychometrika)
[Publication] A Second Generation Little Jiffy by Henry Kaiser (Psychometrika)
[Publication] Tests of Significance in Factor Analysis by Maurice Bartlett (British Journal of Statistical Psychology)
[Publication] Test of Linear Trend in Eigenvalues of a Covariance Matrix with Application to Data Analysis by Peter Bentler and KeHai Yuan (British Journal of Mathematical and Statistical Psychology)
[Publication] The Scree Test For The Number Of Factors by Raymond Cattell (Multivariate Behavioral Research)
[Publication] Using Fit Statistic Differences to Determine the Optimal Number of Factors to Retain in an Exploratory Factor Analysis by William Finch (Educational and Psychological Measurement)
[Publication] An Objective Counterpart to the Visual Scree Test for Factor Analysis: The Standard Error Scree by Keith Zoski and Stephen Jurs (Educational and Psychological Measurement)
[Publication] The Performance of Regression-Based Variations of the Visual Scree for Determining the Number of Common Factors by Fadia Nasser, Jeri Benson and Joseph Wisenbaker (Educational and Psychological Measurement)
[Publication] Investigating the Performance of Exploratory Graph Analysis and Traditional Techniques to Identify the Number of Latent Factors: A Simulation and Tutorial by Hudson Golino, Dingjing Shi, Alexander Christensen, Luis Garrido, Maria Nieto, Ritu Sadana, Jotheeswaran Thiyagarajan, Agustin Martinez-Molina (Psychological Methods)
[Publication] Exploratory Graph Analysis: A New Approach for Estimating the Number of Dimensions in Psychological Research by Hudson Galino and Sacha Epskamp (Plos One)
[Publication] Very Simple Structure: An Alternative Procedure For Estimating The Optimal Number Of Interpretable Factors by William Revelle and Thomas Rocklin (Multivariate Behavioral Research )
[Publication] Determining the Number of Components from the Matrix of Partial Correlations by Wayne Velicer (Psychometrika)
[Publication] Dandelion Plot: A Method for the Visualization of R-mode Exploratory Factor Analyses by Artur Manukyan, Erhan Cene, Ahmet Sedef and Ibrahim Demir (Computational Statistics)
[Course] Applied Data Mining and Statistical Learning by Penn State Eberly College of Science

Unsupervised Learning : Discovering Latent Variables in High-Dimensional Data using Exploratory Factor Analysis

John Pauline Pineda

August 13, 2023

1. Table of Contents

1.1 Sample Data

1.2 Data Quality Assessment

1.3 Data Preprocessing

1.3.1 Outlier Detection

1.3.2 Zero and Near-Zero Variance

1.3.3 Collinearity

1.3.4 Linear Dependency

1.3.5 Distributional Shape

1.4 Data Pre-Assessment

1.4.1 Correlation Matrix Assessment - Covariance Validity

1.4.2 Correlation Matrix Assessment - Determinant Computation

1.4.3 Correlation Matrix Assessment - Bartlett’s Test of Sphericity

1.4.4 Correlation Matrix Assessment - Kaiser-Meyer-Olkin Factor Adequacy

1.5 Data Exploration

1.6 Factor Analysis

1.6.1 Principal Axes Factor Extraction and Varimax Rotation (FA_PA_V)

1.6.2 Principal Axes Factor Extraction and Promax Rotation (FA_PA_P)

1.6.3 Maximum Likelihood Factor Extraction and Varimax Rotation (FA_ML_V)

1.6.4 Maximum Likelihood Factor Extraction and Promax Rotation (FA_ML_P)

1.7 Consolidated Findings

2. Summary

3. References