Association tests and measures for unordered categorical variables

TODO

  • link to correlation, associationOrder, diagCategorical

Install required packages

coin, DescTools

\((2 \times 2)\)-tables

Fisher’s exact test

       diagT
disease isHealthy isIll Sum
    no          8     2  10
    yes         1     4   5
    Sum         9     6  15

    Fisher's Exact Test for Count Data

data:  contT1
p-value = 0.04695
alternative hypothesis: true odds ratio is greater than 1
95 percent confidence interval:
 1.031491      Inf
sample estimates:
odds ratio 
  12.49706 

Prevalence, sensitivity, specificity, CCR, \(F\)-score

[1] 0.3333333
[1] 0.8
[1] 0.8
[1] 0.6666667

Correct classification rate (CCR)

[1] 0.8

\(F\)-score

[1] 0.7272727

Odds ratio, Yule’s \(Q\) and risk ratio

Odds ratio

odds ratio     lwr.ci     upr.ci 
 16.000000   1.092859 234.247896 

Yule’s \(Q\)

[1] 0.8823529

Risk ratio

[1] 4
       diagT
disease isHealthy isIll
    no        0.8   0.2
    yes       0.2   0.8
[1] 4

\((r \times c)\)-tables

\(\chi^{2}\)-test

      siblings
smokes  0  1  2 Sum
   no   5 19  6  30
   yes  3 16  1  20
   Sum  8 35  7  50

    Pearson's Chi-squared test

data:  cTab
X-squared = 2.4256, df = 2, p-value = 0.2974

Also for higher-order tables

Measures of association: \(\phi\), Cramer’s \(V\), contingency coefficient

     DV2
DV1    A  B  C Sum
  A    2  0  0   2
  B    0  3  2   5
  C    0  1  2   3
  Sum  2  4  4  10
                       estimate  lwr.ci  upr.ci
Contingency Coeff.       0.7184       -       -
Cramer V                 0.7303  0.0000  1.0000
Kendall Tau-b            0.6350  0.1884  1.0000
Goodman Kruskal Gamma    0.8333  0.4513  1.0000
Stuart Tau-c             0.6000  0.1151  1.0000
Somers D C|R             0.6452  0.2040  1.0000
Somers D R|C             0.6250  0.1897  1.0000
Pearson Correlation      0.7254  0.1763  0.9302
Spearman Correlation     0.6761  0.0810  0.9159
Lambda C|R               0.5000  0.0000  1.0000
Lambda R|C               0.4000  0.0000  0.8294
Lambda sym               0.4545  0.0591  0.8500
Uncertainty Coeff. C|R   0.4774  0.1492  0.8055
Uncertainty Coeff. R|C   0.4890  0.1519  0.8260
Uncertainty Coeff. sym   0.4831  0.1522  0.8140
Mutual Information       0.7610       -       -

Cochran-Mantel-Haenszel test for three-way tables


    Approximative Generalized Cochran-Mantel-Haenszel Test

data:  sex by
     work (home, office) 
     stratified by group
chi-squared = 1.8, p-value = 0.4269

Useful packages

  • Package riskyr provides many analysis methods for confusion tables, including sensitivity, specificity, relative risk, etc.
  • Package exact2x2 addresses the issue that p-values and confidence interval boundaries in some base-R functions such as fisher.test() may not always be consistent. It also provides unconditional tests when margins are not fixed, such as the Boschloo test.
  • Package contingencytables provides many more options to analyze contingency tables.

Detach (automatically) loaded packages (if possible)

Get the article source from GitHub

R markdown - markdown - R code - all posts