Frequency tables

Install required packages

DescTools, dplyr

Category frequencies for one variable

Absolute frequencies

 [1] "C" "C" "B" "B" "C" "E" "D" "A" "B" "C" "E" "C"
myLetters
A B C D E 
1 3 5 1 2 
[1] "A" "B" "C" "D" "E"
B 
3 
plot of chunk rerFrequencies01
plot of chunk rerFrequencies01

(Cumulative) relative frequencies

myLetters
         A          B          C          D          E 
0.08333333 0.25000000 0.41666667 0.08333333 0.16666667 
         A          B          C          D          E 
0.08333333 0.33333333 0.75000000 0.83333333 1.00000000 

Counting non-existent categories

 [1] C C B B C E D A B C E C
Levels: A B C D E Q
letFac
A B C D E Q 
1 3 5 1 2 0 

Counting runs

 [1] "f" "m" "m" "m" "f" "f" "m" "m" "m" "m" "f" "m" "m"
Run Length Encoding
  lengths: int [1:6] 1 3 2 4 1 2
  values : chr [1:6] "f" "m" "f" "m" "f" "m"
[1] 6
 [1] "f" "m" "m" "m" "f" "f" "m" "m" "m" "m" "f" "m" "m"

Contingency tables for two or more variables

Absolute frequencies using table()

 [1] f f m f f f f m m f
Levels: f m
 [1] office home   office home   office office home   home   home   home  
Levels: home office
   work
sex home office
  f    4      3
  m    2      1
Number of cases in table: 10 
Number of factors: 2 
Test for independence of all factors:
    Chisq = 0.07937, df = 1, p-value = 0.7782
    Chi-squared approximation may be incorrect
plot of chunk rerFrequencies02
plot of chunk rerFrequencies02

Using xtabs()

   sex   work counts
1    f office      3
2    f   home      1
3    m office      4
4    f   home      0
5    f office      0
6    f office      1
7    f   home      2
8    m   home      3
9    m   home      4
10   f   home      4
   work
sex home office
  f    4      3
  m    2      1
   work
sex home office
  f    7      4
  m    7      4

Marginal sums and means

f m 
7 3 
  home office 
     3      2 
Margins computed over dimensions
in the following order:
1: sex
2: work
      work
sex    home office mean
  f     4.0    3.0  3.5
  m     2.0    1.0  1.5
  mean  3.0    2.0  2.5

Relative frequencies

   work
sex home office
  f  0.4    0.3
  m  0.2    0.1

Conditional relative frequencies

   work
sex      home    office
  f 0.5714286 0.4285714
  m 0.6666667 0.3333333
   work
sex      home    office
  f 0.6666667 0.7500000
  m 0.3333333 0.2500000

Flat contingency tables for more than two variables

 [1] A B B A B A A B B A
Levels: A B
       sex   f   m  
       group A B A B
work                
home         3 1 0 2
office       2 1 0 1

Recovering the original data from contingency tables

Individual-level data frame

   sex   work
1    f   home
2    f   home
3    f   home
4    f   home
5    m   home
6    m   home
7    f office
8    f office
9    f office
10   m office

Group-level data frame

  sex   work Freq
1   f   home    4
2   m   home    2
3   f office    3
4   m office    1

Percentile rank

 [1]  0.90  0.88  0.82  0.69  0.55 -0.06 -0.31 -0.38 -0.69 -0.21
 [1] 100  90  80  70  60  50  30  20  10  40

Using base R

 [1] 1.0 0.9 0.8 0.7 0.6 0.5 0.3 0.2 0.1 0.4
[1] 50
 [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
 [1] -0.69 -0.38 -0.31 -0.21 -0.06  0.55  0.69  0.82  0.88  0.90
plot of chunk rerFrequencies03
plot of chunk rerFrequencies03

Using package dplyr

Data set

Absolute frequencies

count() gives frequencies using new variable n (by default).

  sex group n
1   f    CG 1
2   f    WL 1
3   f     T 3
4   m    CG 3
5   m    WL 3
6   m     T 1

By default, count() drops groups / combinations of groups with no entries. Use option .drop=FALSE to include entries with frequency 0.

  sex group n
1   f    CG 1
2   f    WL 1
3   f     T 3
4   m    CG 3
5   m    WL 3
6   m     T 1

Add corresponding count to existing data frame.

   id sex group age  IQ rating n
1   1   m    WL  29 106      1 3
2   2   m     T  31 112      3 1
3   3   f     T  20  98      2 3
4   4   f     T  31  97      4 3
5   5   m    CG  24  86      2 3
6   6   f    WL  20  89      2 1
7   7   m    CG  32  79      3 3
8   8   m    CG  22 109      4 3
9   9   f    CG  25 100      1 1
10 10   m    WL  27  90      2 3
11 11   m    WL  35  96      2 3
12 12   f     T  27  88      4 3

Relative frequencies

  sex group n   freq_rel
1   f    CG 1 0.08333333
2   f    WL 1 0.08333333
3   f     T 3 0.25000000
4   m    CG 3 0.25000000
5   m    WL 3 0.25000000
6   m     T 1 0.08333333
   id sex group age  IQ rating n   freq_rel
1   1   m    WL  29 106      1 3 0.25000000
2   2   m     T  31 112      3 1 0.08333333
3   3   f     T  20  98      2 3 0.25000000
4   4   f     T  31  97      4 3 0.25000000
5   5   m    CG  24  86      2 3 0.25000000
6   6   f    WL  20  89      2 1 0.08333333
7   7   m    CG  32  79      3 3 0.25000000
8   8   m    CG  22 109      4 3 0.25000000
9   9   f    CG  25 100      1 1 0.08333333
10 10   m    WL  27  90      2 3 0.25000000
11 11   m    WL  35  96      2 3 0.25000000
12 12   f     T  27  88      4 3 0.25000000

Conditional relative frequencies

# A tibble: 6 x 5
  sex   group n_sex_group n_sex freq_cond_rel
  <fct> <fct>       <int> <int>         <dbl>
1 f     CG              1     5         0.2  
2 f     WL              1     5         0.2  
3 f     T               3     5         0.6  
4 m     CG              3     7         0.429
5 m     WL              3     7         0.429
6 m     T               1     7         0.143
   id sex group n_sex n_sex_group freq_cond_rel
1   1   m    WL     7           3     0.4285714
2   2   m     T     7           1     0.1428571
3   3   f     T     5           3     0.6000000
4   4   f     T     5           3     0.6000000
5   5   m    CG     7           3     0.4285714
6   6   f    WL     5           1     0.2000000
7   7   m    CG     7           3     0.4285714
8   8   m    CG     7           3     0.4285714
9   9   f    CG     5           1     0.2000000
10 10   m    WL     7           3     0.4285714
11 11   m    WL     7           3     0.4285714
12 12   f     T     5           3     0.6000000

Percent rank

   id sex group age  IQ rating rating_pr
1   1   m    WL  29 106      1   0.00000
2   2   m     T  31 112      3  63.63636
3   3   f     T  20  98      2  18.18182
4   4   f     T  31  97      4  81.81818
5   5   m    CG  24  86      2  18.18182
6   6   f    WL  20  89      2  18.18182
7   7   m    CG  32  79      3  63.63636
8   8   m    CG  22 109      4  81.81818
9   9   f    CG  25 100      1   0.00000
10 10   m    WL  27  90      2  18.18182
11 11   m    WL  35  96      2  18.18182
12 12   f     T  27  88      4  81.81818

Detach (automatically) loaded packages (if possible)

Get the article source from GitHub

R markdown - markdown - R code - all posts