Factors: Representing categorical data

TODO

  • link to recode for transforming continuous variables into factors

Install required packages

forcats

Unordered factors

Create factors from existing variables

[1] m f f m m m f f
Levels: f m
[1] 1 1 3 3 4 4
Levels: 1 2 3 4 5
 [1] 0 1 1 1 1 1 0 0 0 0
 [1] man   woman woman woman woman woman man   man   man   man  
Levels: man woman
[1] male   female female male   male   male   female female
Levels: female male

Generate factors

 [1] A A A A A B B B B B
Levels: A B
 [1] less less less less less more more more more more
Levels: less < more
 [1] more less more more less less less more less more
Levels: less < more
   IV1 IV2
1    a   1
2    a   1
3    b   1
4    b   1
5    a   2
6    a   2
7    b   2
8    b   2
9    a   3
10   a   3
11   b   3
12   b   3

Information about factors

[1] 2
female   male 
     4      4 
[1] "female" "male"  
 Factor w/ 2 levels "female","male": 2 1 1 2 2 2 1 1
[1] 2 1 1 2 2 2 1 1
attr(,"levels")
[1] "female" "male"  
[1] 1 2 3 4 5 6
attr(,"levels")
[1] "10" "11" "12" "13" "14" "15"
[1] "male"   "female" "female" "male"   "male"   "male"   "female" "female"

Joining factors

Concatenating factors

[1] A B E D
Levels: A B C D E
[1] e b d
Levels: a b c d e
[1] A B E D e b d
Levels: A B C D E a b c d e

Repeating factors

[1] A B E D A B E D
Levels: A B C D E

Crossing two factors

 [1] lo lo lo lo lo lo hi hi hi hi hi hi
Levels: hi lo
 [1] 1 2 3 1 2 3 1 2 3 1 2 3
Levels: 1 2 3
 [1] lo.1 lo.2 lo.3 lo.1 lo.2 lo.3 hi.1 hi.2 hi.3 hi.1 hi.2 hi.3
Levels: hi.1 lo.1 hi.2 lo.2 hi.3 lo.3

Ordered factors

[1] hi  lo  hi  mid
Levels: hi lo mid
[1] hi  lo  hi  mid
Levels: lo < mid < hi
[1] TRUE

Control the order of factor levels

Free ordering of group levels

 [1] "A" "A" "A" "A" "A" "B" "B" "B" "B" "B" "C" "C" "C" "C" "C"
 [1] A A A A A B B B B B C C C C C
Levels: A B C
 [1] A A A A A B B B B B C C C C C
Levels: C A B

Using fct_relevel() from package forcats

 [1] A A A A A B B B B B C C C C C
Levels: A B C
 [1] A A A A A B B B B B C C C C C
Levels: B A C
 [1] A A A A A B B B B B C C C C C
Levels: B C A

Reorder group levels according to group statistics

        A         B         C 
10.571183  3.775138 13.249144 
 [1] A A A A A B B B B B C C C C C
attr(,"scores")
        A         B         C 
10.571183  3.775138 13.249144 
Levels: B A C

Relevance of level order for sorting factors

 [1] B A B A B B A B A A
Levels: B A
 [1] B B B B B A A A A A
Levels: B A
 [1] "A" "A" "A" "A" "A" "B" "B" "B" "B" "B"

Add, combine and remove factor levels

Add factor levels

[1] hi lo hi
Levels: hi lo
[1] hi   lo   hi   <NA>
Levels: hi lo
[1] hi  lo  hi  mid
Levels: hi lo mid

Using package forcats

[1] hi  lo  hi  mid
Levels: hi lo mid new_level

Combine factor levels

[1] hi    notHi hi    notHi
Levels: hi notHi

Using package forcats

[1] hi    notHi hi    notHi
Levels: hi notHi

Remove factor levels

[1] hi lo
Levels: hi lo mid
[1] hi lo
Levels: hi lo

Using package forcats

[1] hi lo
Levels: hi lo

Detach (automatically) loaded packages (if possible)

Get the article source from GitHub

R markdown - markdown - R code - all posts