is.na()
In R, missing values are coded as NA
(not available)
[1] NA
[1] 10 20 NA 40 50 NA
[1] 6
Identify missing values with is.na()
[1] FALSE FALSE TRUE FALSE FALSE TRUE
[1] TRUE
[1] 3 6
[1] 2
[,1] [,2] [,3] [,4] [,5] [,6]
vec1 10 20 NA 40 50 NA
vec2 NA 7 9 10 1 8
[,1] [,2] [,3] [,4] [,5] [,6]
vec1 FALSE FALSE TRUE FALSE FALSE TRUE
vec2 TRUE FALSE FALSE FALSE FALSE FALSE
NA
in different situations[1] "A" NA "C"
[1] A <NA> C
Levels: A C
[1] A <NA> C
Levels: A C <NA>
[1] NA
[1] TRUE
[1] FALSE TRUE FALSE NA FALSE TRUE
[1] 2 NA 5
[1] 2 5
NA
When data is entered in other applications (spreadsheets, SPSS, etc.), missing values are often coded as a reserved numeric value, e.g., 99 or 9999. These values need to be replaced with NA
.
[1] 30 25 23 21 NA NA
[,1] [,2] [,3]
[1,] 30 23 -999
[2,] 25 21 999
[,1] [,2] [,3]
[1,] 30 23 NA
[2,] 25 21 NA
[1] NA
[1] -0.6
[1] 4.615192
[1] -3
ageNA <- c(18, NA, 27, 22)
DV1 <- c(NA, 1, 5, -3)
DV2 <- c(9, 4, 2, 7)
(matNA <- cbind(ageNA, DV1, DV2))
ageNA DV1 DV2
[1,] 18 NA 9
[2,] NA 1 4
[3,] 27 5 2
[4,] 22 -3 7
[1] NA NA 11.333333 8.666667
[1] 13.500000 2.500000 11.333333 8.666667
[1] TRUE TRUE FALSE FALSE
ageNA DV1 DV2
[1,] 27 5 2
[2,] 22 -3 7
ageNA DV1 DV2
[1,] 27 5 2
[2,] 22 -3 7
attr(,"na.action")
[1] 2 1
attr(,"class")
[1] "omit"
ageNA DV1 DV2
24.5 1.0 4.5
ageNA DV1 DV2
ageNA 12.5 20 -12.5
DV1 20.0 32 -20.0
DV2 -12.5 -20 12.5
[1] TRUE
Set casewise deletion as a permanent option for statistical functions (another choice is "na.fail"
)
[1] NA NA 11.333333 8.666667
[1] 26.5 23.0
ageNA DV1 DV2
ageNA 20.33333 20 -16.000000
DV1 20.00000 16 -10.000000
DV2 -16.00000 -10 9.666667
Multiple imputation is supported by functions in packages mice
and Amelia
.
R markdown - markdown - R code - all posts