Data import and export

Install required packages

haven, readxl, writexl, RSQLite

Read and write data in text format

Read and write text data from a local file

If reading from a file, check with getwd() that you are in the correct directory - or specify full file path. If the result should be a data frame, use read.table().

Some relevant options for read.table()

Some relevant options for write.table()

To read comma-separated-value files, use read.csv(), for fixed-with-files read.fwf(). To read a raw (unstructured) text file, use readLines(). The result is a character vector with one element per line.

Read text data from other sources

The described R functions can also read data from a character string.

    X     Y      Z
1 123 A B C   34.8
2  99 D E F 3543.2

Other sources can be standard-input (if R is used in batch mode via Rscript), the clipboard, or directly from an online address.

Read and write data in R binary format

Save multiple objects together with their respective symbol (=name). Object names do not need to be assigned when reading data.

Save objects separately without their symbol (=name) -> preferred. Object name needs to be assigned when reading data.

Use R for data entry

Read data from the console with scan(). Lines are terminated by hitting the Return key, scan() quits when Return is hit on a blank line.

Save R data to file

Create a logfile for commands and output

Save and read R objects in text format

Exchange data with other statistics software and spreadsheets

Exchange data with SPSS, SAS and Stata

One option is to use text files (tab-separated or comma-separated) as described above to exchange data with other statistics software packages.

Another option to exchange data with SPSS, SAS and Stata (among others) is the haven package. Example for SPSS:

SPSS has the concept of “variable labels” and “value labels” which do not directly map to R concepts. Variables read from SPSS data files therefore have special labels and are of class haven_labelled. Categorical variables can be converted to regular R factors using as_factor().

When writing data to an SPSS file, labels can be set manually, and the class can be changed to haven_labelled.

Use the SPSS Integration Plug-in for R

If you have SPSS available, install the “Integration plug-in for R” add-in. This allows you to run R within SPSS. The plug-in includes an R package with functions that transfer the active SPSS data frame to R (and back) - including labeled factor levels, dates and German umlauts. Once installed, you can use it like this in the SPSS syntax window:

BEGIN PROGRAM R.
# from here on, you can use R syntax
myDf <- spssdata.GetDataFromSPSS(missingValueToNA=TRUE,
                                 factorMode="labels",
                                 rDate="POSIXct")
save(myDf, file="d:/path/to/your/myDf.Rdata")
END PROGRAM.

Exchange data with Excel

One option is to use text files (tab-separated or comma-separated) as described above to exchange data with spreadsheet applications.

Read and write data from a database

There are R packages that provide an interface to all common database types. Using databases is described in more detail in:

  • Adler, J. (2012). R in a Nutshell (2nd ed.). Sebastopol, CA: O’Reilly.
  • Spector, P. (2008). Data Manipulation with R. New York, NY: Springer.

SQL commands are described here and here.

Using the DBI interface with RSQLite

Simulate data

Save data frame in SQLite database. This is usually a file. In this example, the file is created in memory only. Use dbConnect(<driver object>, dbname="file_name.db") to create a file on disk.

Find out which tables are present, and which fields are in a specific table.

[1] "MyDataFrame"
[1] "sex"    "IQ"     "rating"

Read complete table, then send SQL-query.

  sex       IQ rating
1   f 81.58281      A
2   m 87.37380      B
3   f 86.16037      B
4   m 61.90655      C
  sex       mIQ      sIQ
1   f 103.96425 5198.212
2   m  97.69169 4884.584

Query database and read results in smaller partial chunks. Useful for large queries.

         IQ rating
1  81.58281      A
2 121.76113      A
3 101.29880      A
4 107.66656      A
         IQ rating
1 111.76580      A
2  81.58281      A
3 121.76113      A
4 122.56658      A
         IQ rating
1  96.41313      A
2  89.68006      A
3  99.58071      A
4 106.50953      A
         IQ rating
1  86.16037      A
2  61.90655      A
3 107.66656      A
4 106.50953      A
         IQ rating
1  87.37380      A
2  82.94033      A
3 106.09860      A
4 122.56658      A
        IQ rating
1 114.2887      A
2 121.7611      A
3 121.5551      A
4 106.0986      A
         IQ rating
1  89.68006      A
2 111.76580      A
3 114.28868      A

Clean query, remove the created table, and close the database connection.

Useful resources

Detach (automatically) loaded packages (if possible)

Get the article source from GitHub

R markdown - markdown - R code - all posts