The idea behind this package is to maintain a ‘data’ directory for a BABS project in the same way that an R package maintains a ‘data’ directory.

A subdirectory of a project will be created within which a ‘data’ directory will also be created. As with R packages, the objects saved into that ‘data’ directory will be accessible via the data() function.

By default the data repository is created in the project directory. This is defined as the first 10 directories to the current working directory (this is the standard path to a BABS project). A data repository ‘data-repository’ is created where any documents etc can be saved. There will be a ‘data’ directory where R objects can be saved.

The default paths can be overwritten by setting options().

Typical usage

From any location within a BABS project, data can be saved and loaded. First, the environment is initialised, simply loading the package and defining two test variables.

library(datarepository)

foo <- bar <- data.frame(v=letters[1:4])

Saving data

Now the foo variable will be saved to the (default) data repository at /project/path/data-repository/data.

save(foo)

base::save() is still accessible by including a file argument to the function.

A table can be written to the same /project/path/data-repository/data path using write(). Using the format argument, a csv, tsv or xlsx will be written using readr::write_csv(), readr::write_tsv() or openxlsx::write.xlsx().

write(foo, bar, format='csv')

Loading data

The object can be loaded back into the environment using the data function. This function can also load a package’s data in the expected way. By default, the function looks for the specified object(s) in /project/path/data-repository/data as well as the data subdirectory of each loaded package.

If the object was saved as a table the file is automatically parsed.

data(foo)

A specific file in the data repository can be found using system.file(). If the file(s) exist in the repository the path(s) to the file(s) are returned.

system.file('raw/foo.tsv', package='datarepository')

Path customisation

There are two variables that can be customised: the path to the project and the path within the project to the data repository. Both can be defined in the options() list. datarepo.path can be thought of as a lib.loc and datarepo.subdir as a library.

options(datarepo.path='~/scratch/a_test_project')
options(datarepo.subdir='a-different-data-repository/and-subdirectory')

How to use `data repository`

Christopher Barrington

2021-07-22

Typical usage

Saving data

Loading data

Path customisation

How to use data repository

Christopher Barrington

2021-07-22

Typical usage

Saving data

Loading data

Path customisation

How to use `data repository`