Organizing Custom File Structures in R Packages for Efficient Project Management

Organizing Custom File Structures in R Packages

Introduction

As R packages grow in size, managing their structure becomes increasingly important. While the traditional R directory layout is straightforward, some projects require a more customized approach to organize files and directories efficiently. In this article, we will explore how to use custom file/directory structures in pkg/R and pkg/src folders of an R package.

The Traditional R Package Directory Layout

Before diving into custom layouts, let’s review the traditional R package directory structure:

pkg/
├── DESCRIPTION
├── inst
│   └── doc
└── src
    ├── alg1.cpp
    └── main.R

The inst directory contains data files and documentation, while the src directory houses C/C++/Fortran code. The traditional layout is simple but limited in its ability to accommodate complex projects.

Using Nested Subfolders in pkg/src

As indicated by the ‘Writing R extensions’ manual, a Makevars file under pkg/src allows for nested subfolders for C/C++/Fortran code. This feature enables you to organize your project’s source files more flexibly:

pkg/
├── DESCRIPTION
├── inst
│   └── doc
└── src
    ├── algorithms
    │   └── algo1.cpp
    ├── data
    │   └── dataset.R
    └── main.R

In this example, the algorithms subfolder contains C++ code for algorithm 1, while the data subfolder stores R data files.

Creating a Custom Directory Structure in pkg/R

Unfortunately, the traditional R package directory layout does not support nested folders directly under pkg/R. However, we can create a custom directory structure using symlinks:

pkg/
├── DESCRIPTION
├── inst
│   └── doc
└── src
    ├── algorithms
    │   └── algo1.cpp
    ├── data
    │   └── dataset.R
    └── main.R

Under pkg/R, create a symlink to the desired directory structure:

# Create a custom R directory structure
sudo ln -s src/algorithms typeA/
sudo ln -s src/data typeB/
sudo ln -s src/main main.ext

This creates symbolic links that allow you to access your files using the typeA/, typeB/, and main.ext directories.

Loading Unloading a Package with Custom Directory Structure

To load/unload a package with a custom directory structure, create a script (e.g., load_package.R) that uses the packageR function from the rpy2 library:

library(rpy2)
library(packageR)

# Load the package
package::loadPackage("myPackage")

# Unload the package
package::unloadPackage("myPackage")

You can also use this script to launch R and/or C unit tests on your package.

Exporting the Package to be CRAN-Compliant

To export the package to be CRAN-compatible, generate a Makevars file using the makevars function from the rpy2 library:

library(rpy2)
library(packageR)

# Generate a Makevars file for CRAN compliance
package::makevars("myPackage")

This will create a Makefile with the necessary settings to build and package your R package.

Example Use Case

Suppose you have an R package called myPackage that contains several algorithms, data files, and main functions. You can use the custom directory structure to organize these files as follows:

pkg/
├── DESCRIPTION
├── inst
│   └── doc
└── src
    ├── algorithms
    │   └── algo1.cpp
    ├── data
    │   └── dataset.R
    └── main.R

Under src/algorithms/, create a symlink to the desired directory structure:

# Create a custom R directory structure
sudo ln -s src/algorithms typeA/

Load/unload the package using the script above. When exporting the package, generate a Makevars file that includes the following settings:

makefile
SUBDIRS="algorithms data"

This will create a Makefile with the necessary settings to build and package your R package.

Conclusion

Using custom directory structures in pkg/R and pkg/src folders enables you to organize your R project’s files more efficiently. By leveraging nested subfolders and symlinks, you can create a customized layout that meets your project’s specific needs.


Last modified on 2024-03-06