Portable paths for sharing projects

library(fromhere)

Using appropriate relative file paths is essential for creating portable and reproducible projects. Portable projects are self-contained, meaning all code, data and other files needed by your project are contained within it.

Relative paths play a central role in portability, allowing you to specify file locations relative to a known reference point (e.g., the project root). Compared to system-specific absolute paths, relative file paths work on any system, allowing others to reproduce your work without modifying file paths.

This vignette demonstrates how relative paths from different project roots can be created using the fromhere package, with discussion on how each path type affects the portability of your project.

Project portability

How you organise your project files and specify paths to them changes the portability of your project. Less portable file paths require more specific file structures. For example, in increasing portability:

The working directory can differ depending on context. In general,

Portability example

The files (and paths) used within each part of your project affects its portability.

Consider the R project "my-awesome-project", which has been saved to the Documents folder on John’s computer ("/Users/John/Documents/"). It has the following contents:

my-awesome-project/
├── data-raw/                   # Raw / unprocessed data files
├── data/                       # Clean / processed data files
   ├── survey_cleaned_2024.csv
   ├── population_summary.csv
├── R/                          # R scripts
├── docs/                       # Documents
   ├── analysis.qmd
   ├── memo.qmd
   ├── resources/              # Document resources
   │   ├── logo.png
   │   └── signature.png
├── outputs/                    # Results, figures, tables, and other outputs
   ├── figures/                # Graphs and charts
   └── tables/                 # Data tables and results
       ├── regression_summary.csv
├── README.md                   # Project description and instructions
└── my-awesome-project.Rproj    # R Project file

Consider the two documents in this project, "docs/analysis.qmd" and "docs/memo.qmd":

The memo.qmd document is more portable than analysis.qmd since it can be moved independently of the broader project. It is good practice to organise your project such that individual components are portable. In this case the logo and signature is kept within the docs folder because they are only needed for documents. While less portable than the memo, the analysis is still reasonably portable since everything is contained in the project folder (rather than files scattered around the computer).

By adopting relative paths and understanding project portability, you can create projects that are robust, shareable, and adaptable to new environments. The fromhere package simplifies this process, allowing you to declare paths relative to various project roots, enhancing the portability of your work.

Absolute vs. Relative Paths

Absolute paths (not portable)

An absolute path includes the full location of the files on the computer, so reading the survey data would be:

# Path hardcoded to John's specific system
read.csv("/Users/John/Documents/my-awesome-project/data/survey_cleaned_2024.csv")

This is NOT portable, imagine if this project was shared with Jane who saved it in the Downloads folder of their Windows computer. In this case, the file path starting with "/Users/John/Documents" doesn’t exist on Jane’s computer. Instead the suitable path for Jane is:

# Path hardcoded to Jane's specific system
read.csv("C:\Users\Jane\Downloads\my-awesome-project\data\survey_cleaned_2024.csv")

As a result, the code cannot be reproduced without modification. Jane will have to update all the paths for them to work on her computer.

Relative paths (portable)

A relative path specifies the location of files from a starting folder. In an R session, relative paths start from the current working directory. When you open an RStudio Project, the current working directory will be (automatically set to) your project folder.

The code for reading the survey data with a relative path is:

# Path relative to the working directory
read.csv("data/survey_cleaned_2024.csv")

Notice that the relative path doesn’t include any details about the location of the project on the computer. This allows the project folder to be moved around (on the same computer or shared with others) without needing to change the file paths.

The fromhere package allows you to specify the starting point of relative paths explicitly. This can be useful since the paths will work even if the working directory in R changes. For example, the working directory is changed to the document’s folder ("docs") when rendering RMarkdown/Quarto documents. Using fromhere::from_rproj() will explicitly specify paths starting from the project folder even in RMarkdown/Quarto documents:

# Path relative to the project folder
library(fromhere)
read.csv(from_rproj("data/survey_cleaned_2024.csv"))

Aside: ., and ..

File paths can also include . and .., these paths refer to current directory (.) and the parent directory (..).

This allows you to navigate up folders, for example from_rproj("..") will give you the folder above the project ("/Users/John/Documents" for John, and "C:\Users\Jane\Downloads" for Jane).

Using from_rproj("..") results in a less portable project, since your work now depends on the file structure above your project folder.