--- title: "z - Advanced topic: Reproducible Analytical Pipelines with Nix" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{z-advanced-topic-reproducible-analytical-pipelines-with-nix} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction Isolated environments are great to run pipelines in a safe and reproducible manner. This vignette details how to build a reproducible analytical pipeline using an environment built with Nix that contains the right version of R and packages. ## An example of a reproducible analytical pipeline using Nix Suppose that you've used `{targets}` to build a pipeline for a project and that you did so using a tailor-made Nix environment. Here is the call to `rix()` that you could have used to build that environment: ```{r, eval = F} path_default_nix <- tempdir() rix( r_ver = "4.2.2", r_pkgs = c("targets", "tarchetypes", "rmarkdown"), system_pkgs = NULL, git_pkgs = list( package_name = "housing", repo_url = "https://github.com/rap4all/housing/", commit = "1c860959310b80e67c41f7bbdc3e84cef00df18e" ), ide = "other", project_path = path_default_nix, overwrite = TRUE ) ``` This call to `rix()` generates the following `default.nix` file: ``` let pkgs = import (fetchTarball "https://github.com/rstats-on-nix/nixpkgs/archive/2023-02-13.tar.gz") {}; rpkgs = builtins.attrValues { inherit (pkgs.rPackages) rmarkdown tarchetypes targets; }; housing = (pkgs.rPackages.buildRPackage { name = "housing"; src = pkgs.fetchgit { url = "https://github.com/rap4all/housing/"; rev = "1c860959310b80e67c41f7bbdc3e84cef00df18e"; sha256 = "sha256-s4KGtfKQ7hL0sfDhGb4BpBpspfefBN6hf+XlslqyEn4="; }; propagatedBuildInputs = builtins.attrValues { inherit (pkgs.rPackages) dplyr ggplot2 janitor purrr readxl rlang rvest stringr tidyr; }; }); system_packages = builtins.attrValues { inherit (pkgs) glibcLocales nix R; }; in pkgs.mkShell { LOCALE_ARCHIVE = if pkgs.system == "x86_64-linux" then "${pkgs.glibcLocales}/lib/locale/locale-archive" else ""; LANG = "en_US.UTF-8"; LC_ALL = "en_US.UTF-8"; LC_TIME = "en_US.UTF-8"; LC_MONETARY = "en_US.UTF-8"; LC_PAPER = "en_US.UTF-8"; LC_MEASUREMENT = "en_US.UTF-8"; buildInputs = [ housing rpkgs system_packages ]; } ``` The environment that gets built from this `default.nix` file contains R version 4.2.2, the `{targets}` and `{tarchetypes}` packages, as well as the `{housing}` packages, which is a package that is hosted on GitHub only with some data and useful functions for the project. Because it is on GitHub, it gets installed using the `buildRPackage` function from Nix. You can use this environment to work on you project, or to launch a `{targets}` pipeline. [This GitHub repository](https://github.com/b-rodrigues/nix_targets_pipeline/tree/master) contains the finalized project. On your local machine, you could execute the pipeline in the environment by running this in a terminal: ``` cd /absolute/path/to/housing/ && nix-shell default.nix --run "Rscript -e 'targets::tar_make()'" ``` If you wish to run the pipeline whenever you drop into the Nix shell, you could add a *Shell-hook* to the generated `default.nix` file: ```{r parsermd-chunk-3, eval = FALSE} path_default_nix <- tempdir() rix( r_ver = "4.2.2", r_pkgs = c("targets", "tarchetypes", "rmarkdown"), system_pkgs = NULL, git_pkgs = list( package_name = "housing", repo_url = "https://github.com/rap4all/housing/", commit = "1c860959310b80e67c41f7bbdc3e84cef00df18e" ), ide = "other", shell_hook = "Rscript -e 'targets::tar_make()'", project_path = path_default_nix, overwrite = TRUE ) ``` Now, each time you drop into the Nix shell for that project using `nix-shell`, the pipeline gets automatically executed. `{rix}` also features a function called `tar_nix_ga()` that adds a GitHub Actions workflow file to make the pipeline run automatically on GitHub Actions. The GitHub repository linked above has such a file, so each time changes get pushed, the pipeline runs on GitHub Actions and the results are automatically pushed to a branch called `targets-runs`. See the workflow file [here](https://github.com/b-rodrigues/nix_targets_pipeline/blob/master/.github/workflows/run-pipeline.yaml). This feature is very heavily inspired and adapted from the `targets::github_actions()` function.