--- title: "From R to CFF" subtitle: "Crosswalk" description: >- A comprehensive description of the internal mappings performed by the cffr package. author: Diego Hernangómez bibliography: REFERENCES.bib link-citations: yes output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{From R to CFF} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(cffr) ``` The goal of this vignette is to provide an explicit map between the metadata fields used by **cffr** and each one of the valid keys of the [Citation File Format schema version 1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys). ## Summary {#summary} We summarize here the fields that **cffr** can coerce and the original source of information for each one of them. The details on each key are presented on the next section of the document. The assessment of fields are based on the [Guide to Citation File Format schema version 1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys) [@druskat_citation_2021]. ```{r summary , echo=FALSE} keys <- cff_schema_keys(sorted = TRUE) origin <- vector(length = length(keys)) origin[keys == "cff-version"] <- "parameter on function" origin[keys == "type"] <- "Fixed value: 'software'" origin[keys == "identifiers"] <- "DESCRIPTION/CITATION files" origin[keys == "references"] <- "DESCRIPTION/CITATION files" origin[keys %in% c( "message", "title", "version", "authors", "abstract", "repository", "repository-code", "url", "date-released", "contact", "keywords", "license", "commit" )] <- "DESCRIPTION file" origin[keys %in% c( "doi", "preferred-citation" )] <- "CITATION file" origin[origin == FALSE] <- "Ignored by cffr" df <- data.frame( key = paste0("", keys, ""), source = origin ) knitr::kable(df, escape = FALSE) ``` ## Details ### abstract This key is extracted from the `"Description"` field of the `DESCRIPTION` file.
Example ```{r abstract} library(cffr) # Create cffr for yaml cff_obj <- cff_create("rmarkdown") # Get DESCRIPTION of rmarkdown to check pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION")) cat(cff_obj$abstract) cat(pkg$get("Description")) ```
[Back to summary](#summary). ### authors This key is coerced from the `"Authors"` or `"Authors@R"` field of the `DESCRIPTION` file. By default persons with the role `"aut"` or `"cre"` are considered, however this can be modified via the `authors_roles` parameter.
Example ```{r authors} # An example DESCRIPTION path <- system.file("examples/DESCRIPTION_many_persons", package = "cffr") pkg <- desc::desc(path) # See persons listed pkg$get_authors() # Default behaviour, use authors and creators (maintainers) cff_obj <- cff_create(path) cff_obj$authors # Use now Copyright holders and maintainers cff_obj_alt <- cff_create(path, authors_roles = c("cre", "cph")) cff_obj_alt$authors ```
[Back to summary](#summary). ### cff-version This key can be set via the parameters of the `cff_create()`/`cff_write()` functions:
Example ```{r cffversion} cff_objv110 <- cff_create("jsonlite", cff_version = "v1.1.0") cat(cff_objv110$`cff-version`) ```
[Back to summary](#summary). ### commit This key is extracted from the `"RemoteSha"` field of the `DESCRIPTION` file. This is the case of packages installed using the [r-universe](https://r-universe.dev/) or packages such as **remotes** or **pak**.
Example ```{r commit} # An example DESCRIPTION path <- system.file("examples/DESCRIPTION_r_universe", package = "cffr") pkg <- desc::desc(path) # See RemoteSha pkg$get("RemoteSha") cff_read(path) ```
[Back to summary](#summary). ### contact This key is coerced from the `"Authors"` or `"Authors@R"` field of the `DESCRIPTION` file. Only persons with the role `"cre"` (i.e, the maintainer(s)) are considered.
Example ```{r contact} cff_obj <- cff_create("rmarkdown") pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION")) cff_obj$contact pkg$get_author() ```
[Back to summary](#summary). ### date-released This key is extracted from the `DESCRIPTION` file following this logic: - `"Date"` field or, - If not present, from `"Date/Publication"`. This is present on packages built on **CRAN** and **Bioconductor**. or, - If not present, from `"Packaged"`, that is present on packages built by the [r-universe](https://r-universe.dev/search/).
Example ```{r date-released} # From an installed package cff_obj <- cff_create("rmarkdown") pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION")) cat(pkg$get("Date/Publication")) cat(cff_obj$`date-released`) # A DESCRIPTION file without a Date nodate <- system.file("examples/DESCRIPTION_basic", package = "cffr") tmp <- tempfile("DESCRIPTION") # Create a temporary file file.copy(nodate, tmp) pkgnodate <- desc::desc(tmp) cffnodate <- cff_create(tmp) # Won't appear cat(cffnodate$`date-released`) pkgnodate # Adding a Date desc::desc_set("Date", "1999-01-01", file = tmp) cat(cff_create(tmp)$`date-released`) ```
[Back to summary](#summary). ### doi {#doi} This key is coerced from the `"doi"` field of the [preferred-citation](#preferred-citation) object. If not present and the package is on **CRAN**, it would be populated with the doi provided by **CRAN** (e.g. ).
Example ```{r doi} cff_doi <- cff_create("cffr") cat(cff_doi$doi) cat(cff_doi$`preferred-citation`$doi) ```
[Back to summary](#summary). ### identifiers This key includes all the possible identifiers of the package: - From the `DESCRIPTION` field, it includes all the urls not included in [url](#url) or [repository-code](#repository-code). - From the `CITATION` file, it includes all the dois not included in [doi](#doi) and the identifiers (if any) not included in the `"identifiers"` key of [preferred-citation](#preferred-citation). - If the package is on **CRAN** and it has a `CITATION` file providing a doi, the doi provided by **CRAN** would be added as well.
Example ```{r identifiers} file <- system.file("examples/DESCRIPTION_many_urls", package = "cffr") pkg <- desc::desc(file) cat(pkg$get_urls()) cat(cff_create(file)$url) cat(cff_create(file)$`repository-code`) cff_create(file)$identifiers ```
[Back to summary](#summary). ### keywords This key is extracted from the `DESCRIPTION` file. The keywords should appear in the `DESCRIPTION` as: ``` ... X-schema.org-keywords: keyword1, keyword2, keyword3 ```
Example ```{r keyword} # A DESCRIPTION file without keywords nokeywords <- system.file("examples/DESCRIPTION_basic", package = "cffr") tmp2 <- tempfile("DESCRIPTION") # Create a temporary file file.copy(nokeywords, tmp2) pkgnokeywords <- desc::desc(tmp2) cffnokeywords <- cff_create(tmp2) # Won't appear cat(cffnokeywords$keywords) pkgnokeywords # Adding Keywords desc::desc_set("X-schema.org-keywords", "keyword1, keyword2, keyword3", file = tmp2 ) cat(cff_create(tmp2)$keywords) ```
Additionally, if the source code of the package is hosted on GitHub, **cffr** can retrieve the topics of your repo via the [GitHub API](https://docs.github.com/en/rest) and include those topics as keywords. This option is controlled via the `gh_keywords` parameter:
Example ```{r ghkeyword} # Get cff object from jsonvalidate jsonval <- cff_create("jsonvalidate") # Keywords are retrieved from the GitHub repo jsonval # Check keywords jsonval$keywords # The repo jsonval$`repository-code` ```
[Back to summary](#summary). ### license This key is extracted from the `"License"` field of the `DESCRIPTION` file.
Example ```{r license} cff_obj <- cff_create("yaml") cat(cff_obj$license) pkg <- desc::desc(file.path(find.package("yaml"), "DESCRIPTION")) cat(pkg$get("License")) ```
[Back to summary](#summary). ### license-url This key is not extracted from the metadata of the package. See the description on the [Guide to CFF schema v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#license-url). > - **description**: The URL of the license text under which the software or > dataset is licensed (only for non-standard licenses not included in the > [SPDX License > List](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#definitionslicense-enum)). > > - **usage**: > > ``` yaml > license-url: "https://obscure-licenses.com?id=1234" > ``` [Back to summary](#summary). ### message This key is extracted from the `DESCRIPTION` field, specifically as: ```{r eval=FALSE} msg <- paste0( 'To cite package "', "NAME_OF_THE_PACKAGE", '" in publications use:' ) ```
Example ```{r message} cat(cff_create("jsonlite")$message) ```
[Back to summary](#summary). ### preferred-citation {#preferred-citation} This key is extracted from the `CITATION` file. If several references are provided, it would select the first citation as the `"preferred-citation"` and the rest of them as [references](#references).
Example ```{r preferred-citation} cffobj <- cff_create("rmarkdown") cffobj$`preferred-citation` citation("rmarkdown")[1] ```
[Back to summary](#summary). ### references {#references} This key is extracted from the `CITATION` file if several references are provided. The first citation is considered as the [preferred-citation](#preferred-citation) and the rest of them as `"references"`. It also extracts the package dependencies and adds those to this fields using `citation(auto = TRUE)` on each dependency.
Example ```{r references} cffobj <- cff_create("rmarkdown") cffobj$references citation("rmarkdown")[-1] ```
[Back to summary](#summary). ### repository This key is extracted from the `"Repository"` field of the `DESCRIPTION` file. Usually, this field is auto-populated when a package is hosted on a repo (like **CRAN** or the [r-universe](https://r-universe.dev/)). For packages without this field on the `DESCRIPTION` (that is the typical case for an in-development package), **cffr** would try to search the package on any of the default repositories specified on `options("repos")`. In the case of [Bioconductor](https://bioconductor.org/) packages, those are identified if a ["biocViews"](https://contributions.bioconductor.org/description.html#biocviews) is present on the `DESCRIPTION` file. If **cffr** detects that the package is available on **CRAN**, it would return the canonical url form of the package (i.e. ).
Example ```{r repository} # Installed package inst <- cff_create("jsonlite") cat(inst$repository) # Demo file downloaded from the r-universe runiv <- system.file("examples/DESCRIPTION_r_universe", package = "cffr") runiv_cff <- cff_create(runiv) cat(runiv_cff$repository) desc::desc(runiv)$get("Repository") # For in development package norepo <- system.file("examples/DESCRIPTION_basic", package = "cffr") # No repo norepo_cff <- cff_create(norepo) cat(norepo_cff[["repository"]]) # Change the name to a known package on CRAN: ggplot2 tmp <- tempfile("DESCRIPTION") file.copy(norepo, tmp) # Change name desc::desc_set("Package", "ggplot2", file = tmp) cat(cff_create(tmp)[["repository"]]) ```
[Back to summary](#summary). ### repository-artifact This key is not extracted from the metadata of the package. See the description on the [Guide to CFF schema v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#repository-artifact). > - **description**: The URL of the work in a build artifact/binary repository > (when the work is software). > > - **usage**: > > ``` yaml > repository-artifact: "https://search.maven.org/artifact/org.corpus-tools/cff-maven-plugin/0.4.0/maven-plugin" > ``` [Back to summary](#summary). ### repository-code {#repository-code} This key is extracted from the `"BugReports"` or `"URL"` fields on the `DESCRIPTION` file. **cffr** tries to identify the url of the source on the following repositories: - [GitHub](https://github.com/). - [GitLab](https://about.gitlab.com/). - [R-Forge](https://r-forge.r-project.org/). - [Bitbucket](https://bitbucket.org/).
Example ```{r repository-code} # Installed package on GitHub cff_create("jsonlite")$`repository-code` # GitLab gitlab <- system.file("examples/DESCRIPTION_gitlab", package = "cffr") cat(cff_create(gitlab)$`repository-code`) # Check desc::desc(gitlab) ```
[Back to summary](#summary). ### title This key is extracted from the `"Description"` field of the `DESCRIPTION` file. ```{r eval=FALSE} title <- paste0( "NAME_OF_THE_PACKAGE", ": ", "TITLE_OF_THE_PACKAGE" ) ```
Example ```{r title} # Installed package cat(cff_create("testthat")$title) ```
[Back to summary](#summary). ### type Fixed value equal to `"software"`. The other possible value is `"dataset"`. See the description on the [Guide to CFF schema v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#type). [Back to summary](#summary). ### url {#url} This key is extracted from the `"BugReports"` or `"URL"` fields on the `DESCRIPTION` file. It corresponds to the first url that is different to [repository-code](#repository-code).
Example ```{r url} # Many urls manyurls <- system.file("examples/DESCRIPTION_many_urls", package = "cffr") cat(cff_create(manyurls)$url) # Check desc::desc(manyurls) ```
[Back to summary](#summary). ### version This key is extracted from the `"Version"` field on the `DESCRIPTION` file. ```{r version} # Should be (>= 3.0.0) cat(cff_create("testthat")$version) ``` [Back to summary](#summary). ## References