---
title: "From R to CFF"
subtitle: "Crosswalk"
description: >-
A comprehensive description of the internal mappings performed by
the cffr package.
author: Diego Hernangómez
bibliography: REFERENCES.bib
link-citations: yes
output:
rmarkdown::html_vignette:
toc: true
vignette: >
%\VignetteIndexEntry{From R to CFF}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(cffr)
```
The goal of this vignette is to provide an explicit map between the metadata
fields used by **cffr** and each one of the valid keys of the [Citation File
Format schema version
1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys).
## Summary {#summary}
We summarize here the fields that **cffr** can coerce and the original source of
information for each one of them. The details on each key are presented on the
next section of the document. The assessment of fields are based on the [Guide
to Citation File Format schema version
1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys)
[@druskat_citation_2021].
```{r summary , echo=FALSE}
keys <- cff_schema_keys(sorted = TRUE)
origin <- vector(length = length(keys))
origin[keys == "cff-version"] <- "parameter on function"
origin[keys == "type"] <- "Fixed value: 'software'"
origin[keys == "identifiers"] <- "DESCRIPTION/CITATION files"
origin[keys == "references"] <- "DESCRIPTION/CITATION files"
origin[keys %in% c(
"message",
"title",
"version",
"authors",
"abstract",
"repository",
"repository-code",
"url",
"date-released",
"contact",
"keywords",
"license",
"commit"
)] <- "DESCRIPTION file"
origin[keys %in% c(
"doi",
"preferred-citation"
)] <- "CITATION file"
origin[origin == FALSE] <- "Ignored by cffr"
df <- data.frame(
key = paste0("", keys, ""),
source = origin
)
knitr::kable(df, escape = FALSE)
```
## Details
### abstract
This key is extracted from the `"Description"` field of the `DESCRIPTION` file.
Example
```{r abstract}
library(cffr)
# Create cffr for yaml
cff_obj <- cff_create("rmarkdown")
# Get DESCRIPTION of rmarkdown to check
pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))
cat(cff_obj$abstract)
cat(pkg$get("Description"))
```
[Back to summary](#summary).
### authors
This key is coerced from the `"Authors"` or `"Authors@R"` field of the
`DESCRIPTION` file. By default persons with the role `"aut"` or `"cre"` are
considered, however this can be modified via the `authors_roles` parameter.
Example
```{r authors}
# An example DESCRIPTION
path <- system.file("examples/DESCRIPTION_many_persons", package = "cffr")
pkg <- desc::desc(path)
# See persons listed
pkg$get_authors()
# Default behaviour, use authors and creators (maintainers)
cff_obj <- cff_create(path)
cff_obj$authors
# Use now Copyright holders and maintainers
cff_obj_alt <- cff_create(path, authors_roles = c("cre", "cph"))
cff_obj_alt$authors
```
[Back to summary](#summary).
### cff-version
This key can be set via the parameters of the `cff_create()`/`cff_write()`
functions:
Example
```{r cffversion}
cff_objv110 <- cff_create("jsonlite", cff_version = "v1.1.0")
cat(cff_objv110$`cff-version`)
```
[Back to summary](#summary).
### commit
This key is extracted from the `"RemoteSha"` field of the `DESCRIPTION` file.
This is the case of packages installed using the
[r-universe](https://r-universe.dev/) or packages such as **remotes** or
**pak**.
Example
```{r commit}
# An example DESCRIPTION
path <- system.file("examples/DESCRIPTION_r_universe", package = "cffr")
pkg <- desc::desc(path)
# See RemoteSha
pkg$get("RemoteSha")
cff_read(path)
```
[Back to summary](#summary).
### contact
This key is coerced from the `"Authors"` or `"Authors@R"` field of the
`DESCRIPTION` file. Only persons with the role `"cre"` (i.e, the maintainer(s))
are considered.
Example
```{r contact}
cff_obj <- cff_create("rmarkdown")
pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))
cff_obj$contact
pkg$get_author()
```
[Back to summary](#summary).
### date-released
This key is extracted from the `DESCRIPTION` file following this logic:
- `"Date"` field or,
- If not present, from `"Date/Publication"`. This is present on packages built
on **CRAN** and **Bioconductor**. or,
- If not present, from `"Packaged"`, that is present on packages built by the
[r-universe](https://r-universe.dev/search/).
Example
```{r date-released}
# From an installed package
cff_obj <- cff_create("rmarkdown")
pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))
cat(pkg$get("Date/Publication"))
cat(cff_obj$`date-released`)
# A DESCRIPTION file without a Date
nodate <- system.file("examples/DESCRIPTION_basic", package = "cffr")
tmp <- tempfile("DESCRIPTION")
# Create a temporary file
file.copy(nodate, tmp)
pkgnodate <- desc::desc(tmp)
cffnodate <- cff_create(tmp)
# Won't appear
cat(cffnodate$`date-released`)
pkgnodate
# Adding a Date
desc::desc_set("Date", "1999-01-01", file = tmp)
cat(cff_create(tmp)$`date-released`)
```
[Back to summary](#summary).
### doi {#doi}
This key is coerced from the `"doi"` field of the
[preferred-citation](#preferred-citation) object. If not present and the package
is on **CRAN**, it would be populated with the doi provided by **CRAN** (e.g.
).
Example
```{r doi}
cff_doi <- cff_create("cffr")
cat(cff_doi$doi)
cat(cff_doi$`preferred-citation`$doi)
```
[Back to summary](#summary).
### identifiers
This key includes all the possible identifiers of the package:
- From the `DESCRIPTION` field, it includes all the urls not included in
[url](#url) or [repository-code](#repository-code).
- From the `CITATION` file, it includes all the dois not included in
[doi](#doi) and the identifiers (if any) not included in the `"identifiers"`
key of [preferred-citation](#preferred-citation).
- If the package is on **CRAN** and it has a `CITATION` file providing a doi,
the doi provided by **CRAN** would be added as well.
Example
```{r identifiers}
file <- system.file("examples/DESCRIPTION_many_urls", package = "cffr")
pkg <- desc::desc(file)
cat(pkg$get_urls())
cat(cff_create(file)$url)
cat(cff_create(file)$`repository-code`)
cff_create(file)$identifiers
```
[Back to summary](#summary).
### keywords
This key is extracted from the `DESCRIPTION` file. The keywords should appear in
the `DESCRIPTION` as:
```
...
X-schema.org-keywords: keyword1, keyword2, keyword3
```
Example
```{r keyword}
# A DESCRIPTION file without keywords
nokeywords <- system.file("examples/DESCRIPTION_basic", package = "cffr")
tmp2 <- tempfile("DESCRIPTION")
# Create a temporary file
file.copy(nokeywords, tmp2)
pkgnokeywords <- desc::desc(tmp2)
cffnokeywords <- cff_create(tmp2)
# Won't appear
cat(cffnokeywords$keywords)
pkgnokeywords
# Adding Keywords
desc::desc_set("X-schema.org-keywords", "keyword1, keyword2, keyword3",
file = tmp2
)
cat(cff_create(tmp2)$keywords)
```
Additionally, if the source code of the package is hosted on GitHub, **cffr**
can retrieve the topics of your repo via the [GitHub
API](https://docs.github.com/en/rest) and include those topics as keywords. This
option is controlled via the `gh_keywords` parameter:
Example
```{r ghkeyword}
# Get cff object from jsonvalidate
jsonval <- cff_create("jsonvalidate")
# Keywords are retrieved from the GitHub repo
jsonval
# Check keywords
jsonval$keywords
# The repo
jsonval$`repository-code`
```
[Back to summary](#summary).
### license
This key is extracted from the `"License"` field of the `DESCRIPTION` file.
Example
```{r license}
cff_obj <- cff_create("yaml")
cat(cff_obj$license)
pkg <- desc::desc(file.path(find.package("yaml"), "DESCRIPTION"))
cat(pkg$get("License"))
```
[Back to summary](#summary).
### license-url
This key is not extracted from the metadata of the package. See the description
on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#license-url).
> - **description**: The URL of the license text under which the software or
> dataset is licensed (only for non-standard licenses not included in the
> [SPDX License
> List](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#definitionslicense-enum)).
>
> - **usage**:
>
> ``` yaml
> license-url: "https://obscure-licenses.com?id=1234"
> ```
[Back to summary](#summary).
### message
This key is extracted from the `DESCRIPTION` field, specifically as:
```{r eval=FALSE}
msg <- paste0(
'To cite package "',
"NAME_OF_THE_PACKAGE",
'" in publications use:'
)
```
Example
```{r message}
cat(cff_create("jsonlite")$message)
```
[Back to summary](#summary).
### preferred-citation {#preferred-citation}
This key is extracted from the `CITATION` file. If several references are
provided, it would select the first citation as the `"preferred-citation"` and
the rest of them as [references](#references).
Example
```{r preferred-citation}
cffobj <- cff_create("rmarkdown")
cffobj$`preferred-citation`
citation("rmarkdown")[1]
```
[Back to summary](#summary).
### references {#references}
This key is extracted from the `CITATION` file if several references are
provided. The first citation is considered as the
[preferred-citation](#preferred-citation) and the rest of them as
`"references"`. It also extracts the package dependencies and adds those to this
fields using `citation(auto = TRUE)` on each dependency.
Example
```{r references}
cffobj <- cff_create("rmarkdown")
cffobj$references
citation("rmarkdown")[-1]
```
[Back to summary](#summary).
### repository
This key is extracted from the `"Repository"` field of the `DESCRIPTION` file.
Usually, this field is auto-populated when a package is hosted on a repo (like
**CRAN** or the [r-universe](https://r-universe.dev/)). For packages without
this field on the `DESCRIPTION` (that is the typical case for an in-development
package), **cffr** would try to search the package on any of the default
repositories specified on `options("repos")`.
In the case of [Bioconductor](https://bioconductor.org/) packages, those are
identified if a
["biocViews"](https://contributions.bioconductor.org/description.html#biocviews)
is present on the `DESCRIPTION` file.
If **cffr** detects that the package is available on **CRAN**, it would return
the canonical url form of the package (i.e.
).
Example
```{r repository}
# Installed package
inst <- cff_create("jsonlite")
cat(inst$repository)
# Demo file downloaded from the r-universe
runiv <- system.file("examples/DESCRIPTION_r_universe", package = "cffr")
runiv_cff <- cff_create(runiv)
cat(runiv_cff$repository)
desc::desc(runiv)$get("Repository")
# For in development package
norepo <- system.file("examples/DESCRIPTION_basic", package = "cffr")
# No repo
norepo_cff <- cff_create(norepo)
cat(norepo_cff[["repository"]])
# Change the name to a known package on CRAN: ggplot2
tmp <- tempfile("DESCRIPTION")
file.copy(norepo, tmp)
# Change name
desc::desc_set("Package", "ggplot2", file = tmp)
cat(cff_create(tmp)[["repository"]])
```
[Back to summary](#summary).
### repository-artifact
This key is not extracted from the metadata of the package. See the description
on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#repository-artifact).
> - **description**: The URL of the work in a build artifact/binary repository
> (when the work is software).
>
> - **usage**:
>
> ``` yaml
> repository-artifact: "https://search.maven.org/artifact/org.corpus-tools/cff-maven-plugin/0.4.0/maven-plugin"
> ```
[Back to summary](#summary).
### repository-code {#repository-code}
This key is extracted from the `"BugReports"` or `"URL"` fields on the
`DESCRIPTION` file. **cffr** tries to identify the url of the source on the
following repositories:
- [GitHub](https://github.com/).
- [GitLab](https://about.gitlab.com/).
- [R-Forge](https://r-forge.r-project.org/).
- [Bitbucket](https://bitbucket.org/).
Example
```{r repository-code}
# Installed package on GitHub
cff_create("jsonlite")$`repository-code`
# GitLab
gitlab <- system.file("examples/DESCRIPTION_gitlab", package = "cffr")
cat(cff_create(gitlab)$`repository-code`)
# Check
desc::desc(gitlab)
```
[Back to summary](#summary).
### title
This key is extracted from the `"Description"` field of the `DESCRIPTION` file.
```{r eval=FALSE}
title <- paste0(
"NAME_OF_THE_PACKAGE",
": ",
"TITLE_OF_THE_PACKAGE"
)
```
Example
```{r title}
# Installed package
cat(cff_create("testthat")$title)
```
[Back to summary](#summary).
### type
Fixed value equal to `"software"`. The other possible value is `"dataset"`. See
the description on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#type).
[Back to summary](#summary).
### url {#url}
This key is extracted from the `"BugReports"` or `"URL"` fields on the
`DESCRIPTION` file. It corresponds to the first url that is different to
[repository-code](#repository-code).
Example
```{r url}
# Many urls
manyurls <- system.file("examples/DESCRIPTION_many_urls", package = "cffr")
cat(cff_create(manyurls)$url)
# Check
desc::desc(manyurls)
```
[Back to summary](#summary).
### version
This key is extracted from the `"Version"` field on the `DESCRIPTION` file.
```{r version}
# Should be (>= 3.0.0)
cat(cff_create("testthat")$version)
```
[Back to summary](#summary).
## References