Directly quoting from Reactome:
REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database. Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology and education. Founded in 2003, the Reactome project is led by Lincoln Stein of OICR, Peter D’Eustachio of NYULMC, Henning Hermjakob of EMBL-EBI, and Guanming Wu of OHSU.
(source: https://reactome.org/what-is-reactome)
Reactome provides two RESTful API services: Reactome content services and Reactome analysis services. In rbioapi, the naming schema is that any function which belongs to analysis services starts with rba_reactome_analysis* . Other rba_reactome_* functions without the ‘analysis’ infix correspond to content services API.
Before continuing reading this article, it is a good idea to read Reactome Data Model page.
This section mostly revolves around
rba_reactome_analysis()
function. So, naturally, we will
start with that. As explained in the function’s manual, you have
considerable freedom in providing the main input for this function; You
can supply an R object (as a data frame, matrix, or simple vector), a
URL, or a local file path. Note that the type of analysis will be
decided based on whether your input is 1-dimensional or 2-dimensional.
This has been explained in detail in the manual of
rba_reactome_analysis()
, see that for more
information.
rba_reactome_analysis()
is the API equivalent of Reactome’s
analyse gene
list tool. You can see that the function’s arguments correspond to
what would you choose in the webpage’s wizard.
## 1 We create a simple vector with our genes
genes <- c("p53", "BRCA1", "cdk2", "Q99835", "CDC42", "CDK1", "KIF23", "PLK1", "RAC2", "RACGAP1", "RHOA", "RHOB", "MSL1", "PHF21A", "INSR", "JADE2", "P2RX7", "CCDC101", "PPM1B", "ANAPC16", "CDH8", "HSPA1L", "CUL2", "ZNF302", "CUX1", "CYTH2", "SEC22C", "EIF4E3", "ROBO2", "CXXC1", "LINC01314", "ATP5F1")
## 2 We call reactome analysis with the default parameters
analyzed <- rba_reactome_analysis(input = genes,
projection = TRUE,
p_value = 0.01)
## 3 As always, we use str() to inspect the resutls
str(analyzed, 1)
#> List of 8
#> $ summary :List of 7
#> $ expression :List of 1
#> $ identifiersNotFound: int 1
#> $ pathwaysFound : int 79
#> $ pathways :'data.frame': 79 obs. of 19 variables:
#> $ resourceSummary :'data.frame': 3 obs. of 3 variables:
#> $ speciesSummary :'data.frame': 1 obs. of 5 variables:
#> $ warnings : list()
## 4 Note that in the summary element: (analyzed$summary)
### 4.a because we supplied a simple vector, the analysis type was: over-representation
### 4.b You need the token for other rba_reactome_analysis_* functions
## 5 Analsis results are in the pathways data frame:
As mentioned, some of rba_reactome_analysis()
’s
arguments correspond to the wizard of analyse gene
list tool; Other arguments corresponds to the contents of “Filter
your results” tab in the results page.
Having the analysis’s token, you can retrieve the analysis results in
many formats using rba_reactome_analysis_pdf()
and
rba_reactome_analysis_download()
:
# download a full pdf report
rba_reactome_analysis_pdf(token = analyzed$summary$token,
species = 9606)
# download the result in compressed json.gz format
rba_reactome_analysis_download(token = analyzed$summary$token,
request = "results",
save_to = "reactome_results.json")
Your token is only guaranteed to be stored for 7 days. After that,
you can upload the JSON file you have downloaded using
rba_reactome_analysis_download
and get a token for
that:
Please Note: Other services supported by rbioapi also provide Over-representation analysis tools. Please see the vignette article Do with rbioapi: Over-Representation (Enrichment) Analysis in R (link to the documentation site) for an in-depth review.
rbioapi functions that correspond to Reactome content services are those starting with rba_reactome_* but without “_analysis” infix. These functions cover what you can do with objects in Reactome knowledge-base. In simpler terms, most -but not all of them- correspond to what you can find in Reactome Pathway Browser and search results. (e.g. a pathway, a reaction, a physical Entity, etc.)
Using rba_reactome_query()
, you can retrieve any object
from Reactome knowledge-base. In simpler terms, what I mean by the
object is roughly anything that Reactome associated an ID to it. This
can range from a person’s entry to proteins, reactions, pathways,
species, and many more! You can explore Reactome’s data schema to
learn about Reactome knowledge-base objects and their organization. Here
are some examples, note that you are not limited to only one ID per
query. You can use a vector of inputs, the only limitation is that when
you supply more than one ID, you cannot have
enhanced = TRUE
.
## 1 query a pathway Entry
pathway <- rba_reactome_query(ids = "R-HSA-109581", enhanced = TRUE)
## 2 As always we use str() to inspect the output's structure
str(pathway, 2)
#> List of 26
#> $ dbId : int 109581
#> $ displayName : chr "Apoptosis"
#> $ stId : chr "R-HSA-109581"
#> $ stIdVersion : chr "R-HSA-109581.6"
#> $ created :List of 5
#> ..$ dbId : int 109608
#> ..$ displayName: chr "Alnemri, E, Hengartner, Michael, Tschopp, Jürg, Tsujimoto, Yoshihide, Hardwick, JM, 2004-01-16"
#> ..$ dateTime : chr "2004-01-16 21:01:51"
#> ..$ className : chr "InstanceEdit"
#> ..$ schemaClass: chr "InstanceEdit"
#> $ modified :List of 6
#> ..$ dbId : int 10931649
#> ..$ displayName: chr "Wright, Adam, 2024-03-08"
#> ..$ dateTime : chr "2024-03-08 03:53:59"
#> ..$ note : chr "Inserted by org.reactome.orthoinference"
#> ..$ className : chr "InstanceEdit"
#> ..$ schemaClass: chr "InstanceEdit"
#> $ isInDisease : logi FALSE
#> $ isInferred : logi FALSE
#> $ name :List of 1
#> ..$ : chr "Apoptosis"
#> $ releaseDate : chr "2004-09-20"
#> $ speciesName : chr "Homo sapiens"
#> $ authored :List of 1
#> ..$ : int 109608
#> $ edited :List of 1
#> ..$ :List of 5
#> $ figure :List of 1
#> ..$ :List of 5
#> $ goBiologicalProcess:List of 9
#> ..$ dbId : int 2273
#> ..$ displayName : chr "apoptotic process"
#> ..$ accession : chr "0006915"
#> ..$ databaseName: chr "GO"
#> ..$ definition : chr "A programmed cell death process which begins when a cell receives an internal (e.g. DNA damage) or external sig"| __truncated__
#> ..$ name : chr "apoptotic process"
#> ..$ url : chr "https://www.ebi.ac.uk/QuickGO/term/GO:0006915"
#> ..$ className : chr "GO_BiologicalProcess"
#> ..$ schemaClass : chr "GO_BiologicalProcess"
#> $ literatureReference:List of 7
#> ..$ :List of 11
#> ..$ :List of 11
#> ..$ :List of 11
#> ..$ :List of 11
#> ..$ :List of 11
#> ..$ :List of 11
#> ..$ :List of 11
#> $ orthologousEvent :List of 14
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> ..$ :List of 15
#> $ reviewed :List of 1
#> ..$ :List of 5
#> $ species :List of 1
#> ..$ :List of 8
#> $ summation :List of 1
#> ..$ :List of 5
#> $ reviewStatus :List of 6
#> ..$ dbId : int 9821382
#> ..$ displayName: chr "five stars"
#> ..$ definition : chr "externally reviewed"
#> ..$ name :List of 1
#> ..$ className : chr "ReviewStatus"
#> ..$ schemaClass: chr "ReviewStatus"
#> $ hasDiagram : logi TRUE
#> $ hasEHLD : logi TRUE
#> $ hasEvent :List of 4
#> ..$ :List of 15
#> ..$ :List of 16
#> ..$ :List of 16
#> ..$ :List of 15
#> $ className : chr "Pathway"
#> $ schemaClass : chr "Pathway"
## 3 You can compare it with the webpage of R-HSA-202939 entry:
# https://reactome.org/content/detail/R-HSA-202939
## 1 query a protein Entry
protein <- rba_reactome_query(ids = 66247, enhanced = TRUE)
## 2 As always we use str() to inspect the output's structure
str(protein, 1)
#> List of 27
#> $ dbId : int 66247
#> $ displayName : chr "UniProt:P25942-1 CD40"
#> $ modified :List of 6
#> $ databaseName : chr "UniProt"
#> $ identifier : chr "P25942"
#> $ name :List of 1
#> $ otherIdentifier :List of 108
#> $ url : chr "https://purl.uniprot.org/uniprot/P25942-1"
#> $ crossReference :List of 29
#> $ referenceDatabase :List of 8
#> $ physicalEntity :List of 1
#> $ checksum : chr "BC8776EC2C4A5680"
#> $ comment :List of 1
#> $ description :List of 1
#> $ geneName :List of 2
#> $ isSequenceChanged : logi FALSE
#> $ keyword :List of 16
#> $ secondaryIdentifier:List of 8
#> $ sequenceLength : int 277
#> $ species : int 48887
#> $ chain :List of 2
#> $ referenceGene :List of 12
#> $ referenceTranscript:List of 4
#> $ variantIdentifier : chr "P25942-1"
#> $ isoformParent :List of 1
#> $ className : chr "ReferenceIsoform"
#> $ schemaClass : chr "ReferenceIsoform"
## 3 You can compare it with the webpage of R-HSA-202939 entry:
# https://reactome.org/content/detail/R-HSA-202939
As you can see in the second example usage of we used Reactome’s dbID
66247
to query CD40 protein. How did we obtain that in the
first place? You can use rba_reactome_xref
to map any
cross-reference (external) IDs to Reactome IDs.
## 1 We Supply HGNC ID to find what is the corresponding database ID in Reactome
xref_protein <- rba_reactome_xref("CD40")
## 2 As always use str() to inspect the output's structure
str(xref_protein, 1)
#> List of 19
#> $ dbId : int 66247
#> $ displayName : chr "UniProt:P25942-1 CD40"
#> $ databaseName : chr "UniProt"
#> $ identifier : chr "P25942"
#> $ name :List of 1
#> $ otherIdentifier :List of 1
#> $ url : chr "https://purl.uniprot.org/uniprot/P25942-1"
#> $ checksum : chr "BC8776EC2C4A5680"
#> $ comment :List of 1
#> $ description :List of 1
#> $ geneName :List of 1
#> $ isSequenceChanged : logi FALSE
#> $ keyword :List of 1
#> $ secondaryIdentifier:List of 1
#> $ sequenceLength : int 277
#> $ chain :List of 1
#> $ variantIdentifier : chr "P25942-1"
#> $ className : chr "ReferenceIsoform"
#> $ schemaClass : chr "ReferenceIsoform"
While we are at the cross-reference topic, here is another useful
resource. Using rba_reactome_mapping
you can find the
Reactome pathways or reactions which include your external ID:
## 1 Again, consider CD40 protein:
xref_mapping <- rba_reactome_mapping(id = "CD40",
resource = "hgnc",
map_to = "pathways")
There are still more rbioapi f Reactome content functions that were not covered in this vignette. Here is a brief overview, see the functions’ manual for detailed guides and examples.
rba_reactome_version()
: Return current Reactome
version
rba_reactome_diseases()
: Retrieve a list of disease
annotated in Reactome.
rba_reactome_species()
: Retrieve a list of species
annotated in Reactome.
reactome_complex_list()
: Get a list of complexes
that have your molecule in them.
rba_reactome_complex_subunits()
: Get the list of
subunits in your complex
rba_reactome_participant_of()
: Get a list of
Reactome sets and complexes that your entity (event, molecule, reaction,
pathway etc.) is a participant in them.
rba_reactome_entity_other_forms()
rba_reactome_event_ancestors()
rba_reactome_participants()
rba_reactome_pathways_events()
rba_reactome_event_ancestors()
rba_reactome_orthology()
rba_reactome_event_hierarchy()
: Retrieve full event
hierarchy of an species.
To cite Reactome (Please see https://reactome.org/cite):
To cite rbioapi:
#> R version 4.3.3 (2024-02-29 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 11 x64 (build 22631)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=C
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> time zone: Europe/Brussels
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] rbioapi_0.8.1
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.35 R6_2.5.1 fastmap_1.1.1 xfun_0.43
#> [5] magrittr_2.0.3 cachem_1.0.8 knitr_1.45 htmltools_0.5.8
#> [9] rmarkdown_2.26 lifecycle_1.0.4 DT_0.32 cli_3.6.2
#> [13] sass_0.4.9 jquerylib_0.1.4 compiler_4.3.3 httr_1.4.7
#> [17] rstudioapi_0.16.0 tools_4.3.3 curl_5.2.1 evaluate_0.23
#> [21] bslib_0.6.2 yaml_2.3.8 htmlwidgets_1.6.4 rlang_1.1.3
#> [25] jsonlite_1.8.8 crosstalk_1.2.1