% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/gsnORAtest.R
\name{gsnORAtest}
\alias{gsnORAtest}
\title{gsnORAtest}
\usage{
gsnORAtest(l, bg, geneSetCollection, Alpha = 0.05, full = FALSE)
}
\arguments{
\item{l}{A vector containing an experimentally-derived set of genes. These may be significantly differentially
expressed genes, genes with differential chromatin accessibility or positives from a screen.}

\item{bg}{A vector containing a background of observable genes.}

\item{geneSetCollection}{A gene set collection to query, either a tmod object or a list of character vectors containing
gene sets for which the list element names are the gene set IDs.}

\item{Alpha}{The alpha value setting the significance cutoff adjusted p-value.}

\item{full}{This gives additional data in the results set, specifically the contingency table values.}
}
\value{
Returns a data.frame with an ORA (overrepresentation analysis) results set containing the following columns:

\itemize{
\item{\emph{ID}: the gene set identifiers.}
\item{\emph{Title}: The "Title" field from \code{tmod} class gene set collection objects, corresponding to the reformatted
\code{STANDARD_NAME} field in an MSigDB xml file, with spaces substituted for underscores and initial only
uppercase. \strong{NOTE:} If the search is done using a list of gene sets rather than a \code{tmod} object, this
column will contain NA.}
\item{\emph{a}: the number of genes observed in the background but not in \emph{l} or the queried gene set. (present
only if \code{full == TRUE})}
\item{\emph{b}: the number of observed genes in \emph{l} but not the queried gene set. (present only if \code{full == TRUE})}
\item{\emph{c}: the number of observed genes in the queried gene set but not \emph{l}. (present only if \code{full == TRUE})}
\item{\emph{d}: the number of observed genes in both \emph{l} and the queried gene set, i.e. the overlap.  (present only
if \code{full == TRUE})}
\item{\emph{N}: the number of observed genes the queried gene set.}
\item{\emph{Enrichment}: The fold overrepresentation of genes in the overlap set \emph{d} calculated as:
\deqn{E = (d / (c+d)) / ((b+d)/(a+b+c+d))}
}
\item{\emph{P_2S}: 2-sided Fisher \emph{p}-value. (\emph{NOT} log-transformed, present only if \code{full == TRUE})}
\item{\emph{adj.P.2S}: 2-sided Fisher \emph{p}-value corrected using the method of Benjamini & Hochberg(1) and implemented in
the \code{stats} package. (present only if \code{full == TRUE})}
\item{\emph{P_1S}: 1-sided Fisher \emph{p}-value. (\emph{NOT} log-transformed.)}
\item{\emph{adj.P.1S}: 1-sided Fisher \emph{p}-value corrected using the method of Benjamini & Hochberg(1) and implemented in
the \code{stats} package. (present only if \code{full == TRUE})}
}
}
\description{
Perform an ORA test using an experimentally-derived gene set to query a gene set collection.
}
\details{
This function is provided to allow rapid and easy overrepresentation analysis using an unordered experimental
gene set to query a gene set collection that may be either an arbitrary list of gene-sets, or an \code{tmod} class
gene set collection. The statistical tests provided include both the standard two-sided Fisher and a 1-sided Fisher
test, similar to what is provided by the DAVID pathways analysis web application(2).

If a list of gene sets is provided as the \code{geneSetCollection} argument, it must be structured as a list of
character vectors containing gene symbols (or whatever identifiers are used for the supplied experimental gene set),
}
\examples{

library(GSNA)

# From a differential expression data set, we can generate a
# subset of genes with significant differential expression,
# up or down. Here we will extract genes with significant
# negative differential expression with
# avg_log2FC < 0 and p_val_adj <= 0.05 from **Seurat** data:

sig_DN.genes <-
   toupper( rownames(subset( Bai_CiHep_v_Fib2.de,
                       avg_log2FC < 0  & p_val_adj < 0.05 )))

# Using all the genes in the differential expression data set,
# we can obtain a suitable background:
bg <- toupper(rownames( Bai_CiHep_v_Fib2.de ))

# Now, we can do a overrepresentation analysis search on this
# data using the Bai_gsc.tmod gene set collection included in
# the sample data:
sig_DN.gsnora <- gsnORAtest( l = sig_DN.genes,
                             bg = bg,
                             geneSetCollection = Bai_gsc.tmod )

}
\references{
\enumerate{
\item Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. \emph{Journal of the Royal Statistical Society Series B}, \strong{57}, 289–300. <\href{http://www.jstor.org/stable/2346101}{http://www.jstor.org/stable/2346101}>.
\item Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA. (2003). DAVID: Database for Annotation, Visualization, and Integrated Discovery. \emph{Genome Biol.}, \strong{4}(5):P3. Epub 2003 Apr 3.
}
}
\seealso{
\code{\link{gsnORAtest_cpp}}
\code{\link[stats]{p.adjust}}
}
