% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fdr_od.R
\name{fdr_od}
\alias{fdr_od}
\title{Tail-area FDR and Confidence Interval}
\usage{
fdr_od(
  obsp,
  permp = NULL,
  pnm,
  ntests,
  thres,
  cl = 0.95,
  c1 = NA,
  meff = TRUE,
  seff = TRUE,
  mymat,
  nperms = 5
)
}
\arguments{
\item{obsp}{observed vector of p-values.}

\item{permp}{list of dataframes that include a column of permutation
p-values (or statistics) in each. The length of the list permp = number of
permutations. If \code{permp = NULL}, then the parametric estimator is calculated.}

\item{pnm}{name of column in each list component dataframe that includes
p-values (or statistics).}

\item{ntests}{total number of observed tests, which is usually the same as
the length of obsp and the number of rows in each permp dataframe. However,
this may not be the case if results were filtered by a p-value threshold or
statistic threshold. If filtering was conducted then thres must be smaller
(more extreme) than the filtering criterion.}

\item{thres}{significance threshold.}

\item{cl}{confidence level (default is .95).}

\item{c1}{overdispersion parameter. If this parameter is not specified
(default initial value is NA), then the parameter is estimated from the
data. If all tests are known to be independent, then this parameter should
be set to 1.}

\item{meff}{(For parametric estimation, if \code{permp = NULL}.) Logical. \code{TRUE} implies the calculation of the effective number of tests based on the JM estimator (Default is \code{TRUE})}

\item{seff}{(For parametric estimation, if \code{permp = NULL}.) Logical. \code{TRUE} implies the calculation of the effective number of rejected hypotheses based on the JM estimator (Default is \code{TRUE})}

\item{mymat}{(For parametric estimation, if \code{permp = NULL}.) Matrix. Design matrix used to calculate the p-values provided in \code{obsp}.}

\item{nperms}{(For parametric estimation, if \code{permp = NULL}.) Integer. Number of permutations needed to estimate the effective number of (rejected) tests. (Must be non-zero, default is 5)}
}
\value{
For the permutation-based estimator: 
A list which includes: \item{FDR }{FDR point estimate} \item{ll
}{lower confidence limit} \item{ul }{upper confidence limit} \item{pi0
}{proportion of true null hypotheses} \item{c1 }{overdispersion parameter}
\item{S }{observed number of positive tests} \item{Sp }{total number of
positive tests summed across all permuted result sets}

For the parametric-based estimator:
A list which includes: \item{FDR }{FDR point estimate} \item{ll
}{lower confidence limit} \item{ul }{upper confidence limit} \item{M}{total number of tests}
\item{M.eff}{effective number of tests. \code{NA} if \code{meff = FALSE}}
\item{S }{observed number of positive tests}\item{S.eff }{effective number of positive tests. \code{NA} if \code{seff = FALSE}}
}
\description{
This function can be used to estimate FDR, corresponding confidence
interval, and pi0, the proportion of true null hypotheses, given a selected
significance threshold, and results from permuted data.
}
\details{
If a very large number of tests are conducted, it may be useful to filter
results, that is, save only results of those tests that meet some relaxed
nominal significance threshold. This alleviates the need to record results
for tests that are clearly non-significant. Results from fdr_od() are valid
as long as thres < the relaxed nomimal significance threshold for both
observed and permuted results. It is not necessary for the input to fdr_od()
to be p-values, however, fdr_od() is designed for statistics in which
smaller values are more extreme than larger values as is the case for
p-values. Therefore, if raw statistics are used, then a transformation may
be necessary to insure that smaller values are more likely associated with
false null hypotheses than larger values. In certain situations, for
instance when a large proportion of tests meet the significance threshold,
\code{pi0} is estimated to be very small, and thus has a large influence on the FDR
estimate. To limit this influence, \code{pi0} is constrained to be .5 or greater,
resulting in a more conservative estimate under these conditions.
}
\examples{

ss=100
nvar=100
X = as.data.frame(matrix(rnorm(ss*nvar),nrow=ss,ncol=nvar))
e = as.data.frame(matrix(rnorm(ss*nvar),nrow=ss,ncol=nvar))
Y = .1*X + e
nperm = 10

myanalysis = function(X,Y){
	ntests = ncol(X)
	rslts = as.data.frame(matrix(NA,nrow=ntests,ncol=2))
	names(rslts) = c("ID","pvalue")
	rslts[,"ID"] = 1:ntests
	for(i in 1:ntests){
		fit = cor.test(X[,i],Y[,i],na.action="na.exclude",
			alternative="two.sided",method="pearson")
		rslts[i,"pvalue"] = fit$p.value
	}
	return(rslts)
} # End myanalysis

# Generate observed results
obs = myanalysis(X,Y)

## Generate permuted results
perml = vector('list',nperm)
for(p_ in 1:nperm){
	X1 = X[order(runif(nvar)),]
	perml[[p_]] = myanalysis(X1,Y)
}

## FDR results (permutation)
fdr_od(obs$pvalue,perml,"pvalue",nvar, thres = .05)

## FDR results (parametric)
fdr_od(obs$pvalue, permp = NULL, thres = 0.05, meff = FALSE, seff = FALSE, mymat = X)
}
\references{
Millstein J, Volfson D. 2013. Computationally efficient
permutation-based confidence interval estimation for tail-area FDR.
Frontiers in Genetics | Statistical Genetics and Methodology 4(179):1-11.
}
\author{
Joshua Millstein, Eric S. Kawaguchi
}
\keyword{htest}
\keyword{nonparametric}
