% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/lookoutliers.R
\name{lookout}
\alias{lookout}
\title{Identifies outliers using the algorithm lookout.}
\usage{
lookout(
  X,
  alpha = 0.01,
  beta = 0.9,
  gamma = 0.97,
  bw = NULL,
  gpd = NULL,
  scale = TRUE,
  fast = NROW(X) > 1000,
  old_version = FALSE
)
}
\arguments{
\item{X}{The numerical input data in a data.frame, matrix or tibble format.}

\item{alpha}{The level of significance. Default is \code{0.01}. So there is
a 1/100 chance of any point being falsely classified as an outlier.}

\item{beta}{The quantile threshold used in the GPD estimation. Default is \code{0.90}.
To ensure there is enough data available, values greater than 0.90 are set to 0.90.}

\item{gamma}{Parameter for bandwidth calculation giving the quantile of the
Rips death radii to use for the bandwidth. Default is \code{0.97}. Ignored
under the old version; where the lower limit of the maximum Rips death radii
difference is used. Also ignored if \code{bw} is provided.}

\item{bw}{Bandwidth parameter. If \code{NULL} (default), the bandwidth is
found using Persistent Homology.}

\item{gpd}{Generalized Pareto distribution parameters. If \code{NULL} (the
default), these are estimated from the data.}

\item{scale}{If \code{TRUE}, the data is standardized. Using the old version,
unit scaling is applied so that each column is in the range \code{[0,1]}.
Under the new version, robust rotation and scaling is used so that the columns
are approximately uncorrelated with unit variance. Default is \code{TRUE}.}

\item{fast}{If \code{TRUE} (default), makes the computation faster by
sub-setting the data for the bandwidth calculation.}

\item{old_version}{Logical indicator of which version of the algorithm to use.
Default is FALSE, meaning the newer version is used.}
}
\value{
A list with the following components:
\item{\code{outliers}}{The set of outliers.}
\item{\code{outlier_probability}}{The GPD probability of the data.}
\item{\code{outlier_scores}}{The outlier scores of the data.}
\item{\code{bandwidth}}{The bandwdith selected using persistent homology. }
\item{\code{kde}}{The kernel density estimate values.}
\item{\code{lookde}}{The leave-one-out kde values.}
\item{\code{gpd}}{The fitted GPD parameters.}
}
\description{
This function identifies outliers using the algorithm lookout, an outlier
detection method that uses leave-one-out kernel density estimates and
generalized Pareto distributions to find outliers.
}
\examples{
X <- rbind(
  data.frame(
    x = rnorm(500),
    y = rnorm(500)
  ),
  data.frame(
    x = rnorm(5, mean = 10, sd = 0.2),
    y = rnorm(5, mean = 10, sd = 0.2)
  )
)
lo <- lookout(X)
lo
autoplot(lo)
}
\references{
Kandanaarachchi, S, and Hyndman, RJ (2022) Leave-one-out kernel
density estimates for outlier detection,
\emph{J Computational & Graphical Statistics}, \strong{31}(2), 586-599.
\url{https://robjhyndman.com/publications/lookout/}.

Hyndman, RJ, Kandanaarachchi, S, and Turner, K (2026) When lookout meets
crackle: Anomaly detection using kernel density estimation, in preparation.
\url{https://robjhyndman.com/publications/lookout2.html}
}
