% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/stat-quant-eq.R
\name{stat_quant_eq}
\alias{stat_quant_eq}
\title{Equation, rho, AIC and BIC from quantile regression}
\usage{
stat_quant_eq(
  mapping = NULL,
  data = NULL,
  geom = "text_npc",
  position = "identity",
  ...,
  formula = NULL,
  quantiles = c(0.25, 0.5, 0.75),
  method = "rq:br",
  method.args = list(),
  eq.with.lhs = TRUE,
  eq.x.rhs = NULL,
  coef.digits = 3,
  coef.keep.zeros = TRUE,
  rho.digits = 2,
  label.x = "left",
  label.y = "top",
  label.x.npc = NULL,
  label.y.npc = NULL,
  hstep = 0,
  vstep = NULL,
  output.type = "expression",
  na.rm = FALSE,
  orientation = NA,
  parse = NULL,
  show.legend = FALSE,
  inherit.aes = TRUE
)
}
\arguments{
\item{mapping}{The aesthetic mapping, usually constructed with
\code{\link[ggplot2]{aes}}. Only needs to be
set at the layer level if you are overriding the plot defaults.}

\item{data}{A layer specific dataset, only needed if you want to override
the plot defaults.}

\item{geom}{The geometric object to use display the data}

\item{position}{The position adjustment to use for overlapping points on this
layer}

\item{...}{other arguments passed on to \code{\link[ggplot2]{layer}}. This
can include aesthetics whose values you want to set, not map. See
\code{\link[ggplot2]{layer}} for more details.}

\item{formula}{a formula object. Using aesthetic names instead of
original variable names.}

\item{quantiles}{numeric vector Values in 0..1 indicating the quantiles.}

\item{method}{function or character If character, "rq" or the name of a model
fit function are accepted, possibly followed by the fit function's
\code{method} argument separated by a colon (e.g. \code{"rq:br"}). If a
function different to \code{rq()}, it must accept arguments named
\code{formula}, \code{data}, \code{weights}, \code{tau} and \code{method}
and return a model fit object of class \code{rq} or \code{rqs}.}

\item{method.args}{named list with additional arguments passed to \code{rq()}
or to a function passed as argument to \code{method}.}

\item{eq.with.lhs}{If \code{character} the string is pasted to the front of
the equation label before parsing or a \code{logical} (see note).}

\item{eq.x.rhs}{\code{character} this string will be used as replacement for
\code{"x"} in the model equation when generating the label before parsing
it.}

\item{coef.digits, rho.digits}{integer Number of significant digits to use for
the fitted coefficients and rho in labels.}

\item{coef.keep.zeros}{logical Keep or drop trailing zeros when formatting
the fitted coefficients and F-value.}

\item{label.x, label.y}{\code{numeric} with range 0..1 "normalized parent
coordinates" (npc units) or character if using \code{geom_text_npc()} or
\code{geom_label_npc()}. If using \code{geom_text()} or \code{geom_label()}
numeric in native data units. If too short they will be recycled.}

\item{label.x.npc, label.y.npc}{\code{numeric} with range 0..1 (npc units)
DEPRECATED, use label.x and label.y instead; together with a geom
using npcx and npcy aesthetics.}

\item{hstep, vstep}{numeric in npc units, the horizontal and vertical step
used between labels for different groups.}

\item{output.type}{character One of \code{"expression"}, \code{"LaTeX"},
\code{"text"}, \code{"markdown"} or \code{"numeric"}. In most cases,
instead of using this statistics to obtain numeric values, it is better to
use \code{stat_fit_tidy()}.}

\item{na.rm}{a logical indicating whether NA values should be stripped before
the computation proceeds.}

\item{orientation}{character Either \code{"x"} or \code{"y"} controlling the
default for \code{formula}.}

\item{parse}{logical Passed to the geom. If \code{TRUE}, the labels will be
parsed into expressions and displayed as described in \code{?plotmath}.
Default is \code{TRUE} if \code{output.type = "expression"} and
\code{FALSE} otherwise.}

\item{show.legend}{logical. Should this layer be included in the legends?
\code{NA}, the default, includes if any aesthetics are mapped. \code{FALSE}
never includes, and \code{TRUE} always includes.}

\item{inherit.aes}{If \code{FALSE}, overrides the default aesthetics, rather
than combining with them. This is most useful for helper functions that
define both data and aesthetics and shouldn't inherit behaviour from the
default plot specification, e.g. \code{\link[ggplot2]{borders}}.}
}
\description{
\code{stat_quant_eq} fits a polynomial model by quantile regression and
generates several labels including the equation, rho, 'AIC' and 'BIC'.
}
\details{
This statistic interprets the argument passed to \code{formula} differently
than \code{\link[ggplot2]{stat_quantile}} accepting \code{y} as well as
\code{x} as explanatory variable, matching \code{stat_quant_line()}.

When two variables are subject to mutual constrains, it is useful to consider
both of them as explanatory and interpret the relationship based on them. So,
from version 0.4.1 'ggpmisc' makes it possible to easily implement the
approach described by Cardoso (2019) under the name of "Double quantile
regression".

This stat can be used to automatically annotate a plot with rho or
  the fitted model equation. The model fitting is done using package
 'quantreg', please, consult its documentation for the
  details. It supports only linear models fitted with function \code{rq()},
  passing \code{method = "br"} to it, should work well with up to several
  thousand observations. The rho, AIC, BIC and n annotations can be used with
  any linear model formula. The fitted equation label is correctly generated
  for polynomials or quasi-polynomials through the origin. Model formulas can
  use \code{poly()} or be defined algebraically with terms of powers of
  increasing magnitude with no missing intermediate terms, except possibly
  for the intercept indicated by \code{"- 1"} or \code{"-1"} or \code{"+ 0"}
  in the formula. The validity of the \code{formula} is not checked in the
  current implementation. The default aesthetics sets rho as label for the
  annotation.  This stat generates labels as R expressions by default but
  LaTeX (use TikZ device), markdown (use package 'ggtext') and plain text are
  also supported, as well as numeric values for user-generated text labels.
  The value of \code{parse} is set automatically based on \code{output-type},
  but if you assemble labels that need parsing from \code{numeric} output,
  the default needs to be overridden. This stat only generates annotation
  labels, the predicted values/line need to be added to the plot as a
  separate layer using \code{\link{stat_quant_line}},
  \code{\link{stat_quant_band}} or \code{\link[ggplot2]{stat_quantile}}, so
  to make sure that the same model formula is used in all steps it is best to
  save the formula as an object and supply this object as argument to the
  different statistics.

  A ggplot statistic receives as data a data frame that is not the one passed
  as argument by the user, but instead a data frame with the variables mapped
  to aesthetics. \code{stat_quant_eq()} mimics how \code{stat_smooth()}
  works, except that only polynomials can be fitted. In other words, it
  respects the grammar of graphics. This helps ensure that the model is
  fitted to the same data as plotted in other layers.
}
\note{
For backward compatibility a logical is accepted as argument for
  \code{eq.with.lhs}. If \code{TRUE}, the default is used, either
  \code{"x"} or \code{"y"}, depending on the argument passed to \code{formula}.
  However, \code{"x"} or \code{"y"} can be substituted by providing a
  suitable replacement character string through \code{eq.x.rhs}.
  Parameter \code{orientation} is redundant as it only affects the default
  for \code{formula} but is included for consistency with
  \code{ggplot2::stat_smooth()}.

Support for the \code{angle} aesthetic is not automatic and requires
  that the user passes as argument suitable numeric values to override the
  defaults for label positions.
}
\section{Aesthetics}{
 \code{stat_quant_eq()} understands \code{x} and \code{y},
  to be referenced in the \code{formula} and \code{weight} passed as argument
  to parameter \code{weights} of \code{rq()}. All three must be mapped to
  \code{numeric} variables. In addition, the aesthetics understood by the
  geom used (\code{"text"} by default) are understood and grouping respected.
}

\section{Computed variables}{

If output.type different from \code{"numeric"} the returned tibble contains
columns below in addition to a modified version of the original \code{group}:
\describe{
  \item{x,npcx}{x position}
  \item{y,npcy}{y position}
  \item{eq.label}{equation for the fitted polynomial as a character string to be parsed}
  \item{r.label, and one of cor.label, rho.label, or tau.label}{\eqn{rho} of the fitted model as a character string to be parsed}
  \item{AIC.label}{AIC for the fitted model.}
  \item{n.label}{Number of observations used in the fit.}
  \item{method.label}{Set according \code{method} used.}
  \item{rq.method}{character, method used.}
  \item{rho, n}{numeric values extracted or computed from fit object.}
  \item{hjust, vjust}{Set to "inward" to override the default of the "text" geom.}
  \item{quantile}{Numeric value of the quantile used for the fit}
  \item{quantile.f}{Factor with a level for each quantile}
  }

If output.type is \code{"numeric"} the returned tibble contains columns
 in addition to a modified version of the original \code{group}:
\describe{
  \item{x,npcx}{x position}
  \item{y,npcy}{y position}
  \item{coef.ls}{list containing the "coefficients" matrix from the summary of the fit object}
  \item{rho, AIC, n}{numeric values extracted or computed from fit object}
  \item{rq.method}{character, method used.}
  \item{hjust, vjust}{Set to "inward" to override the default of the "text" geom.}
  \item{quantile}{Indicating the quantile  used for the fit}
  \item{quantile.f}{Factor with a level for each quantile}
  \item{b_0.constant}{TRUE is polynomial is forced through the origin}
  \item{b_i}{One or columns with the coefficient estimates}}

To explore the computed values returned for a given input we suggest the use
of \code{\link[gginnards]{geom_debug}} as shown in the example below.
}

\examples{
# generate artificial data
set.seed(4321)
x <- 1:100
y <- (x + x^2 + x^3) + rnorm(length(x), mean = 0, sd = mean(x^3) / 4)
y <- y / max(y)
my.data <- data.frame(x = x, y = y,
                      group = c("A", "B"),
                      y2 = y * c(1, 2) + max(y) * c(0, 0.1),
                      w = sqrt(x))

# using defaults
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line() +
  stat_quant_eq()

ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line() +
  stat_quant_eq(use_label(c("eq", "method")))

# same formula as default
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = y ~ x) +
  stat_quant_eq(formula = y ~ x)

# explicit formula "x explained by y"
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = x ~ y) +
  stat_quant_eq(formula = x ~ y)

# using color
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(aes(color = after_stat(quantile.f))) +
  stat_quant_eq(aes(color = after_stat(quantile.f))) +
  labs(color = "Quantiles")

# location and colour
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(aes(color = after_stat(quantile.f))) +
  stat_quant_eq(aes(color = after_stat(quantile.f)),
                label.y = "bottom", label.x = "right") +
  labs(color = "Quantiles")

# give a name to a formula
formula <- y ~ poly(x, 3, raw = TRUE)

ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = formula, size = 0.5) +
  stat_quant_eq(formula = formula)

# angle
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = formula) +
  stat_quant_eq(formula = formula, angle = 90, hstep = 0.05, vstep = 0,
                label.y = 0.98, hjust = 1)

ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = formula) +
  stat_quant_eq(formula = formula, angle = 90,
                hstep = 0.05, vstep = 0, hjust = 0,
                label.y = 0.25)

# user set quantiles
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = formula, quantiles = 0.5) +
  stat_quant_eq(formula = formula, quantiles = 0.5)

ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_band(formula = formula,
                  quantiles = c(0.1, 0.5, 0.9)) +
  stat_quant_eq(formula = formula, parse = TRUE,
                quantiles = c(0.1, 0.5, 0.9))

# grouping
ggplot(my.data, aes(x, y2, color = group)) +
  geom_point() +
  stat_quant_line(formula = formula, size = 0.5) +
  stat_quant_eq(formula = formula)

ggplot(my.data, aes(x, y2, color = group)) +
  geom_point() +
  stat_quant_band(formula = formula, size = 0.75) +
  stat_quant_eq(formula = formula) +
  theme_bw()

# labelling equations
ggplot(my.data, aes(x, y2,  shape = group, linetype = group,
       grp.label = group)) +
  geom_point() +
  stat_quant_band(formula = formula, color = "black", size = 0.75) +
  stat_quant_eq(use_label(c("grp", "eq"), sep = "*\": \"*"),
                formula = formula) +
  expand_limits(y = 3) +
  theme_classic()

# using weights
ggplot(my.data, aes(x, y, weight = w)) +
  geom_point() +
  stat_quant_line(formula = formula, size = 0.5) +
  stat_quant_eq(formula = formula)

# no weights, quantile set to upper boundary
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = formula, quantiles = 0.95) +
  stat_quant_eq(formula = formula, quantiles = 0.95)

# manually assemble and map a specific label using paste() and aes()
ggplot(my.data, aes(x, y2, color = group, grp.label = group)) +
  geom_point() +
  stat_quant_line(method = "rq", formula = formula,
                quantiles = c(0.05, 0.5, 0.95),
                size = 0.5) +
  stat_quant_eq(aes(label = paste(after_stat(grp.label), "*\": \"*",
                                   after_stat(eq.label), sep = "")),
                quantiles = c(0.05, 0.5, 0.95),
                formula = formula, size = 3)

# manually assemble and map a specific label using sprintf() and aes()
ggplot(my.data, aes(x, y2, color = group, grp.label = group)) +
  geom_point() +
  stat_quant_band(method = "rq", formula = formula,
                  quantiles = c(0.05, 0.5, 0.95)) +
  stat_quant_eq(aes(label = sprintf("\%s*\": \"*\%s",
                                    after_stat(grp.label),
                                    after_stat(eq.label))),
                quantiles = c(0.05, 0.5, 0.95),
                formula = formula, size = 3)

# geom = "text"
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_quant_line(formula = formula, quantiles = 0.5) +
  stat_quant_eq(label.x = "left", label.y = "top",
                formula = formula,
                quantiles = 0.5)

# Inspecting the returned data using geom_debug()
# This provides a quick way of finding out the names of the variables that
# are available for mapping to aesthetics using after_stat().

gginnards.installed <- requireNamespace("gginnards", quietly = TRUE)

if (gginnards.installed)
  library(gginnards)

if (gginnards.installed)
  ggplot(my.data, aes(x, y)) +
    geom_point() +
    stat_quant_eq(formula = formula, geom = "debug")

\dontrun{
if (gginnards.installed)
  ggplot(my.data, aes(x, y)) +
    geom_point() +
    stat_quant_eq(aes(label = after_stat(eq.label)),
                  formula = formula, geom = "debug",
                  output.type = "markdown")

if (gginnards.installed)
  ggplot(my.data, aes(x, y)) +
    geom_point() +
    stat_quant_eq(formula = formula, geom = "debug", output.type = "text")

if (gginnards.installed)
  ggplot(my.data, aes(x, y)) +
    geom_point() +
    stat_quant_eq(formula = formula, geom = "debug", output.type = "numeric")

if (gginnards.installed)
  ggplot(my.data, aes(x, y)) +
    geom_point() +
    stat_quant_eq(formula = formula, quantiles = c(0.25, 0.5, 0.75),
                  geom = "debug", output.type = "text")

if (gginnards.installed)
  ggplot(my.data, aes(x, y)) +
    geom_point() +
    stat_quant_eq(formula = formula, quantiles = c(0.25, 0.5, 0.75),
                  geom = "debug", output.type = "numeric")
}

}
\references{
Written as an answer to question 65695409 by Mark Neal at
  Stackoverflow.
}
\seealso{
The quantile fit is done with function \code{\link[quantreg]{rq}},
  please consult its documentation. This \code{stat_quant_eq} statistic can
  return ready formatted labels depending on the argument passed to
  \code{output.type}. This is possible because only polynomial models are
  supported. For other types of models, statistics
  \code{\link{stat_fit_glance}},  \code{\link{stat_fit_tidy}} and
  \code{\link{stat_fit_glance}} should be used instead and the code for
  construction of character strings from numeric values and their mapping to
  aesthetic \code{label} needs to be explicitly supplied in the call.

Other ggplot statistics for quantile regression: 
\code{\link{stat_quant_band}()},
\code{\link{stat_quant_line}()}
}
\concept{ggplot statistics for quantile regression}
