% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/impute.R
\name{Impute.coin}
\alias{Impute.coin}
\title{Impute a data set in a coin}
\usage{
\method{Impute}{coin}(
  x,
  dset,
  f_i = NULL,
  f_i_para = NULL,
  impute_by = "column",
  use_group = NULL,
  group_level = NULL,
  normalise_first = NULL,
  out2 = "coin",
  write_to = NULL,
  disable = FALSE,
  ...
)
}
\arguments{
\item{x}{A coin class object}

\item{dset}{The name of the data set to apply the function to, which should be accessible in \code{.$Data}.}

\item{f_i}{An imputation function. See details.}

\item{f_i_para}{Further arguments to pass to \code{f_i}, other than \code{x}. See details.}

\item{impute_by}{Specifies how to impute: if \code{"column"}, passes each column (indicator) separately as a numerical
vector to \code{f_i}; if \code{"row"}, passes each \emph{row} separately; and if \code{"df"} passes the entire data set (data frame) to
\code{f_i}. The function called by \code{f_i} should be compatible with the type of data passed to it.}

\item{use_group}{Optional grouping variable name to pass to imputation function if this supports group
imputation.}

\item{group_level}{A level of the framework to use for grouping indicators. This is only
relevant if \code{impute_by = "row"} or \code{"df"}. In that case, indicators will be split into their groups at the
level specified by \code{group_level}, and imputation will be performed across rows of the group, rather
than the whole data set. This can make more sense because indicators within a group are likely to be
more similar.}

\item{normalise_first}{Logical: if \code{TRUE}, each column is normalised using a min-max operation before
imputation. By default this is \code{FALSE} unless \code{impute_by = "row"}. See details.}

\item{out2}{Either \code{"coin"} to return normalised data set back to the coin, or \code{df} to simply return a data
frame.}

\item{write_to}{Optional character string for naming the data set in the coin. Data will be written to
\code{.$Data[[write_to]]}. Default is \code{write_to == "Imputed"}.}

\item{disable}{Logical: if \code{TRUE} will disable imputation completely and write the unaltered data set. This option is mainly useful
in sensitivity and uncertainty analysis (to test the effect of turning imputation on/off).}

\item{...}{arguments passed to or from other methods.}
}
\value{
An updated coin with imputed data set at \code{.$Data[[write_to]]}
}
\description{
This imputes any \code{NA}s in the data set specified by \code{dset}
by invoking the function \code{f_i} and any optional arguments \code{f_i_para} on each column at a time (if
\code{impute_by = "column"}), or on each row at a time (if \code{impute_by = "row"}), or by passing the entire
data frame to \code{f_i} if \code{impute_by = "df"}.
}
\details{
Clearly, the function \code{f_i} needs to be able to accept with the data class passed to it - if
\code{impute_by} is \code{"row"} or \code{"column"} this will be a numeric vector, or if \code{"df"} it will be a data
frame. Moreover, this function should return a vector or data frame identical to the vector/data frame passed to
it except for \code{NA} values, which can be replaced. The function \code{f_i} is not required to replace \emph{all} \code{NA}
values.

When imputing row-wise, prior normalisation of the data is recommended. This is because imputation
will use e.g. the mean of the unit values over all indicators (columns). If the indicators are on
very different scales, the result will likely make no sense. If the indicators are normalised first,
more sensible results can be obtained. There are two options to pre-normalise: first is by setting
\code{normalise_first = TRUE} - this is anyway the default if \code{impute_by = "row"}. In this case, you also
need to supply a vector of directions. The data will then be normalised using a min-max approach
before imputation, followed by the inverse operation to return the data to the original scales.

Another approach which gives more control is to simply run \code{\link[=Normalise]{Normalise()}} first, and work with the
normalised data from that point onwards. In that case it is better to set \code{normalise_first = FALSE},
since by default if \code{impute_by = "row"} it will be set to \code{TRUE}.

Checks are made on the format of the data returned by imputation functions, to ensure the
type and that non-\code{NA} values have not been inadvertently altered. This latter check is allowed
a degree of tolerance for numerical precision, controlled by the \code{sfigs} argument. This is because
if the data frame is normalised, and/or depending on the imputation function, there may be a very
small differences. By default \code{sfigs = 9}, meaning that the non-\code{NA} values pre and post-imputation
are compared to 9 significant figures.

See also documentation for \code{\link[=Impute.data.frame]{Impute.data.frame()}} and \code{\link[=Impute.numeric]{Impute.numeric()}} which are called by this function.
}
\examples{
#' # build coin
coin <- build_example_coin(up_to = "new_coin")

# impute raw data set using population groups
# output to data frame directly
Impute(coin, dset = "Raw", f_i = "i_mean_grp",
               use_group = "Pop_group", out2 = "df")

}
