% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/concept-load.R
\name{load_concepts}
\alias{load_concepts}
\alias{load_concepts.character}
\alias{load_concepts.concept}
\alias{load_concepts.cncpt}
\alias{load_concepts.num_cncpt}
\alias{load_concepts.unt_cncpt}
\alias{load_concepts.fct_cncpt}
\alias{load_concepts.lgl_cncpt}
\alias{load_concepts.rec_cncpt}
\alias{load_concepts.item}
\alias{load_concepts.itm}
\title{Load concept data}
\usage{
load_concepts(x, ...)

\method{load_concepts}{character}(
  x,
  src = NULL,
  concepts = NULL,
  ...,
  dict_name = "concept-dict",
  dict_dirs = NULL
)

\method{load_concepts}{concept}(
  x,
  src = NULL,
  aggregate = NULL,
  merge_data = TRUE,
  verbose = TRUE,
  ...
)

\method{load_concepts}{cncpt}(x, aggregate = NULL, ..., progress = NULL)

\method{load_concepts}{num_cncpt}(x, aggregate = NULL, ..., progress = NULL)

\method{load_concepts}{unt_cncpt}(x, aggregate = NULL, ..., progress = NULL)

\method{load_concepts}{fct_cncpt}(x, aggregate = NULL, ..., progress = NULL)

\method{load_concepts}{lgl_cncpt}(x, aggregate = NULL, ..., progress = NULL)

\method{load_concepts}{rec_cncpt}(
  x,
  aggregate = NULL,
  patient_ids = NULL,
  id_type = "icustay",
  interval = hours(1L),
  ...,
  progress = NULL
)

\method{load_concepts}{item}(
  x,
  patient_ids = NULL,
  id_type = "icustay",
  interval = hours(1L),
  progress = NULL,
  ...
)

\method{load_concepts}{itm}(
  x,
  patient_ids = NULL,
  id_type = "icustay",
  interval = hours(1L),
  ...
)
}
\arguments{
\item{x}{Object specifying the data to be loaded}

\item{...}{Passed to downstream methods}

\item{src}{A character vector, used to subset the \code{concepts}; \code{NULL}
means no subsetting}

\item{concepts}{The concepts to be used or \code{NULL} in which case
\code{\link[=load_dictionary]{load_dictionary()}} is called}

\item{dict_name, dict_dirs}{In case not concepts are passed as \code{concepts},
these are forwarded to \code{\link[=load_dictionary]{load_dictionary()}} as \code{name} and \code{file} arguments}

\item{aggregate}{Controls how data within concepts is aggregated}

\item{merge_data}{Logical flag, specifying whether to merge concepts into
wide format or return a list, each entry corresponding to a concept}

\item{verbose}{Logical flag for muting informational output}

\item{progress}{Either \code{NULL}, or a progress bar object as created by
\link[progress:progress_bar]{progress::progress_bar}}

\item{patient_ids}{Optional vector of patient ids to subset the fetched data
with}

\item{id_type}{String specifying the patient id type to return}

\item{interval}{The time interval used to discretize time stamps with,
specified as \code{\link[base:difftime]{base::difftime()}} object}
}
\value{
An \code{id_tbl}/\code{ts_tbl} or a list thereof, depending on loaded
concepts and the value passed as \code{merge_data}.
}
\description{
Concept objects are used in \code{ricu} as a way to specify how a clinical
concept, such as heart rate can be loaded from a data source. Building on
this abstraction, \code{load_concepts()} powers concise loading of data with
data source specific preprocessing hidden away from the user, thereby
providing a data source agnostic interface to data loading. At default
value of the argument \code{merge_data}, a tabular data structure (either a
\code{\link[=ts_tbl]{ts_tbl}} or an \code{\link[=id_tbl]{id_tbl}}, depending on what kind of
concepts are requested), inheriting from
\code{\link[data.table:data.table]{data.table}}, is returned, representing the data
in wide format (i.e. returning concepts as columns).
}
\details{
In order to allow for a large degree of flexibility (and extensibility),
which is much needed owing to considerable heterogeneity presented by
different data sources, several nested S3 classes are involved in
representing a concept and \code{load_concepts()} follows this hierarchy of
classes recursively when
resolving a concept. An outline of this hierarchy can be described as
\itemize{
\item \code{concept}: contains many \code{cncpt} objects (of potentially differing
sub-types), each comprising of some meta-data and an \code{item} object
\item \code{item}: contains many \code{itm} objects (of potentially differing
sub-types), each encoding how to retrieve a data item.
}

The design choice for wrapping a vector of \code{cncpt} objects with a container
class \code{concept} is motivated by the requirement of having several different
sub-types of \code{cncpt} objects (all inheriting from the parent type \code{cncpt}),
while retaining control over how this homogeneous w.r.t. parent type, but
heterogeneous w.r.t. sub-type vector of objects behaves in terms of S3
generic functions.
}
\section{Concept}{

Top-level entry points are either a character vector, which is used to
subset a \code{concept} object or an entire \link[=load_dictionary]{concept dictionary}, or a \code{concept} object. When passing a
character vector as first argument, the most important further arguments at
that level control from where the dictionary is taken (\code{dict_name} or
\code{dict_dirs}). At \code{concept} level, the most important additional arguments
control the result structure: data merging can be disabled using
\code{merge_data} and data aggregation is governed by the \code{aggregate} argument.

Data aggregation is important for merging several concepts into a
wide-format table, as this requires data to be unique per observation (i.e.
by either id or combination of id and index). Several value types are
acceptable as \code{aggregate} argument, the most important being \code{FALSE}, which
disables aggregation, NULL, which auto-determines a suitable aggregation
function or a string which is ultimately passed to \code{\link[=dt_gforce]{dt_gforce()}} where it
identifies a function such as \code{sum()}, \code{mean()}, \code{min()} or \code{max()}. More
information on aggregation is available as \link[=rename_cols]{aggregate()}.
If the object passed as \code{aggregate} is scalar, it is applied to all
requested concepts in the same way. In order to customize aggregation per
concept, a named object (with names corresponding to concepts) of the same
length as the number of requested concepts may be passed.

Under the hood, a \code{concept} object comprises of several \code{cncpt} objects
with varying sub-types (for example \code{num_cncpt}, representing continuous
numeric data or \code{fct_cncpt} representing categorical data). This
implementation detail is of no further importance for understanding concept
loading and for more information, please refer to the
\code{\link[=concept]{concept}} documentation. The only argument that is introduced
at \code{cncpt} level is \code{progress}, which controls progress reporting. If
called directly, the default value of \code{NULL} yields messages, sent to the
terminal. Internally, if called from \code{load_concepts()} at \code{concept} level
(with \code{verbose} set to \code{TRUE}), a \link[progress:progress_bar]{progress::progress_bar} is set up in a
way that allows nested messages to be captured and not interrupt progress
reporting (see \code{\link[=msg_progress]{msg_progress()}}).
}

\section{Item}{

A single \code{cncpt} object contains an \code{item} object, which in turn is
composed of several \code{itm} objects with varying sub-types, the relationship
\code{item} to \code{itm} being that of \code{concept} to \code{cncpt} and the rationale for
this implementation choice is the same as previously: a container class
used representing a vector of objects of varying sub-types, all inheriting
form a common super-type. For more information on the \code{item} class, please
refer to the \link[=item]{relevant documentation}. Arguments introduced at \code{item}
level include \code{patient_ids}, \code{id_type} and \code{interval}. Acceptable values for
\code{interval} are scalar-valued \code{\link[base:difftime]{base::difftime()}} objects (see also helper
functions such as \code{\link[=hours]{hours()}}) and this argument essentially controls the
time-resolution of the returned time-series. Of course, the limiting factor
raw time resolution which is on the order of hours for data sets like
\href{https://physionet.org/content/mimiciii/}{MIMIC-III} or
\href{https://physionet.org/content/eicu-crd}{eICU} but can be much higher for a
data set like \href{https://physionet.org/content/hirid/}{HiRID}. The argument
\code{id_type} is used to specify what kind of id system should be used to
identify different time series in the returned data. A data set like
MIMIC-III, for example, makes possible the resolution of data to 3 nested
ID systems:
\itemize{
\item \code{patient} (\code{subject_id}): identifies a person
\item \code{hadm} (\code{hadm_id}): identifies a hospital admission (several of which are
possible for a given person)
\item \code{icustay} (\code{icustay_id}): identifies an admission to an ICU and again has
a one-to-many relationship to \code{hadm}.
}

Acceptable argument values are strings that match ID systems as specified
by the \link[=load_src_cfg]{data source configuration}. Finally, \code{patient_ids}
is used to define a patient cohort for which data can be requested. Values
may either be a vector of IDs (which are assumed to be of the same type as
specified by the \code{id_type} argument) or a tabular object inheriting from
\code{data.frame}, which must contain a column named after the data set-specific
ID system identifier (for MIMIC-III and an \code{id_type} argument of \code{hadm},
for example, that would be \code{hadm_id}).
}

\section{Extensions}{

The presented hierarchy of S3 classes is designed with extensibility in
mind: while the current range of functionality covers settings encountered
when dealing with the included concepts and datasets, further data sets
and/or clinical concepts might necessitate different behavior for data
loading. For this reason, various parts in the cascade of calls to
\code{load_concepts()} can be adapted for new requirements by defining new sub-
classes to \code{cncpt} or \code{itm} and  providing methods for the generic function
\code{load_concepts()}specific to these new classes. At \code{cncpt} level, method
dispatch defaults to \code{load_concepts.cncpt()} if no method specific to the
new class is provided, while at \code{itm} level, no default function is
available.

Roughly speaking, the semantics for the two functions are as follows:
\itemize{
\item \code{cncpt}: Called with arguments \code{x} (the current \code{cncpt} object),
\code{aggregate} (controlling how aggregation per time-point and ID is
handled), \code{...} (further arguments passed to downstream methods) and
\code{progress} (controlling progress reporting), this function should be able
to load and aggregate data for the given concept. Usually this involves
extracting the \code{item} object and calling \code{load_concepts()} again,
dispatching on the \code{item} class with arguments \code{x} (the given \code{item}),
arguments passed as \code{...}, as well as \code{progress}.
\item \code{itm}: Called with arguments \code{x} (the current object inheriting from
\code{itm}, \code{patient_ids} (\code{NULL} or a patient ID selection), \code{id_type} (a
string specifying what ID system to retrieve), and \code{interval} (the time
series interval), this function actually carries out the loading of
individual data items, using the specified ID system, rounding times to
the correct interval and subsetting on patient IDs. As return value, on
object of class as specified by the \code{target} entry is expected and all
\code{\link[=data_vars]{data_vars()}} should be named consistently, as data corresponding to
multiple \code{itm} objects concatenated in row-wise fashion as in
\code{\link[base:cbind]{base::rbind()}}.
}
}

\examples{
if (require(mimic.demo)) {
dat <- load_concepts("glu", "mimic_demo")

gluc <- concept("gluc",
  item("mimic_demo", "labevents", "itemid", list(c(50809L, 50931L)))
)

identical(load_concepts(gluc), dat)

class(dat)
class(load_concepts(c("sex", "age"), "mimic_demo"))
}

}
