% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/f_s3generics_clvdata_plot.R
\name{plot.clv.data}
\alias{plot.clv.data}
\title{Plot Diagnostics for the Transaction data in a clv.data Object}
\usage{
\method{plot}{clv.data}(
  x,
  which = c("tracking", "frequency", "spending", "interpurchasetime", "timings"),
  prediction.end = NULL,
  cumulative = FALSE,
  trans.bins = 0:9,
  count.repeat.trans = TRUE,
  count.remaining = TRUE,
  label.remaining = "10+",
  mean.spending = TRUE,
  annotate.ids = FALSE,
  ids = c(),
  sample = c("estimation", "full", "holdout"),
  geom = "line",
  color = "black",
  plot = TRUE,
  verbose = TRUE,
  ...
)
}
\arguments{
\item{x}{The clv.data object to plot}

\item{which}{Which plot to produce, either "tracking", "frequency", "spending", "interpurchasetime", or "timings".
May be abbreviated but only one may be selected. Defaults to "tracking".}

\item{prediction.end}{"tracking": Until what point in time to plot. This can be the number of periods (numeric) or a form of date/time object. See details.}

\item{cumulative}{"tracking": Whether the cumulative actual repeat transactions should be plotted.}

\item{trans.bins}{"frequency": Vector of integers indicating the number of transactions (x axis) for which the customers should be counted.}

\item{count.repeat.trans}{"frequency": Whether repeat transactions (TRUE, default) or all transactions (FALSE) should be counted.}

\item{count.remaining}{"frequency": Whether the customers which are not captured with \code{trans.bins} should be counted in a separate last bar.}

\item{label.remaining}{"frequency": Label for the last bar, if \code{count.remaining=TRUE}.}

\item{mean.spending}{"spending": Whether customer's mean spending per transaction (\code{TRUE}, default) or the
value of every transaction in the data (\code{FALSE}) should be plotted.}

\item{annotate.ids}{"timings": Whether timelines should be annotated with customer ids.}

\item{ids}{"timings": A character vector of customer ids or a single integer specifying the number of customers to sample.
Defaults to \code{NULL} for which 50 random customers are selected.}

\item{sample}{Name of the sample for which the plot should be made, either
"estimation", "full", or "holdout". Defaults to "estimation". Not for "tracking" and "timing".}

\item{geom}{"spending" and "interpurchasetime": The geometric object of ggplot2 to display the data. Forwarded to
\link[ggplot2:stat_density]{ggplot2::stat_density}.}

\item{color}{Color of resulting geom object in the plot. Not for "tracking" and "timing".}

\item{plot}{Whether a plot should be created or only the assembled data returned.}

\item{verbose}{Show details about the running of the function.}

\item{...}{Forwarded to \link[ggplot2:stat_density]{ggplot2::stat_density} ("spending", "interpurchasetime")
or \link[ggplot2:geom_bar]{ggplot2::geom_bar} ("frequency"). Not for "tracking" and "timings".}
}
\value{
An object of class \code{ggplot} from package \code{ggplot2} is returned by default.
If \code{plot=FALSE}, the data that would have been used to create the plot is returned.
Depending on which plot was selected, this is a \code{data.table}
which contains some of the following columns:
\item{Id}{Customer Id}
\item{period.until}{The timepoint that marks the end (up until and including) of the period to which the data in this row refers.}
\item{Spending}{Spending as defined by parameter \code{mean.spending}.}
\item{mean.interpurchase.time}{Mean number of periods between transactions per customer,
excluding customers with no repeat-transactions.}
\item{num.transactions}{The number of (repeat) transactions, depending on \code{count.repeat.trans}.}
\item{num.customers}{The number of customers.}
\item{type}{"timings": Which purpose the value in this row is used for.}
\item{variable}{"tracking": The number of actual repeat transactions in the period that ends at \code{period.until}.\cr
                "timings": Coordinate (x or y) for which to use the value in this row for.}
\item{value}{"timings":  Date or numeric (stored as string) \cr
             "tracking": numeric}
}
\description{
Depending on the value of parameter \code{which}, one of the following plots will be produced.
Note that the \code{sample} parameter determines the period for which the
selected plot is made (either estimation, holdout, or full).

\subsection{Tracking Plot}{
Plot the aggregated repeat transactions per period over the given time-horizon (\code{prediction.end}).
See Details for the definition of plotting periods.
}

\subsection{Frequency Plot}{
Plot the distribution of transactions or repeat transactions per customer, after aggregating transactions
of the same customer on a single time point.
Note that if \code{trans.bins} is changed, \code{label.remaining} usually needs to be adapted as well.
}

\subsection{Spending Plot}{
Plot the empirical density of either customer's average spending per transaction or the value
of every transaction in the data, after aggregating transactions of the same customer on a single time point.
Note that in all cases this includes all transactions and not only repeat-transactions.
}

\subsection{Interpurchase Time Plot}{
Plot the empirical density of customer's mean time (in number of periods) between transactions,
after aggregating transactions of the same customer on a single time point.
Note that customers without repeat-transactions are removed.
}

\subsection{Transaction Timing Plot}{
Plot the transaction timings of selected or sampled customers on their respective timelines.
}
}
\details{
\code{prediction.end} indicates until when to predict or plot and can be given as either
a point in time (of class \code{Date}, \code{POSIXct}, or \code{character}) or the number of periods.
If \code{prediction.end} is of class character, the date/time format set when creating the data object is used for parsing.
If \code{prediction.end} is the number of periods, the end of the fitting period serves as the reference point
from which periods are counted. Only full periods may be specified.
If \code{prediction.end} is omitted or NULL, it defaults to the end of the holdout period if present and to the
end of the estimation period otherwise.

The first prediction period is defined to start right after the end of the estimation period.
If for example weekly time units are used and the estimation period ends on Sunday 2019-01-01, then the first day
of the first prediction period is Monday 2019-01-02. Each prediction period includes a total of 7 days and
the first prediction period therefore will end on, and include, Sunday 2019-01-08. Subsequent prediction periods
again start on Mondays and end on Sundays.
If \code{prediction.end} indicates a timepoint on which to end, this timepoint is included in the prediction period.

If there are no repeat transactions until \code{prediction.end}, only the time for which there is data
is plotted. If the data is returned (i.e. with argument \code{plot=FALSE}), the respective rows
contain \code{NA} in column \code{Number of Repeat Transactions}.
}
\examples{

\donttest{

data("cdnow")
clv.cdnow <- clvdata(cdnow, time.unit="w",estimation.split=37,
                     date.format="ymd")

### TRACKING PLOT
# Plot the actual repeat transactions
plot(clv.cdnow)
# same, explicitly
plot(clv.cdnow, which="tracking")

# plot cumulative repeat transactions
plot(clv.cdnow, cumulative=TRUE)

# Dont automatically plot but tweak further
library(ggplot2) # for ggtitle()
gg.cdnow <- plot(clv.cdnow)
# change Title
gg.cdnow + ggtitle("CDnow repeat transactions")

# Dont return a plot but only the data from
#   which it would have been created
dt.plot.data <- plot(clv.cdnow, plot=FALSE)


### FREQUENCY PLOT
plot(clv.cdnow, which="frequency")

# Bins from 0 to 15, all remaining in bin labelled "16+"
plot(clv.cdnow, which="frequency", trans.bins=0:15,
     label.remaining="16+")

# Count all transactions, not only repeat
#  Note that the bins have to be adapted to start from 1
plot(clv.cdnow, which="frequency", count.repeat.trans = FALSE,
     trans.bins=1:9)


### SPENDING DENSITY
# plot customer's average transaction value
plot(clv.cdnow, which="spending", mean.spending = TRUE)

# distribution of the values of every transaction
plot(clv.cdnow, which="spending", mean.spending = FALSE)


### INTERPURCHASE TIME DENSITY
# plot as small points, in blue
plot(clv.cdnow, which="interpurchasetime",
     geom="point", color="blue", size=0.02)


### TIMING PATTERNS
# selected customers and annotating them
plot(clv.cdnow, which="timings", ids=c("123", "1041"), annotate.ids=TRUE)

# plot 25 random customers
plot(clv.cdnow, which="timings", ids=25)

# plot all customers
plot(clv.cdnow, which="timings", ids=nobs(clv.cdnow))
}

}
\seealso{
\link[ggplot2:stat_density]{ggplot2::stat_density} and \link[ggplot2:geom_bar]{ggplot2::geom_bar}
for possible arguments to \code{...}

\link[CLVTools:plot.clv.fitted.transactions]{plot} to plot fitted transaction models

\link[CLVTools:plot.clv.fitted.spending]{plot} to plot fitted spending models
}
