simrec: An R-Package for Simulation of Recurrent Event Data

Katharina Ingel¹

Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI, Mainz)

Stella Preussler²

Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI, Mainz)

Federico Marini³

Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI, Mainz)

Antje Jahn-Eimermacher⁴

Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI, Mainz)

2023-09-06

Abstract

simrec allows simulation of recurrent event data following the multiplicative intensity model described in (Andersen and Gill 1982) with the baseline hazard being a function of the total/calendar time. To induce between-subject-heterogeneity a random effect covariate (frailty term) can be incorporated. Furthermore, via simreccomp simulation of data following a multistate model with recurrent event data of one type and a competing event is possible. simrecint gives the possibility to additionally simulate a recruitment time for each individual and cut the data to an interim data set. With simrecPlot and simreccompPlot the data can be plotted.

Keywords: recurrent event data, competing event, frailty, simulation, total-time model

1 Introduction

The simrec package includes the functions simrec, simreccomp and simrecint and allows simulation of recurrent event data. To induce between-subject-heterogeneity a random effect covariate (frailty term) can be incorporated. Via simreccomp time-to-event data that follow a multistate model with recurrent event data of one type and a competing event can be simulated. Data output is in the counting-process format. simrecint gives the possibility to additionally simulate a recruitment time for each individual and cut the data to an interim data set. With simrecPlot and simreccompPlot the data can be plotted.

2 The `simrec` function

Description

The function simrec allows simulation of recurrent event data following the multiplicative intensity model described in Andersen and Gill (Andersen and Gill 1982) with the baseline hazard being a function of the total/calendar time. To induce between-subject-heterogeneity a random effect covariate (frailty term) can be incorporated. Data for individual \(i\) are generated according to the intensity process \[Y_i(t)\cdot \lambda_0(t)\cdot Z_i \cdot \exp(\beta^t X_i)\] where \(X_i\) defines the covariate vector, and \(\beta\) the regression coefficient vector. \(\lambda_0(t)\) denotes the baseline hazard, being a function of the total/calendar time \(t\), and \(Y_i(t)\) the predictable process that equals one as long as individual \(i\) is under observation and at risk for experiencing events. \(Z_i\) denotes the frailty variable with \((Z_i)_i\) iid with \(E(Z_i)=1\) and \(Var(Z_i)=\theta\). The parameter \(\theta\) describes the degree of between-subject-heterogeneity. Data output is in the counting process format.

simrec(N, fu.min, fu.max, cens.prob = 0, dist.x = "binomial", par.x = 0,
  beta.x = 0, dist.z = "gamma", par.z = 0, dist.rec, par.rec, pfree = 0,
  dfree = 0)

Parameters

N Number of individuals
fu.min Minimum length of follow-up.
fu.max Maximum length of follow-up. Individuals length of follow-up is generated from a uniform distribution on [fu.min, fu.max]. If fu.min=fu.max, then all individuals have a common follow-up.
cens.prob Gives the probability of being censored due to loss to follow-up before fu.max. For a random set of individuals defined by a B(N,cens.prob)-distribution, the time to censoring is generated from a uniform distribution on [0, fu.max]. Default is cens.prob=0, i.e. no censoring due to loss to follow-up.
dist.x Distribution of the covariate(s) \(X\). If there is more than one covariate, dist.x must be a vector of distributions with one entry for each covariate. Possible values are "binomial" and "normal", default is "binomial".
par.x Parameters of the covariate distribution(s). For "binomial", par.x is the probability for \(x=1\). For "normal", par.x is the vector of (\(\mu, \sigma\))} where \(\mu\) is the mean and \(\sigma\) is the standard deviation of a normal distribution. If one of the covariates is defined to be normally distributed, par.x must be a list, e.g. dist.x <- c("binomial", "normal") and par.x <- list(0.5, c(1,2)). Default is par.x = 0, i.e. \(x=0\) for all individuals.
beta.x Regression coefficient(s) for the covariate(s) \(x\). If there is more than one covariate, beta.x must be a vector of coefficients with one entry for each covariate. simrec generates as many covariates as there are entries in beta.x. Default is beta.x = 0, corresponding to no effect of the covariate \(x\).
dist.z Distribution of the frailty variable \(Z\) with \(E(Z)=1\) and \(Var(Z)=\theta\). Possible values are "gamma" for a Gamma distributed frailty and "lognormal" for a lognormal distributed frailty. Default is dist.z="gamma".
par.z Parameter \(\theta\) for the frailty distribution: this parameter gives the variance of the frailty variable \(Z\). Default is par.z=0, which causes \(Z\equiv 1\), i.e. no frailty effect.
dist.rec Form of the baseline hazard function. Possible values are "weibull" or "gompertz" or "lognormal" or "step".
par.rec Parameters for the distribution of the event data.
If dist.rec="weibull" the hazard function is \[\lambda_0(t)=\lambda\cdot\nu\cdot t^{\nu - 1}\] where \(\lambda>0\) is the scale and \(\nu>0\) is the shape parameter. Then par.rec=c(lambda, nu). A special case of this is the exponential distribution for \(\nu=1\).\ If dist.rec="gompertz", the hazard function is \[\lambda_0(t)=\lambda\cdot \exp(\alpha t)\] where \(\lambda>0\) is the scale and \(\alpha\in(-\infty,+\infty)\) is the shape parameter. Then par.rec=c(lambda, alpha).
If dist.rec="lognormal", the hazard function is \[\lambda_0(t) = \frac{1}{\sigma t} \cdot \frac{\phi(\frac{ln(t)-\mu}{\sigma})}{\Phi(\frac{-ln(t)-\mu}{\sigma})}\] where \(\phi\) is the probability density function and \(\Phi\) is the cumulative distribution function of the standard normal distribution, \(\mu\in(-\infty,+\infty)\) is a location parameter and \(\sigma>0\) is a shape parameter. Then par.rec=c(mu,sigma). Please note that specifying dist.rec="lognormal" together with some covariates does not specify the usual lognormal model (with covariates specified as effects on the parameters of the lognormal distribution resulting in non-proportional hazards), but only defines the baseline hazard and incorporates covariate effects using the proportional hazard assumption.
If dist.rec="step" the hazard function is \[\lambda_0(t)=\begin{cases} a, & t\leq t_1\cr b, & t>t_1\end{cases}\] Then par.rec=c(a,b,t_1), with \(a,b\geq 0\).
pfree Probability that after experiencing an event the individual is not at risk for experiencing further events for a length of dfree time units. Default is pfree = 0.
dfree Length of the risk-free interval. Must be in the same time unit as fu.max. Default is dfree = 0, i.e. the individual is continously at risk for experiencing events until end of follow-up.

Output

The output is a data.frame consisting of the columns:

id An integer number for identification of each individual
x or x.V1, x.V2, ... - depending on the covariate matrix. Contains the randomly generated value of the covariate(s) \(X\) for each individual.
z Contains the randomly generated value of the frailty variable \(Z\) for each individual.
start The start of interval [start, stop], when the individual starts to be at risk for a next event.
stop The time of an event or censoring, i.e. the end of interval [start, stop].
status An indicator of whether an event occured at time stop (status=1) or the individual is censored at time stop (status=0).
fu Length of follow-up period [0,fu] for each individual.

For each individual there are as many lines as it experiences events, plus one line if being censored. The data format corresponds to the counting process format.

Details

Data are simulated by extending the methods proposed by Bender (Bender, Augustin, and Blettner 2005) to the multiplicative intensity model. You can read more on this in our work (Jahn-Eimermacher et al. 2015).

Example

library(simrec)
### Example:
### A sample of 10 individuals

N <- 10

### with a binomially distributed covariate with a regression coefficient
### of beta=0.3, and a standard normally distributed covariate with a
### regression coefficient of beta=0.2,

dist.x <- c("binomial", "normal")
par.x <- list(0.5, c(0, 1))
beta.x <- c(0.3, 0.2)

### a gamma distributed frailty variable with variance 0.25

dist.z <- "gamma"
par.z <- 0.25

### and a Weibull-shaped baseline hazard with shape parameter lambda=1
### and scale parameter nu=2.

dist.rec <- "weibull"
par.rec <- c(1,2)

### Subjects are to be followed for two years with 20\% of the subjects
### being censored according to a uniformly distributed censoring time
### within [0,2] (in years).

fu.min <- 2
fu.max <- 2
cens.prob <- 0.2

### After each event a subject is not at risk for experiencing further events
### for a period of 30 days with a probability of 50\%.

dfree <- 30/365
pfree <- 0.5

simdata <- simrec(N, fu.min, fu.max, cens.prob, dist.x, par.x, beta.x,
                  dist.z, par.z, dist.rec, par.rec, pfree, dfree)
print(simdata[1:10,])
DT::datatable(simdata)

##    id x.V1       x.V2        z     start      stop status fu
## 1   1    1 -0.7893980 1.115146 0.0000000 0.5016207      1  2
## 2   1    1 -0.7893980 1.115146 0.5016207 1.3302807      1  2
## 3   1    1 -0.7893980 1.115146 1.3302807 1.4132175      1  2
## 4   1    1 -0.7893980 1.115146 1.4132175 1.7863954      1  2
## 5   1    1 -0.7893980 1.115146 1.7863954 1.8778984      1  2
## 6   1    1 -0.7893980 1.115146 1.9600902 2.0000000      0  2
## 7   2    1 -0.5516677 1.169574 0.0000000 0.8522717      1  2
## 8   2    1 -0.5516677 1.169574 0.8522717 1.0128146      1  2
## 9   2    1 -0.5516677 1.169574 1.0950063 1.4411274      1  2
## 10  2    1 -0.5516677 1.169574 1.5233191 1.7305123      1  2

3 The `simreccomp` function

Description

The function simreccomp allows simulation of time-to-event-data that follow a multistate-model with recurrent events of one type and a competing event. The baseline hazard for the cause-specific hazards are here functions of the total/calendar time. To induce between-subject-heterogeneity a random effect covariate (frailty term) can be incorporated for the recurrent and the competing event.

Data for the recurrent events of the individual \(i\) are generated according to the cause-specific hazards \[\lambda_{0r}(t)\cdot Z_{ri} \cdot \exp(\beta_r^t X_i)\] where \(X_i\) defines the covariate vector and \(\beta_r\) the regression coefficient vector. \(\lambda_{0r}(t)\) denotes the baseline hazard, being a function of the total/calendar time \(t\) and \(Z_{ri}\) denotes the frailty variables with \((Z_{ri})_i\) iid with \(E(Z_{ri})=1\) and \(Var(Z_{ri})=\theta_r\). The parameter \(\theta_r\) describes the degree of between-subject-heterogeneity for the recurrent event. Analougously the competing event is generated according to the cause-specific hazard conditionally on the frailty variable and covariates: \[\lambda_{0c}(t)\cdot Z_{ci} \cdot \exp(\beta_c^t X_i)\] Data output is in the counting process format.

simreccomp(N, fu.min, fu.max, cens.prob = 0, dist.x = "binomial", par.x = 0,
           beta.xr = 0, beta.xc = 0, dist.zr = "gamma", par.zr = 0, a = NULL,
           dist.zc = NULL, par.zc = NULL, dist.rec, par.rec,
           dist.comp, par.comp, pfree = 0, dfree = 0)

Parameters

N Number of individuals
fu.min Minimum length of follow-up.
fu.max Maximum length of follow-up. Individuals length of follow-up is generated from a uniform distribution on [fu.min, fu.max]. If fu.min=fu.max, then all individuals have a common follow-up.
cens.prob Gives the probability of being censored due to loss to follow-up before fu.max. For a random set of individuals defined by a B(N,cens.prob)-distribution, the time to censoring is generated from a uniform distribution on [0, fu.max]. Default is cens.prob=0, i.e. no censoring due to loss to follow-up.
dist.x Distribution of the covariate(s) \(X\). If there is more than one covariate, dist.x must be a vector of distributions with one entry for each covariate. Possible values are "binomial" and "normal", default is "binomial".
par.x Parameters of the covariate distribution(s). For "binomial", par.x is the probability for \(x=1\). For "normal", par.x is the vector of (\(\mu, \sigma\))} where \(\mu\) is the mean and \(\sigma\) is the standard deviation of a normal distribution. If one of the covariates is defined to be normally distributed, par.x must be a list, e.g. dist.x <- c("binomial", "normal") and par.x <- list(0.5, c(1,2)). Default is par.x = 0, i.e. \(x=0\) for all individuals.
beta.xr Regression coefficient(s) for the covariate(s) \(x\) corresponding to the recurrent events. If there is more than one covariate, beta.xr must be a vector of coefficients with one entry for each covariate. simreccomp generates as many covariates as there are entries in beta.xr. Default is beta.xr = 0, corresponding to no effect of the covariate \(x\) on the recurrent events.
beta.xc Regression coefficient(s) for the covariate(s) \(x\) corresponding to the competing event. If there is more than one covariate, beta.xc must be a vector of coefficients with one entry for each covariate. simreccomp generates as many covariates as there are entries in beta.xc. Default is beta.xc = 0, corresponding to no effect of the covariate \(x\) on the competing event.
dist.zr Distribution of the frailty variable \(Z_r\) for the recurrent events with \(E(Z_r)=1\) and \(Var(Z_r)=\theta_r\). Possible values are "gamma" for a Gamma distributed frailty and "lognormal" for a lognormal distributed frailty. Default is dist.zr="gamma".
par.zr Parameter \(\theta_r\) for the frailty distribution: this parameter gives the variance of the frailty variable \(Z_r\). Default is par.zr=0, which causes \(Z\equiv 1\), i.e. no frailty effect for the recurrent events.
dist.zc Distribution of the frailty variable \(Z_c\) for the competing event with \(E(Z_c)=1\) and \(Var(Z_c)=\theta_r\). Possible values are "gamma" for a Gamma distributed frailty and "lognormal" for a lognormal distributed frailty.
par.zc Parameter \(\theta_c\) for the frailty distribution: this parameter gives the variance of the frailty variable \(Z_c\).
a Alternatively, the frailty distribution for the competing event can be computed through the distribution of the frailty variable \(Z_r\) by \(Z_c=Z_r^a\). Either a or dist.zc and par.zc must be specified.
dist.rec Form of the baseline hazard function for the recurrent events. Possible values are "weibull" or "gompertz" or "lognormal" or "step".
par.rec Parameters for the distribution of the recurrent event data.
If dist.rec="weibull" the hazard function is \[\lambda_0(t)=\lambda\cdot\nu\cdot t^{\nu - 1}\] where \(\lambda>0\) is the scale and \(\nu>0\) is the shape parameter. Then par.rec=c(lambda, nu). A special case of this is the exponential distribution for \(\nu=1\).\ If dist.rec="gompertz", the hazard function is \[\lambda_0(t)=\lambda\cdot \exp(\alpha t)\] where \(\lambda>0\) is the scale and \(\alpha\in(-\infty,+\infty)\) is the shape parameter. Then par.rec=c(lambda, alpha).
If dist.rec="lognormal", the hazard function is \[\lambda_0(t) = \frac{1}{\sigma t} \cdot \frac{\phi(\frac{ln(t)-\mu}{\sigma})}{\Phi(\frac{-ln(t)-\mu}{\sigma})}\] where \(\phi\) is the probability density function and \(\Phi\) is the cumulative distribution function of the standard normal distribution, \(\mu\in(-\infty,+\infty)\) is a location parameter and \(\sigma>0\) is a shape parameter. Then par.rec=c(mu,sigma). Please note that specifying dist.rec="lognormal" together with some covariates does not specify the usual lognormal model (with covariates specified as effects on the parameters of the lognormal distribution resulting in non-proportional hazards), but only defines the baseline hazard and incorporates covariate effects using the proportional hazard assumption.
If dist.rec="step" the hazard function is \[\lambda_0(t)=\begin{cases} a, & t\leq t_1\cr b, & t>t_1\end{cases}\] Then par.rec=c(a,b,t_1), with \(a,b\geq 0\).
dist.comp Form of the baseline hazard function for the competing event. Possible values are "weibull" or "gompertz" or "lognormal" or "step".
par.comp Parameters for the distribution of the competing event data. For more details see par.rec.
pfree Probability that after experiencing an event the individual is not at risk for experiencing further events for a length of dfree time units. Default is pfree = 0.
dfree Length of the risk-free interval. Must be in the same time unit as fu.max. Default is dfree = 0, i.e. the individual is continously at risk for experiencing events until end of follow-up.

Output

The output is a data.frame consisting of the columns:

id An integer number for identification of each individual
x or x.V1, x.V2, ... - depending on the covariate matrix. Contains the randomly generated value of the covariate(s) \(X\) for each individual.
zr Contains the randomly generated value of the frailty variable \(Z_r\) for each individual.
zc Contains the randomly generated value of the frailty variable \(Z_c\) for each individual.
start The start of interval [start, stop], when the individual starts to be at risk for a next event.
stop The time of an event or censoring, i.e. the end of interval [start, stop].
status An indicator of whether an event occured at time stop (status=1) or the individual is censored at time stop (status=0) or the competing event occured at time stop (status=2).
fu Length of follow-up period [0,fu] for each individual.

For each individual there are as many lines as it experiences events, plus one line if being censored. The data format corresponds to the counting process format.

Example

library(simrec)
### Example:
### A sample of 10 individuals

N <- 10

### with a binomially distributed covariate and a standard normally distributed
### covariate with regression coefficients of beta.xr=0.3 and beta.xr=0.2,
### respectively, for the recurrent events,
### as well as regression coefficients of beta.xc=0.5 and beta.xc=0.25,
### respectively, for the competing event.

dist.x  <- c("binomial", "normal")
par.x <- list(0.5, c(0, 1))
beta.xr <- c(0.3, 0.2)
beta.xc <- c(0.5, 0.25)

### a gamma distributed frailty variable for the recurrent event with
### variance 0.25 and for the competing event with variance 0.3,

dist.zr <- "gamma"
par.zr <- 0.25

dist.zc <- "gamma"
par.zc <- 0.3

### alternatively the frailty variable for the competing event can be computed
### via a:
a <- 0.5

### Furthermore a Weibull-shaped baseline hazard for the recurrent event with
### shape parameter lambda=1 and scale parameter nu=2,

dist.rec <- "weibull"
par.rec <- c(1, 2)

### and a Weibull-shaped baseline hazard for the competing event with
### shape parameter lambda=1 and scale parameter nu=2

dist.comp <- "weibull"
par.comp <- c(1, 2)

### Subjects are to be followed for two years with 20% of the subjects
### being censored according to a uniformly distributed censoring time
### within [0,2] (in years).

fu.min <- 2
fu.max <- 2
cens.prob <- 0.2

### After each event a subject is not at risk for experiencing further events
### for a period of 30 days with a probability of 50%.

dfree <- 30/365
pfree <- 0.5

simdata1 <- simreccomp(N = N, fu.min = fu.min, fu.max = fu.max, cens.prob = cens.prob,
                       dist.x = dist.x, par.x = par.x, beta.xr = beta.xr,
                       beta.xc = beta.xc, dist.zr = dist.zr, par.zr = par.zr, a = a,
                       dist.rec = dist.rec, par.rec = par.rec, dist.comp = dist.comp,
                       par.comp = par.comp, pfree = pfree, dfree = dfree)
simdata2 <- simreccomp(N = N, fu.min = fu.min, fu.max = fu.max, cens.prob = cens.prob,
                       dist.x = dist.x, par.x = par.x, beta.xr = beta.xr,
                       beta.xc = beta.xc, dist.zr = dist.zr, par.zr = par.zr,
                       dist.zc = dist.zc, par.zc = par.zc, dist.rec = dist.rec,
                       par.rec = par.rec, dist.comp = dist.comp,
                       par.comp = par.comp, pfree = pfree, dfree = dfree)

print(simdata1[1:10, ])
print(simdata2[1:10, ])
DT::datatable(simdata1)
DT::datatable(simdata2)

##    id x.V1       x.V2        zr        zc     start       stop status         fu
## 1   1    1 -2.1939082 0.4606165 0.6786873 0.0000000 0.05488195      0 0.05488195
## 3   2    1  1.3596272 0.7828881 0.8848096 0.0000000 0.33509524      1 1.07296452
## 4   2    1  1.3596272 0.7828881 0.8848096 0.3350952 0.93037208      1 1.07296452
## 5   2    1  1.3596272 0.7828881 0.8848096 0.9303721 1.07296452      2 1.07296452
## 9   3    0 -0.5140190 0.2647578 0.5145462 0.0000000 0.96246147      2 0.96246147
## 12  4    1  1.2931537 1.1051093 1.0512418 0.0000000 0.98722198      2 0.98722198
## 21  5    1 -0.6677042 0.8519025 0.9229857 0.0000000 0.68104225      1 1.05826532
## 22  5    1 -0.6677042 0.8519025 0.9229857 0.7632340 1.05826532      2 1.05826532
## 28  6    1 -1.4736345 1.6174605 1.2717942 0.0000000 0.42065435      1 0.85924993
## 29  6    1 -1.4736345 1.6174605 1.2717942 0.5028461 0.52903958      1 0.85924993
##    id x.V1        x.V2        zr        zc     start      stop status        fu
## 1   1    1 -1.15006687 0.7498753 0.8613313 0.0000000 0.1058893      2 0.1058893
## 3   2    0 -0.75189143 0.9142893 0.1872216 0.0000000 0.4741828      0 0.4741828
## 5   3    0 -2.16474687 2.8161489 2.7111986 0.0000000 0.2014064      2 0.2014064
## 14  4    1  0.99538003 1.1940654 0.3344309 0.0000000 0.9455841      1 1.4552197
## 15  4    1  0.99538003 1.1940654 0.3344309 0.9455841 1.0724881      1 1.4552197
## 16  4    1  0.99538003 1.1940654 0.3344309 1.1546799 1.2978204      1 1.4552197
## 17  4    1  0.99538003 1.1940654 0.3344309 1.2978204 1.4552197      2 1.4552197
## 21  5    0  0.02024433 1.5690281 0.6988923 0.0000000 0.9457083      1 0.9741986
## 22  5    0  0.02024433 1.5690281 0.6988923 0.9457083 0.9735811      1 0.9741986
## 23  5    0  0.02024433 1.5690281 0.6988923 0.9735811 0.9741986      2 0.9741986

4 The `simrecint` function

Description

With this function previously simulated data (for example simulated by the use of simrec or simreccomp) can be cut to an interim data set. The simulated data must be in patient time (i.e. time since the patient entered the study, for an example see below), and must be in the counting process format. Furthermore, the dataset must have the variables id, start,stop, and status, like data simulated by the use of simrec or simreccomp. Then for every individual additionally a recruitment time is generated in study time (i.e. time since start of the study, for an example see below), which is uniformly distributed on \([0, t_R]\). The timing of the interim analysis \(t_I\) is set in study time and data are being cut to all data, that are available at the interim analysis. If you only wish to simulate a recruitment time, \(t_I\) can be set to \(t_R + fu.max\) or something else beyond the end of the study.

Simulated data in patient time:

Simulated data in study time with time of interim analysis (\(t_I\)), end of recruitment period (\(t_R\)) and end of study

simrecint(data, N, tR, tI)

Parameters

data Previously generated data (in patient time), that shall be cut to interim data
N Number of individuals, for which data was generated
tR Length of the recruitment period (in study time)
tI Timing of the interim analysis (in study time)

Output

The output is a data.frame consisting of the columns, that were put into, and additionally the following columns:

rectime The recruitment time for each individual (in study time).
interimtime The time of the interim analysis tI (in study time).
stop_study The stopping time for each event in study time.

Individuals that are not already recruited at the interim analysis are left out here.

Example

### Example - see example for simrec
library(simrec)
N         <- 10
dist.x    <- c("binomial", "normal")
par.x     <- list(0.5, c(0,1))
beta.x    <- c(0.3, 0.2)
dist.z    <- "gamma"
par.z     <- 0.25
dist.rec  <- "weibull"
par.rec   <- c(1,2)
fu.min    <- 2
fu.max    <- 2
cens.prob <- 0.2

simdata <- simrec(N, fu.min, fu.max, cens.prob, dist.x, par.x, beta.x, dist.z,
                  par.z, dist.rec, par.rec)

### Now simulate for each patient a recruitment time in [0,tR=2]
### and cut data to the time of the interim analysis at tI=1:

simdataint <- simrecint(simdata, N = N, tR = 2, tI = 1)
print(simdataint)  # only run for small N!
DT::datatable(simdataint)

##    id x.V1       x.V2         z     start      stop status       fu   rectime interimtime
## 1   1    0  0.6823446 0.9188553 0.0000000 0.2284217      0 2.000000 0.7715783           1
## 6   3    0 -0.2524965 1.3456617 0.0000000 0.5411292      0 1.096144 0.4588708           1
## 42 10    0 -0.9202819 1.3565448 0.0000000 0.4958149      1 2.000000 0.4340184           1
## 43 10    0 -0.9202819 1.3565448 0.4958149 0.5659816      0 2.000000 0.4340184           1
##    stop_study
## 1   1.2376194
## 6   1.2143280
## 42  0.9298333
## 43  1.6060269

5 Plotting the Data

Description

The functions simrecPlot and simreccompPlot allow plotting of recurrent event data, possibly with a competing event.

simrecPlot(simdata, id = "id", start = "start", stop = "stop", status = "status")
simreccompPlot(simdata, id = "id", start = "start", stop = "stop", status = "status")

Parameters

data A data set of recurrent event data to be plotted.
The input-data must include columns corresponding to:
- id - patient-ID
- start - beginning of an interval where the patient is at risk for an event
- stop -end of the interval due to an event or censoring
- status - an indicator of the patient status at stop with = 0 censoring, 1 = event for simrecPlot and additionally 2 = competing event for simreccompPlot
id the name of the id column, default is "id"
start the name of the start column, default is "start"
stop the name of the stop column, default is "stop"
status the name of the status column, default is "status"

Output

The output is a plot of the data with a bullet (\(\bullet\)) indicating a recurrent event, a circle (\(\circ\)) indicating censoring and \(\times\) indicating the competing event.

Example

simrecPlot(simdata)

simreccompPlot(simdata1)

Session Info

sessionInfo()

## R version 4.3.0 (2023-04-21)
## Platform: x86_64-apple-darwin20 (64-bit)
## Running under: macOS Monterey 12.6.4
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Berlin
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] simrec_1.0.1 knitr_1.43  
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.33     R6_2.5.1          fastmap_1.1.1     xfun_0.40        
##  [5] magrittr_2.0.3    cachem_1.0.8      htmltools_0.5.6   rmarkdown_2.24   
##  [9] DT_0.28           cli_3.6.1         sass_0.4.7        jquerylib_0.1.4  
## [13] compiler_4.3.0    highr_0.10        rstudioapi_0.15.0 tools_4.3.0      
## [17] ellipsis_0.3.2    evaluate_0.21     bslib_0.5.1       yaml_2.3.7       
## [21] htmlwidgets_1.6.2 rlang_1.1.1       jsonlite_1.8.7    crosstalk_1.2.0

References

Andersen, P. K., and R. D. Gill. 1982. “Cox’s Regression Model for Counting Processes: A Large Sample Study.” Ann. Statist. 10 (4): 1100–1120. https://doi.org/10.1214/aos/1176345976.

Bender, Ralf, Thomas Augustin, and Maria Blettner. 2005. “Generating Survival Times to Simulate Cox Proportional Hazards Models.” Statistics in Medicine 24 (11): 1713–23. https://doi.org/10.1002/sim.2059.

Jahn-Eimermacher, Antje, Katharina Ingel, Ann-Kathrin Ozga, Stella Preussler, and Harald Binder. 2015. “Simulating Recurrent Event Data with Hazard Functions Defined on a Total Time Scale.” BMC Medical Research Methodology 15 (1): 16. https://doi.org/10.1186/s12874-015-0005-2.

ingel@uni-mainz.de; currently at BIOTRONIK SE & Co. KG↩︎
currently at Institute of Medical Biometry and Informatics (IMBI, Heidelberg)↩︎
marinif@uni-mainz.de ↩︎
currently at University of Applied Sciences Darmstadt, antje.jahn@h-da.de ↩︎

Welcome to ClientVPS Mirrors

simrec: An R-Package for Simulation of Recurrent Event Data

Katharina Ingel1

Stella Preussler2

Federico Marini3

Antje Jahn-Eimermacher4

2023-09-06

Abstract

1 Introduction

2 The simrec function

Description

Parameters

Output

Details

Example

3 The simreccomp function

Description

Parameters

Output

Example

4 The simrecint function

Description

Parameters

Output

Example

5 Plotting the Data

Description

Parameters

Output

Example

Session Info

References

Katharina Ingel¹

Stella Preussler²

Federico Marini³

Antje Jahn-Eimermacher⁴

2 The `simrec` function

3 The `simreccomp` function

4 The `simrecint` function