Type: Package
Title: Time-Dependent Precision-Recall Curve Estimation for Right-Censored Data
Version: 1.0.0
Description: This contains functions that can be used to estimate the time-dependent precision-recall curve (PRC) and the corresponding area under the PRC for right-censored survival data. It also compute time-dependent ROC curve and its corresponding area under the ROC curve (AUC). See Beyene, Chen and Kifle (2024) <doi:10.1002/bimj.202300135>.
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Depends: R(≥ 4.0)
Imports: survidm, graphics, stats
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2025-09-10 14:19:59 UTC; m2kas
Author: Kassu Mehari Beyene [aut, cre], Ding-Geng Chen [ctb], Yehenew Getachew Kifle [ctb]
Maintainer: Kassu Mehari Beyene <m2kassu@gmail.com>
Repository: CRAN
Date/Publication: 2025-09-15 09:00:02 UTC

tdPRC

Description

Tools for estimating and visualizing, precision-recall curve and the receiver operating characteristic (ROC) curves. The area under precision-recall (AUPRC) and area under the ROC curve (AUC) can be compared with statistical tests based on bootstrap standard error. Confidence intervals can be computed for AUPRC and AUC.

Abbreviations

In this package, the following abbreviations are commonly used:

TPR

True Positive Rate.

FPR

False Posetive Rate.

PPV

Positive Predictive Value.

AUPRC

Area under the precision-recall curve.

ROC

The Receiver Operating Characteristic curve.

AUC

Area under the ROC curve at a given time horizon t.

Dataset

This package comes with a right-censored data set with 312 observations and 4 variables. For details see mayo

Installing and using

Ensure that your system has an active internet connection, then execute the following command in the R console to install the package:

                    install.packages("tdPRC")
      

To load the package after installation, use the following command:

                    library(tdPRC)
     

Author(s)

Kassu Mehari Beyene, Ding-Geng Chen and Yehenew Getachew Kifle

Maintainer: Kassu Mehari Beyene <m2kassu@gmail.com>

References

Beran, R. (1981). Nonparametric regression with randomly censored survival data. Technical report, University of California, Berkeley.

Beyene, K.M., Chen, D.G., and Kifle, Y.G. (2024). A novel nonparametric time‐dependent precision–recall curve estimator for right‐censored survival data. Biometrical Journal, 66(3), 2300135.

Beyene, K.M. and El Ghouch A. (2020). Smoothed time-dependent receiver operating characteristic curve for right censored survival data. Statistics in Medicine. 39: 3373-3396.

Heagerty, P.J., and Zheng, Y. (2005). Survival model predictive accuracy and ROC curves. Biometrics, 61(1), 92-105.

Li, L., Greene, T. and Hu, B. (2016). A simple method to estimate the time-dependent receiver operating characteristic curve and the area under the curve with right censored data, Statistical Methods in Medical Research, 27(8): 2264-2278.

Sheather, S.J. and Jones, M.C. (1991). A Reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society. Series B (Methodological) 53(3): 683-690.


Conditional probability given the observed data estimate

Description

This is the base function of the package that uses the Beran nonparametric conditional survival function estimator for right-censored data to estimate the conditional probability of the event T<t (i.e., the event occurring before time t) or T>t (i.e., the event occurring after time t), given the observed data.

Usage

Csurv(Y, M, censor, t, h=NULL, ktype="gaussian")

Arguments

Y

The numeric vector of event-time or observed time.

M

The numeric vector of marker value.

censor

The censoring indicator, 1 if event, 0 otherwise.

t

A scalar time point used to calculate the the ROC curve

h

A scalar bandwidth value for kernel weights estimation. The default is the value obtained using the method of Sheather and Jones (1991).

ktype

A character string specifying the desired kernel needed for Beran weight calculation. The possible options are "normal", "epanechnikov", "tricube", "boxcar", "triangular", or "quartic". The defaults is "gaussian" kernel density.

Value

Return a list containing:

positive

estimate of P(T<t|Y,censor,M).

negative

estimate of P(T>t|Y,censor,M).

References

Beyene, K.M., Chen, D.G., and Kifle, Y.G. (2024). A novel nonparametric time‐dependent precision-recall curve estimator for right‐censored survival data. Biometrical Journal, 66(3), 2300135.

Beyene, K.M. and El Ghouch A. (2020). Smoothed time-dependent receiver operating characteristic curve for right censored survival data. Statistics in Medicine, 39: 3373-3396.

Examples

library(tdPRC);
data(mayo);

data <- mayo[ ,c( "time","censor","mayoscore5" )] ;
t <- 365*6;

est <- Csurv(Y=data$time, M=data$mayoscore5, censor=data$censor, t=t, ktype="gaussian")

summary(est$positive)


Mayo Marker Data

Description

Two marker values with event time and censoring status for the subjects in Mayo PBC data

Usage

data(mayo)

Format

A data frame with 312 observations and 4 variables: time (event time/censoring time), censor (censoring indicator), mayoscore4, mayoscore5. The two scores are derived from 4 and 5 covariates respectively.

References

Heagerty, P. J., & Zheng, Y. (2005). Survival model predictive accuracy and ROC curves. Biometrics, 61(1), 92-105.


Time-dependent precision-recall curve (PRC) estimation from right-censored survival data

Description

This function empirically estimate the time-dependent precision-recall curve and RUC curve for right censored survival data using the cumulative sensitivity and dynamic specificity definitions. It also calculates the time-dependent area under precision-recall curve (AUPRC) and the area under the ROC curve (AUC). The function computes standard error and confidence interval of AUPRC and AUC using non-parametric bootstrap approach.

Usage

tdPRC(
  Y,
  M,
  censor,
  t,
  cut = NULL,
  len = 1000,
  h = 0.1,
  ktype = "gaussian",
  B = 0,
  alpha = 0.05,
  plot = FALSE
)

Arguments

Y

The numeric vector of event-time or observed time.

M

The numeric vector of marker values.

censor

The censoring indicator, 1 if event, 0 otherwise.

t

A scalar time point used to calculate the PRC curve.

cut

A grid of cutoff values for estimation is computed. Default is sequence of 151 numbers between 0 and 1.

len

The length of the grid points. Default is 151.

h

A scalar value for Beran's weight calculations. The default is the value obtained by using the method of Sheather and Jones (1991).

ktype

A character string specifying the desired kernel needed for Beran weight calculation. The possible options are "normal", "epanechnikov", "tricube", "boxcar", "triangular", or "quartic". The defaults is "gaussian" kernel density.

B

The number of bootstrap samples to be used for variance estimation. The default is 0, no variance estimation.

alpha

The significance level. The default is 0.05.

plot

The logical parameter to see the ROC curve plot. Default is FALSE.

Value

Returns the following items:

TPR

vector of estimated TPR.

FPR

vector of estimated FPR.

PPV

vector of estimated PPV.

AUPRC

estimated area under the PR curve at a given time horizon t.

AUC

estimated area under the ROC curve at a given time horizon t.

APbot

estimated area under the PR curve for each bootstrap sample at a given time horizon t.

dat

a data frame with two columns:po = positive and M = marker.

References

Beyene, K.M., Chen, D.G., and Kifle, Y.G. (2024). A novel nonparametric time‐dependent precision-recall curve estimator for right‐censored survival data. Biometrical Journal, 66(3), 2300135.

Beyene, K.M. and El Ghouch A. (2020). Smoothed time-dependent receiver operating characteristic curve for right censored survival data. Statistics in Medicine. 39: 3373-3396.

Li, L., Greene, T. and Hu, B. (2016). A simple method to estimate the time-dependent receiver operating characteristic curve and the area under the curve with right censored data, Statistical Methods in Medical Research, 27(8): 2264-2278.

Examples

library(tdPRC);

data(mayo);

data <- mayo[ ,c( "time","censor","mayoscore5" )] ;
t <- 365*6;

resu <- tdPRC(Y=data$time, M=data$mayoscore5, censor=data$censor, t=t, cut=NULL,
         len=1000, h=0.1, plot=TRUE);
resu$AUPRC