Binary outcome variables

Johannes Cepicka

Suppose we are planning a drug development program testing the superiority of an experimental treatment over a control treatment. Our drug development program consists of an exploratory phase II trial which is, in case of promising results, followed by a confirmatory phase III trial.

The drugdevelopR package enables us to optimally plan such programs using a utility-maximizing approach. To get a brief introduction, we presented a very basic example on how the package works in Introduction to planning phase II and phase III trials with drugdevelopR. In the introduction, the observed outcome variable “tumor growth” was normally distributed. However, the drugdevelopR package is not only restricted to normally distributed outcome variables but also binary distributed outcome variables and a time-to-event outcome variables. In this article we want explain how the setting with binary distributed variables works.

The example setting

Suppose we are developing a new treatment to prevent strokes, exper. The patient variable that we want to investigate is if the patient experienced a stroke (unfavorable outcome with value 1) or not (favourable outcome with value 0) over a pre-defined observation period. This is a binary outcome variable.

Within our drug development program, we will compare our experimental treatment exper to the control treatment contro. The treatment effect measure is given by \(\rho = −\log(RR)\), which is the negative logarithm of the risk ratio (relative risk) \(RR = \frac{p_1}{p_0}\). Here, \(p_1\) is the failure probability of the experimental treatment (i.e. \(\mathbb{P}(X_i = 1|\text{experimental})\), the probability that patient \(i\) has a stroke) and analogously \(p_0\) is the failure probability of the control treatment.

Applying the package to the example

After installing the package according to the installation instructions, we can load it using the following code:

library(drugdevelopR)

Defining all necessary parameters

In order to apply the package to the setting from our example, we need to specify the following parameters:

As in the setting with normally distributed outcomes, the treatment effect may be fixed (as in this example) or may be distributed with respect to a prior distribution. Furthermore, all options to adapt the program to your specific needs are also available in this setting.

Now that we have defined all parameters needed for our example, we are ready to feed them to the package. We will use the function optimal_binary(), which calculates the optimal sample size and the optimal threshold value for a binary distributed outcome variable.

res <- optimal_binary(p0 = 0.5, p11 =  0.3,               # probabilities of the unfavorable outcome
   n2min = 20, n2max = 400, stepn2 = 4,                   # define optimization set for n2
   rrgomin = 0.7, rrgomax = 0.9, steprrgo = 0.01,         # define optimization set for RRgo
   alpha = 0.025, beta = 0.1,                              # drug development planning parameters
   c2 = 0.75, c3 = 1, c02 = 100, c03 = 150,               # define fixed and variable costs for phase II and III,
   K = Inf, N = Inf, S = -Inf,                            # constraints
   steps1 = 1, stepm1 = 0.95,  stepl1 = 0.85,             # treatment effect size categories as proposed by IQWiG (2016)
   b1 = 1000, b2 = 2000, b3 = 3000,                       # expected benefit categories
   w = 0.3, p12 = 0.5, in1 = 30, in2 = 60,                # prior (https://web.imbi.uni-heidelberg.de/prior/)
   gamma = 0,                                             # population structures in phase II and III
   fixed = TRUE,                                          # choose if true treatment effects are fixed or random
   skipII = FALSE,                                        # choose if skipping phase II would be an option
   num_cl = 1)

Interpreting the output

After setting all these input parameters and running the function, let’s take a look at the output of the program.

res
#> Optimization result:
#>  Utility: 1399.56
#>  Sample size:
#>    phase II: 260, phase III: 346, total: 606
#>  Probability to go to phase III: 0.99
#>  Total cost:
#>    phase II: 295, phase III: 494, cost constraint: Inf
#>  Fixed cost:
#>    phase II: 100, phase III: 150
#>  Variable cost per patient:
#>    phase II: 0.75, phase III: 1
#>  Effect size categories (expected gains):
#>   small: 1 (1000), medium: 0.95 (2000), large: 0.85 (3000)
#>  Success probability: 0.83
#>  Joint probability of success and observed effect of size ... in phase III:
#>    small: 0.06, medium: 0.18, large: 0.59
#>  Significance level: 0.025
#>  Targeted power: 0.9
#>  Decision rule threshold: 0.86 [RRgo] 
#>  Assumed true effect:
#>   rate in the control: 0.5, rate in the treatment group: 0.3
#>  Treatment effect offset between phase II and III: 0 [gamma]

The program returns a total of thirteen values and the input values. For now, we will only look at the most important ones:

Where to go from here

In this article we introduced the setting, when the outcome variable is binary distributed. For more information on how to use the package, see: