library(CLRtools)
library(dplyr)
library(dagitty)
library(ggdag)
library(ggplot2)
library(dplyr)
library(rstan)In the previous vignettes, we applied the Purposeful Selection Method to the GLOW500 dataset to build a parsimonious multivariable logistic regression model for fracture risk. Four variables were excluded during this process because they did not meet the statistical significance criteria at various stages of the selection steps.
In this follow-up analysis, we examine onr of those excluded variables from a different perspective. Rather than focusing on statistical significance for model building, we approach the question from a causal inference angle:
Do these variables have a causal effect on fracture risk?
To answer this, we:
This approach shows how causal diagrams and Bayesian methods can give
a different perspective from frequentist model selection. It also helps
to check if variables dropped by statistical criteria may still be
important from a causal point of view. Posterior predictive checks and
effective sample size summaries assist in interpreting the results and
assessing the robustness of the causal conclusions. The package
CLRtools provides several functions for these diagnostics,
which will be used in the analysis below.
The GLOW500 dataset includes several variables related to fracture risk in women over 50. Some important variables used in this analysis include:
To better understand the relationships between these variables and fracture risk, we constructed a causal diagram, or directed acyclic graph (DAG), based on our knowledge of the factors involved. The DAG visually represents assumed causal connections among the variables in the GLOW500 dataset.
In this graph, age influences weight, height, smoking status, prior fracture history (priorfrac), menopause status (premeno), and the fracture risk score (raterisk), which in turn affects the outcome (fracture). Height affects weight and body mass index (BMI), while weight also influences BMI. BMI is connected to arm assistance and directly to fracture. This diagram is a simplified representation intended to guide the causal analysis and identify which variables to adjust for when estimating the effects of BMI and weight. It is not an exhaustive or definitive model but a starting point based on the available knowledge.
To estimate the effects of weight on fracture risk while accounting for confounding, we first identified the appropriate adjustment sets based on the causal DAG. Using functions from the dagitty and ggdag packages, we determined which variables to control for to estimate the total and direct effects of each exposure.
This step is essential because adjusting for the correct variables helps avoid biased estimates caused by confounding, collider effects, or inappropriate adjustment for mediators.
After establishing the adjustment sets, we proceeded to fit Bayesian logistic regression models. For each model, we specified prior distributions and used Hamiltonian Monte Carlo (HMC) sampling, as implemented in Stan, to efficiently draw samples from the posterior distributions. This approach lets us better understand the uncertainty around the effects and express the likelihood of different results.
The following sections describe the adjustment set identification and Bayesian modeling steps for weight in detail.
Before fitting the Bayesian models, we prepared the data to ensure consistent scaling and proper format for Stan. Variables with “Yes” or “No” responses were converted to numeric indicators. Continuous variables such as age, weight, height, and BMI were standardized to have mean zero and standard deviation one. This standardization helps prevent any one variable from dominating the model due to scale differences.
Next, we selected the relevant variables for the analysis and converted the dataset into a named list format, as required by Stan for input. This list includes the number of observations and all predictors and outcome variables needed for the model.
# Load the GLOW500 dataset
data("glow500")
# Convert "Yes"/"No" variables to numeric (1 for "Yes", 0 for "No")
glow500[] <- lapply(glow500, function(x) {
if (all(x %in% c("Yes", "No"))) {
return(as.numeric(x == "Yes"))
} else {
return(x)
}
})
# Standardize continuous variables to mean zero and standard deviation one
glow500[, c('age','weight','height', 'bmi')] <- scale(glow500[, c('age','weight','height', 'bmi')])
# Convert data frame subset to a named list for Stan
glow500_subset <- glow500[ , c('age','weight','height','bmi','priorfrac','premeno','momfrac','armassist','smoke','raterisk', 'fracture')]
glow500_list <- as.list(glow500_subset)
# Add total number of observations as 'N'
glow500_list$N <- nrow(glow500_subset)The key question in this analysis is: Is there an effect of weight on
fracture risk? To ensure accurate estimates, we first identified the
variables that need adjustment using the causal DAG. This helps control
for confounding and reduces bias. We used the
adjustmentSets() function from the dagitty package to find
these sets.
When we study how one variable affects another, we consider two kinds of effects: the total effect and the direct effect.
Because these two effects involve different paths, we need to adjust for different variables in each case. For the direct effect, we adjust for confounders and also for mediators to isolate the direct link. For the total effect, we adjust only for confounders, so we capture the full effect including indirect paths.
# Find adjustment sets for estimating the direct effect of weight on fracture
adjustmentSets(dag, exposure = "weight", outcome = "fracture", effect = "direct")
#> { bmi, premeno, priorfrac, raterisk, smoke }
#> { age, bmi }
# Find adjustment sets for estimating the total effect of weight on fracture
adjustmentSets(dag, exposure = "weight", outcome = "fracture")
#> { height, premeno, priorfrac, raterisk, smoke }
#> { age, height }Using the adjustmentSets function, we identified the sets of variables to adjust for when estimating the direct and total effects of weight on fracture risk.
For the direct effect, the adjustment sets included:
For the total effect, the adjustment sets included:
For both effects, we chose the simpler set because it is easier to interpret and avoids unnecessary complexity.
We begin by estimating the direct effect of weight on fracture risk, adjusting for age and BMI as identified earlier. Because BMI acts as a mediator between weight and fracture, the direct effect model consists of two linked parts:
\[ fracture_i \sim \sf Binomial (1, p_i)\] \[\text{logit}(p_i)=\alpha+ \beta_{age} age_i+ \beta_{BMI} \hat{BMI}_i+ \beta_{weight} weight_i\]
Before fitting the model to the observed data, we first performed prior predictive checks to ensure that our chosen priors produce plausible values for both the mediator (BMI) and the outcome (fracture probability). The idea is to sample from the model using only the priors, without using the actual data, and verify that the simulated patterns are reasonable given the scientific context. This step helps confirm that our chosen priors are neither overly restrictive nor unrealistically wide before proceeding to the data analysis.
direct_glow_prior <- "
data {
int<lower=0> N; // Number of observations for simulation
vector[N] age; // Vector of age values
vector[N] weight; // Vector of weight values
}
generated quantities {
// Mediator model parameters
real aBM = normal_rng(0, 0.7); // Intercept for BMI model
real bBM = normal_rng(0, 0.7); // Slope for weight in BMI model
real<lower=0> sigma = exponential_rng(1); // Standard deviation for BMI residuals
// Fracture model parameters
real a = normal_rng(0, 0.7); // Intercept for fracture model
real bAGE = normal_rng(0, 0.7); // Coefficient for age
real bBMI = normal_rng(0, 0.7); // Coefficient for BMI
real bWEIGHT = normal_rng(0, 0.7);// Coefficient for weight
vector[N] mu_BMI; // Predicted BMI values from mediator model
vector[N] p; // Predicted fracture probabilities from outcome model
// Loop over all observations to generate predicted BMI and fracture probabilities
for (i in 1:N) {
mu_BMI[i] = aBM + bBM * weight[i]; // BMI predicted by weight
p[i] = inv_logit(a + bAGE * age[i] + bBMI * mu_BMI[i] + bWEIGHT * weight[i]); // Fracture probability
}
}
"
N <- 1000
weight_seq <- seq(min(glow500_list$weight), max(glow500_list$weight), length.out = N) # Create sequence of weight values
age_seq <- seq(min(glow500_list$age), max(glow500_list$age), length.out = N) # Create sequence of age values
# Run Stan prior predictive simulation with fixed parameters (no data fitting)
direct_prior <- stan(
model_code = direct_glow_prior,
data = list(N = N, weight = weight_seq, age = age_seq),
chains = 4,
iter = 2000,
seed = 1000,
algorithm = 'Fixed_param'
)
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Sampling)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Sampling)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Sampling)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Sampling)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Sampling)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0 seconds (Warm-up)
#> Chain 1: 0.143 seconds (Sampling)
#> Chain 1: 0.143 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2: Iteration: 1 / 2000 [ 0%] (Sampling)
#> Chain 2: Iteration: 200 / 2000 [ 10%] (Sampling)
#> Chain 2: Iteration: 400 / 2000 [ 20%] (Sampling)
#> Chain 2: Iteration: 600 / 2000 [ 30%] (Sampling)
#> Chain 2: Iteration: 800 / 2000 [ 40%] (Sampling)
#> Chain 2: Iteration: 1000 / 2000 [ 50%] (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 0 seconds (Warm-up)
#> Chain 2: 0.148 seconds (Sampling)
#> Chain 2: 0.148 seconds (Total)
#> Chain 2:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
#> Chain 3: Iteration: 1 / 2000 [ 0%] (Sampling)
#> Chain 3: Iteration: 200 / 2000 [ 10%] (Sampling)
#> Chain 3: Iteration: 400 / 2000 [ 20%] (Sampling)
#> Chain 3: Iteration: 600 / 2000 [ 30%] (Sampling)
#> Chain 3: Iteration: 800 / 2000 [ 40%] (Sampling)
#> Chain 3: Iteration: 1000 / 2000 [ 50%] (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3:
#> Chain 3: Elapsed Time: 0 seconds (Warm-up)
#> Chain 3: 0.142 seconds (Sampling)
#> Chain 3: 0.142 seconds (Total)
#> Chain 3:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
#> Chain 4: Iteration: 1 / 2000 [ 0%] (Sampling)
#> Chain 4: Iteration: 200 / 2000 [ 10%] (Sampling)
#> Chain 4: Iteration: 400 / 2000 [ 20%] (Sampling)
#> Chain 4: Iteration: 600 / 2000 [ 30%] (Sampling)
#> Chain 4: Iteration: 800 / 2000 [ 40%] (Sampling)
#> Chain 4: Iteration: 1000 / 2000 [ 50%] (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4:
#> Chain 4: Elapsed Time: 0 seconds (Warm-up)
#> Chain 4: 0.142 seconds (Sampling)
#> Chain 4: 0.142 seconds (Total)
#> Chain 4:The first plot shows prior predictive lines from a Bayesian regression where BMI and weight are standardized. Each line, drawn from Normal(0, 0.7) priors on the slope, represents a possible relationship between weight and BMI before seeing data. Most lines are flat, but both positive and negative trends are allowed.
The second plot shows prior predictive curves for fracture probability as a function of standardized weight, from a Bayesian outcome model including weight, age, and BMI. Coefficients have Normal(0, 0.7) priors, producing a wide range of curves that reflect weak assumptions and allow for positive or negative associations. This ensures flexibility and plausible probabilities between zero and one.
After verifying that the chosen priors produce reasonable predictions, we proceed to fit the full model to the observed data to estimate the direct effect of weight on fracture risk. The following Bayesian regression estimates the direct effect of weight on fracture risk while accounting for age as a confounder and BMI as a mediator.
direct_glow <- "
data {
int<lower=0> N; // Number of observations
vector[N] age; // Age variable
vector[N] bmi; // Body Mass Index (mediator)
vector[N] weight; // Weight (exposure)
int<lower=0,upper=1> fracture[N]; // Fracture outcome (binary)
}
parameters {
// Parameters for outcome model (fracture as outcome)
real a; // Intercept for outcome model
real bWEIGHT; // Effect of weight on fracture
real bAGE; // Effect of age on fracture
real bBMI; // Effect of BMI on fracture
// Parameters for mediator model (BMI as outcome)
real aBM; // Intercept for mediator model
real bBM; // Effect of weight on BMI
real<lower=0> sigma; // Standard deviation of BMI residuals
}
model {
vector[N] p; // Probability of fracture for each observation
vector[N] mu_bmi; // Predicted BMI mean for each observation
// Priors for mediator model parameters
aBM ~ normal(0, 0.7);
bBM ~ normal(0, 0.7);
sigma ~ exponential(1);
// Calculate predicted BMI means for all observations based on weight
for (i in 1:N) {
mu_bmi[i] = aBM + bBM * weight[i];
}
// Likelihood for mediator: observed BMI assumed normal around predicted mean
bmi ~ normal(mu_bmi, sigma);
// Priors for outcome model parameters
a ~ normal(0, 0.7);
bAGE ~ normal(0, 0.7);
bBMI ~ normal(0, 0.7);
bWEIGHT ~ normal(0, 0.7);
// Calculate fracture probabilities for all observations
for (i in 1:N) {
p[i] = inv_logit(a + bAGE * age[i] + bBMI * bmi[i] + bWEIGHT * weight[i]);
}
// Likelihood for outcome: fracture modeled as Bernoulli with probability p
fracture ~ binomial(1, p);
}
generated quantities {
vector[N] mu_BMI; // Predicted BMI means for posterior predictive checks
vector[N] p; // Predicted fracture probabilities
vector[N] log_lik; // Log likelihood for each observation (for model comparison)
int fracture_pred[N]; // Simulated fracture outcomes from posterior predictive distribution
for (i in 1:N) {
mu_BMI[i] = aBM + bBM * weight[i];
p[i] = inv_logit(a + bAGE * age[i] + bBMI * bmi[i] + bWEIGHT * weight[i]);
log_lik[i] = binomial_lpmf(fracture[i] | 1, p[i]);
// Simulate fracture outcome based on predicted probability
fracture_pred[i] = bernoulli_rng(p[i]);
}
}
"
# Fit the Bayesian model for the direct effect using Stan
fit_direct_glow <- stan(
model_code = direct_glow,
data = glow500_list,
chains = 4,
iter = 2000,
warmup = 1000,
seed = 1000
)
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 0.000125 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.25 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 1.125 seconds (Warm-up)
#> Chain 1: 1.109 seconds (Sampling)
#> Chain 1: 2.234 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2:
#> Chain 2: Gradient evaluation took 8.3e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.83 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2:
#> Chain 2:
#> Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 1.13 seconds (Warm-up)
#> Chain 2: 1.072 seconds (Sampling)
#> Chain 2: 2.202 seconds (Total)
#> Chain 2:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
#> Chain 3:
#> Chain 3: Gradient evaluation took 8.7e-05 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.87 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3:
#> Chain 3:
#> Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3:
#> Chain 3: Elapsed Time: 1.143 seconds (Warm-up)
#> Chain 3: 1.064 seconds (Sampling)
#> Chain 3: 2.207 seconds (Total)
#> Chain 3:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
#> Chain 4:
#> Chain 4: Gradient evaluation took 7.2e-05 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.72 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4:
#> Chain 4:
#> Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4:
#> Chain 4: Elapsed Time: 1.116 seconds (Warm-up)
#> Chain 4: 1.109 seconds (Sampling)
#> Chain 4: 2.225 seconds (Total)
#> Chain 4:To analyze the Bayesian model results, we use the
summarize_results() function from the CLRtools
package. This function provides a summary of the model output along with
visualizations that help us understand the posterior distributions and
the model fit.
First plot shows the distribution of possible values for each factor’s effect, such as the \(\alpha, \beta_{weight}, \beta_{age}\), etc. Because the Bayesian model produces many simulated values for each factor, this chart uses density areas to display the range of likely values for each factor’s effect.
The second plot compares the average fracture risk predicted by the model with the actual average fracture rate observed in the data. The actual average is calculated directly from the data by computing the mean of the fracture variable and is shown as a red line. The model’s predictions come from simulating many datasets using the estimated coefficients, and the plot shows how the average and standard deviation of these simulated outcomes compare to the real data. This helps us understand whether the model’s predictions match the true fracture risk and its natural variability.
This function allows the user to provide their own matrix of
posterior samples through the ypredict argument, as shown
below. However, it can also calculate the posterior samples internally,
but in that case the model will be a simple logistic regression without
including mediators.
# Extracting posterior predictive samples
O_rep <- rstan::extract(fit_direct_glow)$fracture_pred
summarize_results(
model = fit_direct_glow,
ypredict = O_rep,
data = glow500_subset,
outcome = 'fracture',
var.param = c('a' = 'a', 'age' = 'bAGE', 'bmi' = 'bBMI', 'weight' = 'bWEIGHT', 'aBM' = 'aBM', 'bBM' = 'bBM'),
rounding = 6,
prob = 0.89,
point.est = 'median'
)
#> Inference for Stan model: anon_model.
#> 4 chains, each with iter=2000; warmup=1000; thin=1;
#> post-warmup draws per chain=1000, total post-warmup draws=4000.
#>
#> mean se_mean sd 2.5% 25% 50% 75%
#> a -1.145313 0.001377 0.107440 -1.359067 -1.216469 -1.142647 -1.073389
#> bAGE 0.462957 0.001424 0.108347 0.249313 0.387805 0.463862 0.535752
#> bBMI 0.621122 0.004652 0.257882 0.133834 0.447237 0.615587 0.790412
#> bWEIGHT -0.525299 0.004840 0.264699 -1.065047 -0.698595 -0.520909 -0.345535
#> aBM -0.000310 0.000220 0.015779 -0.031481 -0.011325 -0.000196 0.010406
#> bBM 0.937061 0.000226 0.016052 0.905269 0.926461 0.937233 0.947628
#> 97.5% n_eff Rhat
#> a -0.942391 6091 1.000105
#> bAGE 0.680765 5792 0.999704
#> bBMI 1.142266 3073 1.000184
#> bWEIGHT -0.023397 2992 1.000221
#> aBM 0.030424 5153 0.999718
#> bBM 0.968636 5067 1.000274
#>
#> Samples were drawn using NUTS(diag_e) at Mon Jan 26 09:30:19 2026.
#> For each parameter, n_eff is a crude measure of effective sample size,
#> and Rhat is the potential scale reduction factor on split chains (at
#> convergence, Rhat=1).
#> Note: in most cases the default test statistic 'mean' is too weak to detect anything of interest.
#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.All variables were standardised, so the reported effects correspond to a one standard deviation increase. The coefficient for weight was −0.53 (OR ≈ 0.59, 95% credible interval 0.34 to 0.98), indicating that, among people with the same age and BMI, higher weight was linked to lower odds of fracture. This represents the direct effect of weight, not the total effect, capturing the influence of weight on fracture that occurs independently of BMI. Notably, the 95% credible interval does not include 1, providing evidence that this effect is genuine and unlikely to be due to chance.
BMI showed a positive association with fracture (OR ≈ 1.86, 95% credible interval 1.14 to 3.13) when controlling for weight and age, and age also had a positive association with fracture at fixed BMI and weight. However, the coefficients for BMI and age reflect conditional associations rather than total causal effects and should not be interpreted as direct effects to avoid the Table 2 fallacy.
In the mediator model, weight was a strong predictor of BMI (coefficient 0.94, 95% credible interval 0.91 to 0.97), meaning that BMI increased by nearly one standard deviation for each one standard deviation increase in weight.
The posterior predictive checks show that the model fits the observed data well in terms of both the mean and the standard deviation. In the left panel, the simulated means from the model closely match the observed mean, which lies clearly within the range of simulated values. Similarly, the right panel demonstrates that the observed standard deviation aligns well with the distribution of simulated standard deviations. These findings suggest that the model can capture important features of the data (average value and variability).
Additionally, the Bayesian model needs to be checked to ensure it is
reliable and to determine if more iterations, better priors, or model
adjustments are necessary. The diagnostic_bayes() function
from the CLRtools package produces several key diagnostic
plots including Rhat statistics, Effective Sample Size (ESS), trace
plots, and rank plots. Rhat statistics measure how well the Markov
chains have mixed and converged, with values close to 1 indicating good
convergence and values below 1.01 preferred. ESS indicates how many
independent samples the MCMC draws represent, reflecting sampling
efficiency. Trace plots display the sampled parameter values across
iterations, helping assess chain mixing and exploration of the parameter
space, while rank plots detect whether the sampler explores the
parameter space evenly.
diagnostic_bayes(
model = fit_direct_glow,
var.param = c('a', 'bAGE', 'bBMI', 'bWEIGHT', 'aBM', 'bBM')
)
#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
Model diagnostics indicate good convergence and sampling quality. The Rhat statistics were all below 1.002, suggesting chains mixed well. Effective sample sizes relative to total samples \((N_{eff}/N)\) exceeded 0.05, with histograms mostly between 3,000 and 6,000, indicating sufficient effective sampling. Trace and rank plots showed strong overlap across all four chains for all coefficients, supporting stable and reliable posterior estimates.
Next, we turn to estimating the total effect of weight on fracture risk. Unlike the direct effect, which holds BMI constant to isolate weight’s influence independent of BMI, the total effect captures the overall impact of weight on fracture, including both direct pathways and those mediated through BMI. Estimating the total effect allows us to understand the full influence of weight on fracture risk without adjusting for intermediate variables like BMI. This provides a more complete picture of how weight affects fracture, through all paths.
For estimating the total effect of weight on fracture risk, we use the adjustment set identified from the causal DAG, which includes age and height. The model is:
\[ fracture_i \sim \sf Binomial (1, p_i)\] \[\text{logit}(p_i)=\alpha+ \beta_{age} age_i+ \beta_{height} height_i+ \beta_{weight} weight_i\]
For the total effect model, we similarly conducted prior predictive checks to validate that the specified priors yield plausible values for fracture risk when adjusting for the new set of covariates. This step ensures that the model’s assumptions remain appropriate and that the priors support realistic outcome patterns before incorporating the observed data.
total_glow_prior <- "
data {
int<lower=0> N; // Number of observations for simulation
vector[N] age; // Vector of age values
vector[N] weight; // Vector of weight values
vector[N] height; // Vector of height values
}
generated quantities {
// Fracture model parameters
real a = normal_rng(0, 0.7); // Intercept for fracture model
real bAGE = normal_rng(0, 0.7); // Coefficient for age
real bHEIGHT = normal_rng(0, 0.7); // Coefficient for height
real bWEIGHT = normal_rng(0, 0.7); // Coefficient for weight
vector[N] p;
// Loop over all observations to generate fracture probabilities
for (i in 1:N) {
p[i] = inv_logit(a + bAGE * age[i] + bHEIGHT * height[i] + bWEIGHT * weight[i]);
}
}
"
N <- 1000
height_seq <- seq(min(glow500_list$height), max(glow500_list$height), length.out = N) # Create sequence of height values
weight_seq <- seq(min(glow500_list$weight), max(glow500_list$weight), length.out = N) # Create sequence of weight values
age_seq <- seq(min(glow500_list$age), max(glow500_list$age), length.out = N) # Create sequence of age values
# Run Stan prior predictive simulation with fixed parameters (no data fitting)
total_prior <- stan(
model_code = total_glow_prior,
data = list(N = N, height = height_seq, weight = weight_seq, age = age_seq),
chains = 4,
iter = 2000,
seed = 1000,
algorithm = 'Fixed_param'
)
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Sampling)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Sampling)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Sampling)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Sampling)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Sampling)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0 seconds (Warm-up)
#> Chain 1: 0.113 seconds (Sampling)
#> Chain 1: 0.113 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2: Iteration: 1 / 2000 [ 0%] (Sampling)
#> Chain 2: Iteration: 200 / 2000 [ 10%] (Sampling)
#> Chain 2: Iteration: 400 / 2000 [ 20%] (Sampling)
#> Chain 2: Iteration: 600 / 2000 [ 30%] (Sampling)
#> Chain 2: Iteration: 800 / 2000 [ 40%] (Sampling)
#> Chain 2: Iteration: 1000 / 2000 [ 50%] (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 0 seconds (Warm-up)
#> Chain 2: 0.115 seconds (Sampling)
#> Chain 2: 0.115 seconds (Total)
#> Chain 2:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
#> Chain 3: Iteration: 1 / 2000 [ 0%] (Sampling)
#> Chain 3: Iteration: 200 / 2000 [ 10%] (Sampling)
#> Chain 3: Iteration: 400 / 2000 [ 20%] (Sampling)
#> Chain 3: Iteration: 600 / 2000 [ 30%] (Sampling)
#> Chain 3: Iteration: 800 / 2000 [ 40%] (Sampling)
#> Chain 3: Iteration: 1000 / 2000 [ 50%] (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3:
#> Chain 3: Elapsed Time: 0 seconds (Warm-up)
#> Chain 3: 0.11 seconds (Sampling)
#> Chain 3: 0.11 seconds (Total)
#> Chain 3:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
#> Chain 4: Iteration: 1 / 2000 [ 0%] (Sampling)
#> Chain 4: Iteration: 200 / 2000 [ 10%] (Sampling)
#> Chain 4: Iteration: 400 / 2000 [ 20%] (Sampling)
#> Chain 4: Iteration: 600 / 2000 [ 30%] (Sampling)
#> Chain 4: Iteration: 800 / 2000 [ 40%] (Sampling)
#> Chain 4: Iteration: 1000 / 2000 [ 50%] (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4:
#> Chain 4: Elapsed Time: 0 seconds (Warm-up)
#> Chain 4: 0.109 seconds (Sampling)
#> Chain 4: 0.109 seconds (Total)
#> Chain 4:
# Extract simulated fracture probabilities
p_sim_total <- extract(total_prior)$p
# Plot prior predictive expected BMI curves against weight
plot(NULL, xlim=range(weight_seq), ylim=c(0,1), xlab = "weight", ylab = "Probability")
for (i in 1:100) {
lines(weight_seq, p_sim_total[i, ], col = alpha("black", 0.1))
}The prior predictive plot shows possible links between standardized weight and the chance of the outcome based only on the priors. The smooth S-shaped curves look like realistic logistic patterns and allow for both positive and negative effects. This means the priors are reasonable, flexible, and not so informative, letting us explore the data without strong assumptions.
After confirming that the priors produce reasonable predictions, we fit the total effect model to the observed data. This Bayesian regression estimates the overall effect of weight on fracture risk while adjusting for age and height, capturing both direct and indirect pathways.
total_glow <- "
data {
int<lower=0> N; // Number of observations
vector[N] age; // Age variable
vector[N] height; // Height Variable
vector[N] weight; // Weight (exposure)
int<lower=0,upper=1> fracture[N]; // Fracture outcome (binary)
}
parameters {
// Parameters for outcome model
real a; // Intercept for outcome model
real bWEIGHT; // Effect of weight on fracture
real bAGE; // Effect of age on fracture
real bHEIGHT; // Effect of height on fracture
}
model {
vector[N] p; // Probability of fracture for each observation
// Priors for outcome model parameters
a ~ normal(0, 0.7);
bAGE ~ normal(0, 0.7);
bWEIGHT ~ normal(0, 0.7);
bHEIGHT ~ normal(0, 0.7);
// Calculate fracture probabilities for all observations
for (i in 1:N) {
p[i] = inv_logit(a + bAGE * age[i] + bHEIGHT * height[i] + bWEIGHT * weight[i]);
}
// Likelihood for outcome: fracture modeled as Bernoulli with probability p
fracture ~ binomial( 1 , p );
}
generated quantities {
vector[N] p; // Predicted fracture probabilities
vector[N] log_lik; // Log likelihood for each observation (for model comparison)
int fracture_pred[N]; // Simulated fracture outcomes from posterior predictive distribution
for (i in 1:N) {
p[i] = inv_logit(a + bAGE * age[i] + bHEIGHT * height[i] + bWEIGHT * weight[i]);
log_lik[i] = binomial_lpmf(fracture[i] | 1, p[i]);
// Simulate fracture outcome based on predicted probability
fracture_pred[i] = bernoulli_rng(p[i]);
}
}
"
# Fit the Bayesian model for the total effect using Stan
fit_total_glow <- stan(
model_code = total_glow,
data = glow500_list,
chains = 4,
iter = 2000,
warmup = 1000,
seed = 1000
)
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 0.000107 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.07 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.373 seconds (Warm-up)
#> Chain 1: 0.393 seconds (Sampling)
#> Chain 1: 0.766 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2:
#> Chain 2: Gradient evaluation took 5.5e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.55 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2:
#> Chain 2:
#> Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 0.397 seconds (Warm-up)
#> Chain 2: 0.401 seconds (Sampling)
#> Chain 2: 0.798 seconds (Total)
#> Chain 2:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
#> Chain 3:
#> Chain 3: Gradient evaluation took 5.4e-05 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.54 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3:
#> Chain 3:
#> Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3:
#> Chain 3: Elapsed Time: 0.39 seconds (Warm-up)
#> Chain 3: 0.416 seconds (Sampling)
#> Chain 3: 0.806 seconds (Total)
#> Chain 3:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
#> Chain 4:
#> Chain 4: Gradient evaluation took 5.7e-05 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.57 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4:
#> Chain 4:
#> Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4:
#> Chain 4: Elapsed Time: 0.391 seconds (Warm-up)
#> Chain 4: 0.373 seconds (Sampling)
#> Chain 4: 0.764 seconds (Total)
#> Chain 4:For the total effect analysis, we again use the
summarize_results() function from the CLRtools
package to explore the Bayesian model output. This function provides
helpful summaries and visualizations of the posterior distributions and
model fit, just as in the direct effect analysis.
By comparing the predicted fracture risk with the observed data and visualizing the range of plausible effect sizes, we can assess how well the total effect model captures the data and the uncertainty around the weight effect when adjusting for age and height.
# Extracting posterior predictive samples
O_rep <- rstan::extract(fit_total_glow)$fracture_pred
summarize_results(
model = fit_total_glow,
ypredict = O_rep,
data= glow500_subset,
outcome = 'fracture',
var.param = c('a'='a', 'age' = "bAGE", 'height'='bHEIGHT', 'weight'='bWEIGHT'),
rounding = 6,
prob = 0.89,
point.est = 'median')
#> Inference for Stan model: anon_model.
#> 4 chains, each with iter=2000; warmup=1000; thin=1;
#> post-warmup draws per chain=1000, total post-warmup draws=4000.
#>
#> mean se_mean sd 2.5% 25% 50% 75%
#> a -1.148928 0.001816 0.108706 -1.371591 -1.221071 -1.148366 -1.075872
#> bAGE 0.449854 0.001815 0.110382 0.225079 0.375763 0.449850 0.522928
#> bHEIGHT -0.280300 0.001933 0.116410 -0.507786 -0.360941 -0.279338 -0.200052
#> bWEIGHT 0.133377 0.001974 0.113760 -0.088531 0.057337 0.134674 0.211793
#> 97.5% n_eff Rhat
#> a -0.938441 3581 0.999753
#> bAGE 0.665068 3699 1.000151
#> bHEIGHT -0.051998 3627 1.000282
#> bWEIGHT 0.349875 3322 1.000195
#>
#> Samples were drawn using NUTS(diag_e) at Mon Jan 26 09:31:51 2026.
#> For each parameter, n_eff is a crude measure of effective sample size,
#> and Rhat is the potential scale reduction factor on split chains (at
#> convergence, Rhat=1).
#> Note: in most cases the default test statistic 'mean' is too weak to detect anything of interest.
#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.All variables were standardized, so the coefficients represent the expected change in the log odds of fracture for a one standard deviation increase in each variable, holding the others constant.
The coefficient for weight is 0.13 (95% credible interval from about -0.09 to 0.35), meaning the data are consistent with both a small positive or no effect of weight on fracture risk after adjusting for age and height. Because the interval includes zero, there is uncertainty about whether weight has a meaningful effect in this model.
Age shows a positive association with fracture risk (coefficient 0.45, 95% credible interval approximately 0.23 to 0.67), indicating that higher age is linked to higher fracture risk, holding height and weight constant. Height has a negative association (coefficient −0.28, 95% credible interval approximately −0.51 to −0.05), suggesting taller individuals tend to have lower fracture risk when accounting for age and weight. However, these numbers show how each factor relates to fracture risk when we hold the other factors fixed. They are not the full cause-and-effect effects, so we should avoid interpreting them as direct effects to prevent the Table 2 fallacy.
The posterior predictive checks suggest the model represents the observed data well. The left panel shows that the model’s simulated means are close to the actual mean, while the right panel shows that the simulated standard deviations match the observed variability. Together, these plots indicate that the model can reproduce the main features central value and spread of the data.
As with the direct effect, we check the total effect model to ensure
the Bayesian results are reliable. We use the same diagnostics from
CLRtools, including Rhat Statistics, Effective Sample Size,
and posterior plots, to confirm that the model has converged and the
posterior samples are robust.
diagnostic_bayes(
model = fit_total_glow,
var.param = c( 'a', "bAGE", 'bWEIGHT', 'bHEIGHT'))
#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.#> `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
The model diagnostics confirm that the posterior estimates are stable and reliable. Rhat values were all below 1.01, showing that the chains converged properly. Effective sample sizes \((N_{eff}/N)\) were sufficient, mostly ranging from 2,000 to 6,000, and both trace and rank plots showed consistent overlap across the four chains for all coefficients. These results indicate the model produced stable samples for inference.
Weight does have a causal effect on fracture risk, but the effect depends on the pathway considered. The direct effect, which isolates the influence of weight independent of BMI, is negative, suggesting that at the same BMI, higher weight lowers fracture risk. At the same time, the indirect effect through BMI is positive, because higher weight increases BMI, and higher BMI raises fracture risk. When combined, the total effect is small and slightly positive, meaning that overall, weight has only a minor influence on fracture risk. These results highlight that weight’s causal impact is complex: it acts in opposite directions along different pathways, and overall the effect is small.