Multi-task regression and network estimation with missing responses — no imputation required!
missoNet
jointly estimates regression
coefficients and the response network (precision
matrix) from multi-response data where some responses are
missing (MCAR/MAR/MNAR). Estimation is based on unbiased estimating
equations with separate L1 regularization for
coefficients and the precision matrix, enabling robust multi-trait
analysis under incomplete outcomes.
Beta
) and
conditional dependency structure (Theta
).If you only have a single response, classical lasso/elastic net (e.g.,
glmnet
) is simpler and likely faster.
CRAN (stable)
install.packages("missoNet")
GitHub (development)
# install.packages("devtools")
::install_github("yixiao-zeng/missoNet", build_vignettes = TRUE) devtools
library(missoNet)
# Example data with ~15% missing responses (MCAR)
<- generateData(n = 300, p = 50, q = 10, rho = 0.15, missing.type = "MCAR")
sim
# Fit along two lambda paths; choose via BIC (no CV)
<- missoNet(X = sim$X, Y = sim$Z, GoF = "BIC")
fit
# Extract estimates at the selected solution
<- fit$est.min$Beta # p x q regression coefficients
Beta <- fit$est.min$Theta # q x q precision (conditional network)
Theta
# Visualize selection path
plot(fit, type = "scatter")
# 5-fold CV over (lambda.beta, lambda.theta)
<- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5)
cvfit
# Inspect CV heatmap and selected models (min and 1-SE variants)
plot(cvfit, type = "heatmap")
# Predict responses on new data
<- predict(cvfit, newx = sim$X, s = "lambda.min") Y_hat
Tip: Try s = "lambda.1se.beta"
or
"lambda.1se.theta"
for more conservative sparsity when
available.
library(parallel)
<- makeCluster(max(1, detectCores() - 1))
cl <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5,
cvfit parallel = TRUE, cl = cl)
stopCluster(cl)
# Lessen the penalty for prior-important predictors
<- ncol(sim$X); q <- ncol(sim$Z)
p <- matrix(1, p, q)
beta.pen.factor c(1, 2), ] <- 0.1
beta.pen.factor[
<- missoNet(X = sim$X, Y = sim$Z,
fit beta.pen.factor = beta.pen.factor)
<- missoNet(X = sim$X, Y = sim$Z,
fit adaptive.search = TRUE,
n.lambda.beta = 50,
n.lambda.theta = 50)
vignette("missoNet-introduction")
vignette("missoNet-cross-validation")
vignette("missoNet-case-study")
If vignettes are not available from CRAN binaries on your platform,
install from source using the GitHub command above with
build_vignettes = TRUE
.
Actual performance will depend on sparsity, signal-to-noise, and missingness mechanisms.
Great for
Not ideal for - Single-response regression (use
glmnet
or similar) - Extremely sparse information (e.g.,
>50% missing responses across most traits)
If you use missoNet
in your research, please cite:
@article{zeng2025missonet,
title = {Multivariate regression with missing response data for modelling regional DNA methylation QTLs},
author = {Zeng, Yixiao and Alam, Shomoita and Bernatsky, Sasha and Hudson, Marie and Colmegna, In{\'e}s and Stephens, David A and Greenwood, Celia MT and Yang, Archer Y},
journal = {arXiv preprint arXiv:2507.05990},
year = {2025},
url = {https://arxiv.org/abs/2507.05990}
}
Contributions and issues are welcome! Please open a discussion or pull request on the GitHub repository.
GPL-2. See the LICENSE file.