| Type: | Package | 
| Title: | Lasso Penalized Precision Matrix Estimation | 
| Version: | 1.0 | 
| Date: | 2018-05-31 | 
| Description: | Estimates a lasso penalized precision matrix via the blockwise coordinate descent (BCD). This package is a simple wrapper around the popular 'glasso' package that extends and enhances its capabilities. These enhancements include built-in cross validation and visualizations. See Friedman et al (2008) <doi:10.1093/biostatistics/kxm045> for details regarding the estimation method. | 
| URL: | https://github.com/MGallow/CVglasso | 
| BugReports: | https://github.com/MGallow/CVglasso/issues | 
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] | 
| ByteCompile: | TRUE | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 6.0.1 | 
| Imports: | stats, parallel, foreach, ggplot2, dplyr, glasso | 
| Depends: | doParallel | 
| Suggests: | testthat | 
| NeedsCompilation: | no | 
| Packaged: | 2018-05-31 23:30:18 UTC; Matt | 
| Author: | Matt Galloway [aut, cre] | 
| Maintainer: | Matt Galloway <gall0441@umn.edu> | 
| Repository: | CRAN | 
| Date/Publication: | 2018-06-04 08:42:55 UTC | 
Parallel Cross Validation
Description
Parallel implementation of cross validation.
Usage
CV(X = NULL, S = NULL, lam = 10^seq(-2, 2, 0.2), diagonal = FALSE,
  path = FALSE, tol = 1e-04, maxit = 10000, adjmaxit = NULL, K = 5,
  crit.cv = c("loglik", "AIC", "BIC"), start = c("warm", "cold"),
  cores = 1, trace = c("progress", "print", "none"), ...)
Arguments
X | 
 option to provide a nxp data matrix. Each row corresponds to a single observation and each column contains n observations of a single feature/variable.  | 
S | 
 option to provide a pxp sample covariance matrix (denominator n). If argument is   | 
lam | 
 positive tuning parameters for elastic net penalty. If a vector of parameters is provided, they should be in increasing order. Defaults to grid of values   | 
diagonal | 
 option to penalize the diagonal elements of the estimated precision matrix (  | 
path | 
 option to return the regularization path. This option should be used with extreme care if the dimension is large. If set to TRUE, cores must be set to 1 and errors and optimal tuning parameters will based on the full sample. Defaults to FALSE.  | 
tol | 
 convergence tolerance. Iterations will stop when the average absolute difference in parameter estimates in less than   | 
maxit | 
 maximum number of iterations. Defaults to 1e4.  | 
adjmaxit | 
 adjusted maximum number of iterations. During cross validation this option allows the user to adjust the maximum number of iterations after the first   | 
K | 
 specify the number of folds for cross validation.  | 
crit.cv | 
 cross validation criterion (  | 
start | 
 specify   | 
cores | 
 option to run CV in parallel. Defaults to   | 
trace | 
 option to display progress of CV. Choose one of   | 
... | 
 additional arguments to pass to   | 
Value
returns list of returns which includes:
lam | 
 optimal tuning parameter.  | 
min.error | 
 minimum average cross validation error (cv.crit) for optimal parameters.  | 
avg.error | 
 average cross validation error (cv.crit) across all folds.  | 
cv.error | 
 cross validation errors (cv.crit).  | 
Parallel Cross Validation
Description
Parallel implementation of cross validation.
Usage
CVP(X = NULL, lam = 10^seq(-2, 2, 0.2), diagonal = FALSE, tol = 1e-04,
  maxit = 10000, adjmaxit = NULL, K = 5, crit.cv = c("loglik", "AIC",
  "BIC"), start = c("warm", "cold"), cores = 1, trace = c("progress",
  "print", "none"), ...)
Arguments
X | 
 nxp data matrix. Each row corresponds to a single observation and each column contains n observations of a single feature/variable.  | 
lam | 
 positive tuning parameters for elastic net penalty. If a vector of parameters is provided, they should be in increasing order. Defaults to grid of values   | 
diagonal | 
 option to penalize the diagonal elements of the estimated precision matrix (  | 
tol | 
 convergence tolerance. Iterations will stop when the average absolute difference in parameter estimates in less than   | 
maxit | 
 maximum number of iterations. Defaults to 1e4.  | 
adjmaxit | 
 adjusted maximum number of iterations. During cross validation this option allows the user to adjust the maximum number of iterations after the first   | 
K | 
 specify the number of folds for cross validation.  | 
crit.cv | 
 cross validation criterion (  | 
start | 
 specify   | 
cores | 
 option to run CV in parallel. Defaults to   | 
trace | 
 option to display progress of CV. Choose one of   | 
... | 
 additional arguments to pass to   | 
Value
returns list of returns which includes:
lam | 
 optimal tuning parameter.  | 
min.error | 
 minimum average cross validation error (cv.crit) for optimal parameters.  | 
avg.error | 
 average cross validation error (cv.crit) across all folds.  | 
cv.error | 
 cross validation errors (cv.crit).  | 
Penalized precision matrix estimation
Description
Penalized precision matrix estimation using the graphical lasso (glasso) algorithm.
Consider the case where X_{1}, ..., X_{n} are iid N_{p}(\mu,
\Sigma) and we are tasked with estimating the precision matrix,
denoted \Omega \equiv \Sigma^{-1}. This function solves the
following optimization problem:
- Objective:
 - 
\hat{\Omega}_{\lambda} = \arg\min_{\Omega \in S_{+}^{p}} \left\{ Tr\left(S\Omega\right) - \log \det\left(\Omega \right) + \lambda \left\| \Omega \right\|_{1} \right\} 
where \lambda > 0 and we define
\left\|A \right\|_{1} = \sum_{i, j} \left| A_{ij} \right|.
Usage
CVglasso(X = NULL, S = NULL, nlam = 10, lam.min.ratio = 0.01,
  lam = NULL, diagonal = FALSE, path = FALSE, tol = 1e-04,
  maxit = 10000, adjmaxit = NULL, K = 5, crit.cv = c("loglik", "AIC",
  "BIC"), start = c("warm", "cold"), cores = 1, trace = c("progress",
  "print", "none"), ...)
Arguments
X | 
 option to provide a nxp data matrix. Each row corresponds to a single observation and each column contains n observations of a single feature/variable.  | 
S | 
 option to provide a pxp sample covariance matrix (denominator n). If argument is   | 
nlam | 
 number of   | 
lam.min.ratio | 
 smallest   | 
lam | 
 option to provide positive tuning parameters for penalty term. This will cause   | 
diagonal | 
 option to penalize the diagonal elements of the estimated precision matrix (  | 
path | 
 option to return the regularization path. This option should be used with extreme care if the dimension is large. If set to TRUE, cores must be set to 1 and errors and optimal tuning parameters will based on the full sample. Defaults to FALSE.  | 
tol | 
 convergence tolerance. Iterations will stop when the average absolute difference in parameter estimates in less than   | 
maxit | 
 maximum number of iterations. Defaults to 1e4.  | 
adjmaxit | 
 adjusted maximum number of iterations. During cross validation this option allows the user to adjust the maximum number of iterations after the first   | 
K | 
 specify the number of folds for cross validation.  | 
crit.cv | 
 cross validation criterion (  | 
start | 
 specify   | 
cores | 
 option to run CV in parallel. Defaults to   | 
trace | 
 option to display progress of CV. Choose one of   | 
... | 
 additional arguments to pass to   | 
Details
For details on the implementation of the 'glasso' function, see Tibshirani's website. http://statweb.stanford.edu/~tibs/glasso/.
Value
returns class object CVglasso which includes:
Call | 
 function call.  | 
Iterations | 
 number of iterations  | 
Tuning | 
 optimal tuning parameters (lam and alpha).  | 
Lambdas | 
 grid of lambda values for CV.  | 
maxit | 
 maximum number of iterations for outer (blockwise) loop.  | 
Omega | 
 estimated penalized precision matrix.  | 
Sigma | 
 estimated covariance matrix from the penalized precision matrix (inverse of Omega).  | 
Path | 
 array containing the solution path. Solutions will be ordered by ascending lambda values.  | 
MIN.error | 
 minimum average cross validation error (cv.crit) for optimal parameters.  | 
AVG.error | 
 average cross validation error (cv.crit) across all folds.  | 
CV.error | 
 cross validation errors (cv.crit).  | 
Author(s)
Matt Galloway gall0441@umn.edu
References
Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. 'Sparse inverse covariance estimation with the graphical lasso.' Biostatistics 9.3 (2008): 432-441.
Banerjee, Onureen, Ghauoui, Laurent El, and d'Aspremont, Alexandre. 2008. 'Model Selection through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data.' Journal of Machine Learning Research 9: 485-516.
Tibshirani, Robert. 1996. 'Regression Shrinkage and Selection via the Lasso.' Journal of the Royal Statistical Society. Series B (Methodological). JSTOR: 267-288.
Meinshausen, Nicolai and Buhlmann, Peter. 2006. 'High-Dimensional Graphs and Variable Selection with the Lasso.' The Annals of Statistics. JSTOR: 1436-1462.
Witten, Daniela M, Friedman, Jerome H, and Simon, Noah. 2011. 'New Insights and Faster computations for the Graphical Lasso.' Journal of Computation and Graphical Statistics. Taylor and Francis: 892-900.
Tibshirani, Robert, Bien, Jacob, Friedman, Jerome, Hastie, Trevor, Simon, Noah, Jonathan, Taylor, and Tibshirani, Ryan J. 'Strong Rules for Discarding Predictors in Lasso-Type Problems.' Journal of the Royal Statistical Society: Series B (Statistical Methodology). Wiley Online Library 74 (2): 245-266.
Ghaoui, Laurent El, Viallon, Vivian, and Rabbani, Tarek. 2010. 'Safe Feature Elimination for the Lasso and Sparse Supervised Learning Problems.' arXiv preprint arXiv: 1009.4219.
Osborne, Michael R, Presnell, Brett, and Turlach, Berwin A. 'On the Lasso and its Dual.' Journal of Computational and Graphical Statistics. Taylor and Francis 9 (2): 319-337.
Rothman, Adam. 2017. 'STAT 8931 notes on an algorithm to compute the Lasso-penalized Gausssian likelihood precision matrix estimator.'
See Also
Examples
# generate data from a sparse matrix
# first compute covariance matrix
S = matrix(0.7, nrow = 5, ncol = 5)
for (i in 1:5){
 for (j in 1:5){
   S[i, j] = S[i, j]^abs(i - j)
 }
}
# generate 100 x 5 matrix with rows drawn from iid N_p(0, S)
Z = matrix(rnorm(100*5), nrow = 100, ncol = 5)
out = eigen(S, symmetric = TRUE)
S.sqrt = out$vectors %*% diag(out$values^0.5)
S.sqrt = S.sqrt %*% t(out$vectors)
X = Z %*% S.sqrt
# lasso penalty CV
CVglasso(X)
Plot CVglasso object
Description
Produces a plot for the cross validation errors, if available.
Usage
## S3 method for class 'CVglasso'
plot(x, type = c("line", "heatmap"), footnote = TRUE,
  ...)
Arguments
x | 
 class object CVglasso  | 
type | 
 produce either 'heatmap' or 'line' graph  | 
footnote | 
 option to print footnote of optimal values. Defaults to TRUE.  | 
... | 
 additional arguments.  | 
Examples
# generate data from a sparse matrix
# first compute covariance matrix
S = matrix(0.7, nrow = 5, ncol = 5)
for (i in 1:5){
 for (j in 1:5){
   S[i, j] = S[i, j]^abs(i - j)
 }
}
# generate 100 x 5 matrix with rows drawn from iid N_p(0, S)
Z = matrix(rnorm(100*5), nrow = 100, ncol = 5)
out = eigen(S, symmetric = TRUE)
S.sqrt = out$vectors %*% diag(out$values^0.5)
S.sqrt = S.sqrt %*% t(out$vectors)
X = Z %*% S.sqrt
# produce line graph for CVglasso
plot(CVglasso(X))
# produce CV heat map for CVglasso
plot(CVglasso(X), type = 'heatmap')
Print CVglasso object
Description
Prints CVglasso object and suppresses output if needed.
Usage
## S3 method for class 'CVglasso'
print(x, ...)
Arguments
x | 
 class object CVglasso  | 
... | 
 additional arguments.  |