| Title: | Mixed, Low-Rank, and Sparse Multivariate Regression on High-Dimensional Data | 
| Version: | 0.1.0 | 
| Description: | Mixed, low-rank, and sparse multivariate regression ('mixedLSR') provides tools for performing mixture regression when the coefficient matrix is low-rank and sparse. 'mixedLSR' allows subgroup identification by alternating optimization with simulated annealing to encourage global optimum convergence. This method is data-adaptive, automatically performing parameter selection to identify low-rank substructures in the coefficient matrix. | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.2.1 | 
| Depends: | R (≥ 4.1.0) | 
| Imports: | grpreg, purrr, MASS, stats, ggplot2 | 
| Suggests: | knitr, rmarkdown, mclust | 
| VignetteBuilder: | knitr | 
| BugReports: | https://github.com/alexanderjwhite/mixedLSR | 
| URL: | https://alexanderjwhite.github.io/mixedLSR/ | 
| NeedsCompilation: | no | 
| Packaged: | 2022-11-04 10:33:31 UTC; whitealj | 
| Author: | Alexander White  | 
| Maintainer: | Alexander White <whitealj@iu.edu> | 
| Repository: | CRAN | 
| Date/Publication: | 2022-11-04 20:00:02 UTC | 
Compute Bayesian information criterion for a mixedLSR model
Description
Compute Bayesian information criterion for a mixedLSR model
Usage
bic_lsr(a, n, llik)
Arguments
a | 
 A list of coefficient matrices.  | 
n | 
 The sample size.  | 
llik | 
 The log-likelihood of the model.  | 
Value
The BIC.
Examples
n <- 50
simulate <- simulate_lsr(n)
model <- mixed_lsr(simulate$x, simulate$y, k = 2, init_lambda = c(1,1), alt_iter = 0)
bic_lsr(model$A, n = n, model$llik)
Internal Alternating Optimization Function
Description
Internal Alternating Optimization Function
Usage
fct_alt_optimize(
  x,
  y,
  k,
  clust_assign,
  lambda,
  alt_iter,
  anneal_iter,
  em_iter,
  temp,
  mu,
  eps,
  accept_prob,
  sim_N,
  verbose
)
Arguments
x | 
 A matrix of predictors.  | 
y | 
 A matrix of responses.  | 
k | 
 The number of groups.  | 
clust_assign | 
 The current clustering assignment.  | 
lambda | 
 A vector of penalization parameters.  | 
alt_iter | 
 The maximum number of times to alternate between the classification expectation maximization algorithm and the simulated annealing algorithm.  | 
anneal_iter | 
 The maximum number of simulated annealing iterations.  | 
em_iter | 
 The maximum number of EM iterations.  | 
temp | 
 The initial simulated annealing temperature, temp > 0.  | 
mu | 
 The simulated annealing decrease temperature fraction. Once the best configuration cannot be improved, reduce the temperature to (mu)T, 0 < mu < 1.  | 
eps | 
 The final simulated annealing temperature, eps > 0.  | 
accept_prob | 
 The simulated annealing probability of accepting a new assignment 0 < accept_prob < 1. When closer to 1, trial assignments will only be small perturbation of the current assignment. When closer to 0, trial assignments are closer to random.  | 
sim_N | 
 The simulated annealing number of iterations for reaching equilibrium.  | 
verbose | 
 A boolean indicating whether to print to screen.  | 
Value
A final fit of mixedLSR
Internal Double Penalized Projection Function
Description
Internal Double Penalized Projection Function
Usage
fct_dpp(
  y,
  x,
  rank,
  lambda = NULL,
  alpha = 2 * sqrt(3),
  beta = 1,
  sigma,
  ptype = "grLasso",
  y_sparse = TRUE
)
Arguments
y | 
 A matrix of responses.  | 
x | 
 A matrix of predictors.  | 
rank | 
 The rank, if known.  | 
lambda | 
 A vector of penalization parameters.  | 
alpha | 
 A positive constant DPP parameter.  | 
beta | 
 A positive constant DPP parameter.  | 
sigma | 
 An estimated standard deviation  | 
ptype | 
 A group penalized regression penalty type. See grpreg.  | 
y_sparse | 
 Should Y coefficients be treated as sparse?  | 
Value
A list containing estimated coefficients, covariance, and penalty parameters.
Internal EM Algorithm
Description
Internal EM Algorithm
Usage
fct_em(x, y, k, lambda, clust_assign, lik_track, em_iter, verbose)
Arguments
x | 
 A matrix of predictors.  | 
y | 
 A matrix of responses.  | 
k | 
 The number of groups.  | 
lambda | 
 A vector of penalization parameters.  | 
clust_assign | 
 The current clustering assignment.  | 
lik_track | 
 A vector storing the log-likelihood by iteration.  | 
em_iter | 
 The maximum number of EM iterations.  | 
verbose | 
 A boolean indicating whether to print to screen.  | 
Value
A mixedLSR model.
Internal Posterior Calculation
Description
Internal Posterior Calculation
Usage
fct_gamma(
  x,
  y,
  k,
  N,
  clust_assign,
  pi_vec,
  lambda,
  alpha,
  beta,
  y_sparse,
  rank,
  max_rank
)
Arguments
x | 
 A matrix of predictors.  | 
y | 
 A matrix of responses.  | 
k | 
 The number of groups.  | 
N | 
 The sample size.  | 
clust_assign | 
 The current clustering assignment.  | 
pi_vec | 
 A vector of mixing probabilities for each cluster label.  | 
lambda | 
 A vector of penalization parameters.  | 
alpha | 
 A positive constant DPP parameter.  | 
beta | 
 A positive constant DPP parameter.  | 
y_sparse | 
 Should Y coefficients be treated as sparse?  | 
rank | 
 The rank, if known.  | 
max_rank | 
 The maximum allowed rank.  | 
Value
A list with the posterior, coefficients, and estimated covariance.
Internal Partition Initialization Function
Description
Internal Partition Initialization Function
Usage
fct_initialize(k, N)
Arguments
k | 
 The number of groups.  | 
N | 
 The sample size.  | 
Value
A vector of assignments.
Internal Likelihood Function
Description
Internal Likelihood Function
Usage
fct_j_lik(
  x,
  y,
  k,
  clust_assign,
  lambda,
  alpha = 2 * sqrt(3),
  beta = 1,
  y_sparse = TRUE,
  max_rank = 3,
  rank = NULL
)
Arguments
x | 
 A matrix of predictors.  | 
y | 
 A matrix of responses.  | 
k | 
 The number of groups.  | 
clust_assign | 
 A vector of cluster labels.  | 
lambda | 
 A vector of penalization parameters.  | 
alpha | 
 A positive constant DPP parameter.  | 
beta | 
 A positive constant DPP parameter.  | 
y_sparse | 
 Should Y coefficients be treated as sparse?  | 
max_rank | 
 The maximum allowed rank.  | 
rank | 
 The rank, if known.  | 
Value
The weighted log-likelihood
Internal Log-Likelihood Function
Description
Internal Log-Likelihood Function
Usage
fct_log_lik(mu_mat, sig_vec, y, N, m)
Arguments
mu_mat | 
 The mean matrix.  | 
sig_vec | 
 A vector of sigma.  | 
y | 
 The output matrix.  | 
N | 
 The sample size.  | 
m | 
 The number of y features.  | 
Value
A posterior matrix.
Internal Perturb Function
Description
Internal Perturb Function
Usage
fct_new_assign(assign, k, p)
Arguments
assign | 
 The current clustering assignments.  | 
k | 
 The number of groups.  | 
p | 
 The acceptance probability.  | 
Value
A perturbed assignment.
Internal Pi Function
Description
Internal Pi Function
Usage
fct_pi_vec(clust_assign, k, N)
Arguments
clust_assign | 
 The current clustering assignment.  | 
k | 
 The number of groups.  | 
N | 
 The sample size.  | 
Value
A mixing vector.
Internal Rank Estimation Function
Description
Internal Rank Estimation Function
Usage
fct_rank(x, y, sigma, eta)
Arguments
x | 
 A matrix of predictors.  | 
y | 
 A matrix of responses.  | 
sigma | 
 An estimated noise level.  | 
eta | 
 A rank selection parameter.  | 
Value
The estimated rank.
Internal Penalty Parameter Selection Function.
Description
Internal Penalty Parameter Selection Function.
Usage
fct_select_lambda(
  x,
  y,
  k,
  clust_assign = NULL,
  initial = FALSE,
  type = "all",
  verbose
)
Arguments
x | 
 A matrix of predictors.  | 
y | 
 A matrix of responses.  | 
k | 
 The number of groups.  | 
clust_assign | 
 The current clustering assignment.  | 
initial | 
 An initial penalty parameter.  | 
type | 
 A type.  | 
verbose | 
 A boolean indicating whether to print to screen.  | 
Value
A selected penalty parameter.
Internal Sigma Estimation Function
Description
Internal Sigma Estimation Function
Usage
fct_sigma(y, N, m)
Arguments
y | 
 A matrix of responses.  | 
N | 
 The sample size.  | 
m | 
 The number of outcome variables.  | 
Value
The estimated sigma.
Internal Simulated Annealing Function
Description
Internal Simulated Annealing Function
Usage
fct_sim_anneal(
  x,
  y,
  k,
  init_assign,
  lambda,
  temp,
  mu,
  eps,
  accept_prob,
  sim_N,
  track,
  anneal_iter = 1000,
  verbose
)
Arguments
x | 
 A matrix of predictors.  | 
y | 
 A matrix of responses.  | 
k | 
 The number of groups.  | 
init_assign | 
 An initial clustering assignment.  | 
lambda | 
 A vector of penalization parameters.  | 
temp | 
 The initial simulated annealing temperature, temp > 0.  | 
mu | 
 The simulated annealing decrease temperature fraction. Once the best configuration cannot be improved, reduce the temperature to (mu)T, 0 < mu < 1.  | 
eps | 
 The final simulated annealing temperature, eps > 0.  | 
accept_prob | 
 The simulated annealing probability of accepting a new assignment 0 < accept_prob < 1. When closer to 1, trial assignments will only be small perturbation of the current assignment. When closer to 0, trial assignments are closer to random.  | 
sim_N | 
 The simulated annealing number of iterations for reaching equilibrium.  | 
track | 
 A likelihood tracking vector.  | 
anneal_iter | 
 The maximum number of simulated annealing iterations.  | 
verbose | 
 A boolean indicating whether to print to screen.  | 
Value
An updated clustering vector.
Internal Weighted Log Likelihood Function
Description
Internal Weighted Log Likelihood Function
Usage
fct_weighted_ll(gamma)
Arguments
gamma | 
 A posterior matrix  | 
Value
A weighted log likelihood vector
Mixed Low-Rank and Sparse Multivariate Regression for High-Dimensional Data
Description
Mixed Low-Rank and Sparse Multivariate Regression for High-Dimensional Data
Usage
mixed_lsr(
  x,
  y,
  k,
  nstart = 1,
  init_assign = NULL,
  init_lambda = NULL,
  alt_iter = 5,
  anneal_iter = 1000,
  em_iter = 1000,
  temp = 1000,
  mu = 0.95,
  eps = 1e-06,
  accept_prob = 0.95,
  sim_N = 200,
  verbose = TRUE
)
Arguments
x | 
 A matrix of predictors.  | 
y | 
 A matrix of responses.  | 
k | 
 The number of groups.  | 
nstart | 
 The number of random initializations, the result with the maximum likelihood is returned.  | 
init_assign | 
 A vector of initial assignments, NULL by default.  | 
init_lambda | 
 A vector with the values to initialize the penalization parameter for each group, e.g., c(1,1,1). Set to NULL by default.  | 
alt_iter | 
 The maximum number of times to alternate between the classification expectation maximization algorithm and the simulated annealing algorithm.  | 
anneal_iter | 
 The maximum number of simulated annealing iterations.  | 
em_iter | 
 The maximum number of EM iterations.  | 
temp | 
 The initial simulated annealing temperature, temp > 0.  | 
mu | 
 The simulated annealing decrease temperature fraction. Once the best configuration cannot be improved, reduce the temperature to (mu)T, 0 < mu < 1.  | 
eps | 
 The final simulated annealing temperature, eps > 0.  | 
accept_prob | 
 The simulated annealing probability of accepting a new assignment 0 < accept_prob < 1. When closer to 1, trial assignments will only be small perturbation of the current assignment. When closer to 0, trial assignments are closer to random.  | 
sim_N | 
 The simulated annealing number of iterations for reaching equilibrium.  | 
verbose | 
 A boolean indicating whether to print to screen.  | 
Value
A list containing the likelihood, the partition, the coefficient matrices, and the BIC.
Examples
simulate <- simulate_lsr(50)
mixed_lsr(simulate$x, simulate$y, k = 2, init_lambda = c(1,1), alt_iter = 0)
Heatmap Plot of the mixedLSR Coefficient Matrices
Description
Heatmap Plot of the mixedLSR Coefficient Matrices
Usage
plot_lsr(a, abs = TRUE)
Arguments
a | 
 A coefficient matrix from mixed_lsr model.  | 
abs | 
 A boolean for taking the absolute value of the coefficient matrix.  | 
Value
A ggplot2 heatmap of the coefficient matrix, separated by subgroup.
Examples
simulate <- simulate_lsr()
plot_lsr(simulate$a)
Simulate Heterogeneous, Low-Rank, and Sparse Data
Description
Simulate Heterogeneous, Low-Rank, and Sparse Data
Usage
simulate_lsr(
  N = 100,
  k = 2,
  p = 30,
  m = 35,
  b = 1,
  d = 20,
  h = 0.2,
  case = "independent"
)
Arguments
N | 
 The sample size, default = 100.  | 
k | 
 The number of groups, default = 2.  | 
p | 
 The number of predictor features, default = 30.  | 
m | 
 The number of response features, default = 35.  | 
b | 
 The signal-to-noise ratio, default = 1.  | 
d | 
 The singular value, default = 20.  | 
h | 
 The lower bound for the singular matrix simulation, default = 0.2.  | 
case | 
 The covariance case, "independent" or "dependent", default = "independent".  | 
Value
A list of simulation values, including x matrix, y matrix, coefficients and true clustering assignments.
Examples
simulate_lsr()