FBMS - Flexible Bayesian Model Selection and Model Averaging

The FBMS package provides functions for Flexible Bayesian Model Selection and Model Averaging.

Examples

Below are provided examples on how to run the algorithm and what the results tell us, we begin by loading the package and a supplied dataset

library(FBMS)
#> Loading required package: fastglm
#> Loading required package: bigmemory
#> Loading required package: GenSA
#> Loading required package: parallel
library(GenSA)
library(fastglm)
data("breastcancer")
bc <- breastcancer[,c(ncol(breastcancer),2:(ncol(breastcancer)-1))]

We need some nonlinear transformations for the algorithm to use, the package offers a selection of these, but you are also able to define your own. Here we create a list of the ones to use, all but one are supplied by the package.

to3 <- function(x) x^3

transforms <- c("sigmoid","sin_deg","exp_dbl","p0","troot","to3")

By calling two functions in the package, a list of probabilities for various parts of the algorithm, as well as a list of parameters are created. The list of probabilities needs the list of transformations to be able to create the vector of probabilities for the different transformations

probs <- gen.probs.gmjmcmc(transforms)
params <- gen.params.gmjmcmc(bc)

We can use one of the supplied likelihood functions, but here we demonstrate how to create our own, it takes four arguments, the dependent \(y\) variable, the matrix \(X\) containing all independent variables, the model as a logical vector specifying the columns of \(X\), and a list of complexity measures for the features involved in the model

loglik.example <- function (y, x, model, complex, params) {
  r <- 20/223
  suppressWarnings({mod <- fastglm(as.matrix(x[,model]), y, family=binomial())})
  ret <- (-(mod$deviance -2*log(r)*sum(complex$width)))/2
  return(list(crit=ret, coefs=mod$coefficients))
}

To be able to calculate the alphas when using for example strategy 3 as per Hubin et al., we need a function for the log likelihood, in this example we will use the function supplied by the package called logistic.loglik.alpha. With that function as a starting point, you can also create your own function for this. We also adjust our parameter list to use the third strategy.

params$feat$alpha <- 3

We are now ready to run the algorithm, in this vignette we will only run very few iterations for demonstration purposes, but the only thing that needs to be changed are the number or populations to visit T, the number of iterations per population N and the number of iterations for the final population N.final

set.seed(1234)
result <- gmjmcmc(bc, loglik.example, logistic.loglik.alpha, transforms, P=3, N=30, N.final=60, probs, params)
#> Data coerced to matrix type.
#> Intercept added to data.
#> Population 1 begin.
New best crit: -33.7601529708647 
#> 
New best crit: -31.3487134729661 
#> 
New best crit: -30.9876784635977 
#> 
New best crit: -28.9372739751026 
#> 
New best crit: -26.5258344770878 
#> 
New best crit: -24.1143949791927 
#> 
New best crit: -21.7029554813142 
#> 
New best crit: -19.2915159834322 
#> 
New best crit: -16.8800764855705 
#> 
New best crit: -16.8800764855416 
#> 
New best crit: -16.8800764855347 
#> 
#> Population 1 done.
#> 
Current best crit: -16.8800764855347 
#> Feature importance:
#>             !                 | radius_mean 
#>  ###########!#################| texture_mean 
#>  ###########!#################| perimeter_mean 
#>             !                 | area_mean 
#>             !                 | smoothness_mean 
#>  ###########!#################| compactness_mean 
#>             !                 | concavity_mean 
#>             !                 | concave.points_mean 
#>             !               ##| symmetry_mean 
#>             !                 | fractal_dimension_mean 
#>             !                 | radius_se 
#>             !                 | texture_se 
#>             !              ###| perimeter_se 
#>  ###########!#################| area_se 
#>             !                 | smoothness_se 
#>             !                 | compactness_se 
#>             !                 | concavity_se 
#>             !                 | concave.points_se 
#>             !                 | symmetry_se 
#>             !                 | fractal_dimension_se 
#>             !                 | radius_worst 
#>             !                 | texture_worst 
#>             !                 | perimeter_worst 
#>             !                 | area_worst 
#>             !                 | smoothness_worst 
#>             !                 | compactness_worst 
#>             !               ##| concavity_worst 
#> ############!#################| concave.points_worst 
#>    #########!#################| symmetry_worst 
#>             !                 | fractal_dimension_worst 
#> Replaced feature concavity_mean with (compactness_mean*compactness_mean) 
#> Replaced feature texture_se with (concavity_mean*area_worst) 
#> Replaced feature symmetry_se with exp_dbl(symmetry_se) 
#> Replaced feature radius_worst with to3(symmetry_worst) 
#> Replaced feature smoothness_worst with sin_deg(-146+12.84*concave.points_worst+-7.51*fractal_dimension_worst+-25.51*symmetry_worst+0.19*area_mean+23.98*concavity_mean+1.96*texture_se) 
#> Replaced feature concavity_worst with exp_dbl(smoothness_mean) 
#> Added feature p0(2.11+-0.38*symmetry_worst+-0.07*texture_mean+-2.33*concave.points_worst+-0.05*exp_dbl(smoothness_mean)+-0.82*fractal_dimension_mean+-1.28*compactness_mean+-3.59*compactness_worst) 
#> Added feature (perimeter_mean*compactness_mean) 
#> Added feature (area_worst*area_se) 
#> Added feature (sin_deg(350.74+-554.06*concave.points_worst+-531.58*fractal_dimension_worst+-356.89*symmetry_worst+-489.51*area_mean+-408.97*concavity_mean+-318.42*texture_se)*compactness_mean) 
#> Added feature ((area_worst*area_se)*to3(symmetry_worst)) 
#> Added feature (concave.points_worst*symmetry_worst) 
#> Added feature p0(symmetry_worst) 
#> Added feature (smoothness_worst*concave.points_worst) 
#> Added feature (smoothness_se*area_se) 
#> Added feature sigmoid(compactness_mean) 
#> Added feature to3(perimeter_mean) 
#> Added feature (concavity_mean*symmetry_worst) 
#> Added feature troot(area_se) 
#> Population 2 begin.
New best crit: -96.4575799165155 
#> 
New best crit: -94.0461404185737 
#> 
New best crit: -47.9676443754457 
#> 
New best crit: -43.2993839991298 
#> 
New best crit: -41.9021668179924 
#> 
New best crit: -37.1878156666056 
#> 
New best crit: -36.1725870809054 
#> 
New best crit: -32.3687735695188 
#> 
New best crit: -31.3377227748698 
#> 
#> Population 2 done.
#> 
Current best crit: -31.3377227748698 
#> Feature importance:
#>             !                #| radius_mean 
#>             !                 | texture_mean 
#>  ###########!#################| perimeter_mean 
#>             !                #| area_mean 
#>             !                 | smoothness_mean 
#>             !                 | compactness_mean 
#>             !                 | (compactness_mean*compactness_mean) 
#>  ###########!#################| concave.points_mean 
#>             !                #| symmetry_mean 
#>  ###########!#################| fractal_dimension_mean 
#>             !                #| radius_se 
#>             !                 | (concavity_mean*area_worst) 
#>  ###########!#################| perimeter_se 
#>             !                #| area_se 
#>             !                #| smoothness_se 
#>             !                 | compactness_se 
#>             !                 | concavity_se 
#>             !                 | concave.points_se 
#>             !                 | exp_dbl(symmetry_se) 
#>  ###########!#################| fractal_dimension_se 
#>             !                 | to3(symmetry_worst) 
#>             !                 | texture_worst 
#>             !                #| perimeter_worst 
#>  ###########!#################| area_worst 
#>             !                 | sin_deg(-146+12.84*concave.points_worst+-7.51*fractal_dimension_worst+-25.51*symmetry_worst+0.19*area_mean+23.98*concavity_mean+1.96*texture_se) 
#>  ###########!#################| compactness_worst 
#>             !                 | exp_dbl(smoothness_mean) 
#>             !                #| concave.points_worst 
#>             !                 | symmetry_worst 
#>             !                 | fractal_dimension_worst 
#>             !                 | p0(2.11+-0.38*symmetry_worst+-0.07*texture_mean+-2.33*concave.points_worst+-0.05*exp_dbl(smoothness_mean)+-0.82*fractal_dimension_mean+-1.28*compactness_mean+-3.59*compactness_worst) 
#>             !                 | (perimeter_mean*compactness_mean) 
#>             !                 | (area_worst*area_se) 
#>             !                 | (sin_deg(350.74+-554.06*concave.points_worst+-531.58*fractal_dimension_worst+-356.89*symmetry_worst+-489.51*area_mean+-408.97*concavity_mean+-318.42*texture_se)*compactness_mean) 
#>             !                 | ((area_worst*area_se)*to3(symmetry_worst)) 
#>             !                 | (concave.points_worst*symmetry_worst) 
#>  ###########!#################| p0(symmetry_worst) 
#>             !                 | (smoothness_worst*concave.points_worst) 
#>             !                 | (smoothness_se*area_se) 
#>             !                 | sigmoid(compactness_mean) 
#>             !                 | to3(perimeter_mean) 
#>             !                 | (concavity_mean*symmetry_worst) 
#>             !                 | troot(area_se) 
#> Replaced feature symmetry_mean with to3(-2.39+3.25*concave.points_mean+0*area_worst) 
#> Replaced feature compactness_se with p0((compactness_mean*compactness_mean)) 
#> Replaced feature concavity_se with to3(perimeter_se) 
#> Replaced feature to3(symmetry_worst) with (perimeter_se*fractal_dimension_mean) 
#> Replaced feature sin_deg(-146+12.84*concave.points_worst+-7.51*fractal_dimension_worst+-25.51*symmetry_worst+0.19*area_mean+23.98*concavity_mean+1.96*texture_se) with troot(4.11+-0.03*perimeter_mean+1.25*p0(symmetry_worst)+2.87*fractal_dimension_mean) 
#> Replaced feature (sin_deg(350.74+-554.06*concave.points_worst+-531.58*fractal_dimension_worst+-356.89*symmetry_worst+-489.51*area_mean+-408.97*concavity_mean+-318.42*texture_se)*compactness_mean) with (p0(symmetry_worst)*compactness_se) 
#> Replaced feature sigmoid(compactness_mean) with p0(2.46+-4.19*compactness_se+-0.08*perimeter_mean+0.08*perimeter_se+0.01*area_worst) 
#> Replaced feature to3(perimeter_mean) with (p0(symmetry_worst)*area_mean) 
#> Replaced feature (concavity_mean*symmetry_worst) with radius_worst 
#> Population 3 begin.
 |=                                       |
 |==                                      |
New best crit: -200.149478326352 
#> 
 |===                                     |
New best crit: -195.326599330544 
#> 
 |====                                    |
New best crit: -183.269401841014 
#> 
 |=====                                   |
New best crit: -36.1715924687893 
#> 
 |======                                  |
 |=======                                 |
 |========                                |
 |=========                               |
New best crit: -26.5258344771775 
#> 
 |==========                              |
 |===========                             |
 |============                            |
 |=============                           |
New best crit: -21.7029554815947 
#> 
 |==============                          |
New best crit: -19.2915159836881 
#> 
 |===============                         |
 |================                        |
 |=================                       |
 |==================                      |
 |===================                     |
 |====================                    |
 |=====================                   |
 |======================                  |
 |=======================                 |
 |========================                |
 |=========================               |
 |==========================              |
 |===========================             |
 |============================            |
 |=============================           |
 |==============================          |
 |===============================         |
 |================================        |
 |=================================       |
 |==================================      |
 |===================================     |
 |====================================    |
 |=====================================   |
 |======================================  |
 |======================================= |
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
 |========================================|
#> Population 3 done.
#> 
Current best crit: -19.2915159836881 
#> Feature importance:
#>             !                 | radius_mean 
#>             !                 | texture_mean 
#>             !                 | perimeter_mean 
#>             !                 | area_mean 
#>             !                 | smoothness_mean 
#>             !                 | compactness_mean 
#>             !                 | (compactness_mean*compactness_mean) 
#>             !                 | concave.points_mean 
#>             !                 | to3(-2.39+3.25*concave.points_mean+0*area_worst) 
#>             !                 | fractal_dimension_mean 
#>             !               ##| radius_se 
#>             !                 | (concavity_mean*area_worst) 
#>             !                 | perimeter_se 
#>             !                 | area_se 
#>             !                 | smoothness_se 
#>             !                 | p0((compactness_mean*compactness_mean)) 
#>             !                 | to3(perimeter_se) 
#>             !                 | concave.points_se 
#>             !                 | exp_dbl(symmetry_se) 
#>             !                 | fractal_dimension_se 
#>             !                 | (perimeter_se*fractal_dimension_mean) 
#>  ###########!#################| texture_worst 
#>             !                 | perimeter_worst 
#>             !                 | area_worst 
#>             !                 | troot(4.11+-0.03*perimeter_mean+1.25*p0(symmetry_worst)+2.87*fractal_dimension_mean) 
#>             !                 | compactness_worst 
#>             !                 | exp_dbl(smoothness_mean) 
#>             !                 | concave.points_worst 
#>  ###########!#################| symmetry_worst 
#>             !                 | fractal_dimension_worst 
#>             !                 | p0(2.11+-0.38*symmetry_worst+-0.07*texture_mean+-2.33*concave.points_worst+-0.05*exp_dbl(smoothness_mean)+-0.82*fractal_dimension_mean+-1.28*compactness_mean+-3.59*compactness_worst) 
#>             !                 | (perimeter_mean*compactness_mean) 
#>             !                 | (area_worst*area_se) 
#>             !                 | (p0(symmetry_worst)*compactness_se) 
#>             !                 | ((area_worst*area_se)*to3(symmetry_worst)) 
#>  ###########!#################| (concave.points_worst*symmetry_worst) 
#>             !                 | p0(symmetry_worst) 
#>             !                 | (smoothness_worst*concave.points_worst) 
#>             !                 | (smoothness_se*area_se) 
#>             !                 | p0(2.46+-4.19*compactness_se+-0.08*perimeter_mean+0.08*perimeter_se+0.01*area_worst) 
#>             !                 | (p0(symmetry_worst)*area_mean) 
#>             !                 | radius_worst 
#>  ###########!#################| troot(area_se)

We can then summarize the results using the supplied function and get a plot of the importance of the parameters in the last population of features

plot(result)