README

SBMSplitMerge is an R package for performing inference on the number of blocks and the edge-state parameters in a Generalised Stochastic Block Model using a split-merge sampler. This allows for inference in non-conjugate edge-state models in a Bayesian framework.

The git repository stores this package as well as the examples and data examples available on github:

Getting the package

Model for the package

Suppose that you have network data in the form of counts of interaction between a set of \(N\) animals. You think the animals may display group like behaviour, via their interactions such that individuals in the same group have a higher change of interacting. You also are unsure of the number of groups in the network.

Classic network modelling techniques are based on graphs where the interactions are binary. This is a major restriction. Some methods allow the practitioner to specify a distribution for the interactions, so long as it has a conjugate prior. A flexible model would allow the practitioner to place any distribution on the interaction process.

The number of groups or blocks in the network is a key parameter. Some models in the literature set a joint prior on the number of blocks and assignment of individuals to blocks. This can lead to too many smaller blocks and inconsistent posterior behaviour c.f. Miller and Harrison

The GSBM allows an explicit prior on the number of groups and an arbitrary distribution for the interaction process.

Using the package

Consider the example at the start of this section. Suppose the interactions are in a \(N\) by \(N\) matrix in R called . First, you must specify the parts of the model: - the interaction process in a edgemod object - the prior for the parameters of the edge-model in a parammod object - the prior for the number of blocks in a blockmod object

The general method is to specify each of these objects in a list (named, say, model), then call

posterior <- sampler(interacts, model, 10000, "rj")

to run 10000 iterations of the split-merge sampler. The returned list, posterior, contains the trace for the number of blocks, parameter values, block memberships.

print(eval_plots(posterior))

this will plot the traces of the above variables and a posterior probability that two individuals belong to the same block.

For details on setting the model up see the accompanying vignette for a negative binomial model:

vignette("Weibull-edges")