Welcome to ClientVPS Mirrors

Finding communities in large datasets

Finding communities in large datasets

library(dendroNetwork)

Community detection in very large datasets

When using larger datasets of tree-ring series, calculating the table with similarities can take a lot of time, but finding communities even more. It is therefore recommended to use of parallel computing for Clique Percolation: clique_community_names_par(network, k=3, n_core = 4). This reduces the amount of time significantly. For most datasets clique_community_names() is sufficiently fast and for smaller datasets clique_community_names_par() can even be slower due to the parallelisation. Therefore, the funtion clique_community_names() should be used initially and if this is very slow, start using clique_community_names_par().

The workflow is similar as described in the vignette("dendroNetwork"), but with minor changes:

  1. load network.

  2. compute similarities.

  3. find the maximum clique size: igraph::clique_num(network) .

  4. detect communities for each clique size separately:

    • com_cpm_k3 <- clique_community_names_par(network, k=3, n_core = 6).

    • com_cpm_k4 <- clique_community_names_par(network, k=4, n_core = 6).

    • and so on until the maximum clique size.

  5. merge these into a single data frame by com_cpm_all <- rbind(com_cpm_k3,com_cpm_k4, com_cpm_k5,... ) .

  6. create table for use in cytoscape with all communities: com_cpm_all <- com_cpm_all |> dplyr::count(node, com_name) |> tidyr::spread(com_name, n) .

  7. Continue with the visualisation in Cytoscape, see the relevant section in the vignette("dendroNetwork") .

Need a high-speed mirror for your open-source project?
Contact our mirror admin team at info@clientvps.com.

This archive is provided as a free public service to the community.
Proudly supported by infrastructure from VPSPulse , RxServers , BuyNumber , UnitVPS , OffshoreName and secure payment technology by ArionPay.