--- title: "Manhattan Plots" output: rmarkdown::html_vignette: toc: true description: > Visualize a summary of the association between cluster-feature and feature-feature relationships. vignette: > %\VignetteIndexEntry{Manhattan Plots} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` Download a copy of the vignette to follow along here: [manhattan_plots.Rmd](https://raw.githubusercontent.com/BRANCHlab/metasnf/main/vignettes/manhattan_plots.Rmd) Manhattan plots can be quickly visualize the relationships between features and cluster solutions. There are three main Manhattan plot variations provided in metasnf. 1. `esm_manhattan_plot` Visualize how a set of cluster solutions separate over input/out-of-model features 2. `mc_manhattan_plot` Visualize how representative solutions from defined meta clusters separate over input/out-of-model features 3. `var_manhattan_plot` Visualize how one raw feature associates with other raw features (similar to `assoc_pval_heatmap`) ## Data set-up The example below is taken from the ["complete example" vignette](https://branchlab.github.io/metasnf/articles/a_complete_example.html). ```{r eval = FALSE} library(metasnf) # Start by making a data list containing all our dataframes to more easily # identify subjects without missing data full_data_list <- generate_data_list( list(subc_v, "subcortical_volume", "neuroimaging", "continuous"), list(income, "household_income", "demographics", "continuous"), list(pubertal, "pubertal_status", "demographics", "continuous"), list(anxiety, "anxiety", "behaviour", "ordinal"), list(depress, "depressed", "behaviour", "ordinal"), uid = "unique_id" ) # Partition into a data and target list (optional) data_list <- full_data_list[1:3] target_list <- full_data_list[4:5] # Build space of settings to cluster over set.seed(42) settings_matrix <- generate_settings_matrix( data_list, nrow = 20, min_k = 20, max_k = 50 ) # Clustering solutions_matrix <- batch_snf(data_list, settings_matrix) # Calculate p-values between cluster solutions and features extended_solutions_matrix <- extend_solutions( solutions_matrix, data_list = data_list, target_list = target_list, min_pval = 1e-10 # p-values below 1e-10 will be thresholded to 1e-10 ) ``` ## Associations with Multiple Cluster Solutions (`esm_manhattan_plot`) ```{r eval = FALSE} esm_manhattan <- esm_manhattan_plot( extended_solutions_matrix[1:5, ], neg_log_pval_thresh = 5, threshold = 0.05, point_size = 3, jitter_width = 0.1, jitter_height = 0.1, plot_title = "Feature-Solution Associations", text_size = 14, bonferroni_line = TRUE ) ggplot2::ggsave( "esm_manhattan.png", esm_manhattan, height = 5, width = 8, dpi = 100 ) ```