---
date: "`r Sys.Date()`"
title: "Correlation Analysis with manymodelr"
output: html_document
vignette: >
  %\VignetteIndexEntry{ "Variable Correlations with manymolder"}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```



- `get_var_corr`

As can probably(hopefully) be guessed from the name, this provides a convenient way to get variable correlations. It enables one to get correlation between one variable and all other variables in the data set.

**Previously, one would set `get_all` to `TRUE` if they wanted to get correlations between all variables. This argument has been dropped in favor of simply supplying an optional `other_vars` vector if one does not want to get all correlations.** 



```{r}
library(manymodelr)
# getall correlations

# default pearson

head( corrs <- get_var_corr(mtcars,comparison_var="mpg") )


```

**Previously, one would also set `drop_columns` to `TRUE` if they wanted to drop factor columns.** Now, a user simply provides a character vector specifying which column types(classes) should be dropped. It defaults to `c("character","factor")`.

```{r}
data("yields", package="manymodelr")
# purely demonstrative
get_var_corr(yields,"height",other_vars="weight",
             drop_columns=c("factor","character"),method="spearman",
             exact=FALSE)


```


Similarly, `get_var_corr_` (note the underscore at the end) provides  a convenient way to get combination-wise correlations.

```{r}

head(get_var_corr_(yields),6)

```

To use only a subset of the data, we can use provide a list of columns to `subset_cols`. By default, the first value(vector) in the list is mapped to `comparison_var` and the other to `other_Var`. The list is therefore of length 2.

```{r}

head(get_var_corr_(mtcars,subset_cols=list(c("mpg","vs"),c("disp","wt")),
                   method="spearman",exact=FALSE))

```


- `plot_corr`

Obtaining correlations would mostly likely benefit from some form of visualization. `plot_corr` aims to achieve just that. There are currently two plot styles, `squares` and `circles`. `circles` has a `shape` argument that can allow for more flexibility. It should be noted that the correlation matrix supplied to this function is an object produced by `get_var_corr_`.



To modify the plot a bit, we can choose to switch the x and y values as shown below.

```{r}


plot_corr(mtcars,show_which = "corr",
          round_which = "correlation",decimals = 2,x="other_var",  y="comparison_var",plot_style = "squares"
          ,width = 1.1,custom_cols = c("green","blue","red"),colour_by = "correlation")

```


To show significance of the results instead of the correlations themselves, we can set `show_which` to "signif"  as shown below. By default, significance is set to 0.05. You can override this by supplying a different `signif_cutoff`.

```{r}
# color by p value
# change custom colors by supplying custom_cols
# significance is default 
set.seed(233)
plot_corr(mtcars, x="other_var", y="comparison_var",plot_style = "circles",show_which = "signif", colour_by = "p.value", sample(colours(),3))

```



To explore more options, please take a look at the documentation.