--- title: "Run Canek on a toy example" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Run Canek on a toy example} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( fig.width = 5, fig.height = 5, fig.align = "center", collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(Canek) # Functions ## Function to plot the pca coordinates plotPCA <- function(pcaData = NULL, label = NULL, legPosition = "topleft"){ col <- as.integer(label) plot(x = pcaData[,"PC1"], y = pcaData[,"PC2"], col = as.integer(label), cex = 0.75, pch = 19, xlab = "PC1", ylab = "PC2") legend(legPosition, pch = 19, legend = levels(label), col = unique(as.integer(label))) } ``` ## Load the data On this toy example we use the two simulated batches included in the `SimBatches` data from Canek's package. `SimBatches` is a list containing: * `batches`: Simulated scRNA-seq datasets with genes (rows) and cells (columns). Simulations were performed using [Splatter](https://bioconductor.org/packages/devel/bioc/vignettes/splatter/inst/doc/splatter.html). * `cell_type`: a factor containing the celltype labels of the batches ```{r} lsData <- list(B1 = SimBatches$batches[[1]], B2 = SimBatches$batches[[2]]) batch <- factor(c(rep("Batch-1", ncol(lsData[[1]])), rep("Batch-2", ncol(lsData[[2]])))) celltype <- SimBatches$cell_types table(batch) table(celltype) ``` ## PCA before correction We perform the Principal Component Analysis (PCA) of the joined datasets and scatter plot the first two PCs. The batch-effect causes cells to group by batch. ```{r} data <- Reduce(cbind, lsData) pcaData <- prcomp(t(data), center = TRUE, scale. = TRUE)$x ``` ```{r} plotPCA(pcaData = pcaData, label = batch, legPosition = "bottomleft") plotPCA(pcaData = pcaData, label = celltype, legPosition = "bottomleft") ``` ## Run Canek We correct the toy batches using the function *RunCanek*. This function accepts: * List of matrices * Seurat object * List of Seurat objects * SingleCellExperiment object * List of SingleCellExperiment objects On this example we use the list of matrices created before. ```{r} data <- RunCanek(lsData) ``` ## PCA after correction We perform PCA of the corrected datasets and plot the first two PCs. After correction, the cells group by their corresponding cell type. ```{r} pcaData <- prcomp(t(data), center = TRUE, scale. = TRUE)$x ``` ```{r} plotPCA(pcaData = pcaData, label = batch, legPosition = "topleft") plotPCA(pcaData = pcaData, label = celltype, legPosition = "topleft") ``` ## Session info ```{r} sessionInfo() ```