--- title: "Classification Units" author: "rassta: Raster-based Spatial Stratification Algorithms" output: rmarkdown::html_vignette: dev: "jpeg" vignette: > %\VignetteIndexEntry{Classification Units} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` -------------------------------------------------------------------------------- Get the data required for this vignette ```{r data, fig.show = 'hold', fig.height = 2.5, fig.width = 2.8, message = FALSE, warning = FALSE, error = TRUE} # Compressed folder with files from rassta’s installation folder wasoil <- system.file("exdat/wasoil.zip", package = "rassta") # Directory for temporary files o <- tempdir() # Copy compressed folder to directory for temporary files file.copy(from = wasoil, to = o) # Extract files to subfolder d <- paste(o, "/rassta", sep = "") unzip(paste(o, "/wasoil.zip", sep = ""), exdir = d) ``` -------------------------------------------------------------------------------- Classification units are first-level spatial entities that result from the stratification of a landscape factor represented by *n* variables(s). One example of a landscape factor is **climate**, which can be represented by variables such as mean annual temperature or total annual precipitation. Accordingly, climatic classification units can result from the spatial stratification of temperature and precipitation through methods such as (bio)climatic classification schemes, cluster analysis, etc. Currently, **rassta** allows the creation of classification units based on dimension reduction and cluster analysis. ## Dimension reduction and estimation of k The process to create classification units starts with the reduction of the feature space represented by *n* variables. This low-dimension representation of the feature space is created using the self-organizing map (SOM) technique of Kohonen (1982, 1990). Subsequently, cluster analysis is performed on the reduced feature space using the partitioning around medoids (PAM) algorithm (Kaufman and Rousseeuw, 1990). The 'optimal' number of clusters *k* is estimated with the *gap statistic* of Tibshirani et al. (2001). The packages **kohonen** (Wehrens and Kruisselbrink, 2018) and **cluster** (Maechler et al., 2021) are imported by **rassta** to perform the SOM technique and the PAM clustering with gap statistic, respectively. The following code examples demonstrate how to reduce the feature space and estimate the optimum *k* using four terrain variables and the function *som_gap*. ```{r terrvars, fig.show = 'hold', fig.height = 5, fig.width = 5.5, message = FALSE, warning = FALSE, error = TRUE} # Load rassta and terra packages library(rassta) library(terra) # Multi-layer SpatRaster with 4 terrain variables var <- c("height.tif", "midslope.tif", "slope.tif", "wetness.tif") vardir <- paste(d, var, sep = "/") terr.var <- rast(vardir) # Plot terrain variables par(mfrow = c(2,2)) nc <- c("Zissou", "Batlow", "Lajolla", "Spectral") for(i in 1:4){ plot(terr.var[[i]], main = names(terr.var)[i], col = hcl.colors(100, nc[i]), mar = c(2, 2, 2, 3.5) ) } ``` ```{r terrclas, fig.show = 'hold', fig.height = 3, fig.width = 2.8, message = FALSE, warning = FALSE, error = TRUE} # Set seed set.seed(963) # Scale variables to mean = 0 and standard deviation = 1 terr.varscale <- scale(terr.var) # With terra::aggregate(), reduce spatial resolution to reduce computing time terr.varscale <- aggregate(terr.varscale, fact = 3) # Dimensionality reduction and estimation of optimum k (max k to evaluate: 9) terr.som <- som_gap(terr.varscale, xdim = 9, ydim = 9, K.max = 9) # Plot results plot(terr.som$SOM, main = "SOM Codes") # Self-Organizing Map's codes par(mar = c(4.5, 4.5, 2, 1)) plot(terr.som$SOMgap, main = "Gap Statistic") # Gap statistic abline(v = terr.som$Kopt) # Optimum k ``` ## Rasterization of the SOM and PAM outputs Once the feature space has been reduced and the optimum *k* has been estimated with *som_gap*, rasterized versions of the SOM and PAM outputs can be created with the function *som_pam*. The rasterized PAM represents the final set of classification units for a landscape factor. Note that any other clustering algorithm outside of **rassta** can be used to create classification units from the output SOM of *som_gap*. The code below demonstrates the creation of classification units with *som_pam*. ```{r terrclasr, fig.show = 'hold', fig.height = 2.5, fig.width = 2.8, message = FALSE, warning = FALSE, error = TRUE} # Create reference SpatRaster rastref <- aggregate(terr.var[[1]], fact = 3) # Rasterization of terrain SOM grid and terrain PAM clustering terr.sompam <- som_pam(ref.rast = rastref, kohsom = terr.som$SOM, k = terr.som$Kopt ) # Plot results plot(terr.sompam$sompam.rast[[1]], main = "Terrain Self-Organizing Map", mar = c(1.5, 1.3, 1.5, 3.3), col = hcl.colors(100, "spectral", rev = TRUE) ) plot(terr.sompam$sompam.rast[[2]], main = "Terrain Classification Units", type = "classes", col = hcl.colors(terr.som$Kopt, "spectral", rev = TRUE), mar = c(1.5, 2, 1.5, 2.5) ) ``` ### Reclassification Note that using **terra**, the numeric IDs of classification units can be reassigned according to the interpretation of each unit. The assignment of new IDs for classes is commonly referred to as 'reclassification' in the geospatial analysis literature. In the next code example, the numeric ID of the terrain classification units are reassigned based on the variable *height*. Terrain classes at higher relative positions (e.g., summits) will have values closer to one. Conversely, terrain classes at lower relative positions (e.g., stream channels) will have values closer to 6. ```{r terrclasrec, fig.show = 'hold', fig.height = 2.5, fig.width = 2.8, message = FALSE, warning = FALSE, error = TRUE} # Extract rasterized PAM solution terr.cu <- terr.sompam$sompam.rast[[2]] # With terra::zonal(), unit-wise mean value of terrain height terrh <- aggregate(terr.var$height, fact = 3) # Match spatial resolution terr.stat <- zonal(terrh, terr.cu, fun = mean) # Order numeric IDs based on terrain height (descending) terr.stat <- terr.stat[order(terr.stat$height, decreasing = TRUE), ] # Column with original numeric IDs terr.stat$CU <- seq(1, terr.som$Kopt) # With terra::classify(), reclassify numeric IDs terr.cu <- classify(terr.cu, terr.stat[, c(1,3)]) # Plot original and reclassified terrain classification units plot(terr.sompam$sompam.rast[[2]], main = "Terrain Classification Units", type = "classes", col = hcl.colors(terr.som$Kopt, "spectral", rev = TRUE), mar = c(1.5, 2, 1.5, 2.5) ) plot(terr.cu, type = "classes", col = hcl.colors(terr.som$Kopt, "spectral"), main = "Reclassified Terrain Classification Units", mar = c(1.5, 2, 1.5, 2.5) ) ``` --- Clean files from temporary directory ```{r clean, message = FALSE, warning = FALSE, error = TRUE} unlink(c(paste(o, "/wasoil.zip", sep = ""), d), recursive = TRUE) ``` --- ## References - B.A. Fuentes, M.J. Dorantes, and J.R. Tipton. rassta: Raster-based Spatial Stratification Algorithms. *EarthArXiv*, 2021. https://doi.org/10.31223/X50S57 - L. Kaufman and P. Rousseeuw. Finding groups in data: an introduction to cluster analysis. *John Wiley & Sons*, 1990. https://doi.org/10.1002/9780470316801 - T. Kohonen. Self-organized formation of topologically correct feature maps. *Biological Cybernetics*, 43 (1):59–69, 1982. https://doi.org/10.1007/bf00337288 - T. Kohonen. The self-organizing map. *Proceedings of the IEEE*, 1990. https://doi.org/10.1016/s0925-2312(98)00030-7 - M. Maechler, P. Rousseeuw, A. Struyf, M. Hubert, and K. Hornik. cluster: Cluster Analysis Basics and Extensions. *R package* version 2.1.2, 2021. https://CRAN.R-project.org/package=cluster - R. Tibshirani, G. Walther, and T. Hastie. Estimating the number of clusters in a data set via the gap statistic. *Journal of the Royal Statistical Society*: Series B (Statistical Methodology), 63(2):411–423, 2001. https://doi.org/10.1111/1467-9868.00293 - R. Wehrens and J. Kruisselbrink. Flexible self-organizing maps in kohonen 3.0. *Journal of Statistical Software*, 87(1):1–18, 2018. https://doi.org/10.18637/jss.v087.i07