--- title: "Exploring Pattern Causality: An Intuitive Guide" author: "Stavros Stavroglou, Athanasios Pantelous, Hui Wang" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to patterncausality} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( warning = FALSE, collapse = TRUE, comment = "#>" ) ``` **Pattern Causality** is a novel approach for detecting and analyzing causal relationships within complex systems. It is particularly effective at: 1. **Identifying hidden patterns in time series data:** Uncovering underlying regularities in seemingly random data. 2. **Quantifying different types of causality:** Distinguishing between positive, negative, and "dark" causal influences, which are often difficult to observe directly. 3. **Providing robust statistical analysis:** Ensuring the reliability of causal inferences through rigorous statistical methods. This algorithm is especially useful in the following domains: - **Financial market analysis:** Understanding the interdependencies between different assets. - **Climate system interactions:** Revealing the complex relationships between various factors in climate change. - **Medical diagnosis:** Assisting in the analysis of disease progression patterns to identify potential causes. - **Complex system dynamics:** Studying the interactions between components in various complex systems. ## Application in Financial Markets First, we import stock data for Apple (AAPL) and Microsoft (MSFT). You can also import data using the **yahooo** API. ```{r message = FALSE} library(patterncausality) data(DJS) head(DJS) ``` The `data(DJS)` function loads the Dow Jones stocks dataset, which includes daily stock prices for several companies, including Apple and Microsoft. The `head(DJS)` function displays the first few rows of the dataset. Next, we visualize the stock prices to observe their trends more intuitively. ```{r echo=FALSE} library(ggplot2) library(ggthemes) df <- data.frame( Date = as.Date(DJS$Date), Value = c( DJS$Apple, DJS$Microsoft ), Type = c( rep("Apple", dim(DJS)[1]), rep("Microsoft", dim(DJS)[1]) ) ) ggplot(df) + geom_line(aes(Date, Value, group = Type, colour = Type), linewidth = 0.4) + theme_few(base_size = 12) + xlab("Time") + ylab("Stock Price") + theme( legend.position = "bottom", legend.box.background = element_rect(fill = NA, color = "black", linetype = 1), legend.key = element_blank(), legend.title = element_blank(), legend.background = element_blank(), axis.text = element_text(size = rel(0.8)), strip.text = element_text(size = rel(0.8)) ) + scale_color_manual(values = c("#DC143C", "#191970")) ``` ### Parameter Optimization To achieve the best results with pattern causality analysis, it's important to find the optimal parameters. The following code demonstrates how to search for these parameters using the `optimalParametersSearch` function (note that this code is not evaluated by default due to the time it may take). ```{r eval=FALSE} dataset <- DJS[,-1] parameter <- optimalParametersSearch(Emax = 5, tauMax = 5, metric = "euclidean", dataset = dataset) ``` ### Calculating Causality After determining the parameters, we can calculate the causality between the two series. ```{r} X <- DJS$Apple Y <- DJS$Microsoft pc <- pcLightweight(X, Y, E = 3, tau = 2, metric = "euclidean", h = 1, weighted = TRUE) print(pc) ``` The `pcLightweight` function performs a lightweight pattern causality analysis. - `X`: The first time series (Apple stock prices). - `Y`: The second time series (Microsoft stock prices). - `E`: The embedding dimension. - `tau`: The time delay. - `metric`: The distance metric to use. - `h`: The prediction horizon. - `weighted`: A boolean indicating whether to use erf function. The `print(pc)` function displays the results of the causality analysis, including the total, positive, negative, and dark causality percentages. Finally, we can visualize the results to better understand the causal relationships using the `plot_total` and `plot_components` functions. ```{r} plot_total(pc) plot_components(pc) ``` The `plot_total` function visualizes the total causality, and `plot_components` visualizes the different components of causality (positive, negative, and dark). ### Detailed Analysis For more detailed causality information, use the `pcFullDetails` function. ```{r eval=FALSE} X <- DJS$Apple Y <- DJS$Microsoft detail <- pcFullDetails(X, Y, E = 3, tau = 2, metric = "euclidean", h = 1, weighted = TRUE) # Access the causality components causality_real <- detail$causality_real causality_pred <- detail$causality_pred print(causality_pred) ``` This completes the entire process of the pattern causality algorithm.