---
title: "tsmarch demo"
author: "Alexios Galanos"
date: "`r Sys.Date()`"
output: 
    rmarkdown::html_vignette:
        css: custom.css
        code_folding: show
        toc: true
link-citations: yes
bibliography-style: apalike
natbiboptions: round
bibliography: tsmarch.bib
vignette: >
  %\VignetteIndexEntry{tsmarch demo}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Introduction


All models in the `tsmarch` package assume as input a zero-mean time series of returns, 
which means that conditional mean filtration should be performed outside the package.
However, all methods provide an argument to pass the already estimated conditional
mean (`cond_mean`) which takes care of the conditional distribution centering, 
particularly during the prediction step, otherwise the user can use the simulated
distribution of innovations as an input to the conditional mean dynamics simulation.
Details of how to work with conditional mean dynamics is available in another
vignette. For this demo, we simply center the data by their mean.


```{r mod_calc,highlight=TRUE}
suppressMessages(library(tsmarch))
suppressMessages(library(tstests))
suppressMessages(library(xts))
suppressMessages(library(shape))
suppressMessages(library(tsgarch))
suppressMessages(library(tsdistributions))
data(globalindices)
Sys.setenv(TZ = "UTC")
train_set <- 1:1600
test_set <- 1601:1698
series <- 1:5
y <- as.xts(globalindices[, series])
train <- y[train_set,]
mu <- colMeans(train)
train <- sweep(train, 2, mu, "-")
test <- y[test_set,]
test <- sweep(test, 2, mu, "-")
oldpar <- par(mfrow = c(1,1))
```


# GOGARCH Dynamics

## Model Specification and Estimation


```{r gogarch_model}
gogarch_mod <- gogarch_modelspec(train, distribution = "nig", model = "gjrgarch", components = 4) |> estimate()
summary(gogarch_mod)
```

In the code snippet above we used dimensionality reduction in the whitening stage 
with the `components` argument. The returned object is of class `r class(gogarch_mod)` 
from which we can then proceed to further analyze the series:

```{r newsimpact,fig.width=6,fig.height=4}
gogarch_mod |> newsimpact(type = "correlation", pair = c(2,5), factor = c(1,3)) |> plot()
```

## Filtering {#sec-gogarch-filtering}

Online filtering of new data with the existing estimated model can be achieved
via the `tsfilter` method which returns an object of class  `r class(gogarch_mod)` 
updated with the new information. What this allows us to do is to use the existing
estimated model in order to filter newly arrived information without having to
re-estimate. Since the returned object is the same as the estimated object, we
can then use the existing methods to analyze the new data. The next code snippet
shows how to perform 1-step ahead rolling predictions and generation of an equal 
weighted portfolio value at risk at the 10% quantile.

```{r var_calc,highlight=TRUE}
h <- 98
w <- rep(1/5, 5)
gogarch_filter_mod <- gogarch_mod
var_value <- rep(0, 98)
actual <- as.numeric(coredata(test) %*% w)

# first prediction without filtering update
var_value[1] <- predict(gogarch_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
for (i in 2:h) {
  gogarch_filter_mod <- tsfilter(gogarch_filter_mod, y = test[i - 1,])
  var_value[i]  <- predict(gogarch_filter_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
}
```

At time `T+0` the initial prediction is made for `T+1`, and then the model is updated with 
new information using the `tsfilter` method bringing the model information set to time
`T+1` from which predictions at time `T+2` are made and so forth. This is equivalent
to a rolling 1-step ahead rolling prediction without re-estimation.

Having obtained the predicted value at risk from the simulated distribution, we can then
use the `var_cp_test` function from the [tstests](https://CRAN.R-project.org/package=tstests) 
package to evaluate the accuracy of the calculation:

```{r}
as_flextable(var_cp_test(actual, var_value, alpha = 0.1), include.decision = TRUE)
```


## Conditional Co-moments

There are 3 methods related to the conditional co-moments of the model: `tscov` (and `tscor`)
returns the `NxNxT` conditional covariance (correlation) matrix, 
`tscoskew` returns the `NxNxNxT` conditional co-skewewness matrix and `tscokurt`
returns the `NxNxNxNxT` conditional co-kurtosis matrix. These methods benefit
from the use of multiple threads which can be set via 
the `RcppParallel::setThreadOptions` function (though care should be taken about
availability of RAM).

```{r comoments,highlight=TRUE}
V <- tscov(gogarch_mod)
S <- tscoskew(gogarch_mod, standardized = TRUE, folded = TRUE)
K <- tscokurt(gogarch_mod, standardized = TRUE, folded = TRUE)
```

Notice that the `standardized` and `folded` arguments are used to return the standardized
co-moments in either folded or unfolded form. The unfolded form represented the flattened
tensor of the co-moments is useful for the calculation of the portfolio weighted moments
via the Kronecker product. Theses method are available for both estimate and predicted/simulated 
objects. To illustrate this, we also generate a 25 step ahead prediction, generate the
co-kurtosis distribution and then combine the estimated and predicted into a `tsmodel.predict`
object for which special purpose plots are available from the 
[tsmethods](https://CRAN.R-project.org/package=tsmethods) package.

```{r,fig.width=6,fig.height=3}
p <- predict(gogarch_mod, h = 25, nsim = 1000, seed = 100)
K_p <- tscokurt(p, standardized = TRUE, folded = TRUE, distribution = TRUE, index = 1:25)
K_p <- t(K_p[1,1,1,1,,])
colnames(K_p) <- as.character(p$forc_dates)
class(K_p) <- "tsmodel.distribution"
L <- list(original_series = xts(K[1,1,1,1,], as.Date(gogarch_mod$spec$target$index)), distribution = K_p)
class(L) <- "tsmodel.predict"
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8)
plot(L, gradient_color = "orange", interval_color = "cadetblue", median_color = "black", median_type = 2, median_width = 1, 
     n_original = 100, main = "Kurtosis [AEX]", xlab = "", cex.main = 0.8)
par(oldpar)
```

To calculate the weighted portfolio moments we can use the `tsaggregate` method and similarly
form an object for plotting, this time for the portfolio skewness.

```{r port_comoments,fig.width=6,fig.height=3}
port_moments_estimate <- tsaggregate(gogarch_mod, weights = w)
port_moments_predict <- tsaggregate(p, weights = w, distribution = TRUE)
L <- list(original_series = port_moments_estimate$skewness, distribution = port_moments_predict$skewness)
class(L) <- "tsmodel.predict"
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8)
plot(L, gradient_color = "orange", interval_color = "cadetblue", median_color = "black", median_type = 2, median_width = 1, 
     n_original = 100, main = "Portfolio Skewness", xlab = "", cex.main = 0.8)
par(oldpar)
```

## Weighted Portfolio Distribution

The main vignette discusses in detail the convolution approach for generating
a weighted portfolio distribution from which the density and quantile functions
can then be approximated. A short example is provided below where we evaluate
the FFT approximation against the exact moments for a 98-step ahead prediction.

```{r convolution,fig.width=6,fig.height=5}
p <- predict(gogarch_mod, h = 98, nsim = 1000)
port_f_moments <- do.call(cbind, tsaggregate(p, weights = w, distribution = FALSE))
pconv <- tsconvolve(p, weights = w, fft_support = NULL, fft_step = 0.0001, fft_by = 0.00001, distribution = FALSE)
p_c_moments <- matrix(0, ncol = 4, nrow = 98)
for (i in 1:98) {
  df <- dfft(pconv, index = i)
  mu <- pconv$mu[i]
  f_2 <- function(x) (x - mu)^2 * df(x)
  f_3 <- function(x) (x - mu)^3 * df(x)
  f_4 <- function(x) (x - mu)^4 * df(x)
  sprt <- attr(pconv$y[[i]],"support")
  p_c_moments[i,2] <- sqrt(integrate(f_2, sprt[1], sprt[2], abs.tol = 1e-8, subdivisions = 500)$value)
  p_c_moments[i,3] <- integrate(f_3, sprt[1], sprt[2], abs.tol = 1e-8, subdivisions = 500)$value/p_c_moments[i,2]^3
  p_c_moments[i,4] <- integrate(f_4, sprt[1], sprt[2], abs.tol = 1e-8, subdivisions = 500)$value/p_c_moments[i,2]^4
}
par(mar = c(2,2,2,2), mfrow = c(3,1), pty = "m")
matplot(cbind(as.numeric(port_f_moments[,2]), p_c_moments[,2]), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), main = "Sigma", xaxt = "n")
grid()
matplot(cbind(as.numeric(port_f_moments[,3]), p_c_moments[,3]), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), main = "Skewness", xaxt = "n")
grid()
matplot(cbind(as.numeric(port_f_moments[,4]), p_c_moments[,4]), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), main = "Kurtosis")
grid()
par(oldpar)
```

This provides for a code correctness check of the FFT approximation to inverting the characteristic
function as we observe that the approximation and exact moments are identical. However, care must
be taken in certain cases in terms of calibrating the step size as well as the integration function
tolerance levels to achieve the desired accuracy.

The next plot shows how to generate a value at risk surface for the prediction period. It should be
noted that the sample quantiles from the simulated distribution will not match up to the FFT 
approximation since the one is based on simulation whereas the other is an analytic approximation
to the weighted density.

```{r port_convolution,fig.width=6,fig.height=4}
p <- predict(gogarch_mod, h = 98, nsim = 5000)
pconv <- tsconvolve(p, weights = w, fft_support = NULL, fft_step = 0.0001, fft_by = 0.00001, distribution = FALSE)
q_seq <- seq(0.025, 0.975, by = 0.025)
q_surface = matrix(NA, ncol = length(q_seq), nrow = 98)
for (i in 1:98) {
  q_surface[i,] <- qfft(pconv, index = i)(q_seq)
}
par(mar = c(1.8,1.8,1.1,1), pty = "m")
col_palette <- drapecol(q_surface, col = femmecol(100), NAcol = "white")
persp(x = 1:98, y = q_seq, z = q_surface,  col = col_palette, theta = 45, 
      phi = 15, expand = 0.5, ltheta = 25, shade = 0.25, 
      ticktype = "simple", xlab = "Time", ylab = "Quantile", 
      zlab = "VaR", cex.axis = 0.8, main = "Value at Risk Prediction Surface")
par(oldpar)
```

We can also generate the probability integral transform of the weighted distribution
which can be used in the expected shortfall test:

```{r pit}
pit_value <- pit(pconv, actual)
as_flextable(shortfall_de_test(pit_value, alpha = 0.1), include.decision = TRUE)
```


# DCC Dynamics

## Model Specification and Estimation

In the DCC model we need to pre-estimate the univariate dynamics before passing 
them to the DCC specification as a `multi_garch` class object. With the exception
of the Copula model, the marginal distributions of the univariate GARCH models
should always be Normal, irrespective of whether a multivariate Normal or Student
is chosen as the DCC model distribution. There are no checks performed for this
and it is up to the user to ensure that this is the case. Additionally, for the
purpose of allowing the calculation of the partitioned Hessian, the argument
`keep_tmb` should be set to TRUE in the estimation routine of the univariate models.


```{r garch_dcc_model}
garch_model <- lapply(1:5, function(i) {
  garch_modelspec(train[,i], model = "gjrgarch") |> estimate(keep_tmb = TRUE)
})
garch_model <- to_multi_estimate(garch_model)
names(garch_model) <- colnames(train)
```

Once the univariate models have been estimated and converted to the appropriate
class, we can then pass the object to the DCC model for estimation:

```{r dcc_model, message=FALSE}
dcc_mod <- dcc_modelspec(garch_model, dynamics = "adcc", distribution = "mvt") |> estimate()
dcc_mod |> summary()
```

We chose to use `adcc` dynamics for this demo which allows asymmetric reaction to
positive and negative shocks, and nicely visualized using a news impact correlation 
surface plot:

```{r dcc_newsimpact, fig.width=6,fig.height=4}
newsimpact(dcc_mod, pair = c(1,2)) |> plot()
```

## Filtering

We perform a similar exercise as in the GOGARCH filtering section:

```{r dcc_var_calc,highlight=TRUE}
h <- 98
w <- rep(1/5, 5)
dcc_filter_mod <- dcc_mod
var_value <- rep(0, 98)
actual <- as.numeric(coredata(test) %*% w)
# first prediction without filtering update
var_value[1] <- predict(dcc_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
for (i in 2:h) {
  dcc_filter_mod <- tsfilter(dcc_filter_mod, y = test[i - 1,])
  var_value[i]  <- predict(dcc_filter_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
}
as_flextable(var_cp_test(actual, var_value, alpha = 0.1), include.decision = TRUE)
```

## Weighted Portfolio Distribution

There are 2 ways to obtain the weighted portfolio distribution for the DCC model:

1. Use the simulated distribution and aggregate (method used in `tsaggregate`)
2. Make use of the analytic form for the weighted multivariate Normal and Student distributions.

We illustrate both approaches in a quick prediction exercise:

```{r dcc_weighted, fig.width=6,fig.height=4}
p <- predict(dcc_mod, h = 98, nsim = 5000)
simulated_aggregate <- tsaggregate(p, weights = w, distribution = TRUE)
# we don't have any conditional mean dynamics but uncertainty around zero from the simulation
weighted_mu <- t(apply(p$mu, 1, rowMeans)) %*% w
H <- tscov(p, distribution = FALSE)
weighted_sigma <- sqrt(sapply(1:98, function(i) w %*% H[,,i] %*% w))
shape <- unname(coef(dcc_mod)["shape"])
simulated_var <- unname(apply(simulated_aggregate$mu, 2, quantile, 0.05))
analytic_var <- qstd(0.05, mu = weighted_mu, sigma = weighted_sigma, shape = shape)
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8, cex.main = 0.8)
plot(as.Date(p$forc_dates), simulated_var, type = "l", ylab = "", xlab = "", main = "Value at Risk [5%]", ylim = c(-0.039, -0.033))
lines(as.Date(p$forc_dates), analytic_var, col = 2, lty = 2)
legend("topright", c("Simulated","Analytic"), col = 1:2, lty = 1:2, bty = "n")
par(oldpar)
```

Note that the DCC dynamics do not have a closed form solution for the
multi-step ahead forecast. Approximations have been used in the literature but in
the `tsmarch` package we have instead opted for a simulation approach which means
that when calling the `tscov` method on a predicted object it will either return
the full simulated array of covariance matrices else their average across each horizon
when the `distribution` argument is set to FALSE.

# Copula with DCC Dynamics

The Copula model allows different distributions for the margins allowing for
an additional layer of flexibility. The next sections use the same type of 
code examples as in the DCC model. Once a model is estimated, the methods
applied on the model and all subsequent methods are the same as in the DCC and
GOGARCH models.

## Model Specification and Estimation

```{r garch_copula_model}
distributions <- c(rep("jsu",4), rep("sstd",1))
garch_model <- lapply(1:5, function(i) {
  garch_modelspec(train[,i], model = "gjrgarch", distribution = distributions[i]) |> estimate(keep_tmb = TRUE)
})
garch_model <- to_multi_estimate(garch_model)
names(garch_model) <- colnames(train)
```

```{r cgarch_model, message=FALSE}
cgarch_mod <- cgarch_modelspec(garch_model, dynamics = "adcc", 
                               transformation = "parametric", 
                               copula = "mvt") |> estimate()
cgarch_mod |> summary()
```


```{r cgarch_newsimpact, fig.width=6,fig.height=4}
newsimpact(cgarch_mod, pair = c(1,2)) |> plot()
```


## Filtering


```{r cgarch_var_calc,highlight=TRUE}
h <- 98
w <- rep(1/5, 5)
cgarch_filter_mod <- cgarch_mod
var_value <- rep(0, 98)
actual <- as.numeric(coredata(test) %*% w)
# first prediction without filtering update
var_value[1] <- predict(cgarch_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
for (i in 2:h) {
  cgarch_filter_mod <- tsfilter(cgarch_filter_mod, y = test[i - 1,])
  var_value[i]  <- predict(cgarch_filter_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
}
as_flextable(var_cp_test(actual, var_value, alpha = 0.1), include.decision = TRUE)
```

## Weighted Portfolio Distribution

For the Copula model we reply on the simulated distribution for all calculations:

```{r cgarch_weighted, fig.width=6,fig.height=4}
p <- predict(cgarch_mod, h = 98, nsim = 5000)
simulated_aggregate <- tsaggregate(p, weights = w, distribution = TRUE)
simulated_var <- unname(apply(simulated_aggregate$mu, 2, quantile, 0.05))
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8, cex.main = 0.8)
plot(as.Date(p$forc_dates), simulated_var, type = "l", ylab = "", xlab = "", main = "Value at Risk [5%]")
par(oldpar)
```


# Conditional Mean Dynamics

We briefly address in this section the question of how to handle conditional
mean dynamics. There are effectively 2 approaches which are available for the
user:

1. Pass in the `cond_mean` at every stage of the analysis 
(i.e. estimation, prediction, filtering, simulation etc) and the underlying
code will take care of re-centering the simulated distributions.
2. Do not pass anything for the `cond_mean` but then take the predicted
simulated zero mean correlated residuals use them as inputs to the prediction 
method of the conditional mean dynamics.

Whilst the second method may be preferred, not many packages have either an
option for generating a simulated predictive distribution or taking an input
of a pre-created matrix of correlated residuals. In the next section we
illustrate both approaches.


## Method 1: Conditional Mean Re-centering


We first estimate the conditional mean dynamics using an AR(6) model, extract
the residuals and fitted values and then make a 25 step ahead prediction.

```{r}
arima_model <- lapply(1:5, function(i){
  arima(train[,i], order = c(6,0,0), method = "ML")
})
.residuals <- do.call(cbind, lapply(arima_model, function(x) as.numeric(residuals(x))))
colnames(.residuals) <- colnames(train)
.residuals <- xts(.residuals, index(train))
.fitted <- train - .residuals
.predicted <- do.call(cbind, lapply(1:5, function(i){
  as.numeric(predict(arima_model[[i]], n.ahead = 25)$pred)
}))
colnames(.predicted) <- colnames(train)
```

We then pass the .fitted values to the estimation method and the .predicted
valued to the prediction method. Technically, the estimation method does not
require this if we are only interested in prediction since they will not be used.
All 3 models in the package handle the conditional mean inputs in the same way,
ensuring that the output generated from different methods which depends on this
will be correctly reflected. For this example we will use the DCC model:

```{r}
dcc_mod_mean <- dcc_modelspec(garch_model, dynamics = "adcc", distribution = "mvt", cond_mean = .fitted) |> estimate()
all.equal(fitted(dcc_mod_mean), .fitted)
```

As expected the fitted method now picks up the `cond_mean` passed to the model.

```{r}
p <- predict(dcc_mod_mean, h = 25, cond_mean = .predicted, nsim = 5000, seed = 100)
simulated_mean <- as.matrix(t(apply(p$mu, 1, rowMeans)))
colnames(simulated_mean) <- colnames(train)
all.equal(simulated_mean, .predicted)
```

The mean of the simulated predictive distribution for each series and horizon is 
now the same as the matrix passed (.predicted) as a result of the re-centering
operation automatically carried out.

## Method 2: Innovation Distribution Injection

In the injection approach, we pass the simulated correlated innovations from the DCC model
to the ARIMA simulation and ensure that we also pass enough start-up innovations to
produce a forward type simulation equivalent to a simulated forecast.

```{r, fig.width=6,fig.height=4}
res <- p$mu
arima_pred <- lapply(1:5, function(i){
  # we eliminate the mean prediction from the simulated predictive distribution
  # to obtain the zero mean innovations
  res_i <- scale(t(res[,i,]), scale = FALSE, center = TRUE)
  sim_p <- do.call(rbind, lapply(1:5000, function(j) {
    arima.sim(model = list(ar = coef(arima_model[[i]])[1:6]), n.start = 20, n = 25, innov = res_i[j,], start.innov = as.numeric(tail(.residuals[,i],20))) |> as.numeric() + coef(arima_model[[i]])[7]
  }))
  return(sim_p)
})
arima_pred <- array(unlist(arima_pred), dim = c(5000, 25, 5))
arima_pred <- aperm(arima_pred, c(2, 3, 1))

simulated_mean <- as.matrix(t(apply(arima_pred, 1, rowMeans)))
colnames(simulated_mean) <- colnames(train)
par(mfrow = c(3,2), mar = c(2,2,2,2))
for (i in 1:5) {
  matplot(cbind(simulated_mean[,i], .predicted[,i]), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), ylab = "", xaxt = "n")
  grid()
}
par(oldpar)
```

The simulated mean is as expected no different from the prediction mean of the ARIMA model.

## Method Comparison

We visually inspect the 2 methods by creating a couple of overlayed distribution plots 

```{r, fig.width=6,fig.height=4}
i <- 1
sim_1a <- t(p$mu[,i,])
sim_1b <- t(arima_pred[,i,])
colnames(sim_1a) <- colnames(sim_1b) <- as.character(p$forc_dates)
class(sim_1a) <- class(sim_1b) <- "tsmodel.distribution"
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8)
plot(sim_1a, gradient_color = "whitesmoke", interval_color = "orange", median_color = "orange")
plot(sim_1b, add = TRUE, gradient_color = "whitesmoke", interval_color = "steelblue", median_color = "steelblue", median_type = 2)
par(oldpar)
```

Next we visually inspect the pairwise correlations between the two methods:

```{r, fig.width=6,fig.height=4}
j <- 2
sim_2a <- t(p$mu[,j,])
sim_2b <- t(arima_pred[,j,])
colnames(sim_2a) <- colnames(sim_2b) <- as.character(p$forc_dates)
class(sim_2a) <- class(sim_2b) <- "tsmodel.distribution"
C_a <- sapply(1:25, function(i) cor(sim_1a[,i], sim_2a[,i]))
C_b <- sapply(1:25, function(i) cor(sim_1b[,i], sim_2b[,i]))
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8, cex.main = 0.8)
matplot(cbind(C_a, C_b), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), ylab = "", main = "Pairwise Correlation")
grid()
par(oldpar)
```

As can be observed, both methods produce almost identical output.