\input{share/preamble} %\VignetteIndexEntry{Risk and ruin theory} %\VignettePackage{actuar} %\SweaveUTF8 \title{Risk and ruin theory features of \pkg{actuar}} \author{Christophe Dutang \\ Université Paris Dauphine \\[3ex] Vincent Goulet \\ Université Laval \\[3ex] Mathieu Pigeon \\ Université du Québec à Montréal} \date{} %% Additional math commands \newcommand{\VaR}{\mathrm{VaR}} \newcommand{\CTE}{\mathrm{CTE}} <>= library(actuar) options(width = 52, digits = 4) @ \begin{document} \maketitle \section{Introduction} \label{sec:introduction} Risk theory refers to a body of techniques to model and measure the risk associated with a portfolio of insurance contracts. A first approach consists in modeling the distribution of total claims over a fixed period of time using the classical collective model of risk theory. A second input of interest to the actuary is the evolution of the surplus of the insurance company over many periods of time. In \emph{ruin theory}, the main quantity of interest is the probability that the surplus becomes negative, in which case technical ruin of the insurance company occurs. The interested reader can read more on these subjects in \cite{LossModels4e,Gerber_MRT,DenuitCharpentier1,MART:2e}, among others. The current version of \pkg{actuar} \citep{actuar} contains four visible functions related to the above problems: two for the calculation of the aggregate claim amount distribution and two for ruin probability calculations. \section{The collective risk model} \label{sec:collective-risk-model} Let random variable $S$ represent the aggregate claim amount (or total amount of claims) of a portfolio of independent risks over a fixed period of time, random variable $N$ represent the number of claims (or frequency) in the portfolio over that period, and random variable $C_j$ represent the amount of claim $j$ (or severity). Then, we have the random sum \begin{equation} \label{eq:definition-S} S = C_1 + \dots + C_N, \end{equation} where we assume that $C_1, C_2, \dots$ are mutually independent and identically distributed random variables each independent of $N$. The task at hand consists in evaluating numerically the cdf of $S$, given by \begin{align} \label{eq:cdf-S} F_S(x) &= \Pr[S \leq x] \notag \\ &= \sum_{n = 0}^\infty \Pr[S \leq x|N = n] p_n \notag \\ &= \sum_{n = 0}^\infty F_C^{*n}(x) p_n, \end{align} where $F_C(x) = \Pr[C \leq x]$ is the common cdf of $C_1, \dots, C_n$, $p_n = \Pr[N = n]$ and $F_C^{*n}(x) = \Pr[C_1 + \dots + C_n \leq x]$ is the $n$-fold convolution of $F_C(\cdot)$. If $C$ is discrete on $0, 1, 2, \dots$, one has \begin{equation} \label{eq:convolution-formula} F_C^{*k}(x) = \begin{cases} I\{x \geq 0\}, & k = 0 \\ F_C(x), & k = 1 \\ \sum_{y = 0}^x F_C^{*(k - 1)}(x - y) f_C(y), & k = 2, 3, \dots, \end{cases} \end{equation} where $I\{\mathcal{A}\} = 1$ if $\mathcal{A}$ is true and $I\{\mathcal{A}\} = 0$ otherwise. \section{Discretization of claim amount distributions} \label{sec:discretization} Some numerical techniques to compute the aggregate claim amount distribution (see \autoref{sec:aggregate}) require a discrete arithmetic claim amount distribution; that is, a distribution defined on $0, h, 2h, \dots$ for some step (or span, or lag) $h$. The package provides function \code{discretize} to discretize a continuous distribution. (The function can also be used to modify the support of an already discrete distribution, but this requires additional care.) Let $F(x)$ denote the cdf of the distribution to discretize on some interval $(a, b)$ and $f_x$ denote the probability mass at $x$ in the discretized distribution. Currently, \code{discretize} supports the following four discretization methods. \begin{enumerate} \item Upper discretization, or forward difference of $F(x)$: \begin{equation} \label{eq:discretization:upper} f_x = F(x + h) - F(x) \end{equation} for $x = a, a + h, \dots, b - h$. The discretized cdf is always above the true cdf. \item Lower discretization, or backward difference of $F(x)$: \begin{equation} \label{eq:discretization:lower} f_x = \begin{cases} F(a), & x = a \\ F(x) - F(x - h), & x = a + h, \dots, b. \end{cases} \end{equation} The discretized cdf is always under the true cdf. \item Rounding of the random variable, or the midpoint method: \begin{equation} \label{eq:discretization:midpoint} f_x = \begin{cases} F(a + h/2), & x = a \\ F(x + h/2) - F(x - h/2), & x = a + h, \dots, b - h. \end{cases} \end{equation} The true cdf passes exactly midway through the steps of the discretized cdf. \item Unbiased, or local matching of the first moment method: \begin{equation} \label{eq:discretization:unbiased} f_x = \begin{cases} \dfrac{\E{X \wedge a} - \E{X \wedge a + h}}{h} + 1 - F(a), & x = a \\ \dfrac{2 \E{X \wedge x} - \E{X \wedge x - h} - \E{X \wedge x + h}}{h}, & a < x < b \\ \dfrac{\E{X \wedge b} - \E{X \wedge b - h}}{h} - 1 + F(b), & x = b. \end{cases} \end{equation} The discretized and the true distributions have the same total probability and expected value on $(a, b)$. \end{enumerate} \autoref{fig:discretization-methods} illustrates the four methods. It should be noted that although very close in this example, the rounding and unbiased methods are not identical. \begin{figure}[t] \centering <>= fu <- discretize(plnorm(x), method = "upper", from = 0, to = 5) fl <- discretize(plnorm(x), method = "lower", from = 0, to = 5) fr <- discretize(plnorm(x), method = "rounding", from = 0, to = 5) fb <- discretize(plnorm(x), method = "unbiased", from = 0, to = 5, lev = levlnorm(x)) par(mfrow = c(2, 2), mar = c(5, 2, 4, 2)) curve(plnorm(x), from = 0, to = 5, lwd = 2, main = "Upper", ylab = "F(x)") plot(stepfun(0:4, diffinv(fu)), pch = 20, add = TRUE) curve(plnorm(x), from = 0, to = 5, lwd = 2, main = "Lower", ylab = "F(x)") plot(stepfun(0:5, diffinv(fl)), pch = 20, add = TRUE) curve(plnorm(x), from = 0, to = 5, lwd = 2, main = "Rounding", ylab = "F(x)") plot(stepfun(0:4, diffinv(fr)), pch = 20, add = TRUE) curve(plnorm(x), from = 0, to = 5, lwd = 2, main = "Unbiased", ylab = "F(x)") plot(stepfun(0:5, diffinv(fb)), pch = 20, add = TRUE) ## curve(plnorm(x), from = 0, to = 5, lwd = 2, ylab = "F(x)") ## par(col = "blue") ## plot(stepfun(0:4, diffinv(fu)), pch = 19, add = TRUE) ## par(col = "red") ## plot(stepfun(0:5, diffinv(fl)), pch = 19, add = TRUE) ## par(col = "green") ## plot(stepfun(0:4, diffinv(fr)), pch = 19, add = TRUE) ## par(col = "magenta") ## plot(stepfun(0:5, diffinv(fb)), pch = 19, add = TRUE) ## legend(3, 0.3, legend = c("upper", "lower", "rounding", "unbiased"), ## col = c("blue", "red", "green", "magenta"), lty = 1, pch = 19, ## text.col = "black") @ \caption{Comparison of four discretization methods} \label{fig:discretization-methods} \end{figure} Usage of \code{discretize} is similar to R's plotting function \code{curve}. The cdf to discretize and, for the unbiased method only, the limited expected value function are passed to \code{discretize} as expressions in \code{x}. The other arguments are the upper and lower bounds of the discretization interval, the step $h$ and the discretization method. For example, upper and unbiased discretizations of a Gamma$(2, 1)$ distribution on $(0, 17)$ with a step of $0.5$ are achieved with, respectively, <>= fx <- discretize(pgamma(x, 2, 1), method = "upper", from = 0, to = 17, step = 0.5) fx <- discretize(pgamma(x, 2, 1), method = "unbiased", lev = levgamma(x, 2, 1), from = 0, to = 17, step = 0.5) @ Function \code{discretize} is written in a modular fashion making it simple to add other discretization methods if needed. \section{Calculation of the aggregate claim amount distribution} \label{sec:aggregate} Function \code{aggregateDist} serves as a unique front end for various methods to compute or approximate the cdf of the aggregate claim amount random variable $S$. Currently, five methods are supported. \begin{enumerate} \item Recursive calculation using the algorithm of \cite{Panjer_81}. This requires the severity distribution to be discrete arithmetic on $0, 1, 2, \dots, m$ for some monetary unit and the frequency distribution to be a member of either the $(a, b, 0)$ or $(a, b, 1)$ class of distributions \citep{LossModels4e}. (These classes contain the Poisson, binomial, negative binomial and logarithmic distributions and their zero-truncated and zero-modified extensions allowing for a zero or arbitrary mass at $x = 0$.) The general recursive formula is: \begin{displaymath} f_S(x) = \frac{(p_1 - (a + b)p_0)f_C(x) + \sum_{y=1}^{\min(x, m)}(a + by/x)f_C(y)f_S(x - y)}{1 - a f_C(0)}, \end{displaymath} with starting value $f_S(0) = P_N(f_C(0))$, where $P_N(\cdot)$ is the probability generating function of $N$. Probabilities are computed until their sum is arbitrarily close to 1. The recursions are done in C to dramatically increase speed. One difficulty the programmer is facing is the unknown length of the output. This was solved using a common, simple and fast technique: first allocate an arbitrary amount of memory and double this amount each time the allocated space gets full. \item Exact calculation by numerical convolutions using \eqref{eq:cdf-S} and \eqref{eq:convolution-formula}. This also requires a discrete severity distribution. However, there is no restriction on the shape of the frequency distribution. The package merely implements the sum \eqref{eq:cdf-S}, the convolutions being computed with R's function \code{convolve}, which in turn uses the Fast Fourier Transform. This approach is practical for small problems only, even on today's fast computers. \item Normal approximation of the cdf, that is \begin{equation} \label{eq:normal-approximation} F_S(x) \approx \Phi \left( \frac{x - \mu_S}{\sigma_S} \right), \end{equation} where $\mu_S = \E{S}$ and $\sigma_S^2 = \VAR{S}$. For most realistic models, this approximation is rather crude in the tails of the distribution. \item Normal Power II approximation of the cdf, that is \begin{equation} \label{eq:np2-approximation} F_S(x) \approx \Phi \left( -\frac{3}{\gamma_S} + \sqrt{\frac{9}{\gamma_S^2} + 1 + \frac{6}{\gamma_S} \frac{x - \mu_S}{\sigma_S}} \right), \end{equation} where $\gamma_S = \E{(S - \mu_S)^3}/\sigma_S^{3/2}$. The approximation is valid for $x > \mu_S$ only and performs reasonably well when $\gamma_S < 1$. See \cite{Daykin_et_al} for details. \item Simulation of a random sample from $S$ and approximation of $F_S(x)$ by the empirical cdf \begin{equation} F_n(x) = \frac{1}{n} \sum_{j = 1}^n I\{x_j \leq x\}. \end{equation} The simulation itself is done with function \code{simul} (see the \code{"simulation"} vignette). This function admits very general hierarchical models for both the frequency and the severity components. \end{enumerate} Here also, adding other methods to \code{aggregateDist} is simple due to its modular conception. The arguments of \code{aggregateDist} differ according to the chosen calculation method; see the help page for details. One interesting argument to note is \code{x.scale} to specify the monetary unit of the severity distribution. This way, one does not have to mentally do the conversion between the support of $0, 1, 2, \dots$ assumed by the recursive and convolution methods, and the true support of $S$. The recursive method fails when the expected number of claims is so large that $f_S(0)$ is numerically equal to zero. One solution proposed by \citet{LossModels4e} consists in dividing the appropriate parameter of the frequency distribution by $2^n$, with $n$ such that $f_S(0) > 0$ and the recursions can start. One then computes the aggregate claim amount distribution using the recursive method and then convolves the resulting distribution $n$ times with itself to obtain the final distribution. Function \code{aggregateDist} supports this procedure through its argument \code{convolve}. A common problem with the recursive method is failure to obtain a cumulative distribution function that reaching (close to) $1$. This is usually due to too coarse a discretization of the severity distribution. One should make sure to use a small enough discretization step and to discretize the severity distribution far in the right tail. The function \code{aggregateDist} returns an object of class \code{"aggregateDist"} inheriting from the \code{"function"} class. Thus, one can use the object as a function to compute the value of $F_S(x)$ in any $x$. For illustration purposes, consider the following model: the distribution of $S$ is a compound Poisson with parameter $\lambda = 10$ and severity distribution Gamma$(2, 1)$. To obtain an approximation of the cdf of $S$ we first discretize the gamma distribution on $(0, 22)$ with the unbiased method and a step of $0.5$, and then use the recursive method in \code{aggregateDist}: <>= fx <- discretize(pgamma(x, 2, 1), method = "unbiased", from = 0, to = 22, step = 0.5, lev = levgamma(x, 2, 1)) Fs <- aggregateDist("recursive", model.freq = "poisson", model.sev = fx, lambda = 10, x.scale = 0.5) summary(Fs) @ Although useless here, the following is essentially equivalent, except in the far right tail for numerical reasons: <>= Fsc <- aggregateDist("recursive", model.freq = "poisson", model.sev = fx, lambda = 5, convolve = 1, x.scale = 0.5) summary(Fsc) @ We return to object \code{Fs}. It contains an empirical cdf with support <>= knots(Fs) @ A nice graph of this function is obtained with a method of \code{plot} (see \autoref{fig:Fs}): <>= plot(Fs, do.points = FALSE, verticals = TRUE, xlim = c(0, 60)) @ \begin{figure}[t] \centering <>= plot(Fs, do.points = FALSE, verticals = TRUE, xlim = c(0, 60)) @ \caption{Graphic of the empirical cdf of $S$ obtained with the recursive method} \label{fig:Fs} \end{figure} The package defines a few summary methods to extract information from \code{"aggregateDist"} objects. First, there are methods of \code{mean} and \code{quantile} to easily compute the mean and obtain the quantiles of the approximate distribution: <>= mean(Fs) quantile(Fs) quantile(Fs, 0.999) @ Second, a method of \texttt{diff} gives easy access to the underlying probability mass function: <>= diff(Fs) @ Of course, this is defined (and makes sense) for the recursive, direct convolution and simulation methods only. Third, the package introduces the generic functions \code{VaR} and \code{CTE} (with alias \code{TVaR}) with methods for objects of class \code{"aggregateDist"}. The former computes the value-at-risk $\VaR_\alpha$ such that \begin{equation} \label{eq:VaR} \Pr[S \leq \VaR_\alpha] = \alpha, \end{equation} where $\alpha$ is the confidence level. Thus, the value-at-risk is nothing else than a quantile. As for the method of \code{CTE}, it computes the conditional tail expectation (also called Tail Value-at-Risk) \begin{equation} \label{eq:CTE} \CTE_\alpha = \E{S|S > \VaR_\alpha}. \end{equation} Here are examples using object \code{Fs} obtained above: <>= VaR(Fs) CTE(Fs) @ To conclude on the subject, \autoref{fig:Fs-comparison} shows the cdf of $S$ using five of the many combinations of discretization and calculation method supported by \pkg{actuar}. \begin{figure}[t] \centering <>= fx.u <- discretize(pgamma(x, 2, 1), from = 0, to = 22, step = 0.5, method = "upper") Fs.u <- aggregateDist("recursive", model.freq = "poisson", model.sev = fx.u, lambda = 10, x.scale = 0.5) fx.l <- discretize(pgamma(x, 2, 1), from = 0, to = 22, step = 0.5, method = "lower") Fs.l <- aggregateDist("recursive", model.freq = "poisson", model.sev = fx.l, lambda = 10, x.scale = 0.5) Fs.n <- aggregateDist("normal", moments = c(20, 60)) Fs.s <- aggregateDist("simulation", model.freq = expression(y = rpois(10)), model.sev = expression(y = rgamma(2, 1)), nb.simul = 10000) par(col = "black") plot(Fs, do.points = FALSE, verticals = TRUE, xlim = c(0, 60), sub = "") par(col = "blue") plot(Fs.u, do.points = FALSE, verticals = TRUE, add = TRUE, sub = "") par(col = "red") plot(Fs.l, do.points = FALSE, verticals = TRUE, add = TRUE, sub = "") par(col = "green") plot(Fs.s, do.points = FALSE, verticals = TRUE, add = TRUE, sub = "") par(col = "magenta") plot(Fs.n, add = TRUE, sub = "") legend(30, 0.4, c("recursive + unbiased", "recursive + upper", "recursive + lower", "simulation", "normal approximation"), col = c("black", "blue", "red", "green", "magenta"), lty = 1, text.col = "black") @ \caption{Comparison between the empirical or approximate cdf of $S$ obtained with five different methods} \label{fig:Fs-comparison} \end{figure} \section{The continuous time ruin model} \label{sec:ruin-model} We now turn to the multi-period ruin problem. Let $U(t)$ denote the surplus of an insurance company at time $t$, $c(t)$ denote premiums collected through time $t$, and $S(t)$ denote aggregate claims paid through time $t$. If $u$ is the initial surplus at time $t = 0$, then a mathematically convenient definition of $U(t)$ is \begin{equation} \label{eq:definition-surplus} U(t) = u + c(t) - S(t). \end{equation} As mentioned previously, technical ruin of the insurance company occurs when the surplus becomes negative. Therefore, the definition of the infinite time probability of ruin is \begin{equation} \label{eq:definition-ruin} \psi(u) = \Pr[U(t) < 0 \text{ for some } t \geq 0]. \end{equation} We define some other quantities needed in the sequel. Let $N(t)$ denote the number of claims up to time $t \geq 0$ and $C_j$ denote the amount of claim $j$. Then the definition of $S(t)$ is analogous to \eqref{eq:definition-S}: \begin{equation} \label{eq:definition-S(t)} S(t) = C_1 + \dots + C_{N(t)}, \end{equation} assuming $N(0) = 0$ and $S(t) = 0$ as long as $N(t) = 0$. Furthermore, let $T_j$ denote the time when claim $j$ occurs, such that $T_1 < T_2 < T_3 < \dots$ Then the random variable of the interarrival (or wait) time between claim $j - 1$ and claim $j$ is defined as $W_1 = T_1$ and \begin{equation} \label{eq:definition-wait} W_j = T_j - T_{j - 1}, \quad j \geq 2. \end{equation} For the rest of this discussion, we make the following assumptions: \begin{enumerate} \item premiums are collected at a constant rate $c$, hence $c(t) = ct$; \item the sequence $\{T_j\}_{j \geq 1}$ forms an ordinary renewal process, with the consequence that random variables $W_1, W_2, \dots$ are independent and identically distributed; \item claim amounts $C_1, C_2, \dots$ are independent and identically distributed. \end{enumerate} \section{Adjustment coefficient} \label{sec:adjustment-coefficient} The quantity known as the adjustment coefficient $\rho$ hardly has any physical interpretation, but it is useful as an approximation to the probability of ruin since we have the inequality \begin{displaymath} \psi(u) \leq e^{-\rho u}, \quad u \geq 0. \end{displaymath} The adjustment coefficient is defined as the smallest strictly positive solution (if it exists) of the Lundberg equation \begin{equation} \label{eq:definition-adjcoef} h(t) = \E{e^{t C - t c W}} = 1, \end{equation} where the premium rate $c$ satisfies the positive safety loading constraint $\E{C - cW} < 0$. If $C$ and $W$ are independent, as in the most common models, then the equation can be rewritten as \begin{equation} \label{eq:definition-adjcoef-ind} h(t) = M_C(t) M_W(-tc) = 1. \end{equation} Function \code{adjCoef} of \pkg{actuar} computes the adjustment coefficient $\rho$ from the following arguments: either the two moment generating functions $M_C(t)$ and $M_W(t)$ (thereby assuming independence) or else function $h(t)$; the premium rate $c$; the upper bound of the support of $M_C(t)$ or any other upper bound for $\rho$. For example, if $W$ and $C$ are independent and each follow an exponential distribution, $W$ with parameter $2$ and $C$ with parameter $1$, and the premium rate is $c = 2.4$ (for a safety loading of 20\% using the expected value premium principle), then the adjustment coefficient is <>= adjCoef(mgf.claim = mgfexp(x), mgf.wait = mgfexp(x, 2), premium.rate = 2.4, upper = 1) @ The function also supports models with proportional or excess-of-loss reinsurance \citep{Centeno_02}. Under the first type of treaty, an insurer pays a proportion $\alpha$ of every loss and the rest is paid by the reinsurer. Then, for fixed $\alpha$ the adjustment coefficient is the solution of \begin{equation} \label{eq:definition-adjcoef-prop} h(t) = \E{e^{t \alpha C - t c(\alpha) W}} = 1. \end{equation} Under an excess-of-loss treaty, the primary insurer pays each claim up to a limit $L$. Again, for fixed $L$, the adjustment coefficient is the solution of \begin{equation} \label{eq:definition-adjcoef-xl} h(t) = \E{e^{t \min(C, L) - t c(L) W}} = 1. \end{equation} For models with reinsurance, \code{adjCoef} returns an object of class \code{"adjCoef"} inheriting from the \code{"function"} class. One can then use the object to compute the adjustment coefficient for any retention rate $\alpha$ or retention limit $L$. The package also defines a method of \code{plot} for these objects. For example, using the same assumptions as above with proportional reinsurance and a 30\% safety loading for the reinsurer, the adjustment coefficient as a function of $\alpha \in [0, 1]$ is (see \autoref{fig:adjcoef} for the graph): <>= mgfx <- function(x, y) mgfexp(x * y) p <- function(x) 2.6 * x - 0.2 rho <- adjCoef(mgfx, mgfexp(x, 2), premium = p, upper = 1, reins = "prop", from = 0, to = 1) rho(c(0.75, 0.8, 0.9, 1)) plot(rho) @ \begin{figure}[t] \centering <>= plot(rho) @ \caption{Adjustment coefficient as a function of the retention rate} \label{fig:adjcoef} \end{figure} \section{Probability of ruin} \label{sec:ruin} In this subsection, we always assume that interarrival times and claim amounts are independent. The main difficulty with the calculation of the infinite time probability of ruin lies in the lack of explicit formulas except for the most simple models. If interarrival times are Exponential$(\lambda)$ distributed (Poisson claim number process) and claim amounts are Exponential$(\beta)$ distributed, then \begin{equation} \label{eq:ruin-cramer-lundberg} \psi(u) = \frac{\lambda}{c \beta}\, e^{-(\beta - \lambda/c) u}. \end{equation} If the frequency assumption of this model is defensible, the severity assumption can hardly be used beyond illustration purposes. Fortunately, phase-type distributions have come to the rescue since the early 1990s. \cite{AsmussenRolski_91} first show that in the classical Cramér--Lundberg model where interarrival times are Exponential$(\lambda)$ distributed, if claim amounts are Phase-type$(\mat{\pi}, \mat{T})$ distributed, then $\psi(u) = 1 - F(u)$, where $F$ is Phase-type$(\mat{\pi}_+, \mat{Q})$ with \begin{equation} \label{eq:prob-ruin:cramer-lundberg} \begin{split} \mat{\pi}_+ &= - \frac{\lambda}{c}\, \mat{\pi} \mat{T}^{-1} \\ \mat{Q} &= \mat{T} + \mat{t} \mat{\pi}_+, \end{split} \end{equation} and $\mat{t} = -\mat{T} \mat{e}$, $\mat{e}$ is a column vector with all components equal to 1; see the \code{"lossdist"} vignette for details. In the more general Sparre~Andersen model where interarrival times can have any Phase-type$(\mat{\nu}, \mat{S})$ distribution, \cite{AsmussenRolski_91} also show that using the same claim severity assumption as above, one still has $\psi(u) = 1 - F(u)$ where $F$ is Phase-type$(\mat{\pi}_+, \mat{Q})$, but with parameters \begin{equation} \label{eq:prob-ruin:sparre:pi+} \mat{\pi}_+ = \frac{\mat{e}^\prime (\mat{Q} - \mat{T})}{% c \mat{e}^\prime \mat{t}} \end{equation} and $\mat{Q}$ solution of \begin{equation} \label{eq:eq:prob-ruin:sparre:Q} \begin{split} \mat{Q} &= \Psi(\mat{Q}) \\ &= \mat{T} - \mat{t} \mat{\pi} \left[ (\mat{I}_n \otimes \mat{\nu}) (\mat{Q} \oplus \mat{S})^{-1} (\mat{I}_n \otimes \mat{s}) \right]. \end{split} \end{equation} In the above, $\mat{s} = -\mat{S} \mat{e}$, $\mat{I}_n$ is the $n \times n$ identity matrix, $\otimes$ denotes the usual Kronecker product between two matrices and $\oplus$ is the Kronecker sum defined as \begin{equation} \label{eq:kronecker-sum} \mat{A}_{m \times m} \oplus \mat{B}_{n \times n} = \mat{A} \otimes \mat{I}_n + \mat{B} \otimes \mat{I}_m. \end{equation} Function \code{ruin} of \pkg{actuar} returns a function object of class \code{"ruin"} to compute the probability of ruin for any initial surplus $u$. In all cases except the exponential/exponential model where \eqref{eq:ruin-cramer-lundberg} is used, the output object calls function \code{pphtype} to compute the ruin probabilities. Some thought went into the interface of \code{ruin}. Obviously, all models can be specified using phase-type distributions, but the authors wanted users to have easy access to the most common models involving exponential and Erlang distributions. Hence, one first states the claim amount and interarrival times models with any combination of \code{"exponential"}, \code{"Erlang"} and \code{"phase-type"}. Then, one passes the parameters of each model using lists with components named after the corresponding parameters of \code{dexp}, \code{dgamma} and \code{dphtype}. If a component \code{"weights"} is found in a list, the model is a mixture of exponential or Erlang (mixtures of phase-type are not supported). Every component of the parameter lists is recycled as needed. The following examples should make the matter clearer. (All examples use $c = 1$, the default value in \code{ruin}.) First, for the exponential/exponential model, one has <>= psi <- ruin(claims = "e", par.claims = list(rate = 5), wait = "e", par.wait = list(rate = 3)) psi psi(0:10) @ Second, for a mixture of two exponentials claim amount model and exponential interarrival times, the simplest call to \code{ruin} is <>= op <- options(width=50) @ <>= ruin(claims = "e", par.claims = list(rate = c(3, 7), weights = 0.5), wait = "e", par.wait = list(rate = 3)) @ Finally, one will obtain a function to compute ruin probabilities in a model with phase-type claim amounts and mixture of exponentials interarrival times with <>= prob <- c(0.5614, 0.4386) rates <- matrix(c(-8.64, 0.101, 1.997, -1.095), 2, 2) ruin(claims = "p", par.claims = list(prob = prob, rates = rates), wait = "e", par.wait = list(rate = c(5, 1), weights = c(0.4, 0.6))) @ To ease plotting of the probability of ruin function, the package provides a method of \code{plot} for objects returned by \code{ruin} that is a simple wrapper for \code{curve} (see \autoref{fig:prob-ruin}): <>= psi <- ruin(claims = "p", par.claims = list(prob = prob, rates = rates), wait = "e", par.wait = list(rate = c(5, 1), weights = c(0.4, 0.6))) plot(psi, from = 0, to = 50) @ <>= options(op) @ \begin{figure}[t] \centering <>= plot(psi, from = 0, to = 50) @ \caption{Graphic of the probability of ruin as a function of the initial surplus $u$} \label{fig:prob-ruin} \end{figure} \section{Approximation to the probability of ruin} \label{sec:beekman} When the model for the aggregate claim process \eqref{eq:definition-S(t)} does not fit nicely into the framework of the previous section, one can compute ruin probabilities using the so-called Beekman's convolution formula \citep{Beekman_68,BeekmanFormula_EAS}. Let the surplus process and the aggregate claim amount process be defined as in \eqref{eq:definition-surplus} and \eqref{eq:definition-S(t)}, respectively, and let $\{N(t)\}$ be a Poisson process with mean $\lambda$. As before, claim amounts $C_1, C_2, \dots$ are independent and identically distributed with cdf $P(\cdot)$ and mean $\mu = \E{C_1}$. Then the infinite time probability of ruin is given by \begin{equation} \label{eq:beekman:prob-ruin} \psi(u) = 1 - F(u), \end{equation} where $F(\cdot)$ is Compound~Geometric$(p, H)$ with \begin{equation} \label{eq:beekman:p} p = 1 - \frac{\lambda \mu}{c} \end{equation} and \begin{equation} \label{eq:beekman:H} H(x) = \int_0^x \frac{1 - P(y)}{\mu}\, dy. \end{equation} In other words, we have (compare with \eqref{eq:cdf-S}): \begin{equation} \label{eq:beekman:prob-ruin-long} \psi(u) = 1 - \sum_{n = 0}^\infty H^{*n}(u) p (1 - p)^n. \end{equation} In most practical situations, numerical evaluation of \eqref{eq:beekman:prob-ruin-long} is done using Panjer's recursive formula. This usually requires discretization of $H(\cdot)$. In such circumstances, Beekman's formula yields approximate ruin probabilities. For example, let claim amounts have a Pareto$(5, 4)$ distribution, that is \begin{displaymath} P(x) = 1 - \left( \frac{4}{4 + x} \right)^5 \end{displaymath} and $\mu = 1$. Then \begin{align*} H(x) &= \int_0^x \left( \frac{4}{4 + y} \right)^5 dy \\ &= 1 - \left( \frac{4}{4 + x} \right)^4, \end{align*} or else $H$ is Pareto$(4, 4)$. Furthermore, we determine the premium rate $c$ with the expected value premium principle and a safety loading of 20\%, that is $c = 1.2 \lambda \mu$. Thus, $p = 0.2/1.2 = 1/6$. One can get functions to compute lower bounds and upper bounds for $F(u)$ with functions \code{discretize} and \code{aggregateDist} as follows: <>= f.L <- discretize(ppareto(x, 4, 4), from = 0, to = 200, step = 1, method = "lower") f.U <- discretize(ppareto(x, 4, 4), from = 0, to = 200, step = 1, method = "upper") F.L <- aggregateDist(method = "recursive", model.freq = "geometric", model.sev = f.L, prob = 1/6) F.U <- aggregateDist(method = "recursive", model.freq = "geometric", model.sev = f.U, prob = 1/6) @ Corresponding functions for the probability of ruin $\psi(u)$ lower and upper bounds are (see \autoref{fig:beekman:prob-ruin} for the graphic): <>= psi.L <- function(u) 1 - F.U(u) psi.U <- function(u) 1 - F.L(u) u <- seq(0, 50, by = 5) cbind(lower = psi.L(u), upper = psi.U(u)) curve(psi.L, from = 0, to = 100, col = "blue") curve(psi.U, add = TRUE, col = "green") @ \begin{figure}[t] \centering <>= curve(psi.L, from = 0, to = 100, col = "blue") curve(psi.U, add = TRUE, col = "green") @ \caption{Lower and upper bounds for the probability of ruin as determined using Beekman's convolution formula.} \label{fig:beekman:prob-ruin} \end{figure} One can make the bounds as close as one wishes by reducing the discretization step. \bibliography{actuar} \end{document} %%% Local Variables: %%% mode: noweb %%% TeX-master: t %%% coding: utf-8 %%% End: