fitsde()
functionThe Sim.DiffProc
package implements pseudo-maximum likelihood via the
fitsde()
function. The interface and the output of the
fitsde()
function are made as similar as possible to those
of the standard mle
function in the stats4
package of the basic R system. The main arguments to fitsde
consist:
data
a univariate time series (ts
object).drift
and diffusion
indicate drift and
diffusion coefficient of the SDE, is an expression
of two
variables t
, x
and theta
names of
the parameters, and must be nominated by a vector of
theta = (theta[1], theta[2],..., theta[p])
for reasons of
symbolic derived in approximation methods.start
must be specified as a named list, where the
names of the elements of the list correspond to the names of the
parameters as they appear in the drift
and
diffusion
coefficient.pmle
argument must be a character
string specifying the method to use, can be either: "euler"
Euler pseudo-likelihood, "ozaki"
Ozaki pseudo-likelihood,
"shoji"
Shoji pseudo-likelihood and "kessler"
Kessler pseudo-likelihood.optim.method
select the optimization method
("L-BFGS-B"
is used by default), and further arguments to
pass to optim
function.lower
and upper
bounds on the variables
for the Brent
or L-BFGS-B
method.The functions of type S3 method
(as similar of the
standard mle
function in the stats4
package of
the basic R system for the class fitsde
are the
following:
coef
: which extracts model coefficients from objects
returned by fitsde
.vcov
: returns the variance-covariance matrix of the
parameters of a fitted model objects.logLik
: extract log-likelihood.AIC
: calculating Akaike’s Information Criterion for
fitted model objects.BIC
: calculating Schwarz’s Bayesian Criterion for
fitted model objects.confint
: computes confidence intervals for one or more
parameters in a fitted model objects.The following we explain how to use this function to estimate a SDE with different approximation methods.
Consider a process solution of the general stochastic differential equation: dXt=f(Xt,θ_)dt+g(Xt,θ_)dWt,t≥0,X0=x0, The Euler scheme produces the discretization (Δt→0): Xt+Δt−Xt=f(Xt,θ)Δt+g(Xt,θ)(Wt+Δt−Wt), The increments Xt+Δt−Xt are then independent Gaussian random variables with mean: Ex=f(Xt,θ)Δt, and variance: Vx=g2(Xt,θ)Δt. Therefore the transition density of the process can be written as: pθ(t,y|x)=1√2πtg2(x,θ)exp(−(y−x−f(x,θ)t)22tg2(x,θ)) Then the log-likelihood is: hn(θ|X1,X2,…,Xn)=−12(n∑i=1(Xi−Xi−1−f(Xi−1,θ)Δ)2σ2Δt+nlog(2πσ2Δt))
As an example, we consider the Chan-Karolyi-Longstaff-Sanders (CKLS)
model: dXt=(θ1+θ2Xt)dt+θ3Xθ4tdWt,X0=2 with θ1=1, θ2=2, θ3=0.5 and θ4=0.3. Before calling
fitsde
, we generate sampled data Xti, with Δt=10−4, as following:
R> set.seed(12345, kind = "L'Ecuyer-CMRG")
R> f <- expression( (1+2*x) ) ; g <- expression( 0.5*x^0.3 )
R> sim <- snssde1d(drift=f,diffusion=g,x0=2,N=10^4,Dt=10^-4)
R> mydata <- sim$X
we set the initial values for the optimizer as θ1=θ2=θ3=θ4=1,
and we specify the coefficients drift and diffusion as expressions. you
can use the upper
and lower
limits of the
search region used by the optimizer (here using the default method
"L-BFGS-B"
), and specifying the method to use with
pmle="euler"
.
R> fx <- expression( theta[1]+theta[2]*x ) ## drift coefficient of model
R> gx <- expression( theta[3]*x^theta[4] ) ## diffusion coefficient of model
R> fitmod <- fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1=1, theta2=1,
+ theta3=1,theta4=1),pmle="euler")
R> fitmod
Call:
function (name, ...) .Primitive("call")
Coefficients:
theta1 theta2 theta3 theta4
1.10865 1.86188 0.46448 0.32910
The estimated coefficients are extracted from the output object
fitmod
as follows:
theta1 theta2 theta3 theta4
1.10865 1.86188 0.46448 0.32910
We can use the summary
function to produce result
summaries of output object:
Pseudo maximum likelihood estimation
Method: Euler
Call:
function (name, ...) .Primitive("call")
Coefficients:
Estimate Std. Error
theta1 1.10865 1.461152
theta2 1.86188 0.191992
theta3 0.46448 0.010136
theta4 0.32910 0.010845
-2 log L: -66532
vcov
for variance-covariance matrice, and extract
log-likelihood by logLik
:
theta1 theta2 theta3 theta4
theta1 2.134965381 -0.2309162505 -0.0001393819 0.000021102
theta2 -0.230916251 0.0368608762 -0.0000098306 0.000025981
theta3 -0.000139382 -0.0000098306 0.0001027407 -0.000103992
theta4 0.000021102 0.0000259811 -0.0001039921 0.000117615
[1] 33266
[1] -66524
[1] -66514
Computes confidence intervals for one or more parameters in a fitted SDE:
2.5 % 97.5 %
theta1 -1.75515 3.97246
theta2 1.48558 2.23818
theta3 0.44461 0.48435
theta4 0.30785 0.35036
The second approach we present is the Ozaki method, and it works for homogeneous stochastic differential equations. Consider the stochastic differential equation: dXt=f(Xt,θ_)dt+σdWt,t≥0,X0=x0, where σ is supposed to be constant. We just recall that the transition density for the Ozaki method is Gaussian, we have that: Xt+Δt|Xt=x∼N(Ex,Vx), where: Ex=x+f(x)∂xf(x)(e∂xf(x)Δt−1),andVx=σ2e2KxΔt−12Kx, with: Kx=1Δtlog(1+f(x)x∂xf(x)(e∂xf(x)Δt−1)) It is always possible to transform process Xt with a constant diffusion coefficient using the Lamperti transform.
Now we consider the Vasicek model, with f(x,θ)=θ1(θ2−x) and constant volatility g(x,θ)=θ3, dXt=θ1(θ2−Xt)dt+θ3dWt,X0=5 with θ1=3, θ2=2 and θ3=0.5, we generate sampled data Xti, with Δt=10−2, as following:
R> set.seed(1234, kind = "L'Ecuyer-CMRG")
R> f <- expression( 3*(2-x) ) ; g <- expression( 0.5 )
R> sim <- snssde1d(drift=f,diffusion=g,x0=5,Dt=0.01)
R> HWV <- sim$X
we set the initial values for the optimizer as θ1=θ2=θ3=1, and
we specify the coefficients drift and diffusion as expressions.
Specifying the method to use with pmle="ozaki"
, which can
easily be implemented in R as follows:
R> fx <- expression( theta[1]*(theta[2]- x) ) ## drift coefficient of model
R> gx <- expression( theta[3] ) ## diffusion coefficient of model
R> fitmod <- fitsde(data=HWV,drift=fx,diffusion=gx,start = list(theta1=1,theta2=1,
+ theta3=1),pmle="ozaki")
R> summary(fitmod)
Pseudo maximum likelihood estimation
Method: Ozaki
Call:
function (name, ...) .Primitive("call")
Coefficients:
Estimate Std. Error
theta1 3.37769 0.398502
theta2 1.99511 0.049140
theta3 0.50465 0.011286
-2 log L: -3135.6
If you want to have confidence intervals θ1 and θ2 parameters only, using the
command confint
as following:
2.5 % 97.5 %
theta1 2.5966 4.1587
theta2 1.8988 2.0914
An extension of the method to Ozaki the more general case where the drift is allowed to depend on the time variable t, and also the diffusion coefficient can be varied is the method Shoji and Ozaki. Consider the stochastic differential equation: dXt=f(t,Xt,θ_)dt+g(Xt,θ_)dWt,t≥0,X0=x0, the transition density for the Shoji-Ozaki method is Gaussian, we have that: Xt+Δt|Xt=x∼N(A(t,x)x,B2(t,x)), where: A(t,x)=1+f(t,x)xLt(eLtΔt−1)+MtxL2t(eLtΔt−1−LtΔt),B(t,x)=g(x)√e2LtΔt−12Lt, with: Lt=∂xf(t,x)andMt=g2(x)2∂xxf(t,x)+∂tf(t,x).
As an example, we consider the following model: dXt=a(t)Xtdt+θ2XtdWt,X0=10 with: a(t)=θ1t, and we generate sampled data Xti, with θ1=−2, θ2=0.2 and time step Δt=10−3, as following:
R> set.seed(1234, kind = "L'Ecuyer-CMRG")
R> f <- expression(-2*x*t) ; g <- expression(0.2*x)
R> sim <- snssde1d(drift=f,diffusion=g,N=1000,Dt=0.001,x0=10)
R> mydata <- sim$X
we set the initial values for the optimizer as θ1=θ2=1, and we specify
the method to use with pmle="shoji"
:
R> fx <- expression( theta[1]*x*t ) ## drift coefficient of model
R> gx <- expression( theta[2]*x ) ## diffusion coefficient of model
R> fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
+ theta2=1),pmle="shoji",lower=c(-3,0),upper=c(-1,1))
R> summary(fitmod)
Pseudo maximum likelihood estimation
Method: Shoji
Call:
function (name, ...) .Primitive("call")
Coefficients:
Estimate Std. Error
theta1 -2.01886 0.3502511
theta2 0.20209 0.0045209
-2 log L: -3444.3
Kessler (1997) proposed to use a higher-order Ito-Taylor expansion to approximate the mean and variance in a conditional Gaussian density. Consider the stochastic differential equation dXt=f(Xt,θ_)dt+g(Xt,θ_)dWt , the transition density by Kessler method is: Xt+Δt|Xt=x∼N(Ex,Vx), where: Ex=x+f(t,x)Δt+(f(t,x)∂xf(t,x)+12g2(t,x)∂xxg(t,x))(Δt)22,Vx=x2+(2f(t,x)x+g2(t,x))Δt+(2f(t,x)(∂xf(t,x)x+f(t,x)+g(t,x)∂xg(t,x))+g2(t,x)(∂xxf(t,x)x+2∂xf(t,x)+∂xg2(t,x)+g(t,x)∂xxg(t,x)))(Δt)22−E2x.
We consider the following Hull-White (extended Vasicek) model: dXt=a(t)(b(t)−Xt)dt+σ(t)dWt,X0=2 with: a(t)=θ1t and b(t)=θ2√t, the volatility depends on time: σ(t)=θ3t. We generate sampled data of Xt, with θ1=3, θ2=1 and θ3=0.3, time step Δt=10−3, as following:
R> set.seed(1234, kind = "L'Ecuyer-CMRG")
R> f <- expression(3*t*(sqrt(t)-x)) ; g <- expression(0.3*t)
R> sim <- snssde1d(drift=f,diffusion=g,M=1,N=1000,x0=2,Dt=0.001)
R> mydata <- sim$X
we set the initial values for the optimizer as θ1=θ2=θ3=1, and
we specify the method to use with pmle="kessler"
:
R> fx <- expression( theta[1]*t* ( theta[2]*sqrt(t) - x ) ) ## drift coefficient of model
R> gx <- expression( theta[3]*t ) ## diffusion coefficient of model
R> fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
+ theta2=1,theta3=1),pmle="kessler")
R> summary(fitmod)
Pseudo maximum likelihood estimation
Method: Kessler
Call:
function (name, ...) .Primitive("call")
Coefficients:
Estimate Std. Error
theta1 3.17595 0.3881997
theta2 1.02941 0.1694476
theta3 0.30318 0.0067834
-2 log L: -8443.1
fitsde()
in practiceLet the following models: dXt=θ1Xtdt+θ2Xθ3tdWt,(true model)dXt=(θ1+θ2Xt)dt+θ3Xθ4tdWt,(competing model 1)dXt=(θ1+θ2Xt)dt+θ3√XtdWt,(competing model 2)dXt=θ1dt+θ2Xθ3tdWt,(competing model 3) We generate data from true model with parameters θ_=(2,0.3,0.5), initial value X0=2 and Δt=10−4, as following:
R> set.seed(1234, kind = "L'Ecuyer-CMRG")
R> f <- expression( 2*x )
R> g <- expression( 0.3*x^0.5 )
R> sim <- snssde1d(drift=f,diffusion=g,M=1,N=10000,x0=2,Dt=0.0001)
R> mydata <- sim$X
We test the performance of the AIC statistics for the four competing models, we can proceed as follows:
R> ## True model
R> fx <- expression( theta[1]*x )
R> gx <- expression( theta[2]*x^theta[3] )
R> truemod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
+ theta2=1,theta3=1),pmle="euler")
R> ## competing model 1
R> fx1 <- expression( theta[1]+theta[2]*x )
R> gx1 <- expression( theta[3]*x^theta[4] )
R> mod1 <- fitsde(data=mydata,drift=fx1,diffusion=gx1,start = list(theta1=1,
+ theta2=1,theta3=1,theta4=1),pmle="euler")
R> ## competing model 2
R> fx2 <- expression( theta[1]+theta[2]*x )
R> gx2 <- expression( theta[3]*sqrt(x) )
R> mod2 <- fitsde(data=mydata,drift=fx2,diffusion=gx2,start = list(theta1=1,
+ theta2=1,theta3=1),pmle="euler")
R> ## competing model 3
R> fx3 <- expression( theta[1] )
R> gx3 <- expression( theta[2]*x^theta[3] )
R> mod3 <- fitsde(data=mydata,drift=fx3,diffusion=gx3,start = list(theta1=1,
+ theta2=1,theta3=1),pmle="euler")
R> ## Computes AIC
R> AIC <- c(AIC(truemod),AIC(mod1),AIC(mod2),AIC(mod3))
R> Test <- data.frame(AIC,row.names = c("True mod","Comp mod1","Comp mod2","Comp mod3"))
R> Bestmod <- rownames(Test)[which.min(Test[,1])]
R> Bestmod
[1] "True mod"
the estimates under the different models:
R> Theta1 <- c(coef(truemod)[[1]],coef(mod1)[[1]],coef(mod2)[[1]],coef(mod3)[[1]])
R> Theta2 <- c(coef(truemod)[[2]],coef(mod1)[[2]],coef(mod2)[[2]],coef(mod3)[[2]])
R> Theta3 <- c(coef(truemod)[[3]],coef(mod1)[[3]],coef(mod2)[[3]],coef(mod3)[[3]])
R> Theta4 <- c("",round(coef(mod1)[[4]],5),"","")
R> Parms <- data.frame(Theta1,Theta2,Theta3,Theta4,row.names = c("True mod",
+ "Comp mod1","Comp mod2","Comp mod3"))
R> Parms
Theta1 Theta2 Theta3 Theta4
True mod 1.95740 0.31178 0.47507
Comp mod1 1.11075 1.76450 0.31183 0.47503
Comp mod2 -0.89167 2.10938 0.29958
Comp mod3 8.33085 0.31075 0.47937
We make use of real data of the U.S. Interest Rates monthly form 06/1964 to 12/1989 (see Figure 1) available in package Ecdat, and we estimate the parameters θ_=(θ1,θ2,θ3,θ4) of CKLS model. These data have been analyzed by Brouste et all (2014) with yuima package, here we confirm the results of the estimates by several approximation methods.
R> data(Irates)
R> rates <- Irates[, "r1"]
R> X <- window(rates, start = 1964.471, end = 1989.333)
R> plot(X)
we can now use all previous methods by fitsde
function
to estimate the parameters of CKLS model as follows:
R> fx <- expression( theta[1]+theta[2]*x ) ## drift coefficient of CKLS model
R> gx <- expression( theta[3]*x^theta[4] ) ## diffusion coefficient of CKLS model
R> pmle <- eval(formals(fitsde.default)$pmle)
R> fitres <- lapply(1:4, function(i) fitsde(X,drift=fx,diffusion=gx,pmle=pmle[i],
+ start = list(theta1=1,theta2=1,theta3=1,theta4=1)))
R> Coef <- data.frame(do.call("cbind",lapply(1:4,function(i) coef(fitres[[i]]))))
R> Info <- data.frame(do.call("rbind",lapply(1:4,function(i) logLik(fitres[[i]]))),
+ do.call("rbind",lapply(1:4,function(i) AIC(fitres[[i]]))),
+ do.call("rbind",lapply(1:4,function(i) BIC(fitres[[i]]))),
+ row.names=pmle)
R> names(Coef) <- c(pmle)
R> names(Info) <- c("logLik","AIC","BIC")
R> Coef
euler kessler ozaki shoji
theta1 2.07695 2.14335 2.11532 2.10150
theta2 -0.26319 -0.27434 -0.26905 -0.26647
theta3 0.13022 0.12598 0.12652 0.13167
theta4 1.45132 1.46917 1.46491 1.45131
logLik AIC BIC
euler -237.88 483.76 487.15
kessler -237.78 483.57 486.96
ozaki -237.84 483.67 487.07
shoji -237.88 483.76 487.15
In Figure 2 we reports the sample mean of the solution of the CKLS model with the estimated parameters and real data, their empirical 95% confidence bands (from the 2.5th to the 97.5th percentile), we can proceed as follows:
R> set.seed(1234, kind = "L'Ecuyer-CMRG")
R> f <- expression( (2.076-0.263*x) )
R> g <- expression( 0.130*x^1.451 )
R> mod <- snssde1d(drift=f,diffusion=g,x0=X[1],M=500, N=length(X),t0=1964.471, T=1989.333)
R> mod
Itô Sde 1D:
| dX(t) = (2.076 - 0.263 * X(t)) * dt + 0.13 * X(t)^1.451 * dW(t)
Method:
| Euler scheme with order 0.5
Summary:
| Size of process | N = 299.
| Number of simulation | M = 500.
| Initial value | x0 = 3.317.
| Time of process | t in [1964.5,1989.3].
| Discretization | Dt = 0.08343.
R> plot(mod,type="n",ylim=c(0,30))
R> lines(X,col=4,lwd=2)
R> lines(time(mod),apply(mod$X,1,mean),col=2,lwd=2)
R> lines(time(mod),apply(mod$X,1,bconfint,level=0.95)[1,],col=5,lwd=2)
R> lines(time(mod),apply(mod$X,1,bconfint,level=0.95)[2,],col=5,lwd=2)
R> legend("topleft",c("real data","mean path",paste("bound of", 95,"% confidence")),inset = .01,col=c(4,2,5),lwd=2,cex=0.8)
snssdekd()
&
dsdekd()
& rsdekd()
- Monte-Carlo
Simulation and Analysis of Stochastic Differential Equations.bridgesdekd()
&
dsdekd()
& rsdekd()
- Constructs and
Analysis of Bridges Stochastic Differential Equations.fptsdekd()
&
dfptsdekd()
- Monte-Carlo Simulation and Kernel Density
Estimation of First passage time.MCM.sde()
&
MEM.sde()
- Parallel Monte-Carlo and Moment Equations for
SDEs.TEX.sde()
- Converting
Sim.DiffProc Objects to LaTeX.fitsde()
- Parametric Estimation
of 1-D Stochastic Differential Equation.Brouste A, Fukasawa M, Hino H, Iacus SM, Kamatani K, Koike Y, Masuda H, Nomura R,Ogihara T, Shimuzu Y, Uchida M, Yoshida N (2014). The YUIMA Project: A ComputationalFramework for Simulation and Inference of Stochastic Differential Equations.” Journal of Statistical Software, 57(4), 1-51. URL https://www.jstatsoft.org/v57/i04.
Guidoum AC, Boukhetala K (2020). “Performing Parallel Monte Carlo and Moment Equations Methods for Itô and Stratonovich Stochastic Differential Systems: R Package Sim.DiffProc”. Journal of Statistical Software, 96(2), 1–82. https://doi.org/10.18637/jss.v096.i02
Iacus SM (2008). Simulation and Inference for Stochastic Differential Equations: With R Examples. Springer-Verlag, New York.
Department of Mathematics and Computer Science, Faculty of Sciences and Technology, University of Tamanghasset, Algeria, E-mail (acguidoum@univ-tam.dz)↩︎
Faculty of Mathematics, University of Science and Technology Houari Boumediene, BP 32 El-Alia, U.S.T.H.B, Algeria, E-mail (kboukhetala@usthb.dz)↩︎