library(calculus)
Orthogonal coordinates are a special but extremely common case of curvilinear coordinates where the coordinate surfaces all meet at right angles. The chief advantage of non-Cartesian coordinates is that they can be chosen to match the symmetry of the problem. For example, spherical coordinates are the most common curvilinear coordinate systems and are used in Earth sciences, cartography, quantum mechanics, relativity, and engineering.1 These coordinates may be derived from a set of Cartesian coordinates by using a transformation that is locally invertible (a one-to-one map) at each point. This means that one can convert a point given in a Cartesian coordinate system to its curvilinear coordinates and back. Differential operators such as the gradient, divergence, curl, and Laplacian can be transformed from one coordinate system to another via the usage of scale factors.2 The package implements these operators in Cartesian, polar, spherical, cylindrical, parabolic coordinates, and supports arbitrary orthogonal coordinates systems defined by custom scale factors.
Curvilinear coordinates \((q_1, q_2, q_3)\) | Transformation from cartesian \((x, y, z)\) | Scale factors |
---|---|---|
Spherical polar coordinates \((r,\theta ,\phi )\) | \({\begin{aligned}x&=r\sin \theta \cos \phi \\y&=r\sin \theta \sin \phi \\z&=r\cos \theta \end{aligned}}\) | \({\begin{aligned}h_{1}&=1\\h_{2}&=r\\h_{3}&=r\sin \theta \end{aligned}}\) |
Cylindrical polar coordinates \((r,\phi ,z)\) | \({\begin{aligned}x&=r\cos \phi \\y&=r\sin \phi \\z&=z\end{aligned}}\) | \({\begin{aligned}h_{1}&=h_{3}=1\\h_{2}&=r\end{aligned}}\) |
Parabolic coordinates \((u,v,\phi )\) | \({\begin{aligned}x&=uv\cos \phi \\y&=uv\sin \phi \\z&={\frac {1}{2}}(u^{2}-v^{2})\end{aligned}}\) | \({\begin{aligned}h_{1}&=h_{2}={\sqrt {u^{2}+v^{2}}}\\h_{3}&=uv\end{aligned}}\) |
Parabolic cylindrical coordinates \((u,v,z)\) | \({\begin{aligned}x&={\frac {1}{2}}(u^{2}-v^{2})\\y&=uv\\z&=z\end{aligned}}\) | \({\begin{aligned}h_{1}&=h_{2}={\sqrt {u^{2}+v^{2}}}\\h_{3}&=1\end{aligned}}\) |
The gradient of a scalar-valued function \(F\) is the vector \((\nabla F)_i\) whose components are the partial derivatives of \(F\) with respect to each variable \(i\). In arbitrary orthogonal coordinate systems, the gradient is expressed in terms of the scale factors \(h_i\) as follows:
\[(\nabla F)_i = \frac{1}{h_i}\partial_iF\]
The function gradient
implements the symbolic and numeric gradient of functions
,
expressions
and characters
. In Cartesian
coordinates:
gradient("x*y*z", var = c("x", "y", "z"))
#> [1] "y * z" "x * z" "x * y"
and in spherical coordinates:
gradient("x*y*z", var = c("x","y","z"), coordinates = "spherical")
#> [1] "1/1 * (y * z)" "1/x * (x * z)" "1/(x*sin(y)) * (x * y)"
To support arbitrary orthogonal coordinate systems, it is possible to
pass custom scale factors to the argument coordinates
. For
instance, the following call is equivalent to the previous example in
spherical coordinates where the scale factors are now explicitly
specified:
gradient("x*y*z", var = c("x","y","z"), coordinates = c(1,"x","x*sin(y)"))
#> [1] "1/(1) * (y * z)" "1/(x) * (x * z)" "1/(x*sin(y)) * (x * y)"
Numerical methods are applied when working with
functions
with the same sintax introduced for derivatives.
<- function(x, y, z) x*y*z
f gradient(f, var = c(x = 1, y = pi/2, z = 0), coordinates = "spherical")
#> [1] 0.000000 0.000000 1.570796
or in vectorized form:
<- function(x) x[1]*x[2]*x[3]
f gradient(f, var = c(1, pi/2, 0), coordinates = "spherical")
#> [1] 0.000000 0.000000 1.570796
When the function \(F\) is a tensor-valued function \(F_{d_1,\dots,d_n}\), the gradient is computed for each scalar component.
\[(\nabla F_{d_1,\dots,d_n})_i = \frac{1}{h_i}\partial_iF_{d_1,\dots,d_n}\]
In particular, this reduces to the Jacobian matrix for vector-valued functions \(F_{d_1}\):
<- function(x) c(prod(x), sum(x))
f gradient(f, var = c(3, 2, 1))
#> [,1] [,2] [,3]
#> [1,] 2 3 6
#> [2,] 1 1 1
that may be expressed in arbitrary orthogonal coordinate systems.
<- function(x) c(prod(x), sum(x))
f gradient(f, var = c(3, 2, 1), coordinates = "cylindrical")
#> [,1] [,2] [,3]
#> [1,] 2 1.0000000 6
#> [2,] 1 0.3333333 1
The function jacobian
is a wrapper for gradient
that always returns the Jacobian as a matrix, even in the case of
unidimensional scalar-valued functions.
<- function(x) x^2
f jacobian(f, var = c(1))
#> [,1]
#> [1,] 2
In Cartesian coordinates, the Hessian of a scalar-valued function \(F\) is the square matrix of second-order partial derivatives:
\[(H(F))_{ij} = \partial_{ij}F\]
It might be tempting to apply the definition of the Hessian as the Jacobian of the gradient to write it in terms of the scale factors. However, this results in a Hessian matrix that is not symmetric and ignores the distinction between vector and covectors in tensor analysis. The generalization to arbitrary coordinate system is not implemented and only Cartesian coordinates are supported:
<- function(x, y, z) x*y*z
f hessian(f, var = c(x = 3, y = 2, z = 1))
#> [,1] [,2] [,3]
#> [1,] -1.918057e-11 1.000000e+00 2.000000e+00
#> [2,] 1.000000e+00 -4.315627e-11 3.000000e+00
#> [3,] 2.000000e+00 3.000000e+00 -1.726251e-10
When the function \(F\) is a
tensor-valued function \(F_{d_1,\dots,d_n}\), the hessian
is computed for each scalar component.
\[(H(F_{d_1,\dots,d_n}))_{ij} = \partial_{ij}F_{d_1,\dots,d_n}\]
In this case, the function returns an array
of Hessian
matrices:
<- function(x, y, z) c(x*y*z, x+y+z)
f <- hessian(f, var = c(x = 3, y = 2, z = 1)) h
that can be extracted with the corresponding indices.
1,,]
h[#> [,1] [,2] [,3]
#> [1,] -1.918057e-11 1.000000e+00 2.000000e+00
#> [2,] 1.000000e+00 -4.315627e-11 3.000000e+00
#> [3,] 2.000000e+00 3.000000e+00 -1.726251e-10
2,,]
h[#> [,1] [,2] [,3]
#> [1,] -4.242789e-12 2.932800e-12 -8.683390e-12
#> [2,] 2.932800e-12 -7.066143e-11 2.825347e-12
#> [3,] -8.683390e-12 2.825347e-12 -1.848667e-10
The divergence of a vector-valued function \(F_i\) produces a scalar value \(\nabla \cdot F\) representing the volume density of the outward flux of the vector field from an infinitesimal volume around a given point.3 In terms of scale factors, it is expressed as follows:
\[\nabla \cdot F = \frac{1}{J}\sum_i\partial_i\Biggl(\frac{J}{h_i}F_i\Biggl)\]
where \(J=\prod_ih_i\). When \(F\) is an array
of
vector-valued functions \(F_{d_1,\dots,d_n,i}\), the divergence
is computed for each vector:
\[(\nabla \cdot F)_{d_1,\dots,d_n} = \frac{1}{J}\sum_i\partial_i\Biggl(\frac{J}{h_i}F_{d_1,\dots,d_n,i}\Biggl)=\frac{1}{J}\sum_i\partial_i(Jh_i^{-1})F_{d_1,\dots,d_n,i}+Jh_i^{-1}\partial_i(F_{d_1,\dots,d_n,i})\]
where the last equality is preferable in practice as the derivatives of the scale factor can be computed symbolically and the computation of the derivatives of \(F\) is more efficient than the direct computation of \(\partial_i\bigl(\frac{J}{h_i}F_{d_1,\dots,d_n,i}\bigl)\) via finite differences. In Cartesian coordinates:
<- c("x^2", "y^2", "z^2")
f divergence(f, var = c("x","y","z"))
#> [1] "2 * x + 2 * y + 2 * z"
In polar coordinates:
<- c("sqrt(r)/10", "sqrt(r)")
f divergence(f, var = c("r","phi"), coordinates = "polar")
#> [1] "(0.5 * r^-0.5/10 * r + (sqrt(r)/10)) / (1*r)"
And for tensors of vector-valued functions:
<- matrix(c("x^2","y^2","z^2","x","y","z"), nrow = 2, byrow = TRUE)
f divergence(f, var = c("x","y","z"))
#> [1] "2 * x + 2 * y + 2 * z" "1 + 1 + 1"
The same syntax holds for functions
where numerical
methods are automatically applied:
<- function(x,y,z) matrix(c(x^2,y^2,z^2,x,y,z), nrow = 2, byrow = TRUE)
f divergence(f, var = c(x = 0, y = 0, z = 0))
#> [1] 3.004132e-21 3.000000e+00
The curl of a vector-valued function \(F_i\) at a point is represented by a vector whose length and direction denote the magnitude and axis of the maximum circulation.4 In 2 dimensions, the curl is written in terms of the scale factors \(h\) and the Levi-Civita symbol \(\epsilon\) as follows:
\[\nabla \times F = \frac{1}{h_1h_2}\sum_{ij}\epsilon_{ij}\partial_i\Bigl(h_jF_j\Bigl)= \frac{1}{h_1h_2}\Biggl(\partial_1\Bigl(h_2F_2\Bigl)-\partial_2\Bigl(h_1F_1\Bigl)\Biggl)\]
In 3 dimensions:
\[(\nabla \times F)_k = \frac{h_k}{J}\sum_{ij}\epsilon_{ijk}\partial_i\Bigl(h_jF_j\Bigl)\]
where \(J=\prod_i h_i\). This
suggests to implement the curl
in \(m+2\) dimensions in such a way
that the formula reduces correctly to the previous cases:
\[(\nabla \times F)_{k_1\dots k_m} = \frac{h_{k_1}\cdots h_{k_m}}{J}\sum_{ij}\epsilon_{ijk_1\dots k_m}\partial_i\Bigl(h_jF_j\Bigl)\]
And in particular, when \(F\) is an
array
of vector-valued functions \(F_{d_1,\dots,d_n,i}\) the curl
is computed for each vector:
\[ \begin{split} (\nabla \times F)_{d_1\dots d_n,k_1\dots k_m} & =\\ &=\frac{h_{k_1}\cdots h_{k_m}}{J}\sum_{ij}\epsilon_{ijk_1\dots k_m}\partial_i\Bigl(h_jF_{d_1\dots d_n,j}\Bigl) \\ &=\sum_{ij}\frac{1}{h_ih_j}\epsilon_{ijk_1\dots k_m}\partial_i\Bigl(h_jF_{d_1\dots d_n,j}\Bigl) \\ &=\sum_{ij}\frac{1}{h_ih_j}\epsilon_{ijk_1\dots k_m}\Bigl(\partial_i(h_j)F_{d_1\dots d_n,j}+h_j\partial_i(F_{d_1\dots d_n,j})\Bigl) \end{split} \]
where the last equality is preferable in practice as the derivatives of the scale factor can be computed symbolically and the computation of the derivatives of \(F\) is more efficient than the direct computation of \(\partial_i\bigl(h_jF_{d_1\dots d_n,j}\bigl)\) via finite differences. In 2-dimensional Cartesian coordinates:
<- c("x^3*y^2","x")
f curl(f, var = c("x","y"))
#> [1] "(1) * 1 + (x^3 * (2 * y)) * -1"
In 3 dimensions, for an irrotational vector field:
<- c("x","-y","z")
f curl(f, var = c("x","y","z"))
#> [1] "0" "0" "0"
And for tensors of vector-valued functions:
<- matrix(c("x","-y","z","x^3*y^2","x","0"), nrow = 2, byrow = TRUE)
f curl(f, var = c("x","y","z"))
#> [,1] [,2] [,3]
#> [1,] "0" "0" "0"
#> [2,] "0" "0" "(1) * 1 + (x^3 * (2 * y)) * -1"
The same syntax holds for functions
where numerical
methods are automatically applied and for arbitrary orthogonal
coordinate systems as shown in the previous sections.
The Laplacian is a differential operator given by the divergence of the gradient of a scalar-valued function \(F\), resulting in a scalar value giving the flux density of the gradient flow of a function. The Laplacian occurs in differential equations that describe many physical phenomena, such as electric and gravitational potentials, the diffusion equation for heat and fluid flow, wave propagation, and quantum mechanics.5 In terms of the scale factor, the operator is written as:
\[\nabla^2F = \frac{1}{J}\sum_i\partial_i\Biggl(\frac{J}{h_i^2}\partial_iF\Biggl)\]
where \(J=\prod_ih_i\). When the
function \(F\) is a tensor-valued
function \(F_{d_1,\dots,d_n}\), the laplacian
is computed for each scalar component:
\[(\nabla^2F)_{d_1\dots d_n} = \frac{1}{J}\sum_i\partial_i\Biggl(\frac{J}{h_i^2}\partial_iF_{d_1\dots d_n}\Biggl)=\frac{1}{J}\sum_i\partial_i\Bigl(Jh_i^{-2}\Bigl)\partial_iF_{d_1\dots d_n}+Jh_i^{-2}\partial^2_iF_{d_1\dots d_n}\]
where the last equality is preferable in practice as the derivatives of the scale factor can be computed symbolically and the computation of the derivatives of \(F\) is more efficient than the direct computation of \(\partial_i\bigl(\frac{J}{h_i^2}\partial_iF\bigl)\) via finite differences. In Cartesian coordinates:
<- "x^3+y^3+z^3"
f laplacian(f, var = c("x","y","z"))
#> [1] "3 * (2 * x) + 3 * (2 * y) + 3 * (2 * z)"
And for tensors of scalar-valued functions:
<- array(c("x^3+y^3+z^3", "x^2+y^2+z^2", "y^2", "z*x^2"), dim = c(2,2))
f laplacian(f, var = c("x","y","z"))
#> [,1] [,2]
#> [1,] "3 * (2 * x) + 3 * (2 * y) + 3 * (2 * z)" "2"
#> [2,] "2 + 2 + 2" "z * 2"
The same syntax holds for functions
where numerical
methods are automatically applied and for arbitrary orthogonal
coordinate systems as shown in the previous sections.
Guidotti E (2022). “calculus: High-Dimensional Numerical and Symbolic Calculus in R.” Journal of Statistical Software, 104(5), 1-37. doi:10.18637/jss.v104.i05
A BibTeX entry for LaTeX users is
@Article{calculus,
title = {{calculus}: High-Dimensional Numerical and Symbolic Calculus in {R}},
author = {Emanuele Guidotti},
journal = {Journal of Statistical Software},
year = {2022},
volume = {104},
number = {5},
pages = {1--37},
doi = {10.18637/jss.v104.i05},
}