luajr
luajr allows you to run Lua code from R.
Lua is a lightweight, simple, and fast scripting language that is used in a variety of settings. The standard Lua interpreter is already reasonably fast, but there is also a just-in-time compiler for Lua called LuaJIT that is even faster. luajr uses LuaJIT.
This is not a guide to Lua or LuaJIT; it is a quick-start guide to luajr for people who already know how to program in Lua. See the Lua web site for resources related to coding in Lua.
lua()
and
lua_shell()
To get a feel for luajr or to run “one-off” Lua code from your R
project, use lua()
and lua_shell()
.
When you pass a character string to lua()
, it is run as
Lua code:
Assignments to global variables will persist between calls to
lua()
:
This is because luajr maintains a “default Lua state” which holds all global variables. This default Lua state is opened the first time a package function is used. You can create your own, separate Lua states, or reset the default Lua state (see Lua states, below).
Assignments to local variables will not persist between
calls to lua()
:
In this case, the second line returns "walrus"
because
the local variable my_animal
goes out of scope after the
first call to lua()
ends, so the second call to
lua()
is referring back to the global variable
my_animal
from before.
You can include more than one statement in the code run by
lua()
:
lua("local my_veg = 'potato'; local my_dish = my_veg .. ' pie'; return my_dish")
#> [1] "potato pie"
You can also use the filename
argument to
lua()
to load and run a Lua source file, instead of running
the contents of a string.
Call lua_shell()
to open an interactive Lua shell at the
R prompt. This can can be helpful for debugging or for testing Lua
statements.
lua_func()
The key piece of functionality for luajr is probably
lua_func()
. This allows you to call Lua functions from
R.
The first argument to lua_func()
, func
, is
a string that should evaluate to a Lua function. lua_func()
then returns an R function that can be used to call that Lua function
from R. For example, you can use lua_func()
to access an
existing Lua function from R:
Here, "type"
is just referring to the built-in Lua
function type
which returns a string describing the Lua
type of the value passed to it. You can also use lua_func()
to refer to a previously defined function in the default Lua state:
lua("function squared(x) return x^2 end")
lua("return squared(4)")
#> [1] 16
sq = lua_func("squared")
sq(8)
#> [1] 64
Or you can use lua_func()
to define an anonymous Lua
function:
Under the hood, lua_func()
just takes its first
parameter (a string), adds "return "
to the front of it,
executes it as Lua code, and registers the result as the function.
The second argument to lua_func()
, argcode
,
is also very important. argcode
determines how the
arguments passed to the function from R are translated into Lua values
for use inside the function.
The permissible arg codes are:
's'
: simplest Lua type (the
default)'a'
: array type'1'
: same as 's'
, but require that the
argument has length 1'9'
: same as 's'
, but require that the
argument has length 9'v'
: pass by value'r'
: pass by referenceThe kinds of R values that can be passed to Lua functions, and their behaviour under different arg codes, is summarized in the following table:
R type | Example R value | arg code ‘s’ | ‘a’ | ‘1’ | ‘2’ | ‘v’ | ‘r’ |
---|---|---|---|---|---|---|---|
NULL |
NULL |
nil |
nil |
nil |
nil |
nil |
nil |
logical(1) |
TRUE |
true |
{true} |
true |
error | luajr.logical({true}) |
luajr.logical_r({true}) |
integer(1) |
1L |
1 |
{1} |
1 |
error | luajr.integer({1}) |
luajr.integer_r({1}) |
numeric(1) |
3.14159 |
3.14159 |
{3.14159} |
3.14149 |
error | luajr.numeric({3.14159}) |
luajr.numeric_r({3.14159}) |
character(1) |
"howdy" |
"howdy" |
{"howdy"} |
"howdy" |
error | luajr.character({"howdy"}) |
luajr.character_r({"howdy"}) |
logical(nn) |
c(TRUE, FALSE) |
{true, false} |
{true, false} |
error | {true, false} |
luajr.logical({true, false}) |
luajr.logical_r({true, false}) |
integer(nn) |
1:2 |
{1, 2} |
{1.0, 2.0} |
error | {1.0, 2.0} |
luajr.integer({1, 2}) |
luajr.integer_r({1, 2}) |
numeric(nn) |
exp(0:1) |
{1, 2.71828...} |
{1, 2.71828...} |
error | {1, 2.71828...} |
luajr.numeric({1, 2.71828...}) |
luajr.numeric_r({1, 2.71828...}) |
character(nn) |
letters[1:2] |
{"a", "b"} |
{"a", "b"} |
error | {"a", "b"} |
luajr.character({"a", "b"}) |
luajr.character_r({"a", "b"}) |
list(...) |
list(1, b='b') |
{[1]=1, b='b'} |
error | error | error | x = luajr.list(); x[1] = luajr.numeric({1}); x.b = luajr.character({'b'}); return x |
x = luajr.list(); x[1] = luajr.numeric_r({1}); x.b = luajr.character_r({'b'}); return x |
external pointer | lua_open() |
userdata:… |
userdata:… |
userdata:… |
userdata:… |
userdata:… |
userdata:… |
Above, nn
stands for an integer that is greater than 1;
in the examples, it stands specifically for 2.
There should be one character in argcode
for every
argument of the function, but the string is “recycled” when there are
more arguments passed than characters in the argcode
string. So, for example, just passing "s"
as
argcode
means all parameters will be passed as the
simplest Lua type, while if argcode
is
"sr"
, then the first argument has arg code
"s"
, the second argument has arg code "r"
, the
third argument has arg code "s"
, etc.
When a vector (logical vector, integer vector, numeric vector, or character vector) is passed from R to Lua by reference, modifications made to the elements of that passed-in vector persist back in the R calling frame. For example:
values = c(1.0, 2.0, 3.0)
keep = lua_func("function(x) x[1] = 999 end", "v") # passed by value
keep(values)
print(values)
#> [1] 1 2 3
change = lua_func("function(x) x[1] = 999 end", "r") # passed by reference
change(values)
print(values)
#> [1] 999 2 3
Vectors can never be resized by reference; only their already-existing elements can be changed by reference.
Lists are always passed by value, but their vector elements can be either passed by value or by reference depending on the arg code. In order to make changes to a list, it needs to be returned to the calling function. For example:
x = list(1)
f1 = lua_func("function(x) x[1][1] = 999; x.a = 42; end", "v")
f1(x)
print(x)
#> [[1]]
#> [1] 1
f2 = lua_func("function(x) x[1][1] = 999; x.a = 42; end", "r")
f2(x)
print(x)
#> [[1]]
#> [1] 999
Note that in the examples above, we modify x[1][1]
, not
just x[1]
. That is because, even when lists are passed by
reference, we need to modify specific elements of their vector elements
if we want to make changes by reference. Setting x[1] = 999
would only reassign an element of the list x
, and lists
cannot be passed by reference itself, so this would have no effect on
the R object passed in. By contrast, setting x[1][1] = 999
modifies the passed-in vector x[1]
, which can be passed in
by reference, so this does change the R object that is passed in.
To modify a list, we have to return its new value from the function:
x = list(1)
f3 = lua_func("function(x) x[1][1] = 999; x.a = 42; return x; end", "v")
x = f3(x)
print(x)
#> [[1]]
#> [1] 999
#>
#> $a
#> [1] 42
f4 = lua_func("function(x) x[1] = luajr.numeric({888, 999}); return x; end", "v")
x = f4(x)
print(x)
#> [[1]]
#> [1] 888 999
#>
#> $a
#> [1] 42
Here, because we are returning a modified value, we can add elements
to the list (f3
) and change whole entries of the list
(f4
).
In the same way that using R packages cpp11 or Rcpp allows you to
bridge your R code with C++ code that runs faster than R, you can use
luajr
to bridge your R code with Lua code that runs faster
than R.
In general, C/C++ code runs about 5-1,000 times faster than the equivalent R code.
In my experience, luajr code often presents a similar speedup, about the same as C/C++ code, or in the worst case, maybe half as fast. Sometimes luajr code can be faster than C/C++, though usually it isn’t quite as good.
Why use luajr then? Rcpp and cpp11 require a C++ toolchain (e.g. gcc, clang, etc.) and requires long compilation times, whereas luajr doesn’t. This means that luajr is usable when a C++ compiler isn’t available, or when compilation times are prohibitive or an annoyance.
In the following section, we look at two aspects of benchmarking. In the first example, we compare different ways of passing vectors to Lua functions relative to R. In the second example, we compare more fundamentally the difference in running a whole algorithm in Lua versus R.
For reasonably long vectors, passing by reference ('r'
)
is faster than passing by value ('v'
), which is faster than
passing by simplify ('s'
, 'a'
, or
'1'
–'9'
). But for relatively short vectors,
passing by simplify can avoid some overhead, and for very simple
operations on very short vectors, it might not be worth using Lua at
all.
To illustrate the above points, here is some code that takes a numeric vector and calculates the sum of squares, comparing a pure R function with three alternatives written in Lua, respectively passing the vector by reference, by value, and by simplify.
v1 = rnorm(1e1)
v4 = rnorm(1e4)
v7 = rnorm(1e7)
lua("sum2 = function(x) local s = 0; for i=1,#x do s = s + x[i]*x[i] end; return s end")
sum2 = function(x) sum(x*x)
sum2_r = lua_func("sum2", "r")
sum2_v = lua_func("sum2", "v")
sum2_s = lua_func("sum2", "s")
# Comparing the results of each function:
sum2(v1) # Pure R version
#> [1] 6.83011
sum2_r(v1) # luajr pass-by-reference
#> [1] 6.83011
sum2_v(v1) # luajr pass-by-value
#> [1] 6.83011
sum2_s(v1) # luajr pass-by-simplify
#> [1] 6.83011
The time taken to sum v1
, v4
and
v7
, depending upon the function kind, on a 2019-era MacBook
Pro is summarised in this table:
Call | Number of summands | Method | Median runtime |
---|---|---|---|
sum2(v1) |
10 | Pure R | 1 µs |
sum2_r(v1) |
10 | luajr pass by reference |
6 µs |
sum2_v(v1) |
10 | luajr pass by value |
7 µs |
sum2_s(v1) |
10 | luajr pass by simplify |
4 µs |
sum2(v4) |
10,000 | Pure R | 63 µs |
sum2_r(v4) |
10,000 | luajr pass by reference |
18 µs |
sum2_v(v4) |
10,000 | luajr pass by value |
93 µs |
sum2_s(v4) |
10,000 | luajr pass by simplify |
78 µs |
sum2(v7) |
10,000,000 | Pure R | 49,000 µs |
sum2_r(v7) |
10,000,000 | luajr pass by reference |
13,000 µs |
sum2_v(v7) |
10,000,000 | luajr pass by value |
96,000 µs |
sum2_s(v7) |
10,000,000 | luajr pass by simplify |
124,000 µs |
When the vector is 10 elements long, the R version wins handily.
When the vector is 10,000 elements long, the pass-by-reference Lua version is fastest, but the other methods are all comparable.
When the vector is 10,000,000 elements long, the story is similar.
This is an example where luajr doesn’t add that much speed—the function is relatively short, and there is a certain amount of overhead in invoking it, while the R function doesn’t have the overhead of transferring control between languages.
Consider the following:
logistic_map_R = function(x0, burn, iter, A)
{
result_x = numeric(length(A) * iter)
j = 1
for (a in A) {
x = x0
for (i in 1:burn) {
x = a * x * (1 - x)
}
for (i in 1:iter) {
result_x[j] = x
x = a * x * (1 - x)
j = j + 1
}
}
return (list2DF(list(a = rep(A, each = iter), x = result_x)))
}
logistic_map_L = lua_func(
"function(x0, burn, iter, A)
local dflen = #A * iter
local result = luajr.dataframe()
result.a = luajr.numeric_r(dflen, 0)
result.x = luajr.numeric_r(dflen, 0)
local j = 1
for k,a in pairs(A) do
local x = x0
for i = 1, burn do
x = a * x * (1 - x)
end
for i = 1, iter do
result.a[j] = a
result.x[j] = x
x = a * x * (1 - x)
j = j + 1
end
end
return result
end", "sssr")
Here we are comparing two different versions (R versus Lua) of running a parameter sweep of the logistic map, a chaotic dynamical system popularized by Bob May in a 1976 Nature article. The output looks like this:
logistic_map = logistic_map_L(0.5, 50, 100, 200:385/100)
plot(logistic_map$a, logistic_map$x, pch = ".")
The times taken by each function on a 2019-era MacBook Pro are as follows:
Call | Method | Median runtime |
---|---|---|
logistic_map_R(0.5, 50, 100, 200:385/100)) |
R function | 1900 µs |
logistic_map_L(0.5, 50, 100, 200:385/100)) |
Lua function | 200 µs |
The version written in Lua is around 10 times faster than the version in R.
The speedup was much more notable in an earlier test where the R
version first created the data frame and then performed the iteration,
i.e. with the line result$x[j] = x
instead of
result_x[j] = x
. The median runtime for that R version was
two orders of magnitude slower; the extra overhead associated with
data.frame
methods was pointed out by Tim Taylor.
lua_open()
,
lua_reset()
All the functions mentioned above (lua()
,
lua_shell()
, and lua_func()
) can also take an
argument L
that specifies a particular Lua state that the
function operates in.
When L = NULL
(the default) the functions operate on the
default Lua state.
But you can also open alternative Lua states using
lua_open()
, and then by passing the result as the parameter
L
, specify that the function operates in that specific
state. For example:
L1 = lua_open()
lua("a = 2")
lua("a = 4", L = L1)
lua("return a")
#> [1] 2
lua("return a", L = L1)
#> [1] 4
There is no lua_close
in luajr because Lua states are
closed automatically when they are garbage collected in R.
lua_reset()
resets the default Lua state:
To reset a non-default Lua state L
returned by
lua_open()
, just do L = lua_open()
again. The
memory previously used by L
will be cleaned up at the next
garbage collection.
For notes on the luajr
Lua module – which contains
functions and types for interacting with R from Lua code – see
vignette("luajr-module")
.