In a recent blog post I used something like this (for use in a call to optim()
):
obj_fun_expr <- expression((x^4 + 4 * x^2 * y^2 - 12 * x^2 * a + 144 * x^2 -
48 * x * y * a + 144 * x * y - 4320 * x + 36 * y^2 - 2160 *
y + 180 * a^2 + 32400)/16)
f_expr <- function(par) {
x <- par[1]
y <- par[2]
a <- par[3]
val <- eval(obj_fun_expr, list(x = x, y = y, a = a))
return(val)
}
I have been wondering what the cost of eval(expr, ...)
is instead of having something like this:
f_lang <- function(par) {
x <- par[1]
y <- par[2]
a <- par[3]
val <- (x^4 + 4 * x^2 * y^2 - 12 * x^2 * a + 144 * x^2 -
48 * x * y * a + 144 * x * y - 4320 * x + 36 * y^2 - 2160 *
y + 180 * a^2 + 32400)/16
return(val)
}
Another option is to fill out the function body using the expression:
f_lang_expr <- function(par) {
}
f_lines <- parse(text = c(
"x <- par[1]",
"y <- par[2]",
"a <- par[3]",
paste0("val <- ", obj_fun_expr),
"return(val)"))
body(f_lang_expr) <- as.call(c(as.name("{"), f_lines))
f_lang_expr
## function (par)
## {
## x <- par[1]
## y <- par[2]
## a <- par[3]
## val <- (x^4 + 4 * x^2 * y^2 - 12 * x^2 * a + 144 * x^2 -
## 48 * x * y * a + 144 * x * y - 4320 * x + 36 * y^2 -
## 2160 * y + 180 * a^2 + 32400)/16
## return(val)
## }
Typically you would evaluate it multiple times (e.g. 1,000), e.g. for plotting or for optimising it.
So what is the difference between them?
First we ensure that the give the same answers:
xs <- seq(1, 100, length.out = 1000)
res_f_expr <- lapply(xs, function(x) f_lang(c(x, 1, 1)))
res_f_lang <- lapply(xs, function(x) f_lang(c(x, 1, 1)))
res_f_lang_expr <- lapply(xs, function(x) f_lang(c(x, 1, 1)))
all.equal(res_f_expr, res_f_lang)
## [1] TRUE
all.equal(res_f_expr, res_f_lang_expr)
## [1] TRUE
Now, we can calculate the difference in run times:
library(microbenchmark)
m <- microbenchmark(
f_expr = lapply(xs, function(x) f_lang(c(x, 1, 1))),
f_lang = lapply(xs, function(x) f_lang(c(x, 1, 1))),
f_lang_expr = lapply(xs, function(x) f_lang(c(x, 1, 1))),
times = 10
)
print(m, unit = "s") # seconds
## Unit: seconds
## expr min lq mean median uq
## f_expr 0.001826758 0.001929865 0.002141647 0.001966037 0.002257127
## f_lang 0.001795386 0.001847062 0.002253064 0.001899274 0.002141826
## f_lang_expr 0.001799808 0.001874676 0.002043506 0.001894780 0.001937986
## max neval cld
## 0.003082509 10 a
## 0.004323419 10 a
## 0.003174221 10 a
On my computer there is not really any difference. Normally, I would expect a small difference, but as seen it is not really something to worry about (at least not to begin with). Remember: “[…]premature optimization is the root of all evil[…]”.