R script: random.R
It is possible to define normal random variables in the EQUATION
block instead of the DEFINITION
block of the Mlxtran script, using the syntax
x ~ normal(mu, sigma)
where mu is the mean and sigma the standard deviation of the normal random variable x.
Then x can be used in the EQUATION
block as any other variable.
Remark: For simulation, we can either draw random variables in the block EQUATION
, or define its probability distribution in the block DEFINITION
. This is not the case for other tasks, such estimation, where an explicit description of the probability distributions is required by the estimation algotihms. Then, this definition should be provided in the block DEFINITION
.
Consider the following model \[ y_j = (a + b\, t_j)e^{e_j} \] where \(e_j \iid {\cal N}(0, s^2)\).
When this model is used for simulation only, it is possible to define the sequence of residual errors \((e_j)\) directly in the block EQUATION
:
myModel1 = inlineModel("
[LONGITUDINAL]
input = {a, b, s}
EQUATION:
f = a + b*t
e ~ normal(0,s)
y = f*exp(e)
")
f <- list(name='f',time=seq(0, 100, by=1))
y <- list(name='y',time=seq(5, 100, by=10))
e <- list(name='e',time=seq(5, 100, by=10))
p <- c(a=10, b=0.5, s=0.2)
s <- list(seed=12345)
res <- simulx(model = myModel1,
parameter = p,
settings = s,
output = list(f, y, e))
We can then plot both sequences \((y_j)\) and \((e_j)\)
library(gridExtra)
pl1 <- ggplot() +
geom_point(aes(x=time, y=y), data=res$y, color="red", size=3) +
geom_line(aes(x=time, y=f), data=res$f, size=1)
pl2 <- ggplot(aes(x=time, y=e), data=res$e) + geom_point(color="red", size=3) +
geom_hline(yintercept = 0)
grid.arrange(pl1, pl2, nrow=1)
print(head(res$y))
## time y
## 1 5 12.70690
## 2 15 14.72222
## 3 25 16.84884
## 4 35 29.41867
## 5 45 28.16719
## 6 55 47.63037
Remark: In this example, it would be equivalent to define the probability distribution of \((y_j)\) in the block DEFINITION
:
myModel2 = inlineModel("
[LONGITUDINAL]
input = {a, b, s}
EQUATION:
f = a + b*t
DEFINITION:
y = {distribution=lognormal, prediction=f, sd=s}
")
res <- simulx(model = myModel2,
parameter = p,
settings = s,
output = list(f, y))
print(head(res$y))
## time y
## 1 5 12.70690
## 2 15 14.72222
## 3 25 16.84884
## 4 35 29.41867
## 5 45 28.16719
## 6 55 47.63037
The results are the same because the same seed is used with both models.
Random normal variables can also be defined in the bloc EQUATION
of Section [INDIVIDUAL]
myModel3 = inlineModel("
[LONGITUDINAL]
input = {a, b, s}
EQUATION:
f = a + b*t
e ~ normal(0,s)
y = f*exp(e)
[INDIVIDUAL]
input = {a_pop, omega_a, b_pop, omega_b}
EQUATION:
a ~ normal(a_pop,omega_a)
b ~ normal(b_pop,omega_b)
")
i <- list(name=c('a', 'b'))
res <- simulx(model = myModel3,
parameter = c(a_pop=10, omega_a=1, b_pop=0.5, omega_b=0.1, s=0.2),
settings = list(seed=12345),
output = list(y, i))
print(res$parameter)
## a b
## 1 10.08208 0.4135785
print(head(res$y))
## time y
## 1 5 14.40448
## 2 15 12.69401
## 3 25 24.52023
## 4 35 21.36594
## 5 45 30.87609
## 6 55 38.49637
This is equivalent to defining the probability distributions in the 2 blocs DEFINITION
:
myModel4 = inlineModel("
[LONGITUDINAL]
input = {a, b, s}
EQUATION:
f = a + b*t
DEFINITION:
y = {distribution=lognormal, prediction=f, sd=s}
[INDIVIDUAL]
input = {a_pop, omega_a, b_pop, omega_b}
DEFINITION:
a = {distribution=normal, mean=a_pop, sd=omega_a}
b = {distribution=normal, mean=b_pop, sd=omega_b}
")
res <- simulx(model = myModel4,
parameter = c(a_pop=10, omega_a=1, b_pop=0.5, omega_b=0.1, s=0.2),
settings = list(seed=12345),
output = list(y, i))
print(res$parameter)
## a b
## 1 10.08208 0.4135785
print(head(res$y))
## time y
## 1 5 14.40448
## 2 15 12.69401
## 3 25 24.52023
## 4 35 21.36594
## 5 45 30.87609
## 6 55 38.49637