R scripts: groupII.R

Mlxtran codes: model/groupII1.txt ; model/groupII2.txt


$$ \newcommand{\esp}[1]{\mathbb{E}\left(#1\right)} \newcommand{\var}[1]{\mbox{Var}\left(#1\right)} \newcommand{\deriv}[1]{\dot{#1}(t)} \newcommand{\prob}[1]{ \mathbb{P}\!(#1)} \newcommand{\eqdef}{\mathop{=}\limits^{\mathrm{def}}} \newcommand{\by}{\boldsymbol{y}} \newcommand{\bc}{\boldsymbol{c}} \newcommand{\bpsi}{\boldsymbol{\psi}} \def\pmacro{\texttt{p}} \def\like{{\cal L}} \def\llike{{\cal LL}} \def\logit{{\rm logit}} \def\probit{{\rm probit}} \def\one{{\rm 1\!I}} \def\iid{\mathop{\sim}_{\rm i.i.d.}} \def\simh0{\mathop{\sim}_{H_0}} \def\df{\texttt{df}} \def\res{e} \def\xomega{x} \newcommand{\argmin}[1]{{\rm arg}\min_{#1}} \newcommand{\argmax}[1]{{\rm arg}\max_{#1}} \newcommand{\Rset}{\mbox{$\mathbb{R}$}} \def\param{\theta} \def\setparam{\Theta} \def\xnew{x_{\rm new}} \def\fnew{f_{\rm new}} \def\ynew{y_{\rm new}} \def\nnew{n_{\rm new}} \def\enew{e_{\rm new}} \def\Xnew{X_{\rm new}} \def\hfnew{\widehat{\fnew}} \def\degree{m} \def\nbeta{d} \newcommand{\limite}[1]{\mathop{\longrightarrow}\limits_{#1}} \def\ka{k{\scriptstyle a}} \def\ska{k{\scriptscriptstyle a}} \def\kel{k{\scriptstyle e}} \def\skel{k{\scriptscriptstyle e}} \def\cl{C{\small l}} \def\Tlag{T\hspace{-0.1em}{\scriptstyle lag}} \def\sTlag{T\hspace{-0.07em}{\scriptscriptstyle lag}} \def\Tk{T\hspace{-0.1em}{\scriptstyle k0}} \def\sTk{T\hspace{-0.07em}{\scriptscriptstyle k0}} \def\thalf{t{\scriptstyle 1/2}} \newcommand{\Dphi}[1]{\partial_\pphi #1} \def\asigma{a} \def\pphi{\psi} \newcommand{\stheta}{{\theta^\star}} \newcommand{\htheta}{{\widehat{\theta}}}$$

1 Introduction

It is possible to define groups for several purposes.

  1. We can define several groups (or arms) with different values of the parameters, different values of the regression variables, different treatments and/or different outputs (see the examples of the previous article)

  2. We can create groups that consist of several replicates of the longitudinal data, several individuals sharing or not the same covariates, and/or several populations.

  3. We can combine these features and create groups of different sizes, with different levels of randomization and different parameters, treatments, outputs, … (see the example below).

Definition of the size of a group is unambiguous when there is a unique section [LONGITUDINAL] in the model: size=N means that we want to generate \(N\) replicates of the longitudinal data defined in section [LONGITUDINAL].

Problem arise when there are several sections. Consider for example a model with sections [INDIVIDUAL] and [LONGITUDINAL]:

[LONGITUDINAL]
.
.
[INDIVIDUAL]
.
.

Then, size=N turns out to be ambiguous: does it mean that \(N\) replicates should be generated using a single set of simulated individual parameters, or does it mean that \(N\) vectors of individual parameters and one replicate per individual should be generated?

To avoid any possible confusion, we propose to explicity define the levels of randomization and provide the number of simulations per level (i.e. per section of the Mlxtran model).

For instance,

group = list(size=c(1,5), level=c('individual','longitudinal'))

means that

group = list(size=c(5,1), level=c('individual','longitudinal'))

means that

group = list(size=c(5,3), level=c('individual','longitudinal'))

means that


If you are really lazy, some defaults are available. Consider for example a model with sections [COVARIATE], [INDIVIDUAL] and [LONGITUDINAL] (in this order). Then,

  1. the default size for each level is 1. Then,
group = list(size=5, level='longitudinal')

is equivalent to

group = list(size=c(1,1,5), level=c('covariate','individual','longitudinal'))

and

group = list(size=5, level='covariate')

is equivalent to

group = list(size=c(5,1,1), level=c('covariate','individual','longitudinal'))
  1. Levels defined in the Mlxtran code are used if they are not provided as an element of group. Then,
group = list(size=c(3,1,2))

is equivalent to

group = list(size=c(3,1,2), level=c('covariate','individual','longitudinal'))

In such case, the length of vector size should be exactly the number of sections of the Mlxtran code.


2 Examples

2.1 Example 1

Let us start start with the model for longitudinal data implemented in groupII1.txt

[LONGITUDINAL]
input = {V, k, a}
EQUATION:
C = pkmodel(V,k)
DEFINITION:
y = {distribution=lognormal, prediction=C, sd=a}

We will use the input argument group of simulix for simulating 5 replicates of the same model.

adm <- list(time=seq(0,to=66,by=12), amount=100)
C <- list(name='C', time=seq(0, 100, by=1))
y <- list(name='y', time=seq(0, 100, by=12))
p <- c(V=10, k=0.2, a=0.1)
g <- list(size=5, level='longitudinal')

res1 <- simulx(model     = "model/groupII1.txt",
               treatment = adm,
               parameter = p, 
               output    = list(C,y), 
               group     = g,
               settings  = list(seed = 12345))

Here, res1 is a list of 2 data frames, C and y

names(res1)
## [1] "C"         "y"         "group"     "treatment"
names(res1$C)
## [1] "id"   "time" "C"
names(res1$y)
## [1] "id"   "time" "y"

Let us plot the predicted concentration \(C\) (wich is the same for the 5 replicates), and the 5 replicates of simulated data \((y_j)\):

print(ggplot() + geom_line(data=res1$C, aes(x=time, y=C), colour="grey", size=0.75) +
        geom_point(data=res1$y, aes(x=time, y=y, colour=id), size=3))

Remark: since there is only one group in this example, it would be equivalent to define the treatment, the parameter values and the outputs as elements of group, instead of input arguments of simulx:

g <- list(treatment=adm, parameter=p, output=list(C,y), size=5, level='longitudinal')

res3 <- simulx(model     = "model/groupII1.txt",
               group     = g,
               settings  = list(seed = 12345))

print(ggplot() + geom_line(data=res3$C, aes(x=time, y=C), colour="grey", size=0.75) +
        geom_point(data=res3$y, aes(x=time, y=y, colour=id), size=3))

With the same model, we can now define three groups with different sizes and different treatments. The three groups share the same parameter values and the same outputs.

adm1 <- list(time=seq(0,to=66,by=6), amount=50)
adm2 <- list(time=seq(0,to=66,by=12), amount=100)
adm3 <- list(time=seq(0,to=66,by=18), amount=150)
g1 <- list(treatment=adm1, size=2, level='longitudinal')
g2 <- list(treatment=adm2, size=3, level='longitudinal')
g3 <- list(treatment=adm3, size=4, level='longitudinal')

C <- list(name='C', time=seq(0, 100, by=1))
y <- list(name='y', time=seq(0, 100, by=12))
p <- c(V=10, k=0.2, a=0.1)


res4 <- simulx(model     = "model/groupII1.txt", 
               parameter = p, 
               output    = list(C,y), 
               group     = list(g1,g2,g3))

print(ggplot() + geom_line(data=res4$C, aes(x=time, y=C), colour="grey", size=0.75) +
        geom_point(data=res4$y, aes(x=time, y=y, colour=id), size=3)+ 
        facet_grid(. ~ group))


2.2 Example 2

Model groupII2.txt now includes a section [INDIVIDUAL]

[LONGITUDINAL]
input = {V, k, a}
EQUATION:
C = pkmodel(V,k)
DEFINITION:
y = {distribution=lognormal, prediction=C, sd=a}

[INDIVIDUAL]
input = {V_pop, omega_V, w}
EQUATION:
Vpred=V_pop*(w/70)
DEFINITION:
V = {distribution=lognormal, prediction=Vpred, sd=omega_V}

We can then use this model for simulating six individuals with different individual parameter values by setting level=‘individual’.

adm <- list(time=seq(0,66,by=12), amount=100)
y <- list(name="y", time=seq(18, 80, by=6))
C <- list(name="C", time=seq(0,100, by=0.5))
V <- list(name="V")
p <- c(V_pop=10, omega_V=0.3, w=50, k=0.2, a=0.2)

g <- list(size  = c(6,1), 
          level = c('individual','longitudinal'))

res5 <- simulx(model    = "model/groupII2.txt", 
              output    = list(C,y,V),
              parameter = p,
              treatment = adm,
              group     = g,
              settings  = list(seed=123456))

print(res5$parameter)
##   id         V
## 1  1 15.002045
## 2  2  8.384035
## 3  3  6.063110
## 4  4  9.524546
## 5  5  5.797582
## 6  6  9.945060
print(ggplot()  +
  geom_line(data=res5$C, aes(x=time, y=C), colour="black", size=0.5) +
  geom_point(data=res5$y, aes(x=time, y=y), colour="red", size=3)+ 
    facet_wrap( ~ id))

We can also use group for defining two groups with different treatments, different parameter values, and different sizes.

adm1 <- list(time=seq(0,66,by=6),  amount=50)
adm2 <- list(time=seq(0,66,by=12), amount=100)
y <- list(name="y", time=seq(30, 90, by=6))
C <- list(name="C", time=seq(0,100, by=0.5))
V <- list(name="V")
p1 <- c(V_pop=10, omega_V=0.3, w=50, k=0.2, a=0.2)
p2 <- c(V_pop=20, omega_V=0.3, w=75, k=0.1, a=0.2)

g1 <- list(treatment = adm1, 
           parameter = p1, 
           size      = c(3,1), 
           level     = c('individual','longitudinal'))
g2 <- list(treatment = adm2, 
           parameter = p2, 
           size      = c(2,1), 
           level     = c('individual','longitudinal'))

res6 <- simulx(model    = "model/groupII2.txt", 
               output   = list(C,y,V),
               group    = list(g1,g2),
               settings = list(seed=123123))

print(res6$parameter)
##   id group         V
## 1  1     1  7.139040
## 2  2     1 15.198435
## 3  3     1  7.430626
## 4  4     2 17.141255
## 5  5     2 32.046976
print(ggplot()  +
        geom_line( data=res6$C, aes(x=time, y=C, colour=id), size=0.5) +
        geom_point(data=res6$y, aes(x=time, y=y, colour=id), size=3) + 
        facet_grid(. ~ group))