2
votes

I am trying to run a mixed-effects model in julia (R is too slow for my data), but I keep getting this error.

I have installed DataArrays, DataFrames , MixedModels, and RDatasets packages and am following the tutorial here --> http://dmbates.github.io/MixedModels.jl/latest/man/fitting/#Fitting-linear-mixed-effects-models-1

These are my steps:

using DataArrays, DataFrames , MixedModels, RDatasets 

I get these warnings

WARNING: Method definition ==(Base.Nullable{S}, Base.Nullable{T}) in module Base at nullable.jl:238 overwritten in module NullableArrays at /home/home/.julia/v0.6/NullableArrays/src/operators.jl:128. WARNING: Method definition model_response(DataFrames.ModelFrame) in module DataFrames at /home/home/.julia/v0.6/DataFrames/src/statsmodels/formula.jl:352 overwritten in module MixedModels at /home/home/.julia/v0.6/MixedModels/src/pls.jl:65. WARNING: Method definition ==(Base.Nullable{S}, Base.Nullable{T}) in module Base at nullable.jl:238 overwritten in module NullableArrays at /home/home/.julia/v0.6/NullableArrays/src/operators.jl:128. WARNING: Method definition ==(Base.Nullable{S}, Base.Nullable{T}) in module Base at nullable.jl:238 overwritten in module NullableArrays at /home/home/.julia/v0.6/NullableArrays/src/operators.jl:128. WARNING: Method definition model_response(DataFrames.ModelFrame) in module DataFrames at /home/home/.julia/v0.6/DataFrames/src/statsmodels/formula.jl:352 overwritten in module MixedModels at /home/home/.julia/v0.6/MixedModels/src/pls.jl:65. WARNING: Method definition model_response(DataFrames.ModelFrame) in module DataFrames at /home/home/.julia/v0.6/DataFrames/src/statsmodels/formula.jl:352 overwritten in module MixedModels at /home/home/.julia/v0.6/MixedModels/src/pls.jl:65.

I get an R dataset from lme4 package (used in the tutorial)

inst = dataset("lme4", "InstEval")

julia> head(inst)
6×7 DataFrames.DataFrame
│ Row │ S   │ D      │ Studage │ Lectage │ Service │ Dept │ Y │
├─────┼─────┼────────┼─────────┼─────────┼─────────┼──────┼───┤
│ 1   │ "1" │ "1002" │ "2"     │ "2"     │ "0"     │ "2"  │ 5 │
│ 2   │ "1" │ "1050" │ "2"     │ "1"     │ "1"     │ "6"  │ 2 │
│ 3   │ "1" │ "1582" │ "2"     │ "2"     │ "0"     │ "2"  │ 5 │
│ 4   │ "1" │ "2050" │ "2"     │ "2"     │ "1"     │ "3"  │ 3 │
│ 5   │ "2" │ "115"  │ "2"     │ "1"     │ "0"     │ "5"  │ 2 │
│ 6   │ "2" │ "756"  │ "2"     │ "1"     │ "0"     │ "5"  │ 4 │

I run the model as shown in the tutorial

m2 = fit!(lmm(y ~ 1 + dept*service + (1|s) + (1|d), inst))

and get

ERROR: UndefVarError: y not defined
  Stacktrace:
  [1] macro expansion at ./REPL.jl:97 [inlined]
  [2] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73

The same thing happens when I try it with my own data loaded using "readtable" from DataFrames package

I am running julia 0.6.0 and all packages are freshly installed. My system is arch linux 4.11.7-1 with all the latest packages. Julia installs without a problem, but some packages give warnings (see above).

1
Without me knowing much, nor trying anything... I notice the difference in formula and Dataframe column header capitalization. What happens if you use this instead? m2 = fit!(lmm(Y ~ 1 + Dept*Service + (1|S) + (1|D), inst))rickhg12hs
Same thing --> ERROR: UndefVarError: Y not defined I just installed julia on friend's mac via brew, and it gives the same error.mike

1 Answers

2
votes

have a go with the @formula macro:

julia> fit!(lmm(@formula(Y ~ (1 | Dept)), inst), true)
f_1: 250160.38873 [1.0]
f_2: 250175.99074 [1.75]
f_3: 250123.06531 [0.25]
f_4: 250602.3424 [0.0]
f_5: 250137.66303 [0.4375]
f_6: 250129.76244 [0.325]
f_7: 250125.94066 [0.280268]
f_8: 250121.15016 [0.23125]
f_9: 250119.12389 [0.2125]
f_10: 250114.7257 [0.175]
f_11: 250105.61264 [0.1]
f_12: 250602.3424 [0.0]
f_13: 250107.52714 [0.118027]
f_14: 250106.36924 [0.107778]
f_15: 250105.04638 [0.0925]
f_16: 250104.72722 [0.085]
f_17: 250104.93086 [0.0749222]
f_18: 250104.70046 [0.0831588]
f_19: 250104.70849 [0.0839088]
f_20: 250104.69659 [0.0824088]
f_21: 250104.69632 [0.0822501]
f_22: 250104.69625 [0.0821409]
f_23: 250104.69625 [0.0820659]
f_24: 250104.69624 [0.0821118]
f_25: 250104.69624 [0.0821193]
f_26: 250104.69624 [0.082111]
f_27: 250104.69624 [0.0821118]
f_28: 250104.69624 [0.0821117]
f_29: 250104.69624 [0.0821118]
f_30: 250104.69624 [0.0821118]
Linear mixed model fit by maximum likelihood
 Formula: Y ~ 1 | Dept
     logLik        -2 logLik          AIC             BIC       
 -1.25052348×10⁵  2.50104696×10⁵  2.50110696×10⁵  2.50138308×10⁵

Variance components:
              Column     Variance   Std.Dev.  
 Dept     (Intercept)  0.011897242 0.10907448
 Residual              1.764556375 1.32836605
 Number of obs: 73421; levels of grouping factors: 14

  Fixed-effects parameters:
             Estimate Std.Error z value P(>|z|)
(Intercept)   3.21373  0.029632 108.455  <1e-99

The warnings are the usual "Julia (and the package ecosystem) is still in flux" messages. But I wonder whether the docs always keep pace with the code.