3
votes

I am reading "Scala for the Impatient" and in 8.10 there is an example:

class Animal {
  val range: Int = 10
  val env: Array[Int] = new Array[Int](range)

}

class Ant extends Animal {
  override val range: Int = 2
}

The author explains why the env ends up being an empty Array[Int]:

[..] 3. The Animal constructor, in order to initialize the env array, calls the range() getter.

  1. That method is overridden to yield the (as yet uninitialized) range field of the Ant class.

  2. The range method returns 0. (That is the initial value of all integer fields when an object is allocated.)

  3. env is set to an array of length 0.

  4. The Ant constructor continues, setting its range field to 2.[..]

I don't understand the 4th and therefore the next steps are also not clear. range() method is overridden with 2, so why doesn't it set the range already in the 4th step?

Doe is work this way? When the val is overridden, then it gets uninitialised and also all val's that include this overridden gals get modified as well. Is it correct? If yes, why there is a different behaviour with def as outlined here. Why is def defined before the constructor call and val after?

2
Probably relevant: stackoverflow.com/questions/49595018/… . This was trending only few hours ago.Andrey Tyukin

2 Answers

5
votes

After your comment, I decided to actually look at what exactly is written in the book. After reading the explanation, I decided that I cannot express it any clearer. So instead, I propose to take a look at the completely desugared code, for which there was no place in the short book.

Save this as a Scala script:

class Animal {
  val range: Int = 10
  val env: Array[Int] = new Array[Int](range)
}

class Ant extends Animal {
  override val range: Int = 2
}

val ant = new Ant
println(ant.range)
println(ant.env.size)

and then run it using the -print-option:

> scala -nc -print yourScript.scala

you should see something like this:

class anon$1$Animal extends Object {
  private[this] val range: Int = _;
  <stable> <accessor> def range(): Int = anon$1$Animal.this.range;
  private[this] val env: Array[Int] = _;
  <stable> <accessor> def env(): Array[Int] = anon$1$Animal.this.env;
  <synthetic> <paramaccessor> <artifact> protected val $outer: <$anon: Object> = _;
  <synthetic> <stable> <artifact> def $outer(): <$anon: Object> = anon$1$Animal.this.$outer;
  def <init>($outer: <$anon: Object>): <$anon: Object> = {
    if ($outer.eq(null))
      throw null
    else
      anon$1$Animal.this.$outer = $outer;
    anon$1$Animal.super.<init>();
    anon$1$Animal.this.range = 10;
    anon$1$Animal.this.env = new Array[Int](anon$1$Animal.this.range());
    ()
  }
};
class anon$1$Ant extends <$anon: Object> {
  private[this] val range: Int = _;
  override <stable> <accessor> def range(): Int = anon$1$Ant.this.range;
  <synthetic> <stable> <artifact> def $outer(): <$anon: Object> = anon$1$Ant.this.$outer;
  def <init>($outer: <$anon: Object>): <$anon: anon$1$Animal> = {
    anon$1$Ant.super.<init>($outer);
    anon$1$Ant.this.range = 2;
    ()
  }
}

This is the desugared code as it is seen by the compiler in later stages of compilation. It's a bit hard to read, but what it is important are these declarations:

  // in Animal:
  private[this] val range: Int = _;
  <stable> <accessor> def range(): Int = anon$1$Animal.this.range;

  // in Ant:
  private[this] val range: Int = _;
  override <stable> <accessor> def range(): Int = 
    anon$1$Ant.this.range;

and also the statement in the initializer of Animal:

  anon$1$Animal.this.env = new Array[Int](anon$1$Animal.this.range())

What you can see here is that there are actually two different variables range: one is Animal.this.range and the other is Ant.this.range. Moreover, there are completely separate defs which are also called range in the desugared code: these are the getters which are generated automatically for vals.

The first variable is indeed initialized in Animal and set to 10:

    anon$1$Animal.this.range = 10;

However, this does not matter, because the env is initialized using the getter range(), which is overridden to return Ant.this.range. The variable Ant.this.range is assigned the value 2 once, but after the initializer of Animal has completed. During the initialization of Animal, the variable Ant.this.range holds the default value 0, hence the counter-intuitive result.

If you simplify the desugared code a little bit, you obtain a compilable and readable example that behaves in the same way:

class Animal {
  private[this] var _Animal_range: Int = 0
  def range: Int = _Animal_range

  _Animal_range = 10
  val env: Array[Int] = new Array[Int](range)
}

class Ant extends Animal {
  private[this] var _Ant_range: Int = 0
  override def range: Int = _Ant_range

  _Ant_range = 2
}

val ant = new Ant
println(ant.range)
println(ant.env.size)

Here, the same happens:

  1. _Animal_range is allocated with default value 0
  2. _Ant_range is allocated with default value 0
  3. Animal base class begins initialization
  4. _Animal_range is initialized with value 10
  5. To initialize env, the getter range is invoked. It is overridden in the Ant-class, and returns _Ant_range, which is still 0
  6. env is set to an empty array
  7. Animal base class finishes initialization
  8. Ant begins initialization
  9. Only now does it set _Ant_range to 2.

This is why both code snippets print 2 and 0.

Hope that helps.

1
votes

defs are called when asked for, but vals are kept in memory, so in this case the val version is still set to zero because it hasn’t been initialised, where the def version is called and therefore can never give a misleading value.

It’s only because the val here is overloaded that it isn’t initialised in time, because classes must be initialised starting at the top of the inheritance hierarchy and building down. If env were a def you’d be fine also, because it wouldn’t be created until called, by which time the vals would all be initialised. Of course this way you’d get different lists every time you called env so you might prefer to use lazy val, which is initialised the first time it’s called, but then remains the same and is kept in memory.