31
votes

Coming from a Java background, I'm used to the common practice of dealing with collections: obviously there would be exceptions but usually code would look like:

public class MyClass {
  private Set<String> mySet;

  public void init() {
    Set<String> s = new LinkedHashSet<String>();
    s.add("Hello");
    s.add("World"); 
    mySet = Collections.unmodifiableSet(s);
  }
}

I have to confess that I'm a bit befuddled by the plethora of options in Scala. There is:

  • scala.List (and Seq)
  • scala.collections.Set (and Map)
  • scala.collection.immutable.Set (and Map, Stack but not List)
  • scala.collection.mutable.Set (and Map, Buffer but not List)
  • scala.collection.jcl

So questions!

  1. Why are List and Seq defined in package scala and not scala.collection (even though implementations of Seq are in the collection sub-packages)?
  2. What is the standard mechanism for initializing a collection and then freezing it (which in Java is achieved by wrapping in an unmodifiable)?
  3. Why are some collection types (e.g. MultiMap) only defined as mutable? (There is no immutable MultiMap)?

I've read Daniel Spiewak's excellent series on scala collections and am still puzzled by how one would actually use them in practice. The following seems slightly unwieldy due to the enforced full package declarations:

class MyScala {
  var mySet: scala.collection.Set[String] = null 

  def init(): Unit = {
     val s = scala.collection.mutable.Set.empty[String]
     s + "Hello"
     s + "World"
     mySet = scala.collection.immutable.Set(s : _ *)

  }
}

Although arguably this is more correct than the Java version as the immutable collection cannot change (as in the Java case, where the underlying collection could be altered underneath the unmodifiable wrapper)

4

4 Answers

26
votes

Why are List and Seq defined in package scala and not scala.collection (even though implementations of Seq are in the collection sub-packages)?

Because they are deemed so generally useful that they are automatically imported into all programs via synonyms in scala.Predef.

What is the standard mechanism for initializing a collection and then freezing it (which in Java is achieved by wrapping in an unmodifiable)?

Java doesn't have a mechanism for freezing a collection. It only has an idiom for wrapping the (still modifiable) collection in a wrapper that throws an exception. The proper idiom in Scala is to copy a mutable collection into an immutable one - probably using :_*

Why are some collection types (e.g. MultiMap) only defined as mutable? (There is no immutable MultiMap)?

The team/community just hasn't gotten there yet. The 2.7 branch saw a bunch of additions and 2.8 is expected to have a bunch more.

The following seems slightly unwieldy due to the enforced full package declarations:

Scala allows import aliases so it's always less verbose than Java in this regard (see for example java.util.Date and java.sql.Date - using both forces one to be fully qualified)

import scala.collection.{Set => ISet}
import scala.collection.mutable.{Set => MSet}

class MyScala {
  var mySet: ISet[String] = null 

  def init(): Unit = {
     val s = MSet.empty[String]
     s + "Hello"
     s + "World"
     mySet = Set(s : _ *)
  }
}

Of course, you'd really just write init as def init() { mySet = Set("Hello", "World")} and save all the trouble or better yet just put it in the constructor var mySet : ISet[String] = Set("Hello", "World")

7
votes

Mutable collections are useful occasionally (though I agree that you should always look at the immutable ones first). If using them, I tend to write

import scala.collection.mutable

at the top of the file, and (for example):

val cache = new mutable.HashMap[String, Int]

in my code. It means you only have to write “mutable.HashMap”, not scala.collection.mutable.HashMap”. As the commentator above mentioned, you could remap the name in the import (e.g., “import scala.collection.mutable.{HashMap => MMap}”), but:

  1. I prefer not to mangle the names, so that it’s clearer what classes I’m using, and
  2. I use ‘mutable’ rarely enough that having “mutable.ClassName” in my source is not an undue burden.

(Also, can I echo the ‘avoid nulls’ comment too. It makes code so much more robust and comprehensible. I find that I don’t even have to use Option as much as you’d expect either.)

4
votes

A couple of random thoughts:

  1. I never use null, I use Option, which would then toss a decent error. This practice has gotten rid of a ton NullPointerException opportunities, and forces people to write decent errors.
  2. Try to avoid looking into the "mutable" stuff unless you really need it.

So, my basic take on your scala example, where you have to initialize the set later, is

class MyScala {
  private var lateBoundSet:Option[ Set[ String ] ] = None
  def mySet = lateBoundSet.getOrElse( error("You didn't call init!") )

  def init {
     lateBoundSet = Some( Set( "Hello", "World" ) )
  }
}

I've been on a tear recently around the office. "null is evil!"

3
votes

Note that there might be some inconsistencies in the Scala collections API in the current version; for Scala 2.8 (to be released later in 2009), the collections API is being overhauled to make it more consistent and more flexible.

See this article on the Scala website: http://www.scala-lang.org/node/2060

To add to Tristan Juricek's example with a lateBoundSet: Scala has a built-in mechanism for lazy initialization, using the "lazy" keyword:

class MyClass {
    lazy val mySet = Set("Hello", "World")
}

By doing this, mySet will be initialized on first use, instead of immediately when creating a new MyClass instance.