3
votes

Trying to achieve deeper understanding of Scala type system, I found this (old) presentation of Martin Odersky:

https://www.youtube.com/watch?v=ecekSCX3B4Q&t=3747s

Roughly at time position [1:00:00] of this movie Martin explains that parameterized types in Scala are actually only a syntax sugar and you can completely rewrite your code replacing them with abstract types. As a side effect of this translation, we are getting a strict interpretation of "type variance". Wow. It sounded very nice and the whole story quite corresponded to some issues I found in my code, so I immediately started experimenting. But, guess what - this conversion does not work as expected. Or I am doing something wrong.

This is a very small piece of code I was using to isolate the problem:

import java.net.URL

trait MessagingClient[BrokerLocation] {
  def connect(broker: BrokerLocation)
  def sendMessage(targetNodeAddress: Long, msg: Any)
}

class KafkaMessaging extends MessagingClient[URL] {
  override def connect(broker: URL): Unit = ???
  override def sendMessage(targetNodeAddress: Long, msg: Any): Unit = ???
}

class ClusterNode[BrokerLocation](messagingClient: MessagingClient[BrokerLocation]) {
  def startNode(brokerLocation: BrokerLocation): Unit = {
    messagingClient.connect(brokerLocation)
  }
}

object Test {
  def main(args: Array[String]): Unit = {
    val messagingClient = new KafkaMessaging
    val clusterNode = new ClusterNode[URL](messagingClient)
    val brokerLocation = new URL("http://1.2.3.4:666")
    clusterNode.startNode(brokerLocation)
  }
}

The code above compiles without issues. Now, this was my first attempt to eliminate parameterized types:

import java.net.URL

trait MessagingClient {
  type BrokerLocation
  def connect(broker: BrokerLocation)
  def sendMessage(targetNodeAddress: Long, msg: Any)
}

class KafkaMessaging extends MessagingClient {
  override type BrokerLocation = URL
  override def connect(broker: URL): Unit = ???
  override def sendMessage(targetNodeAddress: Long, msg: Any): Unit = ???
}

class ClusterNode(val messagingClient: MessagingClient) {
  def startNode(brokerLocation: messagingClient.BrokerLocation): Unit = {
    messagingClient.connect(brokerLocation)
  }
}

object Test {
  def main(args: Array[String]): Unit = {
    val messagingClient = new KafkaMessaging
    val clusterNode = new ClusterNode(messagingClient)
    val brokerLocation = new URL("http://1.2.3.4:666")
    clusterNode.startNode(brokerLocation)
  }
}

This attempt does not work, however. Typechecker seems to have all needed information to approve the typing but nevertheless it complains about the line:

clusterNode.startNode(brokerLocation)

Failed with this attempt, I decided to be even more rigorous in doing the conversion, i.e. to introduce abstract type in every class that was previously parameterized. Surprisingly, this attempt also fails to compile:

import java.net.URL

trait MessagingClient {
  type BrokerLocation
  def connect(broker: BrokerLocation)
  def sendMessage(targetNodeAddress: Long, msg: Any)
}

class KafkaMessaging extends MessagingClient {
  override type BrokerLocation = URL
  override def connect(broker: URL): Unit = ???
  override def sendMessage(targetNodeAddress: Long, msg: Any): Unit = ???
}

class ClusterNode(val messagingClient: MessagingClient) {
  type BrokerLocation = messagingClient.BrokerLocation
  def startNode(brokerLocation: BrokerLocation): Unit = {
    messagingClient.connect(brokerLocation)
  }
}

object Test {
  def main(args: Array[String]): Unit = {
    val messagingClient: MessagingClient = new KafkaMessaging
    val clusterNode = new ClusterNode(messagingClient) {type BrokerLocation = URL}
    val brokerLocation = new URL("http://1.2.3.4:666")
    clusterNode.startNode(brokerLocation)
  }
}

Now - where is the mistake coming from ? I also was trying to find the "root" explanation of the whole equivalence between parameterized types and abstract types but I could hardly find it in Scala language specification. Maybe some of you already managed to investigate this problem ....

EDIT (Added as a follow-up after Andrey long investigation ... rather long comment ... not changing the original question but throwing some extra light onto the problem)

Thanks again Andrey for your extensive research on the subject. I also spent some time analyzing what I know and following your hints.

Technical issues first: I actually pasted together the latest version of your solution ('Cluster3') and unfortunately it is NOT compiling. Took some time to improve your idea a little to fix the problem. Look:

abstract class MessagingClient {
  type BrokerLocation
  def connect(b: BrokerLocation): Unit
}

class KafkaMessaging extends MessagingClient {
  override type BrokerLocation = URL
  override def connect(broker: URL): Unit = ???
}

abstract class ClusterNode3 {
  val msgClient: MessagingClient
  type BrokerLocation = msgClient.BrokerLocation
  def connect(i: BrokerLocation): Unit =
    msgClient.connect(i)
}

object AndreySolution {
  def main(args: Array[String]): Unit = {
    val messagingClient = new KafkaMessaging

    //your original solution - unfortunately leads to compilation error in last line
    //val clusterNode = wrapMessagingClientIntoClusterNode3_param(messagingClient)

    //naive attempt to solve the problem by adding type annotation - does not help, really
    //val clusterNode: ClusterNode3 {type BlokerLocation = URL} = wrapMessagingClientIntoClusterNode3_param(messagingClient)

    //... but this actually works
    //val clusterNode: ClusterNode3 {val msgClient: KafkaMessaging} = new ClusterNode3 {val msgClient = messagingClient}

    //...and this also works - using the 'smarter' wrapping approach
    val clusterNode = wrapMessagingClient_byWojciech(messagingClient)

    val brokerLocation = new URL("http://1.2.3.4:666")
    clusterNode.connect(brokerLocation)
  }

  def wrapMessagingClientIntoClusterNode3_param[I](p: MessagingClient { type BrokerLocation = I} ): ClusterNode3 { type BrokerLocation = I } =
    new ClusterNode3 {
      val msgClient = p
    }

  def wrapMessagingClient_byWojciech[T <: MessagingClient](p: T): ClusterNode3 {val msgClient: T} = new ClusterNode3 {val msgClient = p}

}

Now - final thoughts. I recognize here two separate (but connected) issues:

  • Issue 1 (= my original question) Can parameterized types be understood as a syntax sugar for abstract types ? - Still not sure about it, especially because in our solution code we apparently fall into using parameterized types again and this usage feels crucial (so funny, actually)
  • Issue 2: Combining path dependent types and abstract types in the same source code can be more tricky than one could expect. After spending several hours trying to find my way with this, I still feel like this is "walking on ice". So far I could not come to a clear recipe which patterns of coding are safe and which are not when I mix abstract types with paths.

Please have a look at this piece of code (pretty random example taken from my long marathon of experiments):

object Fancy {
  val fooWithInt = new Foo {type A = Int; val numberA = 1; type B = String; val numberB = "42"}
  val boxWithFooWithInt = new Box {type C = Int; val secret1 = fooWithInt; val secret2 = "bingo"}
  val surprise: Int = boxWithFooWithInt.secret1.numberA
  val secret: String = boxWithFooWithInt.secret2
}

trait Foo {
  type A
  type B
  val numberA: A
  val numberB: B
}

trait Box {
  type C
  type D = secret1.B
  val secret1: Foo {type A = C}
  val secret2: D
}

From the human thinking point of view types in this code are correct. Now try to guess - will Scala compiler be happy or not ?

So, it turns out that Intellij screams that types are wrong, but Scala compiler confirms that everything is fine this time. To me is it still kinda lottery how far the reasoning of the compiler can reach while solving the set of type equations in path dependent types theory.

Most likely to get the definitive answer to problems I am facing one should investigate the actual type theory that current Scala compiler implements. And I am now also super-curious how Dotty will handle my examples (maybe I can find time to test it on Dotty beta).

1

1 Answers

3
votes

Notice that a declaration like

class Foo[A](arg: Bar[A])

ensures that the first type parameter of Foo and the first type parameter of Bar are the same. In your original code, this line

class ClusterNode[BrokerLocation](messagingClient: MessagingClient[BrokerLocation])

makes sure that ClusterNode and the injected messagingClient agree on the same type of BrokerLocation, and moreover, the type of the BrokerLocation is visible from the outside as part of the ClusterNode[BrokerLocation] type.

In your first attempt, ClusterNode does not even have an abstract type member, so the information about the type members of the messagingClient gets lost immediately.

In your second attempt, what you've written roughly corresponds to

class ClusterNode(val messagingClient: MessagingClient[_ <: Any]) {
  // once the type parameter of `messagingClient` is forgotten,
  // use `BrokerLocation` typedef to publish the absent 
  // information about `messagingClient`s type parameter.
  type BrokerLocation = Any
}

in the type-parameter language. That is, the connection between the type of messagingClient and the type of ClusterNode gets lost again.

Now consider this code:

import java.net.URL

trait MessagingClient {
  type BrokerLocation
  def connect(broker: BrokerLocation)
  def sendMessage(targetNodeAddress: Long, msg: Any)
}

class KafkaMessaging extends MessagingClient {
  type BrokerLocation = URL
  override def connect(broker: URL): Unit = ???
  override def sendMessage(targetNodeAddress: Long, msg: Any): Unit = ???
}

abstract class ClusterNode { self =>
  type BrokerLocation // type is declared

  // (*) Coherence is enforced
  val messagingClient: MessagingClient { type BrokerLocation = self.BrokerLocation }

  def startNode(brokerLocation: BrokerLocation): Unit = {
    messagingClient.connect(brokerLocation)
  }
}

object Test {
  def main(args: Array[String]): Unit = {
    val msgCl = new KafkaMessaging
    // (**)
    val clusterNode = new ClusterNode {
      type BrokerLocation = URL
      val messagingClient = msgCl
    }
    val brokerLocation = new URL("http://1.2.3.4:666")
    clusterNode.startNode(brokerLocation)
  }
}

Look at the code around the line marked with (*). In the line of code right above it, we declare that ClusterNode has some abstract type member called BrokerLocation, and in the line below it, we enforce that the messagingClient is a MessagingClient with a compatible abstract member type. Once you do it like this, the abstract type member does not get lost, and the code in the lines after (**) compiles as expected, even though the instantiation is much more cumbersome.

EDIT: Apparently it's still not exactly clear why the "second attempt" loses the information about the BrokerLocation in MessagingClient.

The overall picture is this: There is some type of BrokerLocation, and this type should be consistent in three different places:

  • As seen from the outside of the ClusterNode
  • As type member of ClusterNode
  • As type member of the wrapped MessagingClient

The ClusterNode should somehow make sure that the BrokerLocation is known outside (step1 outside-to-cn), so we can pass an instance of BrokerLocation from the main.

Furthermore, one should somehow establish that the BrokerLocation in ClusterNode is the same as in MessagingClient (step2: cn-to-mc). In the question, this chain was broken in two different ways:

  • step1 broken, step2 broken
  • step1 broken, step2 ok

In the first part of the posting, I have simply proposed a variant in which the entire chain worked, but I also replaced the second step:

  • step1 ok, step2 again ok (but implemented differently)

In the comment below, you have assumed that there was something wrong with the step2 in your second proposal. This is not the case. The combination "step1 ok, step2 broken" is not the problem. The problem is in the constructor as it is written in your code. It is the constructor of ClusterNode that leaks type information, thereby breaking step1.

I want to keep the EDIT compilable, so let's repeat the definitions from your second attempt:

abstract class MessagingClient {
  type BrokerLocation
  def connect(b: BrokerLocation): Unit
}

class ClusterNode(val mc: MessagingClient) {
  type BrokerLocation = mc.BrokerLocation
  def startNode(b: BrokerLocation) = mc.connect(b)
}

Forget all member types and methods that are declared in the body of ClusterNode for a moment, and just look at the constructor of the ClusterNode. What does the signature of the ClusterNode constructor tell you? It tells you that it accepts any kind of MessagingClient, regardless of the type of BrokerLocation. Nothing in the declaration of ClusterNode prevents us from writing this:

def wrapMessagingClientIntoClusterNode(p: MessagingClient)
: ClusterNode = new ClusterNode(p)

Just look at the first line of the declaration. Nothing in this method signature prevents you from passing all kind of messaging clients, and no type information flows from the first line to the second line.

How can the compiler possibly recover anything useful about the type of the BrokerLocation in the MessagingClient passed to the constructor of the ClusterNode? It cannot recover the type. And it won't. The type information is lost as soon as the constructor is invoked. No amount of type declarations and methods inside the ClusterNode body can recover this information.

Can we somehow attach more information to a ClusterNode, so that the type is preserved? Well, yes, but we would have to attach the following rather cumbersome construction from the outside:

def wrapMessagingClientIntoClusterNode_param[I]
(p: MessagingClient { type BrokerLocation = I } )
: ClusterNode { type BrokerLocation = I } = 
  (new ClusterNode(p)).asInstanceOf[ClusterNode { type BrokerLocation = I }]

Notice that we cannot omit the asInstanceOf part here, because the constructor itself leaks the necessary type information, just as the first version of the wrapMessagingClientIntoClusterNode method.

Now compare it to the following definition:

abstract class ClusterNode2 { self =>
  type BrokerLocation
  val msgClient: MessagingClient { 
    type BrokerLocation = self.BrokerLocation 
  }
  def startNode(i: BrokerLocation): Unit = 
    msgClient.connect(i)
}

What happens if you try to write a method wrapMessagingClientIntoClusterNode, in the same way as above?

def wrapMessagingClientIntoClusterNode2(p: MessagingClient)
: ClusterNode2 = new ClusterNode2 {
  type BrokerLocation = p.BrokerLocation
  val msgClient = p
}

That's more or less just as useless as the wrapMessagingClientIntoClusterNode method, it also loses the type information, because we are essentially asking for an existential type MessagingClient[_] on the left hand side. But this time, we can correct it more easily:

def wrapMessagingClientIntoClusterNode2_param[I]
(p: MessagingClient {type BrokerLocation = I})
: ClusterNode2 { type BrokerLocation = I } = 
  new ClusterNode2 {
    type BrokerLocation = I
    val msgClient = p
  }

Again, we introduce a type parameter I which can tunnel the type information from the argument-part to the return-type part. It compiles, but this time without an asInstanceOf, because we specify the type BrokerLocation member first, before it is forgotten by the constructor.

Now, consider a third version, which is closer to what you've tried in your second attempt:

abstract class ClusterNode3 { self =>
  val msgClient: MessagingClient
  type BrokerLocation = msgClient.BrokerLocation
  def connect(i: BrokerLocation): Unit =
    msgClient.connect(i)
}

You can do the same stupid mistakes with it:

def wrapMessagingClientIntoClusterNode3(p: MessagingClient)
: ClusterNode3 = new ClusterNode3 {
  val msgClient = p
}

But you can also do the right thing, and preserve the types:

def wrapMessagingClientIntoClusterNode3_param[I]
(p: MessagingClient { type BrokerLocation = I} )
: ClusterNode3 { type BrokerLocation = I } =
  new ClusterNode3 {
    val msgClient = p
  }

This is almost the same as what you did. just without the constructor.

Whether you take ClusterNode2 or ClusterNode3 is not that important, the switch from ClusterNode3 to ClusterNode2 was not that crucial in the first part of my answer. The elimination of the "evil" type-erasing constructor was actually more important, because it repaired step1.

I'll try to summarize it: in your second attempt, the compiler knows that the BrokerLocation type members in MessagingClient and ClusterNode are the same, but it does not know what the type is.

I hope the role of the constructor is a little bit clearer now.