1
votes

Using Scala... I can't figure out how to use polymorphism in a way that mixes type bound and covariance.

In a nutshell, I think I need something like this type signature... but if you follow along with my dummy example, you'll see why I get here... and maybe I'm wrong.

def func[+T <: U](func: Seq[T] => T)(iter: Iterator[String]): Map[String, String] = ???

but this approach yields...

>> error: ']' expected but identifier found

Here's a dummy example that demonstrates what I'm trying to do... I could sidestep this by just making work only with the base trait Record... but I'd like to get it working with polymorphism baked in for other reasons in the real code.

setup

// underlying trait to hold key and value
trait Record {
  def k: String 
  def v: String
  def isDefined: Boolean
}

// companion object with apply method
object Record {
  def apply(s: String): Record = s.split(",") match {
    case Array(k,v) => new ValidRecord(k,v).asInstanceOf[Record]
    case _          => EmptyRecord.asInstanceOf[Record]
  }
}

// singleton for empty records
object EmptyRecord extends Record {
  val k = ""
  val v = ""
  val isDefined = false
}

// class for actual data
class ValidRecord(val k: String, val v: String) extends Record {
  val isDefined = true
}

polymorphic function

note - going from Iterator to Seq here looks questionable... I'm reading in a file from src/main/resources... it comes in as an Iterator... and I ultimately need to get it into a Map, so .toSeq and .groupBy seem like logical steps... it's only maybe 100MB and a million or so records, so this works fine... but if there's a smarter way to get from start to end, I'm open to that critique as well.

def iter_2_map[T <: Record](func: Seq[T] => T)(iter: Iterator[String]): Map[String, String] = {
  iter                               // iterator of raw data
  .map(Record.apply)                 // Iterator[Record]
  .toSeq                             // gives .groupBy() method
  .groupBy(_.k)                      // Map[k -> Seq[Record]]; one Seq of records per k
  .mapValues(func) // <<< ERROR HERE //function to reduce Seq[Record] to 1 Record
  .filter(_._2.isDefined)            // get rid of empty results
  .mapValues(_.v)                    // target of Map is just v
}

error

found   : Seq[T] => T
required: Seq[Record] => ?
          .mapValues(func)
                     ^

If I break down all those steps and declare types at every relevant step... the error changes to this...

found   : Seq[T] => T
required: Seq[Record] => Record
          .mapValues(func)
                     ^

So here's where I get stumped. I think making T covariant solves this... T is a declared subtype of Record, but maybe it's not recognizing Seq[T] as <: Seq[Record]?

But making this change yields the error at the top...

def iter_2_map[+T <% Record](func: Seq[T] => T)(iter: Iterator[String]): Map[String, String] = {
  ???
}

back to this...

>> error: ']' expected but identifier found

Am I even on the right track?

1
It is not clear if your trait Record should be sealed. In other words are EmptyRecord and ValidRecord the only expected subclasses? Or do you want to create more specific types with more fields and other parsing logic? So it is not clear what T you have in mind when you write T <: Record. On the one hand you explicitly call Record.apply so you can't get anything else besides those 2. On the other hand, if you use Record.apply you can get both of them so you can't narrow the type down stricter than just Record. To sum up: show us how you are going to use your iter_2_mapSergGr

1 Answers

2
votes

You are using + incorrectly. It's only used with type parameters of classes to signal that the class should be covariant in its parameter. It does not make very much sense to use it with methods (Seq[T] actually is a subclass of Seq[Record] - because Seq is covariant, but that does not help you, because functions are contravariant in their argument type, so Function[Seq[T], T] is a superclass of Function[Seq[Record], T], not a subclass). Here is why:

After .groupBy(_.k) you have Map[String, Seq[Record]]. Now, you are doing .mapValues(func) on it, and are trying to pass a function to it, that takes a Seq[T]. This cannot work.

Imagine, that Record is Animal, and T is Dog ... and func is makeBark ... And now you are trying to pass a bunch of animals to it, some of which are Cats, some Birds, and some, maybe Fish. You can't make them all bark, can you?

You could just declare your reducer function to accept the Record sequence rather than T:

   def iter_2_map[T <: Record](func: Seq[Record] => T)(iter: Iterator[String])

This will compile, but doesn't seem like it would be very useful for you anyway, because you appear to expect your func to be able to return both EmptyRecord and ValidRecord, and not just T (since you are filtering the empties out afterwards). So, it actually seems that you don't need the type parameter at all after all:

  def iter_2_map(func: Seq[Record] => Record)(iter: Iterator[String])