0
votes

I have a big flat denormalized csv file containing multiples objects on single row like this:

a1, a2, a3, b1, b2, b3 ...
...

and I have objects:

case class A(a1: Int, a2: String, a3: Float)
case class B...
...

and the legacy is writing complicated adapters for extracting each class. I recently read some talks about shapeless and I know I can solve this with generic programming with shapeless.

and there's even a csv parser example, perfect. my thoughts would be:

  1. parse the csv into List[String]
  2. filter the ListString with object field information
  3. using the filtered ListString and fed it with the Csv Parser example
  4. thus I can extract multiple objects a row from the csv file.

problems I have:

  1. I am still on scala 2.10, I seem to have configured the compiler plugin correctly( egg. mvn clean install works properly). but intellij fails to compile occasionally, throws exceptions.

    <groupId>org.scala-lang.plugins</groupId>
    <artifactId>macro-paradise_2.10</artifactId>
    <version>2.0.0-SNAPSHOT</version>
    
  2. this code is from shapeless example

    implicit def deriveHConsOption[V, T <: HList](
        implicit
        scv: Lazy[CSVConverter[V]],
        sct: Lazy[CSVConverter[T]]
      ):CSVConverter[Option[V] :: T] = new CSVConverter[Option[V] :: T] {
        override def from(s: String): Try[shapeless.::[Option[V], T]] = s.span(_ != ',') 
    

however I'm having following compiler errors:

Error:(70, 28) wrong number of type arguments for ::, should be 1
):CSVConverter[Option[V] :: T] = new CSVConverter[Option[V] :: T] { ^

  1. my attempts to filter the csv using shapeless:

    //code to filter and extract one object 
    def extractCSVColumnsAndParse[T]:
        val labl = LabelledGeneric[T]
        val keys = Keys[labl.Repr].apply
        val keyNames = keys.toList.map(_.name)
    

however it seems T could only be concrete Class Type

Error:(86, 35) could not find implicit value for parameter lgen: shapeless.LabelledGeneric[T] val labl = LabelledGeneric[T]

1
For #2, you need to import shapeless.::, otherwise, it uses the list constructor, that only has one type parameter.Cyrille Corpet
Could you give some context about where you're using your snippet from #3? Is it in a method you want to apply multiple time with concrete types?Cyrille Corpet
@CyrilleCorpet right that solves #2. #3, I was hoping I could write one function but call it multiple times to parse different objects out.zinking

1 Answers

1
votes

I don't know much about your IntelliJ problem, but I'll give the other two some answer:

  • 2: :: is scala.collection.immutable.:: (the List constructor), unless you explicitly import shapeless.:: (the HList constructor).

  • 3: While you're dealing with generic types, you cannot assume anything on them. They might be Int, Any, MyCaseClass, Nothing, ... So the compiler cannot find a LabelledGeneric for them (indeed, what would be the LabelledGeneric for Any?). Therefore, you must explicitly tell all your generic methods that your type has an instance of LabelledGeneric[T], and this is done by giving it as an implicit parameter (or a context bounds, which is the same thing, under the hood). So for instance, you could do

    // alternatively, def extractCSVColumnsAndParse[T: LabelledGeneric]
    def extractCSVColumnsAndParse[T](implicit ev: LabelledGeneric[T]) = {
      val labl = LabelledGeneric[T]
      val keys = Keys[labl.Repr].apply
      val keyNames = keys.toList.map(_.name)
      ...
    }
    

    And then, when you use it, for explicit case classes:

    extractCSVColumnsAndParse[MyCaseClass]  //no need to pass the parameter, it is already in scope
    

    The "magic" of shapeless is that it generates your implicit ev: LabelledGeneric[MyCaseClass] for you, but it can only do so for specific types (using macros), so you have to tell the compiler that it exists, if you're dealing with generic types.

EDIT

After that, you get an error with val keys, because the type parameter for Keys must be an HList, so you have to enforce this in some way, because LabelledGeneric[T]#Repr is not necessarily an HList. And, you also need to provide an implicit Keys[Repr], for the same reason as with LabelledGeneric.

def extractCSVColumnsAndParse[T, Repr <: HList](labl: LabelledGeneric.Aux[T, Repr], K: Keys[Repr]) {
  val keys = K()
  val keyNames = keys.toList.map(_.name)
  ...
}

However, this makes it less easy to call with a specific case class, since you cannot do extractCSVColumnsAndParse[MyCaseClass] anymore. This is because scala methods only have one list of type parameters, so you must give them all or none of them.

A convoluted way to avoid this is the following pattern, assuming your method will actually have some parameter (say, the List[String] from the csv file, or the csv file path):

def extractCSVColumnsAndParse[T] = new Extractor[T]

trait Extractor[T] {
  def apply[Repr <: HList](csv: List[String])(implicit labl: LabelledGeneric.Aux[T, Repr], K: Keys[Repr]) = {
    ... // put the logic here
  }
}

Now you can call it using

extractCSVColumnsAndParse[MyCaseClass](csv)

This pattern allows you to specify only the first type parameter, the second one being inferred at compile time.