17
votes

I'm using Scala 2.8.0 and trying to read pipe delimited file like in code snipped below:

object Main {
  def main(args: Array[String]) :Unit = {
    if (args.length > 0) {
      val lines = scala.io.Source.fromPath("QUICK!LRU-2009-11-15.psv")
     for (line <-lines)
       print(line)
    }
  }
}

Here's the error:

Exception in thread "main" java.nio.charset.UnmappableCharacterException: Input length = 1 at java.nio.charset.CoderResult.throwException(CoderResult.java:261) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:319) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) at java.io.InputStreamReader.read(InputStreamReader.java:167) at java.io.BufferedReader.fill(BufferedReader.java:136) at java.io.BufferedReader.read(BufferedReader.java:157) at scala.io.BufferedSource$$anonfun$1$$anonfun$apply$1.apply(BufferedSource.scala:29) at scala.io.BufferedSource$$anonfun$1$$anonfun$apply$1.apply(BufferedSource.scala:29) at scala.io.Codec.wrap(Codec.scala:65) at scala.io.BufferedSource$$anonfun$1.apply(BufferedSource.scala:29) at scala.io.BufferedSource$$anonfun$1.apply(BufferedSource.scala:29) at scala.collection.Iterator$$anon$14.next(Iterator.scala:149) at scala.collection.Iterator$$anon$2.next(Iterator.scala:745) at scala.collection.Iterator$$anon$2.head(Iterator.scala:732) at scala.collection.Iterator$$anon$24.hasNext(Iterator.scala:405) at scala.collection.Iterator$$anon$20.hasNext(Iterator.scala:320) at scala.io.Source.hasNext(Source.scala:209) at scala.collection.Iterator$class.foreach(Iterator.scala:534) at scala.io.Source.foreach(Source.scala:143) ... at infillreports.Main$.main(Main.scala:8) at infillreports.Main.main(Main.scala) Java Result: 1

4

4 Answers

26
votes
object Main {
  def main(args: Array[String]) :Unit = {
    if (args.length > 0) {
      val lines = scala.io.Source.fromPath("QUICK!LRU-2009-11-15.psv")("UTF-8")
      for (line <-lines)
        print(line)
    }
  }
}
7
votes

I was struggling with this same issue and this answer helped me. I wanted to extend on the comment of seh regarding the 'why this works'. The answer should lie on the method signature:

def fromFile(file: JFile)(implicit codec: Codec): BufferedSource

It takes an implict codec parameter. Yet, on the example, a string is provided, not a codec. A second translation is taking place behind the scenes: The companion object of the class Codec defines an apply method from String:

def apply(encoding: String): Codec

so the compiler has done some work for us: val lines = Source.fromFile(someFile)(Codec("UTF-8"))

Given that Codec is implicit, if you are calling this method several times, you can also create a Codec object in the scope of your call:

implicit val codec = Codec("UTF-8")
val lines = Source.fromFile(someFile)
val moreLines = Source.fromFile(someOtherFile)

I hope I got that right (I'm still a Scala n00b, getting my grips on it - feel free to correct where needed)

4
votes

To add to Daniel C. Sobral's answer, you can also try something like this:

val products = Source.fromFile("products.txt")("UTF-8").getLines().toList;

for(p <- products){
        println("product :" + p);
}
1
votes

This maybe a more generic solution:

implicit val codec = Codec("UTF-8")
codec.onMalformedInput(CodingErrorAction.REPLACE)
codec.onUnmappableCharacter(CodingErrorAction.REPLACE)

with the two settings, you can avoid the malformed data in file.