In the code snippet below I am reading a JSON file with a structure similar to this one:
{ "c7254865-87b5-4d34-a7bd-6ba6c9dbab14": "72119c87-7fce-4e17-9770-fcfab04328f5"}
{ "36c18403-1707-48c4-8f19-3b2e705007d4": "72119c87-7fce-4e17-9770-fcfab04328f5"}
{ "34a71a88-ae2d-4304-a1db-01c54fc6e4d8": "72119c87-7fce-4e17-9770-fcfab04328f5"}
Each line contains a key value pair which should be then added to a map in Scala. This is the Scala code which I used for this purpose:
val fs = org.apache.hadoop.fs.FileSystem.get(new Configuration())
def readFile(location: String): mutable.HashMap[String, String] = {
val path: Path = new Path(location)
val dataInputStream: FSDataInputStream = fs.open(path)
val m = new mutable.HashMap[String, String]()
for (line <- Source.fromInputStream(dataInputStream).getLines) {
val parsed: Option[Any] = JSON.parseFull(line)
m ++= parsed.get.asInstanceOf[Map[String, String]]
}
m
}
There must be a more elegant way to do this in Scala for sure. Especially you should be able to get rid of the mutable map and ingest the lines directly via a stream into a map. How can you do that?