1
votes

I have code which looks roughly like this:

val json: Json = parse("""
[
{
"id": 1,
"type": "Contacts",
"admin": false,
"cookies": 3
},
{
"id": 2,
"type": "Apples",
"admin": false,
"cookies": 6
},
{
"id": 3,
"type": "Contacts",
"admin": true,
"cookies": 19
}
]
""").getOrElse(Json.Null)

I'm using Circe, Cats, Scala, Circe-json, and so on, and the Parse call succeeds.

I want to return a List, where each top-level Object where type="Contacts", is shown in it's entirety.

Something like: List[String] = ["{"id": 1,"type": "Contacts","admin": false,"cookies": 3}","{"id": 3,"type": "Contacts","admin": true,"cookies": 19}"]

The background is that I have large JSON files on disk. I need to filter out the subset of objects that match a certain type= value, in this case, type=Contacts, and then split these out from the rest of the json file. I'm not looking to modify the file, I'm more looking to grep for matching objects and process them accordingly.

Thank you.

1

1 Answers

2
votes

The most straightforward way to accomplish this kind of thing is to decode the document into either a List[Json] or List[JsonObject] value. For example, given your definition of json:

import io.circe.JsonObject

val Right(docs) = json.as[List[JsonObject]]

And then you can query based on the type:

scala> val contacts = docs.filter(_("type").contains(Json.fromString("Contacts")))
contacts: List[io.circe.JsonObject] = List(object[id -> 1,type -> "Contacts",admin -> false,cookies -> 3], object[id -> 3,type -> "Contacts",admin -> true,cookies -> 19])

scala> contacts.map(Json.fromJsonObject).map(_.noSpaces).foreach(println)
{"id":1,"type":"Contacts","admin":false,"cookies":3}
{"id":3,"type":"Contacts","admin":true,"cookies":19}

Given your use case, circe-optics seems unlikely to be a good fit (see my answer here for some discussion of why filtering with arbitrary predicates is awkward with Monocle's Traversal).

It may be worth looking into circe-fs2 or circe-iteratee, though, if you're interested in parsing and filtering large JSON files without loading the entire contents of the file into memory. In both cases the principle would be the same as in the List[JsonObject] code just above—you decode your big JSON array into a stream of JsonObject values, which you can query however you want.