How to decode a JSON null into an empty collection

Question

Suppose I have a Scala case class like this:

case class Stuff(id: String, values: List[String])

And I want to be able to decode the following JSON values into it:

{ "id": "foo", "values": ["a", "b", "c"] }
{ "id": "bar", "values": [] }
{ "id": "qux", "values": null }

In Circe the decoder you get from generic derivation works for the first two cases, but not the third:

scala> decode[Stuff]("""{ "id": "foo", "values": ["a", "b", "c"] }""")
res0: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(a, b, c)))

scala> decode[Stuff]("""{ "id": "foo", "values": [] }""")
res1: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List()))

scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
res2: Either[io.circe.Error,Stuff] = Left(DecodingFailure(C[A], List(DownField(values))))

How can I make my decoder work for this case, preferably without having to deal with the boilerplate of a fully hand-written definition.

Travis Brown Travis Brown · Accepted Answer · 2019-09-13T08:28:53

Preprocessing with cursors

The most straightforward way to solve this problem is to use semi-automatic derivation and preprocess the JSON input with prepare. For example:

import io.circe.{Decoder, Json}, io.circe.generic.semiauto._, io.circe.jawn.decode

case class Stuff(id: String, values: List[String])

def nullToNil(value: Json): Json = if (value.isNull) Json.arr() else value

implicit val decodeStuff: Decoder[Stuff] = deriveDecoder[Stuff].prepare(
  _.downField("values").withFocus(nullToNil).up
)

And then:

scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
res0: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List()))

It's a little more verbose than simply using deriveDecoder, but it still lets you avoid the boilerplate of writing out all your case class members, and if you only have a few case class with members that need this treatment, it's not too bad.

Handling missing fields

If you additionally want to handle cases where the field is missing entirely, you need an extra step:

implicit val decodeStuff: Decoder[Stuff] = deriveDecoder[Stuff].prepare { c =>
  val field = c.downField("values")

  if (field.failed) {
    c.withFocus(_.mapObject(_.add("values", Json.arr())))
  } else field.withFocus(nullToNil).up
}

And then:

scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
res1: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List()))

scala> decode[Stuff]("""{ "id": "foo" }""")
res2: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List()))

This approach essentially makes your decoder behave exactly the same way it would if the member type was Option[List[String]].

Bundling this up

You can make this more convenient with a helper method like the following:

import io.circe.{ACursor, Decoder, Json}
import io.circe.generic.decoding.DerivedDecoder

def deriveCustomDecoder[A: DerivedDecoder](fieldsToFix: String*): Decoder[A] = {
  val preparation = fieldsToFix.foldLeft[ACursor => ACursor](identity) {
    case (acc, fieldName) =>
      acc.andThen { c =>
        val field = c.downField(fieldName)

        if (field.failed) {
          c.withFocus(_.mapObject(_.add(fieldName, Json.arr())))
        } else field.withFocus(nullToNil).up
      }
  }

  implicitly[DerivedDecoder[A]].prepare(preparation)
}

Which you can use like this:

case class Stuff(id: String, values: Seq[String], other: Seq[Boolean])

implicit val decodeStuff: Decoder[Stuff] = deriveCustomDecoder("values", "other")

And then:

scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
res1: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))

scala> decode[Stuff]("""{ "id": "foo" }""")
res2: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))

scala> decode[Stuff]("""{ "id": "foo", "other": [true] }""")
res3: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List(true)))

scala> decode[Stuff]("""{ "id": "foo", "other": null }""")
res4: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))

This gets you 95% of the back to the ease of use of semi-automatic derivation, but if that's not enough…

The nuclear option

If you have a lot of case class with members that need this treatment and you don't want to have to modify them all, you can take the more extreme approach of modifying the behavior of the Decoder for Seq everywhere:

import io.circe.Decoder

implicit def decodeSeq[A: Decoder]: Decoder[Seq[A]] =
  Decoder.decodeOption(Decoder.decodeSeq[A]).map(_.toSeq.flatten)

Then if you have a case class like this:

case class Stuff(id: String, values: Seq[String], other: Seq[Boolean])

The derived decoder will just do what you want automatically:

scala> import io.circe.generic.auto._, io.circe.jawn.decode
import io.circe.generic.auto._
import io.circe.jawn.decode

scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
res0: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))

scala> decode[Stuff]("""{ "id": "foo" }""")
res1: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))

scala> decode[Stuff]("""{ "id": "foo", "other": [true] }""")
res2: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List(true)))

scala> decode[Stuff]("""{ "id": "foo", "other": null }""")
res3: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))

I'd strongly recommend sticking to the more explicit version above, though, since relying on changing the behavior of the Decoder for Seq puts you in a position where you have to be very careful about what implicits are in scope where.

This question comes up often enough that we may provide specific support for people who need null mapped to empty collections in a future release of Circe.

How to decode a JSON null into an empty collection

2 Answers

Preprocessing with cursors

Handling missing fields

Bundling this up

The nuclear option