2
votes

I have a table that is described in the json file and based on that I want to create a collection as a sideInput later on.

PCollection<KV<Boolean, Map<String, Object>>> pC = p_jsonstring
    .apply("create ...", MapElements.via( (String input) -> {                                                                               
      try {
        ObjectMapper mapper = new ObjectMapper();
        Map<String, Object> mytable =
            mapper.readValue(input, new TypeReference<Map<String, Object>>(){});
        Boolean key = (Boolean) mytable.get("mykey");
        return KV.of(key, mytable);
      } catch (IOException e) {
        e.printStackTrace();
        return null;
      }
    }).withOutputType(new TypeDescriptor<KV<Boolean, Map<String, Object>>>() {}));

When running it, I have the following error messages:

SEVERE: Unable to return a default Coder for create KV../Map.out [PCollection]. Correct one of the following root causes: No Coder has been manually specified; you may do so using .setCoder(). Inferring a Coder from the CoderRegistry failed: Unable to provide a default Coder for org.apache.beam.sdk.values.KV>. Correct one of the following root causes: Building a Coder using a registered CoderFactory failed: Cannot provide coder for parameterized type org.apache.beam.sdk.values.KV>: Unable to provide a default Coder for java.util.Map. Correct one of the following root causes: Building a Coder using a registered CoderFactory failed: Cannot provide coder for parameterized type java.util.Map: Unable to provide a default Coder for java.lang.Object. Correct one of the following root causes: Building a Coder using a registered CoderFactory failed: Cannot provide coder based on value with class java.lang.Object: No CoderFactory has been registered for the class. Building a Coder from the @DefaultCoder annotation failed: Class java.lang.Object does not have a @DefaultCoder annotation.

I think the issue is mainly related to Object in Map<String, Object>, but in my case the mapping value is only determined at the runtime when reading the json string from the file. The Object type can be string, number or boolean.

Any suggestions?

1
Did you find a solution for this?user944849

1 Answers

0
votes

I think the canned TypeDescriptors.kvs should work here as your output type, and you could consider keeping your input String as a String in the Map values and deserializing when you actually want to process the object. If you want to only deserialize here, consider creating a Schema for the deserialized object and using a Row as your value class. You can generate a Coder from that Schema