My kafka topic is pushing data in this format (coming from collectd):
[{"values":[100.000080140372],"dstypes":["derive"],"dsnames":["value"],"time":1529970061.145,"interval":10.000,"host":"k5.orch","plugin":"cpu","plugin_instance":"23","type":"cpu","type_instance":"idle","meta":{"network:received":true}}]
It's a combination of arrays, ints and floats... and the whole thing is inside a json array. As a result Im having a heck of a time using ksql to do anything with this data.
When I create a 'default' stream as
create stream cd_temp with (kafka_topic='ctd_test', value_format='json');
I get this result:
ksql> describe cd_temp;
Field | Type
-------------------------------------
ROWTIME | BIGINT (system)
ROWKEY | VARCHAR(STRING) (system)
-------------------------------------
Any select will return the ROWTIME and an 8 digit hex value for ROWKEY.
I've spent some time trying to extract the json fields to no avail. What concerns me is this:
ksql> print 'ctd_test' from beginning;
Format:JSON
com.fasterxml.jackson.databind.node.ArrayNode cannot be cast to com.fasterxml.jackson.databind.node.ObjectNode
Is it possible that this topic can't be used in ksql? Is there a technique for unpacking the outer array to get to the interesting bits inside?