1
votes

I have a json like {"destination.port":"443","network.packets":"4464","event.end":"2019-07-19T07:47:22.000Z","source.address":"1.2.2.3","message":"OK","server.address":"ip-1-2-2-3.ec2.internal","event.action":"ACCEPT","event.module":"S3bucket","source.port":"56448","network.protocol":"6","cloud.account.id":"512889038796","event.type":"AWS_VPC_log","organization.id":"DeloitteFusion","destination.address":"1.2.2.3","network.bytes":"178584","event.start":"2019-07-19T07:46:22.000Z","event.kind":"2","host.id":"eni-0c5e3a6282a912997","timestamp":"2019-07-19T07:51:52.584Z","srckey_val":"167772160_184549375","srckey_rev":"15072019_1541"}

Creating KSQL stream as create stream vpc_log ("destination.port" integer, "network.packets" integer, "event.end" varchar, "source.address" varchar, message varchar, "server.address" varchar, "event.action" varchar, "event.module" varchar, "source.port" integer, "network.protocol" integer, "cloud.account.id" bigint, "event.type" varchar, "organization.id" varchar, "destination.address" varchar, "network.bytes" integer, "event.start" varchar, "event.kind" integer, "host.id" varchar, timestamp varchar, srckey_val varchar, srckey_rev varchar) WITH (KAFKA_TOPIC='client_data_parsed', VALUE_FORMAT='JSON');

on running select * from vpc_log; threw the following error Caused by: Cannot create field because of field name duplication address.

So have modified the stream query as create stream vpc_log ("destination.port2" integer, "network.packets" integer, "event.end" varchar, "source.addres" varchar, message varchar, "server.adress" varchar, "event.action" varchar, "event.module" varchar, "source.port1" integer, "network.protocol" integer, "cloud.account.id2" bigint, "event.type" varchar, "organization.id" varchar, "destination.adres" varchar, "network.bytes" integer, "event.start" varchar, "event.kind" integer, "host.id1" varchar, timestamp varchar, srckey_val varchar, srckey_rev varchar) WITH (KAFKA_TOPIC='client_data_parsed', VALUE_FORMAT='JSON');

the output on select * is 1563522879847 | null | null | null | null | null | OK | null | null | null | null | null | null | null | null | null | null | null | null | null | 2019-07-19T07:51:52.584Z | 167772160_184549375 | 15072019_1541

Values for all the dotted(.) fields/keys is null. Im referring to link with no solution provided. Help me understand.

Tried the escape the dot as:

create stream vpc_log ("DESTINATION\.PORT1" INTEGER, "NETWORK\.PACKETS" INTEGER, "EVENT\.END" VARCHAR, "SOURCE\.ADDRESS1" VARCHAR, MESSAGE VARCHAR, "SERVER\.ADDRESS2" VARCHAR, "EVENT\.ACTION" VARCHAR, "EVENT\.MODULE" VARCHAR, "SOURCE\.PORT2" INTEGER,"NETWORK\.PROTOCOL" INTEGER,"CLOUD\.ACCOUNT\.ID" BIGINT,"EVENT\.TYPE" VARCHAR,"ORGANIZATION\.ID1" VARCHAR,"DESTINATION\.ADDRESS3" VARCHAR,"NETWORK\.BYTES" INTEGER,"EVENT\.START" VARCHAR,"EVENT\.KIND" INTEGER,"HOST\.ID2" VARCHAR,TIMESTAMP VARCHAR,SRCKEY_VAL VARCHAR,SRCKEY_REV VARCHAR)WITH (KAFKA_TOPIC='client_data_parsed',VALUE_FORMAT='JSON');

Still the same issue.

I could go with the query with no dot in the key as a quick work around. But, would like to know what is it with KSQL to have dot?

1
That issue was closed by saying workaround for this problem by ensuring JSON fields don't have periods... The issue wasn't actually fixed - OneCricketeer

1 Answers

2
votes

I've got this answered in slack community by ksql developers. Might help someone. KSQL doesn't have official support for the same but a workaround is to escape the period. With ksql v5.3.0 installed from here.

create stream vpc_log ("DESTINATION\.PORT" INTEGER,"NETWORK\.PACKETS" INTEGER,"EVENT\.END" VARCHAR,"SOURCE\.ADDRESS" VARCHAR,MESSAGE VARCHAR,"SERVER\.ADDRESS" VARCHAR,"EVENT\.ACTION" VARCHAR,"EVENT\.MODULE" VARCHAR,"SOURCE\.PORT" INTEGER,"NETWORK\.PROTOCOL" INTEGER,"CLOUD\.ACCOUNT\.ID" BIGINT,"EVENT\.TYPE" VARCHAR,"ORGANIZATION\.ID" VARCHAR,"DESTINATION\.ADDRESS" VARCHAR,"NETWORK\.BYTES" INTEGER,"EVENT\.START" VARCHAR,"EVENT\.KIND" INTEGER,"HOST\.ID" VARCHAR,TIMESTAMP VARCHAR,SRCKEY_VAL VARCHAR,SRCKEY_REV VARCHAR) WITH (KAFKA_TOPIC='client_data_parsed',

VALUE_FORMAT='JSON');

Thanks Robin Moffatt.