0
votes

Have anyone tried this code before?

XmlSource<String> source = XmlSource.<String>from("gs://balajee_test/sample_3.xml")
              .withRootElement("book")
              .withRecordElement("author")
              .withRecordElement("title")
              .withRecordElement("genre")
              .withRecordElement("price")
              .withRecordElement("description")
              .withRecordClass(XMLFormatter.class);

PCollection<String> output = p.apply(Read.from(source));

https://beam.apache.org/documentation/sdks/javadoc/0.4.0/org/apache/beam/sdk/io/XmlSource.html

org.apache.beam.sdk.io.xml.XmlSource

Hope I'm using the correct 'XmlSource' class but still not able to resolve dependencies for method 'from("gs://balajee_test/sample_3.xml")' and getting compilation error for the same. The error message is :

The method from(String) is undefined for the type XmlSource

This question may go too silly but I really need to get it resolved in order to be able to read XML File stored in GCS Bucket.

1
What version of the SDK are you using? Those documents are for a very old version (.4). The current version is 2.1. If you are using a newer version of the SDK, you want this: beam.apache.org/documentation/sdks/javadoc/2.1.0 Also, you have two 'froms' with the same name as well. Is that intended? Could be the issue if you are on an old SDK. - Lara Schmidt
I'm sorry. Those two 'froms' were not intended. While posting this query by mistake I copied the same 'from' statement twice. I'm using 2.0 version. Is this the issue? - Balajee Venkatesh
Yeah, the API for XML Source has changed for the 2.0 version. I'd try using the new documentation for that specific version. beam.apache.org/documentation/sdks/javadoc/2.0.0 You can see the XML Source by looking for org.apache.beam.sdk.io.xml - Lara Schmidt

1 Answers

0
votes

From comments, seem that the SDK used is 2.0 which has a new way to define a read from XML. Check the new documentation for how to read.

SDK IO documentation (for 2.0.0) can be found here: beam.apache.org/documentation/sdks/javadoc/2.0.0