I cannot manage to read input multiline JSON input files in a Apache Beam pipeline (coded in Python).
I understand that ReadFromFile with a JSON coder reads JSONL files but how to handle files with following format:
[{
"name": "name1",
"value": "val1"
},
{
"name": "name2",
"value": "val2"
}]
I cam across the FileSystem module which contains the open() function allows to read the entire file (not line by line) but this returns a file handle (as per the documentation)
But what to do afterwards? This might not be the good way to do it, so any idea?