In apache beam python sdk , I often see '>>' operator in pipeline procedure.
https://beam.apache.org/documentation/programming-guide/#pipeline-io
lines = p | 'ReadFromText' >> beam.io.ReadFromText('path/to/input-*.csv')
What does this mean?
In apache beam python sdk , I often see '>>' operator in pipeline procedure.
https://beam.apache.org/documentation/programming-guide/#pipeline-io
lines = p | 'ReadFromText' >> beam.io.ReadFromText('path/to/input-*.csv')
What does this mean?
>> is the right bitwise shift operator in Python. The equivalent dunder (double underscore) method is __rrshift__().
The implementation of Apache Beam in Python simply redefines __rrshift__() for the PTransform class so that names can be added to the transform. It's just special syntax. In your example, "ReadFromText" is the name of the transform.
Reference: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/ptransform.py#L445