0
votes

I have a big bag of tuples containing constant, but unknown number of integers (over 200). Is there any way to sum corresponding elements from this tuple?

For example SUM_TUPLES({(1, 0, 1), (2, 1, 0)}) should return (3, 1, 1).

I wrote my UDF in Python, but since bag and tuples are really huge I get GC limit exceeded error.

def SUM_TUPLES(tuple_bag):
    if not tuple_bag:
        return []
    result = len(iter(tuple_bag).next())*[0]
    for tup in tuple_bag:
        for i, ele in enumerate(tup):
            result[i] += ele
    return result
1
Does your data contain some other column apart from the bags? Or do you have one only column with the bags you want to add?Balduz
The data has 4 columns - 3 first are chararrays and the last one is tuple. I group data by 3 first columns and want to sum values from grouped tuples (they are put in the bag by group operation)Piotr Dabkowski

1 Answers

1
votes

Already answered here

import operator
tuple_bag = (1, 0, 1), (2, 1, 0)
tuple(map(operator.add, *tuple_bag))