0
votes

I am new to pentaho data integration. How can i concatenate distinct string values ?

bse_id values
100     A1
100     A1
100     A2
150     A1
150     B1
150     C1
150     C1

putput should be

bse_id values
100    A1,A2
150    A1,B1,C1

In Mysql, i can use select bse_id,group_concat(distinct values) from table group by 1;

In SPOON, i have tried group_by step and memory group_by both are resulting in duplicate values. I'm getting output as

 bse_id values
    100    A1,A1,A2
    150    A1,B1,C1,C1

Please help me in removing the duplicates.

2

2 Answers

1
votes

You need to have 2 Group by Steps:

Try the following three steps after input:

  1. Step: Sort by BOTH - 'bsi_id' and 'values'
  2. Step: Group by BOTH - 'bsi_id' and 'values' (no aggregation here)
  3. Step: Group by 'bsi_id'; aggregate 'values' with Type "Concatenate strings separated by ,"

Output is:

bse_id; values
100; A1, A2
150; A1, B1, C1

This should work fine. Bye

0
votes

You can do this easily with a Group by step. Be sure the input to the step is sorted on the bse_id field, then select values as the subject of an aggregate field and set the type to 'Concatenate strings separated by,'. That should give you exactly what you want.