Converting Row Columns to Rows with Pentaho Kettle/PDI

Question

Brand new to Pentaho (and a newbie SO poster so look out!)

I'd like to use Kettle/PDI to transform data coming in from an RDBMS from this (for example):

Question1   Question2   Question3   Question4
1/1/13      123.00      Test 1      Test 1.1
1/2/13      124.00      Test 2      Test 1.2
1/3/13      125.00      Test 3      Test 1.3
1/4/13      126.00      Test 4      Test 1.4
1/5/13      127.00      Test 5      Test 1.5

to this:

QuestionName AnswerDate AnswerNumber AnswerString
Question1    1/1/13     
Question1    1/2/13     
Question1    1/3/13     
Question1    1/4/13     
Question1    1/5/13     
Question2               123.00
Question2               124.00
Question2               125.00
Question2               126.00
Question2               127.00
Question3                             Test 1
Question3                             Test 2
Question3                             Test 3
Question3                             Test 4
Question3                             Test 5
Question4                             Test 1.1
Question4                             Test 1.2
Question4                             Test 1.3
Question4                             Test 1.4
Question4                             Test 1.5

As hopefully reflected above, there should be an "Answer<FieldDataType>" column for each available datatype in the original table. Is this possible with PDI? If so, can someone provide me with some pointers? I've tried using the Row Normaliser step to pivot the table and assign the new fields, but am probably not doing things quite right (or there is a bug [PDI 4.4]).

Check the samples for that step. It does work, but every time I have to use it, i have to check the sample because for some reason configuring it isnt intuitive! — Codek
Are you comfortable/allowed to use a javascript node in PDI? — KLDavenport
Kevin D, yes to both. I guess with the JS step I could filter by type and add fields based on the type? If you have an example, I'd greatly appreciate it. I'm not opposed to using JS, but I wasn't sure if there was a more "native" way to do it. — jriggins
I came across this issue a while ago, wonder if you are hitting the same thing, if so vote for the jira! jira.pentaho.com/browse/PDI-9467 — Codek
I think that I am. At any rate, I voted! Thanks for the head's up. — jriggins

nevern02 nevern02 · Accepted Answer · 2013-08-27T19:54:59

I accomplished this by using a scripting step to write an output row containing the column and value for each column in the input row. From there, I went to a Regex Evaluation step and used multiple capture groups to map the value types to additional columns in the stream. I messed around with the Row Normaliser for a while, but couldn't get it to do exactly what I wanted. The performance loss of using a scripting step was negligible.

Converting Row Columns to Rows with Pentaho Kettle/PDI

3 Answers