Pentaho - Having multiple Copy rows to result results in Get rows from result empty

Question

I'm trying to process some data and store it in a datawarehouse. For doing it, I wanted to store dimensions in one transformation and fact (only have one) in another transformation. So I can use a job for execute the first one, copy rows to result and get them into the second transformation.

In the first transformation, I read some Excel file and separate this data into some streams. It is data from a baptism, so I have one stream for the person, another one for parents, another one for sponsors, and so on... At the end of each stream, I insert data into database and return PK autogenerated (it is an id autoincrement).

In the second one, I only have Get rows from result and want to set them into a txt file (just for see it is been done correctly). The problem is that the file is created but it is empty. I suppose that if I let fields in Get rows from result empty, it gets all fields.

What am I doing wrong?

At the end what I want is to have one Copy rows to result at the end of each stream in the first transformation and get all this data in the second one.

enter image description here

In "Insert Pare Padrina" I return id_pare_padrina which is autogenerated, and the same with "Insert Mare Padrina" (I have more streams which I also have to include them into result). This transformation is not executed per row because I need values of other rows.

Thank you!

Lyd, did you get solution to your problem? Would you mind posting same? Thanks — Akshay Lokur

Rishu Shrivastava Rishu Shrivastava · Accepted Answer · 2015-04-14T18:00:14

In order to pass the data from the first transformation to the second transformation, you need to set certain parameters like:

1. First of all, in the transformation settings of the second transformation (at the Job Level), check on the items as image below:

enter image description here

Copy Previous results to parameters will ensure that all the results/data in the "Copy Rows to Result" step is getting properly passed to the next level.

Execute for every input row : will execute the second transformation for every rows in the first transformation file. This is optional based on your requirement.

2. In the same transformation settings, define the "Parameters" in the Parameters tabs. Check the image below:

enter image description here

Here, NAME is the parameter i have defined. So when you are using the "Get rows from result", you can define these parameter names.

3. Instead of using "Get rows from result", you can alternately use "Get Variables" step to fetch all the variables coming from the previous step. All you need to do is to define the parameter names inside the ktr file (CTRL + T). (Actually i have practically implemented in that fashion and it worked for me.)

4. Since "copy rows to result" step uses heap memory, defining multiple instances of this step might exhaust the memory space quickly and your code might fall in trouble. Ideally use a single instance of this step.

But if your data interation is only one row, best option would be to use "set variables" step.

I assume you might have missed some of these sections in the job.

You can read more on copy rows to result in here.

Hope it helps :)

Pentaho - Having multiple Copy rows to result results in Get rows from result empty

1 Answers