0
votes

I have a transformation into Kettle Pentaho called Test.

This ETL process should load three different tables of a single database, where each one has his source into a different table of a another database.

To do this I use three table input steps. Each one connects to a value mapper, this to a Select value step, then a Data Validator, and add sequence step and finally a table output.

Summarising I have a total of six steps per table load. When I am editing the finals steps I found a thing that I would like to solve, I drag the fields of the previous tables loads.

For example, table A load have the field bank_id, in the second table it does not exist, but in the table output step of the second load process I can select this despite I do not want this.

Is there any option to do not see the previous fields? Thsi way I avoid easy errors. Especially, when the tables have a field with the same name.

Thank you

EDIT

enter image description here

2
If you have 3 separate streams, you should not see fields from the first table in the second output step. Did you combine the streams somewhere and/or copy the first table output to create the second?Cyrus
Yes I duplicate the steps, but then change the things that I need as the table and select statement.mrc
It's very strange then that you would see fields from another stream. Can you include a screenshot of the transformation?Cyrus
Actually it isn't good idea to use several streams in same transformation. Data between streams starts to mess. Usually almost not sensible, but if data volume is huge, then such effects starts to appear. I can confirm such behavior, and I use simple rule single stream in transformation.simar
@Cyrus I have added the screenshot. Thank you.mrc

2 Answers

2
votes

The screenshot clarifies the situation immensely, so now the answer is simple:

Delete the diagonal hops (arrows) between the rows.

Transformations in PDI don't have a single starting or ending point, so you don't need to connect all the steps in a single line. Having three separate streams is just fine.

All steps in a transformation start in parallel, then wait and process rows as they come in (or in the case of input steps, start reading data and generating rows into their output hop). That means your three streams will execute in parallel following their own hops from input to output.

0
votes

add a Select Values step, i use to add filter steps often to "clean" the flow