4
votes

I thought this would be an easy task, but since I am new to PDI, I could not find out so far which transform to choose to accomplish the following:

I am using Pentaho Data Integration (former Kettle), Community Edition, to map/copy values from one table ('tasksA') of one database 'A' to another table 'tasksB' in another database B. tasksA has a column 'description' and I want to copy these values to the column 'taskName' in 'tasksB'. Furthermore, I have to copy each value of 'description' several times, since in 'tasksB', there are multiple lines for each value in 'taskName'.

Maybe this would be possible by direct SQL, but I wanted to try whether I can define this more readable with PDI, especially because in the next step I will have to extend it to other tables involved.

So I have to tell which value of 'description' has to be mapped onto which value of 'taskName' and that in every tuple containing this value (well, sounds like a WHERE clause...) in the column 'taskName' it should be replaced.

My first experiments with the 'Table input' and 'Table output' steps did not work when I simply drew a hop between them and modifying the 'database fields' tab of the 'Table output' step, which generated 'drop column' statements in the resulting SQL which is not what I want. I don't want to modify the schema, just copy the values.

Would be great if someone could point me to the right steps/transforms needed, I worked through the first examples from the Pentaho Wiki and have got the 'Pentaho Kettle Solutions' book of Casters et al. but could find out how to do solve this. Many thanks in advance for any help.

2

2 Answers

0
votes

If I got this right, you should use the Table Input connected to a "Insert/Update" step.

On the Insert/Update step you need to inform the keys from tasksA where should be looked up on tasksB. Then define which fields on tasksB should be updated: description (as stream field) -> taskName (as the table field).

Keep in mind that if this key is not found, a row will be inserted on tasksB. If it is not what you plan, you'll need to build something like: Table Input -> Database Lookup -> Filter Rows -> Insert/Update

0
votes

@RFVoltolini has a good answer. Alternatively you could go

Table Input -> Update

And connect the error output to something else like a Text file output.