1
votes

I am using Kettle Pentaho data integration. I need to skip inserting IdN column from first table (first db) if it already exists int second table (second db).

In my kettle's design I've got Table Input (which returned me some rows).

I've need to select distinct IdN column from that returned rows, and to insert them (all distinct ids) into second table of second database (so I avoid duplicating ids).

I've tried with this code in execute sql statements :

INSERT INTO {SecondTable}
SELECT DISTINCT (IdN) 

But returned with error:

Invalid column name 'IdN'.

And what confuses me, is that I have checked parameter IdN into "Field name to be used as argument" under "Execute SQL section" node of my kettle's design, which is valid column name I've got returned from previous table input node.

2
where is your FROM part of the SELECT? - JotaPardo
I'm getting data from previous "Table Input" node, I don't know how could I specify "from previous sql executed data rows"? - luka032
I think you should still use the table in the FROM - JotaPardo
Yeah I would but there's problem because I'm limited on 1 connection, and I have got 2 different databases which I need to transfer data between so If I select FROM first, I would lose option to INSERT INTO second one. - luka032
Have you tried the Insert/Update step? - Brian.D.Myers

2 Answers

1
votes

I've succeeded using "Unique rows" step in Kettle, it will do exactly what I wanted, just specify columns on which to check value existence.

0
votes

To be sure I am understanding. You are trying to write a query to select the idN column from one database into a table in another database. If so the query should look like this

    Insert into db1.dbo.tableName1(colname)
    Select Distinct colname From db2.dbo.tablename2