0
votes

I am using pentaho DI to insert data into fact table . But the thing is the table from which I am populating contains 10000 reccords and increasing on daily basis.

In my populating table contain 10,000 records and newly 200 records are added then i need to run the ktr, If I am running the ktr file then again it truncates all 10,000 data from fact table and start inserting the new 10,200 records.

To avoid this i unchecked the truncate option in table output step and also made one key as unique in the fact table and check the Ignore inputs error option. Now it's working fine and it inserting only the 200 records, But it taking the same execution time.

I tried with the stream lookup step also in the ktr, But there is no change in my execution time.

Please can any one help me to solve this problem.

Thanks in advance.

1
How much time are we talking about? 10K rows doesn't sound like that much. And have you tried the Merge Rows Diff step? - Brian.D.Myers
I'm having 2,50,000 records with 24 fields in the table, All most it takes 8 Hours to complete the task. Previously I tried indexing and some time get reduced, But not yet tried Merge Rows Diff Step. - Sathishkumar S
8 rows per second sounds ridiculously slow to me. What kind of processing are you doing on those rows? Or are they just insanely wide? In the mean time, I'd check out the Combination Lookup Update step and the Merge Rows Diff step if you need to capture all of inserts, updates, and deletes. See if those will work for you. - Brian.D.Myers
Brian.D.Myers thank you so much, Merge Rows Step really help me lot to solve the problem.. In my case not only insertion, including that i used 8 stag of lookup process. Now i included the Merge Rows steps with switch case step before the lookup steps and amazingly the work is done within 6 Minutes. - Sathishkumar S
So it seems that's the answer... - Brian.D.Myers

1 Answers

0
votes

If you need to capture all of Inserts, Updates, and Deletes, the Merge Rows Diff step followed by a Synchronize after Merge step will do this, and typically will do it very quickly.