I have an ETL requirement like:
I need to fetch around 20000 records from a table and process each record separately.(The processing of each record involves a couple of steps like creating a table for each record and inserting some data into it). For prototype I implemented it with two Jobs(with corresponding transformations). Rather than table I created a simple empty file. But this simple case also doesn't seem to work smoothly. (When I do create a table for each record the Kettle exits after 5000 reocrds)
When I run this the Kettle goes slow and then hangs after 2000-3000 files though processing is complete after a long time though Kettle seems to stop at some time. Is my design approach right?. When I replace the write to file with actual requirement like creating a new table(through sql script step) for each id and inserting data into it, the kettle exits after 5000 records. What do I need to do so that the flow works. increasing the Java memory(Xmx is already at 2gb)?. Is there any other configuration I can change? Or is there any other way? Extra Time shouldn't be a constraint but the flow should work.
My initial guess was since we are not storing any data the prototype atleast should work smoothly. I am using Kettle 3.2.