1
votes

My question is, how can I run 2 different "Execute SQL script" steps, so that they both run in one same session, not in 2 different?
In my transformation, I need to run one sql script at the very beginning of transformation process, to permit modifications on table. Then, I extract data from one database, manipulate them, and based on results I need to call some procedure (using another sql script) to update status of transformation in. I can't update status, if the first sql is not executed, and permission not granted.

Searches give me only 2 results:

  1. Select "unique connections" in transformations settings. This is not applicable, because it's available only in older versions of kettle.
  2. Select "make the transformation database transactional" - actually same thing as first one (new name of older property in new kettle). Seems like this is not actually what I want, because it doesn't work as I expect.

I'm working with Oracle database and kettle 5.1.0.

UPDATE:

enter image description here

In this picture, I have 2 "Execute SQL script" steps in sequence, blocked by appropriate steps. Seems like this scripts executed in 2 different database connections (sessions), so second one doesn't see the effect of first one.

2
Your picture is no longer available through this link. You can email it to me and I will insert it into your question. Could you explain what does not work as I expect means exactly?Marcus Rickert
I thought after running transformation using block steps, they'll run in sequence and SQL scripts will run in one session. Image is available through link @MarcusRickertNur4I

2 Answers

2
votes

Main point was at "Execute for each row?" option of "Execute SQL statement" step. This option slightly changes the behaviour of the step. Pentaho wiki for this step says "Select this option to execute the SQL for each incoming row. In this case paramters can be used. When this option is unchecked, the SQL statement is executed at the step initialization phase.". Exactly what I needed. But I'm wondering, why developers put 2 different functions into one option?

1
votes

I've gone through the same discovery process myself a while ago. Turns out, what you need is to define a job that runs two transformations. Within a transformations you can't guarantee sequential execution. A downstream step might start running before its "upstream" completed. If your transformation relies on external effects (such as modifications to an external SQL databases), you won't be able to achieve what you need in one transformation. Jobs allow you to completely control the order of the transformations they trigger.

These are the steps you need to take:

  1. Create a new job
  2. add a "start" (triangle) step
  3. add a "transformation" step. point it to a transformation whose only task is to enable write access in your db tables
  4. add a 2nd transformation step, point it to the transformation that manipulates the data and finally uploads it.
  5. connect using arrows the start to the 1st, then 1st to 2nd transformations.
  6. run the job :)