0
votes

I use Pentaho Spoon for Data Integration (aka, PDI). My internal client wants reports written to excel files, which is fine, but he also wants to see the queries I use in separate tabs in the workbook. Is there a way to use Spoon to export the actual text of the query itself - not the results - to a .xlsx?

In other words, if one query is

SELECT Muppets, Peanuts, Sprites FROM eighties_shows.childrens_cartoons

Then I want to have an excel file that shows

SELECT Muppets, Peanuts, Sprites FROM eighties_shows.childrens_cartoons

2

2 Answers

0
votes

You can use PDI Table Input to get the data from a variable. Try to define a variable which will hold all the column names (and another variable containing the table name) and try to pass this variable to the Table Input Step. For e.g: In the Table Input Step define:

Select ${COLUMN VARIABLE} FROM ${TABLE NAME VARIABLE}

This will execute your query. Now for Generating the Query in your excel, use a "Modified Java Script" step to recreate the same structure. The JS Code snip is given below:

LOOP i=1 till getInputRowMeta().size()

   var fields =getInputRowMeta().getValueMeta(i) /* This will give you the list of Input Columns along with the datatype e.g. Muppets String(100) */

END LOOP

Remove the datatype from your fields variable and create a final variable having the concatenation of

var final_query="SELECT" + cleaned fields variable from the above JS + "FROM" + table name;

Hence you generate a query which you can easily output in your excel output.

This is slightly lengthy process. But i am not able to get any other solution other than generating query using the method above. Hope this solution is good for you !! :)

0
votes

Most modern databases support dynamic sql. If you are using SQL Server look into sp_executesql. All this means is that your SQL code is stored as a string and executed that way.

Now to using dynamic sql in pentaho....

Set up a job that creates the excel file for output and that also gets the sql either passed in on the job call or from the filesystem. Now pass the sql to a transformation within the job that will be responsible for filling the excel file. Write the sql to whatever sheet you wish.

Hope this helps