0
votes

I am struggling with something in pentaho and I am not entirely sure if pentaho will be able to handle this problem. I will try explain as best I can.

So I have a column in my fact sales called reference number, which I must use to lookup an ID from the dimension table and return the ID. But if the first column i did my lookup on in the dimension returns a null , I need to check the same field from fact table lookup in another column in dimension and then another column again.

Is there a way in Pentaho where i can ask it to go through a process of 3 different lookups and return the id if a match in one of those 3 columns exist into the same column in fact sales.

I'm using MySQL as my database

2

2 Answers

0
votes

This seems to be somewhat a basic task for Pentaho Data Integration.

You could do this manually by performing three Database lookup (or) Dimension lookup/update (depending on type of your dimension) which will store every lookup result in a different field.

Then, use a Modified Java Script Value to perform null coalescing - choosing first non-null value and finally if you need a Select Values script to remove three columns with lookup results that are no longer needed.

Below is a screen with a simplified case but I'm sure you can follow the logic behind it and implement it in your case scenario as I've mentioned steps that you could use to achieve the task:

enter image description here

0
votes

It would be far faster to use a filter step. If the looked up first value is null filter on the null to break the null stream to a second lookup and the found data to your "found" step. Rinse and repeat till you have what you want.

Then use a multiway Merge Join to stitch your dataset back together. The merge join step might not even be necessary to be honest if the resulting streams are all identical which you can achieve with some select steps if they're not. No need to look everything up at once and this really is not taking advantage of the parallel processing at all to look it all up once and evaluate.

Does that help?