1
votes

I have two datasets: one with thousands of rows that has information about clients (one row per client) and more than 100 variables; and another one which is the result of a Machine Learning process that has some important values of the top 10 most important client's variables (one row per variable).

I want to create a chart that displays the variable in the first dataset that has the highest value in the second dataset. The variables in the second dataset are:

[Dataset]: is a variable that is the same for all the dataset, and is used to fix a calculation (string).

[Variables]: is a list with the names of the most important variables (string).

[Correlation With Target]: is the correlation between the variable in that row and a target variable (float).

This are the calculations that I have made. The first calculated field is created in the second dataset, and the other two calculated fields are created in the first dataset.

Highest Correlation:

IF ABS([Correlation With Target])=={ FIXED  [Dataset]:MAX(ABS([Correlation With Target]))} THEN [Variables] ELSE null END

Variable Number:

CASE ATTR([Sheet1 (Variable Dataset)].[Highest Correlation])

WHEN "Borrower Age" THEN 1

WHEN "Credit score - Borrower" THEN 2

WHEN "Monthly Disposable Income" THEN 3

WHEN "Loan Term" THEN 4

WHEN "LTV" THEN 5

WHEN "Monthly Interest Rate" THEN 6

WHEN "Outstanding Principal Balance" THEN 7

WHEN "Years at Address" THEN 8

WHEN "Years in Employment" THEN 9

END

Correlation Graph:

IF {[Highest Correlation]=1} THEN [Borrower Age]

ELSEIF {[Highest Correlation]=2} THEN [Credit Score]

ELSEIF {[Highest Correlation]=3} THEN [Income]

ELSEIF {[Highest Correlation]=4} THEN [Loan Term]

ELSEIF {[Highest Correlation]=5} THEN [LTV]

ELSEIF {[Highest Correlation]=6} THEN [Interest Rate]

ELSEIF {[Highest Correlation]=7} THEN [Outstanding Principal Balance]

ELSEIF {[Highest Correlation]=8} THEN [Years at Address]

ELSEIF {[Highest Correlation]=9} THEN [Years in Employment]

END

The problem is that in the 3rd calculation, when calling the [Highest Correlation] field, it throws this error:

"All fields in a level of detail expression must came from the same datasource"

What would be the turn around for this problem?

PS: I can't share the workbooks or data, but I will answer any question related to it so I can help you to help me.

1
Hi Pedro, have you considered a cross database join? In general LOD expressions will be challenging if you are trying to use blending. Joining on the other hand may be an easier way to achieve your results.smb

1 Answers

0
votes

I think the problem is that field [Income] comes from a different datasource.

If it is posible you post Datasources and Fields?