3
votes

I am running a Python script in the Power Query editor of Power BI to transform and work with my data. After these computations, I want to return the dataset and another table to the Power Query editor. Am I correct that this second table needs to be a Pandas Dataframe?

Only when converting the second table from a numpy.ndarray to a pandas.DataFrame, does Power BI recognize it as tabular output (result after script). Is this the expected behaviour? Can other Python objects than a Pandas Dataframe be passed down to Power BI?

1

1 Answers

1
votes

Am I correct that this second table needs to be a Pandas Dataframe?

yes

Is this the expected behaviour?

yes, but why?

You can construct a lot of variable types using Python, and it would be a lot to demand that Power BI should recognize all of them. And instead of letting Power BI recognize some variable types, it seems that the developers for the sake of simplicity has decided to draw the line at DataFrames. Personally, I think it's a wise decision. This way, you'll know it's not a datatype issue if anything goes wrong.


Some details:

Go to the Power Query Editor and insert an empty table using Enter Data > OK. Then insert the script below using Transform > Run Python Script:

# 'dataset' holds the input data for this script
import numpy as np
import pandas as pd
var1 = np.random.randint(5, size=(2, 4))
var2 = pd.DataFrame(np.random.randint(5, size=(2, 4)))
var3 = 3
var4 = pd.DataFrame([type(var3)])
var5 = pd.Series([type(var3)])

This snippet constructs 5 variables of the following types:

print(type(var1))
<class 'numpy.ndarray'>

print(type(var2))
<class 'pandas.core.frame.DataFrame'>

print(type(var3))
<class 'int'>

print(type(var4))
<class 'pandas.core.frame.DataFrame'>

print(type(var5))
<class 'pandas.core.series.Series'>

And just to be specific, I've not run the print() commands in PowerBI, but in Spyder. Now, If you click OK and execute the first snippet in The Power Query Editor, you'll be presented with a table showing which variable will be available to you under Applied Steps:

enter image description here

dataset is constructed by defult upon inserting the Python snippet, while var2 and var4 is constructed in the code. And all are dataframes. Even var5 which is a pandas Series is not available for further editing.

I hope this helps. If not, then don't hesitate to let me know!

Edit:

Regarding:

After these computations, I want to return the dataset and another table to the Power Query editor.

You can load any table and edit it with Python. If you'd like to keep one version of the table, and do further editing on another table, you should take a look at Edit python script used as Data entry in Power BI