I am trying to pass data into a python script in Data Lake Analytics.
I've stripped this back to show the error clearly. I understand the python isn't actually doing anything... :-)
I have a very simple table
@FormattedCasinoData =
SELECT int.Parse(UserID) AS [UserID],
int.Parse(ModelID) AS [ModelID],
float.Parse(Value) AS [Value]
FROM @CasinoData
WHERE UserID != "UserID"
ORDER BY UserID
FETCH 1000 ROWS;
So the table format is int, int, float.
When i try to run this
REFERENCE ASSEMBLY [ExtPython];
DECLARE @myScript = @"
def usqlml_main(df):
return df
";
@pythonOutput =
REDUCE @FormattedCasinoData ON [UserID]
PRODUCE [UserID] int, [ModelID] int, [Value] float
USING new Extension.Python.Reducer(pyScript:@myScript);
OUTPUT @pythonOutput
TO @"adl://mydatalake.azuredatalakestore.net/myFolder/PythonOutput20171208.csv"
USING Outputters.Csv();
I get the following error:
"Python returned dataframe schema (System.Int32, System.Int32, System.Double) does match U-SQL schema (System.Int32, System.Int32, System.Single)"
Any idea why the U-SQL schema is expecting System.Single for the third column, when i have explicitly defined "float" in the output.
Thanks!