The easiest, at least to my brain, mechanism for using IronPython from the confines of SSIS would be to invoke the external process and dump to a file and then use that as a source for a dataflow.
That said, I was able to host an IronPython app from C# and use the returned data to populate the output buffers and interact with that data in the pipeline. I've only had one machine to perform this on so I'm listing everything I recall doing until the package went green.
Prerequisites
This article set me down the path of how to make this work. Hosting IronPython in a C# 4.0 program I would strongly urge you to create a C#/VB.NET console app and get your IronPython integration working there first as SSIS is going to add an additional layer to everything.
There may be the ability to host older versions of IronPython within C# without requiring the 4.0 framework but that's far beyond the realm of my competency. What I can say is that to use the 4.0 framework, you are looking at SQL Server 2012. A 2008 package can target up to the 3.5 framework (default is 2.0).
Global Assembly Cache, GAC for short. It is a special place in Windows where signed assemblies can live. SSIS may be able to use assemblies that aren't in the GAC, but I've not had luck doing so. This case was no different. My Console app worked fine but when I copied that code into SSIS, it'd tank with Could not load file or assembly 'Microsoft.Scripting...
error messages. Blessedly, IronPython-2.7.2.1 (and probably previous versions) are strongly signed dlls. That means you can and must add them into the GAC.
In your Visual Studio directory, look for the Visual Studio Command Prompt (2010).
Assuming your IronPython installation folder is C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1
you would type cd C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1
Then I registered the following 3 assemblies
C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1>gacutil -if Microsoft.Dynamic.dll
Microsoft (R) .NET Global Assembly Cache Utility. Version 4.0.30319.1
Copyright (c) Microsoft Corporation. All rights reserved.
Assembly successfully added to the cache
C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1>gacutil -if IronPython.dll
Microsoft (R) .NET Global Assembly Cache Utility. Version 4.0.30319.1
Copyright (c) Microsoft Corporation. All rights reserved.
Assembly successfully added to the cache
C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1>gacutil -if Microsoft.Scripting.dll
Microsoft (R) .NET Global Assembly Cache Utility. Version 4.0.30319.1
Copyright (c) Microsoft Corporation. All rights reserved.
Assembly successfully added to the cache
My SSIS project, I had set the Run64bitRuntime to False but in retesting, it does not matter. The default it True and that seems to work fine.
Python script - I don't have enough of a background to make the integration between C# and .NET DLR languages more graceful. It'd have been nice to supply a string or something containing the script I wanted to execute and perhaps that's what a script block is about but I don't have time to investigate. So, this solution requires a script file sitting out somewhere on disk. I had trouble with the imports working from a hosted script (no module named X exceptions). Undoubtedly there's some magic with class paths and all that stuff that needs to provided to the host to make it work well. That's probably a different SO question btw.
Set up
I have a file sitting at C:\ssisdata\simplePy.py
# could not get a simple import to work from hosted
# works fine from "not hosted"
#import os
def GetIPData():
#os.listdir(r'C:\\')
return range(0,100)
After adding a script task to the Data Flow, I configured it to have a single column on the output buffer (wstr 1000). I then used this as my source code.
using System;
using System.Collections.Generic;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using IronPython.Hosting;
using Microsoft.Scripting.Hosting;
/// <summary>
/// Attempt to use IP script as a source
/// http://blogs.msdn.com/b/charlie/archive/2009/10/25/hosting-ironpython-in-a-c-4-0-program.aspx
/// </summary>
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
/// <summary>
/// Create data rows and fill those buckets
/// </summary>
public override void CreateNewOutputRows()
{
foreach (var item in this.GetData())
{
Output0Buffer.AddRow();
Output0Buffer.Content = item;
}
}
/// <summary>
/// I've written plenty of code, but I'm quite certain this is some of the ugliest.
/// There certainly must be more graceful means of
/// * feeding your source code to the ironpython run-time than a file
/// * processing the output of the code the method call
/// * sucking less at life
/// </summary>
/// <returns>A list of strings</returns>
public List<string> GetData()
{
List<string> output = null;
var ipy = Python.CreateRuntime();
dynamic test = ipy.UseFile(@"C:\ssisdata\simplePy.py");
output = new List<string>();
var pythonData = test.GetIPData();
foreach (var item in pythonData)
{
output.Add(item.ToString());
}
return output;
}
}
Quick shot of what my references look like
Click the run button and great success
python myscript.py
work? Also, if it's an Iron* variant, you could probably invoke it from a Script task. Probably need to bump the .NET level of the project up from 2.0 but I think it should work based on my half-understood followings of the IronPython mailing list. – billinkc