0
votes

For uninteresting reasons, I have to use jRuby on a particular project where we also want to use Amazon Simple Workflow (SWF). I don't have a choice in the jRuby department, so please don't say "use MRI".

The first problem I ran into is that jRuby doesn't support forking and SWF activity workers love to fork. After hacking through the SWF ruby libraries, I was able to figure out how to attach a logger and also figure out how to prevent forking, which was tremendously helpful:

AWS::Flow::ActivityWorker.new(
  swf.client, domain,"my_tasklist", MyActivities
) do |options|
    options.logger=  Logger.new("logs/swf_logger.log")
    options.use_forking = false
  end

This prevented forking, but now I'm hitting more exceptions deep in the SWF source code having to do with Fibers and the context not existing:

Error in the poller, exception: 
AWS::Flow::Core::NoContextException: AWS::Flow::Core::NoContextException stacktrace: 

"aws-flow-2.4.0/lib/aws/flow/implementation.rb:38:in 'task'",

 "aws-flow-2.4.0/lib/aws/decider/task_poller.rb:292:in 'respond_activity_task_failed'", 

"aws-flow-2.4.0/lib/aws/decider/task_poller.rb:204:in 'respond_activity_task_failed_with_retry'", 

"aws-flow-2.4.0/lib/aws/decider/task_poller.rb:335:in 'process_single_task'", 

"aws-flow-2.4.0/lib/aws/decider/task_poller.rb:388:in 'poll_and_process_single_task'", 

"aws-flow-2.4.0/lib/aws/decider/worker.rb:447:in 'run_once'", 

"aws-flow-2.4.0/lib/aws/decider/worker.rb:419:in 'start'", 

"org/jruby/RubyKernel.java:1501:in `loop'", 

"aws-flow-2.4.0/lib/aws/decider/worker.rb:417:in 'start'", 

"/Users/trcull/dev/etl/flow/etl_runner.rb:28:in 'start_workers'"

This is the SWF code at that line:

      # @param [Future] future
  #   Unused; defaults to **nil**.
  #
  # @param block
  #   The block of code to be executed when the task is run.
  #
  # @raise [NoContextException]
  #   If the current fiber does not respond to `Fiber.__context__`.
  #
  # @return [Future]
  #   The tasks result, which is a {Future}.
  #
  def task(future = nil, &block)
    fiber = ::Fiber.current
    raise NoContextException unless fiber.respond_to? :__context__
    context = fiber.__context__
    t = Task.new(nil, &block)
    task_context = TaskContext.new(:parent => context.get_closest_containing_scope, :task => t)
    context << t
    t.result
  end

I fear this is another flavor of the same forking problem and also fear that I'm facing a long road of slogging through SWF source code and working around problems until I finally hit a wall I can't work around.

So, my question is, has anyone actually gotten jRuby and SWF to work together? If so, is there a list of steps and workarounds somewhere I can be pointed to? Googling for "SWF and jRuby" hasn't turned up anything so far and I'm already 1 1/2 days into this task.

3

3 Answers

0
votes

I think the issue might be that aws-flow-ruby doesn't support Ruby 2.0. I found this PDF dated Jan 22, 2015.

1.2.1

Tested Ruby Runtimes The AWS Flow Framework for Ruby has been tested with the official Ruby 1.9 runtime, also known as YARV. Other versions of the Ruby runtime may work, but are unsupported.

0
votes

I have a partial answer to my own question. The answer to "Can SWF be made to work on jRuby" is "Yes...ish."

I was, indeed, able to get a workflow working end-to-end (and even make calls to a database via JDBC, the original reason I had to do this). So, that's the "yes" part of the answer. Yes, SWF can be made to work on jRuby.

Here's the "ish" part of the answer.

The stack trace I posted above is the result of SWF trying to raise an ActivityTaskFailedException due to a problem in some of my activity code. That part is my fault. What's not my fault is that the superclass of ActivityTaskFailedException has this code in it:

def initialize(reason = "Something went wrong in Flow",
   details = "But this indicates that it got corrupted getting out")
   super(reason)
   @reason = reason
   @details = details
   details = details.message if details.is_a? Exception
   self.set_backtrace(details)
end

When your activity throws an exception, the "details" variable you see above is filled with a String. MRI is perfectly happy to take a String as an argument to set_backtrace(), but jRuby is not, and jRuby throws an exception saying that "details" must be an Array of Strings. This exception blows through all the nice error catching logic of the SWF library and into this code that's trying to do incompatible things with the Fiber library. That code then throws a follow-on exception and kills the activity worker thread entirely.

So, you can run SWF on jRuby as long as your activity and workflow code never, ever throws exceptions because otherwise those exceptions will kill your worker threads (which is not the intended behavior of SWF workers). What they are designed to do instead is communicate the exception back to SWF in a nice, trackable, recoverable fashion. But, the SWF code that does the communicating back to SWF has, itself, code that's incompatible with jRuby.

To get past this problem, I monkey-patched AWS::Flow::FlowException like so:

  def initialize(reason = "Something went wrong in Flow",
                 details = "But this indicates that it got corrupted getting out")
    super(reason)
    @reason = reason
    @details = details
    details = details.message if details.is_a? Exception
    details = [details] if details.is_a? String
    self.set_backtrace(details)
  end

Hope that helps someone in the same situation as me.

0
votes

I'm using JFlow, it lets you start SWF flow activity workers with JRuby.