2
votes

I'm using Rails 2.3.14 here, but it's possible problem I'm having is not confined to this particular version. This is all association and eager loading functionality that is still around in Rails 3, and has been around well before 2.3.14. I'm in the process of upgrading from rails 2.3.8, where I was not having the problem described below.

The code below is a mock-up based on a much more complex production system. The class/module scheme I'm outlining is set up like this for a reason. I'm actually including a few more details in this mock-up than necessary to demonstrate the problem, hopefully make the overall structure of the system more clear.

Suppose I have several domain objects that can be "driven", including vehicles (cars/trucks) and golf balls. For each of these things, I have an ActiveRecord class:

class Vehicle < ActiveRecord::Base
end

class Car < Vehicle
    include Driveable
end

class Truck < Vehicle
    include Driveable
end

class GolfBall < ActiveRecord::Base
    include Driveable
end

First note that GolfBall is a top-level model class and the corresponding database table is golf_balls. On the other hand, Car and Truck are sub-classes of Vehicle. The database for vehicles is set up with STI, so Cars and Trucks both correspond to the vehicles table (with a type column differentiator).

Secondly, note that I'm including a Drivable module on all the bottom-level domain objects (Car, Truck, GolfBall), and it looks like this (in the actual system this module does a lot more too, including settings things up based on the specific including domain object):

module Driven
    def self.included(base)
        base.class_eval do
            has_one :driver, :as => :driveable, :class_name => "#{self.name}Driver", :dependent => :destroy
        end
    end
end

So each of these things can have a Driver, and it's using a :class_name based on the including class name (e.g. including class Car results in a has_one with :class_name => "CarDriver"), because each of these referenced classes (CarDriver, etc...) contains specific business logic that is necessary for the association's use.

There is a top-level Driver class which sets up the polymorphic association, and then a similar subclass hierarchy as above for domain object drivers:

class Driver < ActiveRecord::Base
    belongs_to :driveable, :polymorphic => true
end

class VehicleDriver < Driver
end

class CarDriver < VehicleDriver
end

class TruckDriver < VehicleDriver
end

class GolfBallDriver < Driver
end

This is based on a single database table drivers, using STI for all subclasses.

With this system in place, I create a new Car (stored in @car below) and associate it with a newly-created CarDriver like this (it's split up into these particular sequential steps in this mock-up to mirror the way the actual system works):

@car = Car.create
CarDriver.create(:driveable => @car)

This created database row in the vehicles table like this:

 id   type   ... 
-----------------
 1    Car    ...

And a row in the drivers table like this:

 id   driveable_id    driveable_type   type         ...
--------------------------------------------------------
 1    1               Vehicle          CarDriver    ...

Vehicle is the driveable_type as opposed to Car because vehicles are STI. So far so good. Now I open up a rails console and execute a simple command to get a Car instance:

>> @car = Car.find(:last)
=> #<Car id: 1, type: "Car", ...>

According to the log, here is the query that was executed:

Car Load (1.0ms)
SELECT * FROM `vehicles`
WHERE ( `vehicles`.`type` = 'Car' )
ORDER BY vehicles.id DESC
LIMIT 1

Then I get the CarDriver:

>> @car.driver
=> #<CarDriver id: 1, driveable_id: 1, driveable_type: "Vehicle", type: "CarDriver", ...>

This caused this query to be execute.

CarDriver Load (0.7ms)
SELECT * FROM `drivers`
WHERE (`drivers`.driveable_id = 1 AND `drivers`.driveable_type = 'Vehicle') AND (`drivers`.`type` = 'CarDriver' )
LIMIT 1

If I try to use eager loading, however, I get different results. From a fresh console session, I run:

>> @car = Car.find(:last, :include => :driveable)
=> #<Car id: 1, type: "Car", ...>
>> @car.driver
=> nil

This results in nil for the driver. Checking the logs, the first statement execute the following queries (regular query and eager loading query):

Car Load (1.0ms)
SELECT * FROM `vehicles`
WHERE ( `vehicles`.`type` = 'Car' )
ORDER BY vehicles.id DESC
LIMIT 1

CarDriver Load (0.8ms)
SELECT * FROM `drivers`
WHERE (`drivers`.driveable_id = 1 AND `drivers`.driveable_type = 'Car') AND (`drivers`.`type` = 'CarDriver' )
LIMIT 1

As you can see, in the eager loading case, the Car query is identical to the above, but the CarDriver query is different. It's mostly the same, except that for drivers.driveable type it is looking for Car where it shouldbe looking for the STI base class name, Vehicle, as it does in the non-eager loading case.

Any idea how to fix this?

1

1 Answers

2
votes

After reading the rails source code for hours, I'm pretty sure I figured this out (it's possible I'm mis-reading something, of course). It seems to be a bug in the Rails 2.3.x branch that was introduced in Rails 2.3.9 and never fixed. It started with this Rails 2.3 bug report:

Query built wrong for preloading associations nested under polymorphic belongs_to

And indeed this was a valid bug, and it was fixed by this commit in Rails 2.3.9:

Fix eager loading of polymorphic has_one associations nested

However, this commit inadvertently broke non-nested polymorphic eager loading on STI classes, and they didn't have a test for this scenario, so automated tests did not catch it. The relevant diff is here:

 360    -          conditions = "#{reflection.klass.quoted_table_name}.#{connection.quote_column_name "#{interface}_id"} #{in_or_equals_for_ids(ids)} and #{reflection.klass.quoted_table_name}.#{connection.quote_column_name "#{interface}_type"} = '#{self.base_class.sti_name}'"

 360    +          parent_type = if reflection.active_record.abstract_class?
 361    +            self.base_class.sti_name
 362    +          else
 363    +            reflection.active_record.sti_name
 364    +          end
 365    +
 366    +          conditions = "#{reflection.klass.quoted_table_name}.#{connection.quote_column_name "#{interface}_id"} #{in_or_equals_for_ids(ids)} and #{reflection.klass.quoted_table_name}.#{connection.quote_column_name "#{interface}_type"} = '#{parent_type}'"

From this diff you can see that prior to 2.3.9 for eager-loaded polymorphic associations it would always use self.base_class.sti_name as the parent_type (which is the polymorphic type column). Following the example in the question, that means the parent type column would be driveable_type, and since Car.base_class is Vehicle, it would correctly match on driveable_type = 'Vehicle'.

Starting with the above commit, however, it only does that correctly if the association-holder is an abstract_class. This appears to be a straight-up bug; it should be set up so this if the association-holder is either abstract or an sti sub-class, it should set the parent_type to self.base_class.sti_name.

Fortunately, there is a simple way to work around this. While it doesn't mention this in the docs anywhere, it seems like this can be solved by setting abstract_class = true on sti subclasses holding the polymorphic belongs_to assocations. In the example in the question, that would be:

class Car < Vehicle
    abstract_class = true
    include Driveable
end

class Truck < Vehicle
    abstract_class = true
    include Driveable
end

Doing a full-text search of the rails source for abstract_class, setting that doesn't seem like it will have any unintended/undesired consequences, and in an initial test with the real app, it seems to have solved the problem.