I'm working with a Rails 3 application to allow people to apply for grants and such. We're using Elasticsearch/Tire as a search engine.
Documents, e.g., grant proposals, are composed of many answers of varying types, like contact information or essays. In AR, (relational dbs in general) you can't specify a polymorphic "has_many" relation directly, so instead:
class Document < ActiveRecord::Base
has_many :answerings
end
class Answering < ActiveRecord::Base
belongs_to :document
belongs_to :question
belongs_to :payload, :polymorphic => true
end
"Payloads" are models for individual answer types: contacts, narratives, multiple choice, and so on. (These models are namespaced under "Answerable.")
class Answerable::Narrative < ActiveRecord::Base
has_one :answering, :as => :payload
validates_presence_of :narrative_content
end
class Answerable::Contact < ActiveRecord::Base
has_one :answering, :as => :payload
validates_presence_of :fname, :lname, :city, :state, :zip...
end
Conceptually, the idea is an answer is composed of an answering (functions like a join table, stores metadata common to all answers) and an answerable (which stores the actual content of the answer.) This works great for writing data. Search and retrieval, not so much.
I want to use Tire/ES to expose a more sane representation of my data for searching and reading. In a normal Tire setup, I'd wind up with (a) an index for answerings and (b) separate indices for narratives, contacts, multiple choices, and so on. Instead, I'd like to just store Documents and Answers, possibly as parent/child. The Answers index would merge data from Answerings (id, question_id, updated_at...) and Answerables (fname, lname, email...). This way, I can search Answers from a single index, filter by type, question_id, document_id, etc. The updates would be triggered from Answering, but each answering will then pull in information from its answerable. I'm using RABL to template my search engine inputs, so that's easy enough.
Answering.find(123).to_indexed_json # let's say it's a narrative
=> { id: 123, question_id: 10, :document_id: 24, updated_at: ..., updated_by: [email protected], narrative_content: "Back in the day, when I was a teenager, before I had...", answerable_type: "narrative" }
So, I have a couple of questions.
- The goal is to provide a single-query solution for all answers, regardless of underlying (answerable) type. I've never set something like this up before. Does this seem like a sane approach to the problem? Can you foresee wrinkles I can't? Alternatives/suggestions/etc. are welcome.
The tricky part, as I see it, is mapping. My plan is to put explicit mappings in the Answering model for the fields that need indexing options, and just let the default mappings take care of the rest:
mapping do indexes :question_id, :index => :not_analyzed indexes :document_id, :index => :not_analyzed indexes :narrative_content, :analyzer => :snowball indexes :junk_collection_total, :index => :not_analyzed indexes :some_other_crazy_field, :index [...]
If I don't specify a mapping for some field, (say, "fname") will Tire/ES fall back on dynamic mapping? (Should I explicitly map every field that will be used?)
Thanks in advance. Please let me know if I can be more specific.