0
votes

I'm trying to query an Ecto table with append-only semantics, so I'd like the most recent version of a complete row for a given ID. The technique is described here, but in short: I want to JOIN a table on itself with a subquery that fetches the most recent time for an ID. In SQL this would look like:

SELECT r.*
FROM rules AS r
JOIN (
  SELECT id, MAX(inserted_at) AS inserted_at FROM rules GROUP BY id
) AS recent_rules
ON (
  recent_rules.id = r.id
  AND recent_rules.inserted_at = r.inserted_at)

I'm having trouble expessing this in Ecto. I tried something like this:

maxes =
  from(m in Rule,
    select: {m.id, max(m.inserted_at)},
    group_by: m.id)

from(r in Rule,
  join: m in ^maxes, on: r.id == m.id and r.inserted_at == m.inserted_at)

But trying to run this, I hit a restriction:

queries in joins can only have where conditions in query

suggesting maxes must just be a SELECT _ FROM _ WHERE form.

If I try switching maxes and Rule in the JOIN:

maxes =
  from(m in Rule,
    select: {m.id, max(m.inserted_at)},
    group_by: m.id)

from(m in maxes,
  join: r in Rule, on: r.id == m.id and r.inserted_at == m.inserted_at)

then I'm not able to SELECT the whole row, just id and MAX(inserted_at).

Does anyone know how to do this JOIN? Or a better way to query append-only in Ecto? Thanks ????

1

1 Answers

4
votes

Doing m in ^maxes is not running a subquery but either query composition (if in a from) or converting the query to a join (in a join). In both cases, you are changing the same query. Given your initial query, I believe you want subqueries.

Also note that a subquery requires the select to return a map, so we can refer to the fields later on. Something along these lines should work:

maxes =
  from(m in Rule,
    select: %{id: m.id, inserted_at: max(m.inserted_at)},
    group_by: m.id)

from(r in Rule,
  join: m in ^subquery(maxes), on: r.id == m.id and r.inserted_at == m.inserted_at)

PS: I have pushed a commit to Ecto that clarifies the error message in cases like yours.

invalid query was interpolated in a join.
If you want to pass a query to a join, you must either:

  1. Make sure the query only has `where` conditions (which will be converted to ON clauses)
  2. Or wrap the query in a subquery by calling subquery(query)