6
votes

I'm surprised to find that query results in datomic are not lazy, when entities are.

Is there an obvious rationale for this choice that I am missing? It seems reasonable that someone might want to want to (map some-fn (take 100 query-result-containing-millions)), but this would force the evaluation of the entire set of entity-ids, no?

Is there a way to get a lazy seq (of entity-ids) directly back from the query, or do they always have to be loaded into memory first, with laziness only available through the entity?

1
No obvious rationale that I can see, but this looks like a subtle issue. According to the docs, every intermediate step in a query must fit in memory even though the underlying data set need not. This seems to be the reason that the result also must fit in memory. I'm guessing it has something to do with how datalog works or how they implement it. Whether that requirement and the lack of a lazy query API are related is unclear, but that's my best guess. Note also that even the aggregation features are still marked as beta.Michael Victor Zink
As for your second question, datoms and seek-datoms are the closest you'll get: they provide lazy access to raw datoms.Michael Victor Zink

1 Answers

6
votes

You can use the datomic.api/datoms fn to get access to entities in a lazy way.

Note that you have to specify the index type when calling datoms and the types of indexes available to you depends on the type of the attribute that you're interested in. Eg the :avet index is only available if your attribute has :db/index set in the schema, and the :vaet index is only available if your attribute is of type :db.type/ref.

We use something like this at work (note: the attribute, ref-attr, must be of :db.type/ref for this to work):

(defn datoms-by-ref-value
  "Returns a lazy seq of all the datoms in the database matching the
  given reference attribute value."
  [db ref-attr value]
  (d/datoms db :vaet value ref-attr))

The datoms documentation is a bit sparse, but with some trial an error you can probably work out what you need. There's a post by August Lilleaas about using the :avet index (which requires an index on the attribute in the datomic schema) that I found somewhat helpful.