1
votes

Are Postgres SELECT DISTINCT queries deterministic?

Will SELECT DISTINCT somecolumn FROM sometable return the same result (including order) if the table (and entire database) goes unchanged?

In the Select Query Documentation the Description section notes:

If the ORDER BY clause is specified, the returned rows are sorted in the specified order. If ORDER BY is not given, the rows are returned in whatever order the system finds fastest to produce.

In the DISTINCT ON clause section they add:

Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.

Generally, is this still true when the database goes un-changed?

1
The only (really: the only) way to get a deterministic and stable order of the rows is ORDER BY. There is no exception. - a_horse_with_no_name
Since you mentioned return the same result (including order) the answer is no, there is no guarantee if the results will be same (including order) for two executions of the query. - Salman A

1 Answers

3
votes

This answer assumes that the expressions in the select are deterministic. Otherwise, the question seems trivial.

The ordering is not specified, so it could change between runs of the query -- or on a different system. However, the result set should be the same.

Your second quote from the documentation is for distinct on. That is not-deterministic, unless you are using a stable sort.

Note: You might get non-deterministic results if you are using a case-insensitive collation. The built-in collations are case-sensitive; and case insensitivity means that the original expressions are not deterministic.