Which is the least expensive aggregate function in the absence of ANY()

10

votes

I usally use MAX() or MIN() if a DBMS hasn't an ANY() aggregate function.

Is there something less expensive in mySQL and MS-SQL?

why do not exists an ANY() aggregate function? – Luis Siquot

Why would an ANY() aggregate function exist? What would its specification be? – Jonathan Leffler

What do you mean with ANY()? It should return one random row? – ypercubeᵀᴹ

any(expresion) : returns any value of expresion within the group. in order to have good performance with trivial dependant columns in cases where all are the same or just any is good enaugh – Luis Siquot

No ANY aggregate in ANSI-92 SQL contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt – gbn

1

votes

MySQL does not need an ANY() aggregate.

if I do a

SELECT field1, field2, SUM(field3) FROM table1 
GROUP BY field2

Microsofts T-SQL will complain but

MySQL will just silently execute

SELECT whatever(field1), field2, SUM(....

Which of course is way faster than SELECT max(field1), field2, SUM(.... GROUP BY field2

MySQL supports ANY, but SELECT ANY(field1) ... doesn't work because ANY is an operator similar to IN and ALL.
see: http://dev.mysql.com/doc/refman/5.0/en/any-in-some-subqueries.html

I love MySQL

3

votes

There is no ANY aggregate in ANSI SQL-92

There is the ANY qualifier to match SOME and ALL

TSQL - SOME | ANY why are they same with different names? (SO)
ALL, ANY, and SOME: The Three Stooges (SQL Server Pedia)

MIN and MAX are proper aggregates... completely unrelated and shouldn't be compared

Edit:

Only MySQL has this ambiguity of an "ANY" aggregate: SQL Server, Sybase, Oracle, PostGres do not. See Do all columns in a SELECT list have to appear in a GROUP BY clause

1

votes

MIN and MAX are equally (in)expensive.

0

votes

Fine, let's rephrase the original question. Instead of ANY, which is ambiguous and not SQL standard, it would be great if EXISTS-aggregate function was provided.

On several occasions I have used "count(outer.column) > 0" to imitate the "exists(outer.column)", which would be arguably cheaper since counting is discarded anyway.

0

votes

As of MySQL 5.7 (released October 2015), there is actually a function for this! any_value(col) explicitly meets this need - see the documentation for details.

However, it's important to note that it still appears to not guarantee a short-circuit and may still perform a full scan, so the goal of higher efficiency may not be met. If MySQL is intelligent about it, there is some chance, though testing would be wise. There are two circumstances in which it may be helpful anyway:

If the value to be aggregated is particularly large and unindexed, such that any comparisons would be expensive.
If the intention in the code would be made less clear by leveraging max or another placeholder aggregate. For example, a max may imply there was some reason for its existence rather than getting "anything" from the grouping, but any_value would explicitly define the intention without additional commenting.

Toy sample:

select
col, 
any_value(a.val) as any_val
from (
    select 'blue' as col, 3 as val
    union all
    select 'blue' as col, 2 as val
    union all
    select 'green' as col, 1 as val
) as a
group by col

Which is the least expensive aggregate function in the absence of ANY()

5 Answers