61
votes

In MySQL, you can select X random rows with the following statement:

SELECT * FROM table ORDER BY RAND() LIMIT X

This does not, however, work in SQLite. Is there an equivalent?

6

6 Answers

86
votes

For a much better performance use:

SELECT * FROM table WHERE id IN (SELECT id FROM table ORDER BY RANDOM() LIMIT x)

SQL engines first load projected fields of rows to memory then sort them, here we just do a random sort on id field of each row which is in memory because it's indexed, then separate X of them, and find the whole row using these X ids.

So this consume less RAM and CPU as table grows!

60
votes
SELECT * FROM table ORDER BY RANDOM() LIMIT X
9
votes
SELECT * FROM table ORDER BY RANDOM() LIMIT 1
3
votes

All answers here are based on ORDER BY. This is very inefficient (i.e. unusable) for large sets because you will evaluate RANDOM() for each record, and then ORDER BY which is a resource expensive operation.

An other approach is to place abs(CAST(random() AS REAL))/9223372036854775808 < 0.5 in the WHERE clause to get in this case for example 0.5 hit chance.

SELECT *
FROM table
WHERE abs(CAST(random() AS REAL))/9223372036854775808 < 0.5

The large number is the maximum absolute number that random() can produce. The abs() is because it is signed. Result is a uniformly distributed random variable between 0 and 1.

This has its drawbacks. You can not guarantee a result and if the threshold is large compared to the table, the selected data will be skewed towards the start of the table. But in some carefully designed situations, it can be a feasible option.

2
votes

The accepted answer works, but requires a full table scan per query. This will get slower and slower as your table grows large, making it risky for queries that are triggered by end-users.

The following solution takes advantage of indexes to run in O(log(N)) time.

SELECT * FROM table
WHERE rowid > (
  ABS(RANDOM()) % (SELECT max(rowid) FROM table)
)
LIMIT 1;

To break it down

  • SELECT max(rowid) FROM table - Returns the largest valid rowid for the table. SQLite is able to use the index on rowid to run this efficiently.
  • ABS(RANDOM()) % ... - Return a random number between 0 and max(rowid) - 1). SQLite's random function generates a number between -9223372036854775808 and +9223372036854775807. The ABS makes sure its positive, and the modulus operator gates it between max(rowid) - 1.
  • rowid > ... - Rather than using =, use > in case the random number generated corresponds to a deleted row. Using strictly greater than ensures that we return a row with a row id between 1 (greater than 0) and max(rowid) (great than max(rowid) - 1). SQLite uses the primary key index to efficiently return this result as well.

This also works for queries with WHERE clauses. Apply the WHERE clause to both the output and the SELECT max(rowid) subquery. I'm not sure which conditions this will run efficiently, however.

Note: This was derived from an answer in a similar question.

1
votes

This one solves the negative RANDOM integers, and keeps good performance on large datasets:

SELECT * FROM table LIMIT 1 OFFSET abs(random() % (select count(*) from table));

where:
abs(random() % n ) Gives you a positive integer in range(0,n)