MySQL: How to speed up insert and group by query of very large table (74m rows)

Question

I'm trying to run the below query and it's taking hours and hours. We've got a dedicated server for the queries (not running on localhost).

It's an InnoDB table with around 74 million rows. I've indexed the two columns involved in the grouping (TRAN_URN, UCI) in a hope to speed up the query.

insert into data.urn_uci_lookup (TRAN_URN, UCI, `Count`)
select  TRAN_URN,UCI, count(*) as `Count`
from data.diablo18
group by TRAN_URN, UCI

Is this inefficient for some reason? How can I improve it?

EDIT: Here is the EXPLAIN plan

id select_type table type possible_keys key key_len ref rows Extra

1 SIMPLE diablo18 ALL \N \N \N \N 74631102 Using temporary; Using filesort

Cheers, Lucas

Hey Jim, how exactly do I use this? Never used EXPLAIN PLAN before... — Lucas
do your EXPLAIN EXTENDED just on the select portion of your query. — gview
The biggest question is the cardinality of the index, or conversely, the numbers you expect in the count(*) column. If you expect very low (i.e. <3) results for most rows in count(*) then the query is going to be slow no matter what you do. How many distinct values of (TRAN_URN,UCI) do you have? — Jim Garrison

gview gview · Accepted Answer · 2014-03-24T05:27:05

This query is going to read the entire 74 million rows. It is also going to recreate much of the table in a new table, depending on how many groups you have.

We don't have enough information about your server or data set to do much but make educated guesses.

You want to look into your innodb configuration, especially in regards to how much memory you have allocated to it (should be almost the entirety of the server's available RAM, the more the better) less what's needed for basic overhead, as described in https://dev.mysql.com/doc/refman/5.5/en/innodb-buffer-pool.html.

Your server io subsystem may be the bottleneck. If IO is slow, the server may just be stuck trying to keep up with the required reads/writes of this query. Setting up a high performance database server is much more complicated than installing the mysql server on a "dedicated" machine.

MySQL: How to speed up insert and group by query of very large table (74m rows)

1 Answers