9
votes

I have a very large table with a primary key of BINARY(20).

The table has around 17 million rows. Every hour a cron job tries to insert as many as 50,000 new entries into this table with the ON_DUPLICATE_KEY_UPDATE syntax.

Each insert in the cronjob is with 1,000 values (multiple insert). How can I get the number of rows inserted into the table from this query? I cannot do a row count before and after as there are around 17million rows and the query is too expensive.

In the manual mysql says for a row inserted the affected number of rows is 1 and for an updated field it is 2, meaning in my 1000 INSERT ON DUPLICATE KEY UPDATE query I could have affected rows ranging from 1000 - 2000, but I have no way of telling how many records were inserted from this number?

How can I overcome this?

Thanks

2

2 Answers

10
votes

The number of inserts would be 2000 minus the number of affected rows. More generally:

(numberOfValuesInInsert * 2) - mysql_affected_rows()

EDIT:

As tomas points out, The MySQL docs actually say:

With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values.

[emphasis mine]

Consequently, if setting an existing row to the same values is a possibility, it's impossible to tell how many rows were updated vs. inserted, since two inserts would be indistinguishable from one update with different values + one update with the same values.

7
votes

When Your job does an Insert of 1000 , some are pure Inserts and some are Updates as you have the ON_DUPLICATE_KEY_UPDATE . Thus you get the first equation

(1) Inserts + Updates = No of rows Inserted( in this case 1000)

I take a simple example where you get a value of 1350 for the my_sql_affected_rows . since for an Insert a value of 1 and for update a value of 2 aggregates to my_sql_affected_rows . I get the following equation .

(2) Inserts + 2 * Updates = my_sql_affected_rows (in this case 1350) .

Subtract (2) - (1) . You get

(3) Updates = my_sql_affected_rows - No of rows Inserted

Updates = 1350 - 1000 ( in this example ).

Updates = 350 .

Substitute value of Updates in equation (1) , you get

Inserts = 650

Thus to get the number of Updates , you only need to use equation (3) directly .