Let's imagine an environment: there are one db client and one db server. The db client could be a Java program or other, etc; the db server can be mysql, oracle, etc.
The requirement is inserting a large amount of records into one table on the db server.
The simplest way is having a loop inside which the client inserts one record each time until all records are inserted. This is single threaded sequential inserts.
There is another multi-threaded concurrent inserts way that let client fire up multiple threads simultaneously each of which inserts one record into the table. Intuitively, because those records are independent and there is an assumption that modern db servers come with RAID in which the concurrent IO is well-supported, they seem to be able to gain practical and true concurrency for multiple inserts, therefore, this way could improve performance, compared to the above method.
However, as soon as I dived into more details, it turns out that's probably not the case. This link -- Multi threaded insert using ORM? says inserts on the same table require a lock for every single write on the whole table. Therefore, every insert just blocks another following insert, eventually, this way is just another type of sequential multiple inserts, no performance gain at all.
My question are as follows:
- Why do most DBs treat multi threaded inserts on same table like this way?
- Why is the insert lock on the whole table compulsory?
- Are multiple threaded updates treated similarly as multi threaded inserts?
Despite it seems that the best way to deal with a large amount inserts is to have batch insert enabled, I'm still very curious of the grounds of locking whole table while inserting happens.
Thanks in advance!
=====================================================================
After plenty of readings and research, it shows that my questions were actually wrong. The real thing is one insert doesn't block another insert at the same time.(at least it is true for Oracle).