It's basically just a question of preference. You can do all database queries with one or two tables (using UNION). So I'd prefer two tables to have a cleaner model structure. Image you'd want so save an income entry:
- One Table: You always have to set the type
- Two Tables: You just have to choose the right model
But I can image one database query that could(!) be faster with using only one table:
- If you want to
ORDER
both types let's say by date.
And there's another point where one table is better, but that doesn't apply to your model:
- If there's an infinite number of types. Or: If the number of types can change.
For everything else two separate tables are better. Concerning query performance:
- If the tables get really huge and you for example want to retrieve all income entries, it's of course faster to look up those entries in a table with 300000 entries than in a table with 600000 entries.
With a deeper look at the DBMS there's another reason for using two tables:
- Table locking. Some database engines lock whole tables for write operations. Thus only half of the data would get locked and the other half can still be accessed at the same time.
I will have a look at the ORDER
thing with two tables. Maybe I'm wrong and the performance impact isn't even there.
Results:
I've created three simple tables (using MySQL):
inc
: id
(int, PK), money
(int, not null)
exp
: id
(int, PK), money
(int, not null)
combi
: id
(int, PK), type
(tinyint, index, not null), money
(not null)
And then filled the tables with random data:
money
: from 1 to 10000
type
: from 1 to 2
inc
: 100000 entries
exp
: 100000 entries
combi
: 200000 entries
Run these queries:
SELECT id, money
FROM combi
WHERE money > 5000
ORDER BY money
LIMIT 200000;
0,1 sec ... without index: 0,1 sec
SELECT * FROM (
SELECT id, money FROM inc WHERE money > 5000
UNION
SELECT id, money FROM exp WHERE money > 5000
) a
ORDER BY money LIMIT 200000;
0,16 sec
SELECT id, money
FROM combi
WHERE money > 5000 && type = 1
ORDER BY money
LIMIT 200000;
0,14 sec ... without index: 0,085 sec
SELECT id, money
FROM inc
WHERE money > 5000
ORDER BY money
LIMIT 200000;
0,04 sec
And you can see the expected results:
- when you need income and expenses in one query, then one table is faster
- when you need only income OR expenses, then two tables are faster
But what I don't understand: Why is the query with type = 1
so much slower? I thought using index would make it nearly equal fast?