1
votes

I'm new to Rails and am trying to figure out how to create models to track income and expenses in my app. Should I:

1) Create one model and database table called Finance, and then set a field called "type" to either income or expense, then continue with description, amount, date?

2) Or should I create two models and two tables called Income and Expenses, each with description, amount, and date?

I intend to use this data to allow photographers to track income and expenses related to their business. So for example when the photographer books an appointment they can associate income and expenses with that appointment. They can also see a report which shows monthly income, expenses, and profit.

2

2 Answers

1
votes

It's basically just a question of preference. You can do all database queries with one or two tables (using UNION). So I'd prefer two tables to have a cleaner model structure. Image you'd want so save an income entry:

  • One Table: You always have to set the type
  • Two Tables: You just have to choose the right model

But I can image one database query that could(!) be faster with using only one table:

  • If you want to ORDER both types let's say by date.

And there's another point where one table is better, but that doesn't apply to your model:

  • If there's an infinite number of types. Or: If the number of types can change.

For everything else two separate tables are better. Concerning query performance:

  • If the tables get really huge and you for example want to retrieve all income entries, it's of course faster to look up those entries in a table with 300000 entries than in a table with 600000 entries.

With a deeper look at the DBMS there's another reason for using two tables:

  • Table locking. Some database engines lock whole tables for write operations. Thus only half of the data would get locked and the other half can still be accessed at the same time.

I will have a look at the ORDER thing with two tables. Maybe I'm wrong and the performance impact isn't even there.

Results:

I've created three simple tables (using MySQL):

  • inc: id (int, PK), money (int, not null)
  • exp: id (int, PK), money (int, not null)
  • combi: id (int, PK), type (tinyint, index, not null), money (not null)

And then filled the tables with random data:

  • money: from 1 to 10000
  • type: from 1 to 2
  • inc: 100000 entries
  • exp: 100000 entries
  • combi: 200000 entries

Run these queries:

SELECT id, money
FROM combi
WHERE money > 5000
ORDER BY money
LIMIT 200000;

0,1 sec ... without index: 0,1 sec

SELECT * FROM (
    SELECT id, money FROM inc WHERE money > 5000
    UNION
    SELECT id, money FROM exp WHERE money > 5000
) a
ORDER BY money LIMIT 200000;

0,16 sec

SELECT id, money
FROM combi
WHERE money > 5000 && type = 1
ORDER BY money
LIMIT 200000;

0,14 sec ... without index: 0,085 sec

SELECT id, money
FROM inc
WHERE money > 5000
ORDER BY money
LIMIT 200000;

0,04 sec

And you can see the expected results:

  • when you need income and expenses in one query, then one table is faster
  • when you need only income OR expenses, then two tables are faster

But what I don't understand: Why is the query with type = 1 so much slower? I thought using index would make it nearly equal fast?

2
votes

I would say go with one table and use STI (i.e use the type field).. Both income and expenses are inherently the same thing, just the "direction" of the operation is different. So to me it makes sense to use the same data model, with exceptions hidden in specific subtypes.

Now as for the issues mentioned in the other answer:

  • Ordering both items at the same time becomes easy with one table. It will be painful with two.
  • When indexing your table properly, it doesn't matter if its one or two tables. When creating an index on the type column, the cardinality of the records is the same as it would be in two tables, thus not really being that much different in terms of performance. Aggregation will be easier and faster with one table as well.
  • Table locking is not an issue, unless you use some kind of a you database (like MyISAM), which you should not be doing.