2
votes

My post table has the columns: id, user_id, title, message, date

My comment table has the columns: id, user_id, post_id, message, date

Since these 2 tables are so similar, would it be better to have a single table with the columns:

id, user_id, parent_id, type, title, message, date

and every row that is a parent post will have an empty parent_id value and every row that is a child post will have an empty title value. The type will either be 1 for parent posts or 2 for child posts.

Edit: I will also use a rating feature and I ask the same question again, 2 specific post_rating and comment_rating tables, or a generic rating tables. In case of the ratings table, the structure will be exactly the same, so no nullable fields. It's also unlikely that the rating table(s) will change, since it only knows about the owner of the rating, the actual rating, and the targeted post/comment/etc.

2
yeah, i'm asking if it should be normalized or denormalized? i know normalization is generally adviced, but for this situation denormalization might make more sense.user1989781
Perhaps keep things normalized until there's a good reason not to...jspcal

2 Answers

6
votes

I would not do that for a few reasons:

  1. There is a child/parent relation between those two and for relations like this you have foreign keys and seperate tables.
  2. Never use the same table only because the naming of the column is the same. Dont try to save space here, space does not cost a thing but your time understanding it later or changing it does. Having to explain your db design (which you had to do) results from an unclear design. So keep it clean and do not try to be smart to save some bits or not to have to make a new table ;)
  3. A change to this table always affects both elements, not only one. If you want to add a thumbnail image link to the comment, you have to touch the posts, too. That results in more testing that everything still works.
  4. Having null values in your table is ok, but not ideal. It certainly is annoying when you try to use this table in a programming environment where you have to check for nulls
  5. You could put them into one table if they inherit from the same type. For example you can have a "Text" element and lets say a comment and a post is a text. But again, the relation from point 1 still stands and this will result in an ugly design again.

Regaring the rating system: You can just make a normal table with ratings and the comments and posts table will get a FK relationship to the ratings table. Like the one you are having for the user right now. However, you also can make two separate tables, to be more flexible later, and it can make the joins a little bit faster as the rating tables will not get as big. But this is more personal preference than anything else. I would certainly make a separate table if the rating systems will drift from each other, but in this case I also would go for one.

1
votes

Using a single table would allow you to have a deeper nested heirarchy. For example:

title
    comment
    comment2
        comment_to_comment
    comment3
        comment_to_comment2
        comment_to_comment3

So it's an additional capability that doesn't exist with the 2 table setup.

"I would only do it if you need that capability" because the queries will be more complicated. It would also be beneficial to add a (root_id) column to such a table to indicate the node with the title for comments with a nesting level greater than 1.

You could also use one table for the data and one table for the parent/child hierarchy (this is just an example, you may need to adjust the syntax).

create table element (     
    id        serial  not null primary key,
    data      integer not null
);

create table heirarchy (
    id        serial  not null primary key,
    id_root   integer not null references element(id),
    id_parent integer not null references element(id),
    id_child  integer not null references element(id)
);

insert into element (data) values (100);
insert into element (data) values (101);
insert into element (data) values (102);
insert into element (data) values (103);
insert into element (data) values (104);
insert into element (data) values (105);
insert into element (data) values (106);
insert into element (data) values (107);
insert into element (data) values (108);

select id, data from element;

1 | 100
2 | 101
3 | 102
4 | 103
5 | 104
6 | 105
7 | 106
8 | 107
9 | 108

insert into heirarchy (id_root, id_parent, id_child) values (3, 3, 4);
insert into heirarchy (id_root, id_parent, id_child) values (3, 3, 5);
insert into heirarchy (id_root, id_parent, id_child) values (3, 4, 1);
insert into heirarchy (id_root, id_parent, id_child) values (3, 4, 2);
insert into heirarchy (id_root, id_parent, id_child) values (3, 1, 9);
insert into heirarchy (id_root, id_parent, id_child) values (6, 6, 7);
insert into heirarchy (id_root, id_parent, id_child) values (6, 6, 8);

This kind of thing isn't trivial and you also can setup triggers to prevent circular relationships.