0
votes

This problem I'm facing may sound trivial to you but I'm wondering, beside ENUM type, what can one use to store numerical values for category, subcategory and multiple tags in a single column?

Scenario:

  • a DB entry needs to have category (id=4), subcategory(id=43) and tags(id=78,id=85,id=98,id=112, etc.)
  • tbl_entry has a column "categories" or similar to that, with the aim to hold those values (4,43,7,85,98,112,...)
  • Is ENUM really suitable and recommended way of storing it?

    However, I am thinking about leaving out such a column completely and apply somewhat Wordpress-like solution - create taxonomies (cat, subcat, tag), and then in another table called relationships put the ID of my entry to be paired with correct taxonomies. The issue with this solution is that 50,000 entries would produce so many relationship rows (1 for cat, 1 for subcat, but up to 20 for tags).

    Can you elaborate at the proposed ways and/or suggest new one(s)?

    EDITS

    Below is a suggestion to avoid storing multiple values in a single column at all costs. Acknowledged and appreciated!

    I'm thinking about creating 3 separate tables for cat, subcat, tags. Additionally, this way I may be able to employ Martin Fowler's pattern "Data Mapper" with "Table inheritance" for handling this aspect of my app (categorization of entries).

    2
    You probably should read "SQL Antipatterns" book: pragprog.com/book/bksqla/sql-antipatternstereško

    2 Answers

    1
    votes

    NEVER EVER store multiple values in a single column, if you want to access them seperately - this is an accident waiting to happen!

    Just forget the multi-id column, however it might be implemented: Think of what would be needed to find all rows, that have some tag: You would need to textually parse all rows' multi-vlue field, throwing away most of them: Every query becomes a full table scan with calculation - worst case.

    The canonical way to do this, which is to have a join table with the relationships, might produce lots of records, but those are accessible in a furiously fast way, as they in fact are just indices.

    Breaking this up into a a categories join table and a tags join table might or might not speed things up even further, depending on how often your business logic needs only one of those.

    1
    votes

    Why you need to store it in the same column, make different tables for each as Category, Subcategory and Tags with one to one, one to many relationships resp.
    Use Indexes/FKs and JOINs to get records