4
votes

I've searched about this and tried to make it myself over the last few days but I just couldn't. The closest I've gotten within my search was this answer, also on Stack Overflow: EAV Select query from spreaded value tables

So here I am, turning myself to the Internet!

So, I have a database which makes use of the EAV (Entity-Attribute-Value) model. But here's the catch: the actual entities aren't directly connected to the other EAV tables. Let me be more specific; say that there is a Person and a Site tables, and that they only have their primary keys: person_id and site_id, respectively.

Because the attributes (called "properties" on my schema) of those entities (i.e. the Person and the Site) must be dynamic, they must all be stored outside their respective tables, i.e. the EAV tables. The following is the EAV part of the database schema (I'm not sure if it's completely correct, so please let me know if you have any suggestion). -- http://i.stack.imgur.com/EN3dy.png

The EAV part of the schema basically has the following tables:

  • property
  • property_value_varchar
  • property_value_text
  • property_value_number
  • property_value_boolean
  • property_value_datetime
  • entity_tables

Ok, so, since the Entities aren't "directly connected" to the EAV part, I'm using the entity_tables table just as a reference to the actual tables, so, with the example above, the entity_tables table should look something like this:

---------------------------------------
|entity_table_id | entity_table_name  |
|      1         |       person       |
|      2         |       site         |
|      .         |        .           |
|      .         |        .           |
---------------------------------------

The property table is the one that actually holds the different properties that any entity can hold, say "PERSON_FIRST_NAME" or "LOCATION_NAME", or anything else.

The property_value_* tables are all exactly the same except on the datatype of the property_value. These are the ones that hold the actual value of each Entity's object's property, which get mapped by the entity_table_id and entity_object_id.

Let me give you a possible instance of the database, for clarity:

Person table
-------------
| person_id |
|     1     |
|     2     |
-------------

Site table
-----------
| site_id |
|    1    |
|    2    |
-----------

entity_tables table
---------------------------------------
|entity_table_id | entity_table_name  |
|      1         |       person       |
|      2         |       site         |
---------------------------------------

property table
-------------------------------------
| property_id |    property_code    |
|      1      |   PERSON_FIRST_NAME |
|      2      |   PERSON_LAST_NAME  |
|      3      |   PERSON_BIRTH_DATE |
|      4      |   SITE_NAME         |
|      5      |   SITE_PHONE_NR_1   |
|      6      |   SITE_PHONE_NR_2   |
|      7      |   SITE_LATITUDE     |
|      8      |   SITE_LONGITUDE    |
|      9      |   SITE_CITY         |
|     10      |   SITE_COUNTRY      |
|     11      |   SITE_ZIP_CODE     |
-------------------------------------

property_value_varchar table
-----------------------------------------------------------------------------------------
| property_value_id | property_id | entity_table_id | entity_object_id | property_value |
|         1         |      1      |        1        |         1        |     Richard    |
|         2         |      2      |        1        |         1        |     Hammer     |
|         3         |      1      |        1        |         2        |     Bruce      |
|         4         |      2      |        1        |         2        |     Heaton     |
|         5         |      4      |        2        |         1        |     BatCave    |
|         6         |      5      |        2        |         1        |  +49123456789  |
|         7         |      4      |        2        |         2        |   BigCompany   |
|         8         |      5      |        2        |         2        |    987654321   |
|         9         |      6      |        2        |         2        |    147852369   |
|        10         |      9      |        2        |         2        |      Berlin    |
|        11         |     10      |        2        |         2        |     Germany    |
|        12         |     11      |        2        |         2        |      14167     |
-----------------------------------------------------------------------------------------

property_value_datetime table
-----------------------------------------------------------------------------------------
| property_value_id | property_id | entity_table_id | entity_object_id | property_value |
|         1         |      3      |        1        |         1        |   1985-05-31   |
-----------------------------------------------------------------------------------------

property_value_number table
-----------------------------------------------------------------------------------------
| property_value_id | property_id | entity_table_id | entity_object_id | property_value |
|         1         |      7      |        2        |         1        |    1.402636    |
|         2         |      8      |        2        |         1        |    7.273922    |
-----------------------------------------------------------------------------------------

(property_value_text and property_value_boolean tables are empty)

As you could see, not all objects of each entity have necessarily the same properties (attributes). The domain is really loose like that.

So, now like so many people before me, I'm not sure how to retrieve all of this information in a readable way, namely how can I get all of the information concerning the records of the Person table or of the Site table?

Namely, how can I get something like this:

 Person table view
----------------------------------------------------
| Person ID |     Property code   | Property value |
|     1     |   PERSON_FIRST_NAME |    Richard     |
|     1     |   PERSON_LAST_NAME  |    Hammer      |
|     1     |   PERSON_BIRTH_DATE |   1985-05-31   |
|     2     |   PERSON_FIRST_NAME |    Bruce       |
|     2     |   PERSON_LAST_NAME  |    Heaton      |
----------------------------------------------------

 Site table view
------------------------------------------------
| Site ID | Property code    |  Property value |
|    1    |  SITE_NAME       |  Batcave        |
|    1    |  SITE_PHONE_NR_1 |  +49123456789   |
|    1    |  SITE_LATITUDE   |  1.402636       |
|    1    |  SITE_LONGITUDE  |  7.273922       |
|    2    |  SITE_NAME       |  BigCompany     |
|    2    |  SITE_PHONE_NR_1 |  987654321      |
|    2    |  SITE_PHONE_NR_2 |  147852369      |
|    2    |  SITE_CITY       |  Berlin         |
|    2    |  SITE_COUNTRY    |  Germany        |
|    2    |  SITE_ZIP_CODE   |  14167          |
------------------------------------------------

Or even like this, if it is easier:

Person table view
------------------------------------------------------------------------
| Person ID | PERSON_FIRST_NAME | PERSON_LAST_NAME | PERSON_BIRTH_DATE |
|     1     |      Richard      |       Hammer     |     1985-05-31    |
|     2     |       Bruce       |       Heaton     |                   |
------------------------------------------------------------------------

Site table view
----------------------------------------------------------------------------------------------------------------------------------------
| Site ID | SITE_NAME  | SITE_PHONE_NR_1 | SITE_PHONE_NR_2 | SITE_LATITUDE | SITE_LONGITUDE | SITE_CITY | SITE_COUNTRY | SITE_ZIP_CODE |
|    1    | Batcave    |   +49123456789  |                 |   1.402636    |    7.273922    |           |              |               |
|    2    | BigCompany |    987654321    |   147852369     |               |                |  Berlin   |   Germany    |     14167     |
----------------------------------------------------------------------------------------------------------------------------------------

I realize this can be quite confusing. Please let me know how else I can help you help me, like more information or better explaining of some part.

I also don't expect 1 SQL query (per entity) to do the trick. I realize that more than 1 query is likely and that it/they would very likely need to be "assembled" by PHP (for instance), in order to really make it dynamic. So even if someone could even just explain me how I could get all of this info just for the hypothetical properties (attribute) that I have above, I would already be immensely grateful!

Thank you for any help!

1
Unfortunately, EAV is generally an antipattern. It makes the kinds of queries you want very cumbersome to write -- you wind up having to "hard-code" a lot of the stuff that you have carefully avoided "hard-coding" into a schema, losing any benefit. In particular, the type of a SELECT query column cannot depend on the contents of a table -- it must be known at query time, or discoverable from the type of a table column or expression, so the only way to write a query that would produce output like your first example would be to join on every result_value_... table and merge... - j_random_hacker
... the results into a single field with COALESCE() or similar. But then the type of the column will necessarily become some lowest-common-denominator type (probably unbounded VARCHAR), so you can't perform most sensible operations on them (e.g. you could not add 5 to each value, even if you knew they were originally numeric, without first explicitly converting to a numeric type). Some RDBMSs (perhaps MySQL, I dunno) have "pivoting" extensions that might allow rows to be converted into columns as per your later snippets, but this is TTBOMK not standard SQL. - j_random_hacker
If you went with EAV in the first place to allow users to write highly customisable queries, my suggestion would be to redesign your DB schema in the usual way (i.e. with a fixed set of hardcoded columns), and offer the user query flexibility by allowing them to choose fields, tables etc. from lists that you extract from the DB via its special metadata system tables (all DBs offer this, e.g. via INFORMATION_SCHEMA) and then build a customised SELECT command from that. - j_random_hacker
OTOH if you went with EAV to allow users the ultimate flexibility in adding columns to a table, you'll have to accept that this flexibility means that the DB "knows very little" about that data, so it won't be possible to do much more with it than treat it as blobs of arbitrary text or binary data. (E.g. as you'll find, trying to allow such "columns" to several different types is a nightmare, and forget trying to index by such a column.) Finally, EAV is essentially this: en.wikipedia.org/wiki/Inner-platform_effect - j_random_hacker
If you want to allow these users total flexibility, then you're essentially writing a generic front-end to a RDBMS -- and I'm sure that such things exist already. But I think you probably want to only offer them some flexibility. In that case I'd suggest setting up fixed tables (and indices, etc.) in your DB schema for the main business entities that won't change. If you want to offer the user the ability to add extra properties to the rows of these tables, you could create ALTER TABLE commands from their input, but I think a better (safer) solution would be to... - j_random_hacker

1 Answers

1
votes

This was a fun question! This can be handled with dynamic sql. In the code below, the schema has been recreated with temp tables. The code could be turned into a stored procedure that takes an entity_table_id as a parameter and then selects the entity_object_id as entity_table_name + 'id' followed by every property_value as columns with the corresponding property_code as the headings.

-- load EAV tables
if object_id('tempdb..#entity_tables') is not null
    drop table #entity_tables
create table #entity_tables(entity_table_id int,entity_table_name varchar(255))
insert into #entity_tables values
    (1,'person'),
    (2,'site')
if object_id('tempdb..#property') is not null
    drop table #property
create table #property(property_id int,property_code varchar(255))
insert into #property values
    (1,'PERSON_FIRST_NAME'),
    (2,'PERSON_LAST_NAME'),
    (3,'PERSON_BIRTH_DATE'),
    (4,'SITE_NAME'),
    (5,'SITE_PHONE_NR_1'),
    (6,'SITE_PHONE_NR_2'),
    (7,'SITE_LATITUDE'),
    (8,'SITE_LONGITUDE'),
    (9,'SITE_CITY'),
    (10,'SITE_COUNTRY'),
    (11,'SITE_ZIP_CODE')
if object_id('tempdb..#property_value_varchar') is not null
    drop table #property_value_varchar
create table #property_value_varchar(property_value_id int,property_id int,entity_table_id int,entity_object_id int,property_value varchar(255))
insert into #property_value_varchar values
    (1,1,1,1,'Richard'),
    (2,2,1,1,'Hammer'),
    (3,1,1,2,'Bruce'),
    (4,2,1,2,'Heaton'),
    (5,4,2,1,'BatCave'),
    (6,5,2,1,'+49123456789'),
    (7,4,2,2,'BigCompany'),
    (8,5,2,2,'987654321'),
    (9,6,2,2,'147852369'),
    (10,9,2,2,'Berlin'),
    (11,10,2,2,'Germany'),
    (12,11,2,2,'14167')
if object_id('tempdb..#property_value_datetime') is not null
    drop table #property_value_datetime
create table #property_value_datetime(property_value_id int,property_id int,entity_table_id int,entity_object_id int,property_value datetime)
insert into #property_value_datetime values
(1,3,1,1,'1985-05-31')
if object_id('tempdb..#property_value_number') is not null
    drop table #property_value_number
create table #property_value_number(property_value_id int,property_id int,entity_table_id int,entity_object_id int,property_value float)
insert into #property_value_number values
(1,7,2,1,1.402636),
(2,8,2,1,7.273922)

-- create dynamic sql to get all data conditioned on #entity_tables.table_id value
declare @tableid int,@sql varchar(max)
set @tableid = 1 -- this could be passed as a parameter

-- get pivot code with #ColumnList# placeholders to be added below
select @sql = 'select entity_object_id ' + entity_table_name + 'id,
    #ColumnListCast#
from    (
        select
            e.entity_table_name,
            pv.entity_object_id,
            pv.property_value,
            p.property_code
        from #entity_tables e
            inner join  (
                        select entity_table_id,entity_object_id,property_id,property_value from #property_value_varchar union all
                        select entity_table_id,entity_object_id,property_id,cast(property_value as varchar(255)) from #property_value_datetime union all
                        select entity_table_id,entity_object_id,property_id,cast(property_value as varchar(255)) from #property_value_number
                        ) pv
                on pv.entity_table_id = e.entity_table_id
            inner join #property p
                on p.property_id = pv.property_id
        where e.entity_table_id = ' + cast(@tableid as varchar(5)) + '
        ) p
    pivot   (
            max(property_value)
            for property_code in    (
                                    #ColumnList#
                                    )
            ) piv' from #entity_tables where entity_table_id = @tableid

-- get column list with cast version for diffferent data types
declare @ColumnList varchar(max),
        @ColumnListCast nvarchar(max)
set @ColumnList = ''
set @ColumnListCast = ''
select  @ColumnList = @ColumnList + '[' + p.property_code + ']' + case row_number() over(order by p.property_id desc) when 1 then '' else ',' end,
        @ColumnListCast = @ColumnListCast + 'cast([' + p.property_code + '] as ' + t.CastValue + ') [' + p.property_code + ']' + case row_number() over(order by p.property_id desc) when 1 then '' else ',' end
from #property p
    inner join  (
                select property_id,'varchar(255)' CastValue from #property_value_varchar where entity_table_id = @tableid union
                select property_id,'datetime' CastValue from #property_value_datetime where entity_table_id = @tableid union
                select property_id,'float' CastValue from #property_value_number where entity_table_id = @tableid
                ) t
        on t.property_id = p.property_id
order by p.property_id

set @sql = replace(replace(@sql,'#ColumnList#',@ColumnList),'#ColumnListCast#',@ColumnListCast)

exec(@sql)