5
votes

I have the following scenario in a Postgres 9.3 database:

  • Tables B and C reference Table A.
  • Table C has an optional field that references table B.

I would like to ensure that for each row of table C that references table B, c.b.a = c.a. That is, if C has a reference to B, both rows should point at the same row in table A.

  • I could refactor table C so that if c.b is specified, c.a is null but that would make queries joining tables A and C awkward.
  • I might also be able to make table B's primary key include its reference to table A and then make table C's foreign key to table B include table C's reference to table A but I think this adjustment would be too awkward to justify the benefit.
  • I think this can be done with a trigger that runs before insert/update on table C and rejects operations that violate the specified constraint.

Is there a better way to enforce data integrity in this situation?

3
You'd have to use a trigger and appropriate locking. - Craig Ringer
This is a faq, but to search you need to express your problem/goal/question in a concise & complete statement (many times in different ways). - philipxy

3 Answers

3
votes

There is a very simple, bullet-proof solution. Works for Postgres 9.3 - when the original question was asked. Works for the current Postgres 13 - when the question in the bounty was added:

Would like information on if this is possible to achieve without database triggers

FOREIGN KEY constraints can span multiple columns. Just include the ID of table A in the FK constraint from table C to table B. This enforces that linked rows in B and C always point to the same row in A. Like:

CREATE TABLE a (
  a_id int PRIMARY KEY
);

CREATE TABLE b (
  b_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, UNIQUE (a_id, b_id)  -- redundant, but required for FK
);

CREATE TABLE c (
  c_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, b_id int
, CONSTRAINT fk_simple_and_safe_solution
  FOREIGN KEY (a_id, b_id) REFERENCES b(a_id, b_id)  -- THIS !
);

Minimal sample data:

INSERT INTO a(a_id) VALUES
  (1)
, (2);

INSERT INTO b(b_id, a_id) VALUES
  (1, 1)
, (2, 2);

INSERT INTO c(c_id, a_id, b_id) VALUES
  (1, 1, NULL)  -- allowed
, (2, 2, 2);    -- allowed

Disallowed as requested:

INSERT INTO c(c_id, a_id, b_id) VALUES (3,2,1);
ERROR:  insert or update on table "c" violates foreign key constraint "fk_simple_and_safe_solution"
DETAIL:  Key (a_id, b_id)=(2, 1) is not present in table "b".

db<>fiddle here

The default MATCH SIMPLE behavior of FK constraints works like this (quoting the manual):

MATCH SIMPLE allows any of the foreign key columns to be null; if any of them are null, the row is not required to have a match in the referenced table.

So NULL values in c(b_id) are still allowed (as requested: "optional field"). The FK constraint is "disabled" for this special case.

We need the logically redundant UNIQUE constraint on b(a_id, b_id) to allow the FK reference to it. But by making it out to be on (a_id, b_id) instead of (b_id, a_id), it is also useful in its own right, providing a useful index on b(a_id) to support the other FK constraint, among other things. See:

(An additional index on c(a_id) is typically useful accordingly.)

Further reading:

2
votes

I ended up creating a trigger as follows:

create function "check C.A = C.B.A"()
returns trigger
as $$
begin
    if NEW.b is not null then
        if NEW.a != (select a from B where id = NEW.b) then
            raise exception 'a != b.a';
        end if;
    end if;
    return NEW;
end;
$$
language plpgsql;

create trigger "ensure C.A = C.B.A"
before insert or update on C
for each row
execute procedure "check C.A = C.B.A"();
2
votes
Would like information on if this is possible to achieve without database triggers

Yes, it is possible. The mechanism is called ASSERTION and it is defined in SQL-92 Standard(though it is not implemented by any major RDBMS).

In short it allows to create multiple-row constraints or multi-table check constraints.


As for PostgreSQL it could be emulated by using view with WITH CHECK OPTION and performing operation on view instead of base table.

WITH CHECK OPTION

This option controls the behavior of automatically updatable views. When this option is specified, INSERT and UPDATE commands on the view will be checked to ensure that new rows satisfy the view-defining condition (that is, the new rows are checked to ensure that they are visible through the view). If they are not, the update will be rejected.

Example:

CREATE TABLE a(id INT PRIMARY KEY, cola VARCHAR(10));

CREATE TABLE b(id INT PRIMARY KEY, colb VARCHAR(10), a_id INT REFERENCES a(id) NOT NULL);

CREATE TABLE c(id INT PRIMARY KEY, colc VARCHAR(10),
                a_id INT REFERENCES a(id) NOT NULL,
                b_id INT REFERENCES b(id));

Sample inserts:

INSERT INTO a(id, cola) VALUES (1, 'A');
INSERT INTO a(id, cola) VALUES (2, 'A2');
INSERT INTO b(id, colb, a_id) VALUES (12, 'B', 1);
INSERT INTO c(id, colc, a_id) VALUES (15, 'C', 2);

Violating the condition(connecting C with B different a_id on both tables)

UPDATE c SET b_id = 12 WHERE id = 15;;
-- no issues whatsover

Creating view:

CREATE VIEW view_c
AS
SELECT *
FROM c
WHERE NOT EXISTS(SELECT 1 
                 FROM b
                 WHERE c.b_id = b.id
                   AND c.a_id != b.a_id) -- here is the clue, we want a_id to be the same
WITH CHECK OPTION ;                  

Trying update second time(error):

UPDATE view_c SET b_id = 12 WHERE id = 15;
--ERROR:  new row violates check option for view "view_c"
--DETAIL:  Failing row contains (15, C, 2, 12). 

Trying brand new inserts with incorrect data(also errors)

INSERT INTO b(id, colb, a_id) VALUES (20, 'B2', 2);

INSERT INTO view_c(id, colc, a_id, b_id) VALUES (30, 'C2', 1, 20);
--ERROR:  new row violates check option for view "view_c"
--DETAIL:  Failing row contains (30, C2, 1, 20)

db<>fiddle demo