1
votes

I have two tables, adv_institution and institution. institution has 5000+ rows, while adv_institution has 1400+

I want to use Oracle MERGE to back-fill records to adv_institution from institution. These two tables have about four fields tin common which I can use to back-fill.

Here is my entire MERGE statement

merge into
  adv_institution to_t
using (
  select
    uni.*,
    adv_c.country_cd as con_code_text
  from
    (
      select 
        institution_cd, 
        name,
        institution_status,
        country_cd
      from 
        institution uni
      where
        uni.institution_status = 'ACTIVE' and
        uni.country_cd is not null
      group by
        institution_cd,
        name,
        institution_status,
        country_cd
      order by
        name  
    ) uni,
    country_cd c_cd,
    adv_country adv_c
  where
    uni.country_cd = c_cd.country_cd and
    c_cd.description = adv_c.country_cd
) from_t
on
(
  to_t.VENDOR_INSTITUTION_CD = from_t.INSTITUTION_CD or
  to_t.INSTITUTION_CD = from_t.NAME
)
WHEN NOT MATCHED THEN INSERT (
  to_t.INSTITUTION_CD,
  to_t.INSTITUTION_NAME,
  to_t.SHORT_NAME,
  to_t.COUNTRY_CD,

  to_t.NOTE,
  to_t.UNIT_TERMINOLOGY,
  to_t.COURSE_TERMINOLOGY,
  to_t.CLOSED_IND,

  to_t.UPDATE_WHO,
  to_t.UPDATE_ON,
  to_t.CALLISTA_INSTITUTION_CD
)
VALUES (
  from_t.NAME,
  from_t.NAME,
  '',
  from_t.con_code_text,

  '',
  'UNIT',
  'COURSE',
  'N',

  'MYUSER',
  SYSDATE,
  from_t.institution_cd
);

The error I got is

Error report - ORA-00001: unique constraint (MYUSER.ADI_PK) violated

ADI_PK means adv_institution.institution_cd is a primary key and it must be unique.

That is because in WHEN NOT MATCHED THEN INSERT there is an insert statement. I insert from_t.NAME into to_t.INSTITUTION_CD.

It looks like from_t.NAME has the same value at least twice, when inserting into to_t.INSTITUTION_CD

But I did a group statement to make sure from_t.NAME is unique:

(
      select 
        institution_cd, 
        name,
        institution_status,
        country_cd
      from 
        institution uni
      where
        uni.institution_status = 'ACTIVE' and
        uni.country_cd is not null
      group by
        institution_cd,
        name,
        institution_status,
        country_cd
      order by
        name  
    ) uni

I am not sure I understand the issue correctly. I tried all I can, but still no luck.

2

2 Answers

0
votes

I think your main issue is with group by.

Please consider below example:

 desc temp_inventory;
 Name                  Type
 --------------------- -----------
 WAREHOUSE_NO          NUMBER(2)
 ITEM_NO               NUMBER(10)
 ITEM_QUANTITY         NUMBER(10)

WAREHOUSE_NO    ITEM_NO    ITEM_QUANTITY
1               1000       100
1               2000       200
1               2000       300

If i write a query where I want warehouse_no to be unique:

select warehouse_no,item_quantity 
from temp_inventory 
group by warehouse_no,item_quantity

Its going to return the same 3 rows.. instead i want to group by..

select warehouse_no,sum(item_quantity)
from temp_inventory 
group by warehouse_no

which will make the warehouse_no unique in this situation !

Also in cases where you have VARCHAR2 columns, you can use MAX, MIN on them as aggregate functions along with group by to make a unique key in the query.

Example:

Select object_type, min(object_name) 
from user_objects group by object_type;

which will make the object_type unique & return only 1 corresponding object name for it.

So note that if there are duplicate's, in the end some records will be eliminated based on the aggregate function.

0
votes

"But I did a group statement to make sure from_t.NAME is unique:"

But your query does not do that. It produces a set of distinct combinations of (institution_cd,name,institution_status,country_cd). Clearly such a set could contain multiple recurrences of name, one for each different value of country_cd. As you have four elements in your key you are virtually guaranteeing that your set will have multiple occurrences of name.

You compound this with the or in the ON conditions, which means you trigger the UNMATCHED logic if to_t.VENDOR_INSTITUTION_CD = from_t.INSTITUTION_CD even though there is already a record in the target table where to_t.INSTITUTION_CD = from_t.NAME.

The problem is that the MERGE statement is atomic. The set of records coming from the USING subquery must contain unique keys. When Oracle finds a second occurrence of the same name in the result set it doesn't say, I've already merged one of those, let's skip it. It has to hurl ORA-00001 because there is no way for Oracle to know which record is apply, which combination of (institution_cd,name,institution_status,country_cd) is the correct one.

To solve this you need to change the USING query to produce a result set with unique keys. It's your data model, you understand its business rules, so you're in the position to rewrite it properly. But maybe something like this:

  select 
    name,
    max(institution_cd) as institution_cd,
    institution_status,
    max(country_cd) as country_cd
  from (
    institution uni
  where
    uni.institution_status = 'ACTIVE' and
    uni.country_cd is not null
  group by
    name,
    institution_status
  order by
    name  
) uni

Then you can simplify the MERGE ON clause to:

on
(
  to_t.INSTITUTION_CD = from_t.NAME
)

The use of MAX() in the subquery is an inelegant kludge. I hope you can apply better business rules.