0
votes

I have a oracle table called table1 that has a foreign key from table2. table2 has 2 columns: id, name table2.name has a lot of duplicates that need to be sorted out, so I grouped table2 by name and kept only 1 id for each name. However, table1.table2_id uses a lot of the duplicated fields.

How can I change all the table1.table2_id fields so that there are no duplicate names?

Currently: table1:

id blabla table2_id
1 row 1001
2 row 1002
3 row 1003
4 row 1004
5 row 1004
6 row 1005

table2:

id name
1001 Bob
1002 Bob
1003 Bob
1004 Jack
1005 Jack

Desired Result: table1:

id blabla table2_id
1 row 1001
2 row 1001
3 row 1001
4 row 1004
5 row 1004
6 row 1004

table2:

id name
1001 Bob
1004 Jack

So imo, I would need to:

  1. update all table1.table2_id to top 1 table2.id grouped by name
  2. delete duplicate rows from table2

And as there are thousands of duplicate fields in table2, I cannot use case for this..

The solution I am trying returns 1 id for each name, but I am failing at the part where I can update the table.

update table1 d
set d.table2_id  = 
(select
b.id
from table2 b
where b.name in (select min(id) keep(dense_rank first order by id) id
     , name
  from table2
 group by name)
order by b.id ASC
)
where d.table2_id in ( SELECT b.id FROM table2 b WHERE b.name in (select min(id) keep(dense_rank first order by id) id
     , name
  from table2
 group by name));

Please help :)

2
Which dbms are you actually using? - jarlh
Judging from the keep(...) structure, you are using Oracle. If so, pls only tag your question with Oracle, not with several random database products! - Shadow

2 Answers

1
votes

Here are syntactically correct Oracle statements

update table1 t1
set t1.table2_id = (
  select new_id
  from (
    select id, min(id) over (partition by name) new_id
    from table2
  ) d
  where d.id = t1.table2_id
);
delete from table2
where id not in (
  select min(id) from table2 group by name
);

The analytic function min(id) over (partition by name) is used here so that you can have all original ids together with their new ids (the min ids from the set where the name is the same).

0
votes

here is one way:

update table1 d
set d.table2_id = x.id_tokeep
from 
(
select name , id , min(id) over (partition by name ) as id_tokeep
from table2
) x
 where x.id = d.table2_id

delete from table2
where id not in ( select min(id) from table2 group by name)