293
votes

What's the simplest SQL statement that will return the duplicate values for a given column and the count of their occurrences in an Oracle database table?

For example: I have a JOBS table with the column JOB_NUMBER. How can I find out if I have any duplicate JOB_NUMBERs, and how many times they're duplicated?

13

13 Answers

642
votes

Aggregate the column by COUNT, then use a HAVING clause to find values that appear greater than one time.

SELECT column_name, COUNT(column_name)
FROM table_name
GROUP BY column_name
HAVING COUNT(column_name) > 1;
59
votes

Another way:

SELECT *
FROM TABLE A
WHERE EXISTS (
  SELECT 1 FROM TABLE
  WHERE COLUMN_NAME = A.COLUMN_NAME
  AND ROWID < A.ROWID
)

Works fine (quick enough) when there is index on column_name. And it's better way to delete or update duplicate rows.

35
votes

Simplest I can think of:

select job_number, count(*)
from jobs
group by job_number
having count(*) > 1;
17
votes

You don't need to even have the count in the returned columns if you don't need to know the actual number of duplicates. e.g.

SELECT column_name
FROM table
GROUP BY column_name
HAVING COUNT(*) > 1
7
votes

How about:

SELECT <column>, count(*)
FROM <table>
GROUP BY <column> HAVING COUNT(*) > 1;

To answer the example above, it would look like:

SELECT job_number, count(*)
FROM jobs
GROUP BY job_number HAVING COUNT(*) > 1;
6
votes

In case where multiple columns identify unique row (e.g relations table ) there you can use following

Use row id e.g. emp_dept(empid, deptid, startdate, enddate) suppose empid and deptid are unique and identify row in that case

select oed.empid, count(oed.empid) 
from emp_dept oed 
where exists ( select * 
               from  emp_dept ied 
                where oed.rowid <> ied.rowid and 
                       ied.empid = oed.empid and 
                      ied.deptid = oed.deptid )  
        group by oed.empid having count(oed.empid) > 1 order by count(oed.empid);

and if such table has primary key then use primary key instead of rowid, e.g id is pk then

select oed.empid, count(oed.empid) 
from emp_dept oed 
where exists ( select * 
               from  emp_dept ied 
                where oed.id <> ied.id and 
                       ied.empid = oed.empid and 
                      ied.deptid = oed.deptid )  
        group by oed.empid having count(oed.empid) > 1 order by count(oed.empid);
4
votes

Doing

select count(j1.job_number), j1.job_number, j1.id, j2.id
from   jobs j1 join jobs j2 on (j1.job_numer = j2.job_number)
where  j1.id != j2.id
group by j1.job_number

will give you the duplicated rows' ids.

4
votes
SELECT   SocialSecurity_Number, Count(*) no_of_rows
FROM     SocialSecurity 
GROUP BY SocialSecurity_Number
HAVING   Count(*) > 1
Order by Count(*) desc 
3
votes

I usually use Oracle Analytic function ROW_NUMBER().

Say you want to check the duplicates you have regarding a unique index or primary key built on columns (c1, c2, c3). Then you will go this way, bringing up ROWID s of rows where the number of lines brought by ROW_NUMBER() is >1:

Select * From Table_With_Duplicates
      Where Rowid In
                    (Select Rowid
                       From (Select Rowid,
                                    ROW_NUMBER() Over (
                                            Partition By c1 || c2 || c3
                                            Order By c1 || c2 || c3
                                        ) nbLines
                               From Table_With_Duplicates) t2
                      Where nbLines > 1)
2
votes

I know its an old thread but this may help some one.

If you need to print other columns of the table while checking for duplicate use below:

select * from table where column_name in
(select ing.column_name from table ing group by ing.column_name having count(*) > 1)
order by column_name desc;

also can add some additional filters in the where clause if needed.

1
votes

Here is an SQL request to do that:

select column_name, count(1)
from table
group by column_name
having count (column_name) > 1;
0
votes

1. solution

select * from emp
    where rowid not in
    (select max(rowid) from emp group by empno);
-1
votes

Also u can try something like this to list all duplicate values in a table say reqitem

SELECT count(poid) 
FROM poitem 
WHERE poid = 50 
AND rownum < any (SELECT count(*)  FROM poitem WHERE poid = 50) 
GROUP BY poid 
MINUS
SELECT count(poid) 
FROM poitem 
WHERE poid in (50)
GROUP BY poid 
HAVING count(poid) > 1;