You can change your inner query to look for other rows with the same last_receipt
value but a different ID (assuming ID is unique); if another row exists then that is equivalent to your count returning greater than one. But you can't simply test two CLOB values for equality, you need to use dbms_lob.compare
:
select ID
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID != t1.ID
);
Applying the row number filter is tricker, as you also can't use a CLOB in the analytic partition by
clause. As André Schild suggested, you can use a hash; here passing the integer value 3, which is the equivalent of dbms_crypto.hash_sh1
(though in theory that could change in a future release!):
select id from (
select ID, ROW_NUMBER() over (partition by dbms_crypto.hash(LATEST_RECEIPT, 3)
order by ID) rownumber
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID != t1.ID
)
)
where rownumber > 1;
It is of course possible to get a hash collision, and if that happened - you had two latest_receipt
values which both appeared more than once and both hashed to the same value - then you could get too many rows back. That seems pretty unlikely, but it's something to consider.
So rather than ordering you can only look for rows which have the same lastest_receipt
and a lower ID:
select ID
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID < t1.ID
);
Again that assumes ID is unique. If it isn't then you could still use rowid
instead, but you would have less control over which rows were found - the lowest rowid
isn't necessarily the lowest ID. Presumably you're using this to dine rows to delete. If you actually don't mind which row you keep and which you delete then you could still do:
and t2.ROWID < t1.ROWID
But since you are currently ordering that probably isn't acceptable, and hashing might be preferable, despite the small risk.
GROUP BY
and other agregate functions. I think you will have to convert it back into aVARCHAR(much-longer)
data type – André SchildLAST_RECEIPT
text? (If yes, then you probably would need to add another column with a hash key of the text and then use this for the group by condition) – André Schild