I have two tables in Redshift - t1
and t2
.
t2
already contains ~300 000 000 records.
t1
contains ~10 000 000 records.
I need to delete all records from t1
which are already present in t2
based on id
field.
In order to do this, I'm going to execute the following queries(one of them):
DELETE FROM t1 WHERE id IN(SELECT id FROM t2);
or
DELETE FROM t1 USING t2 WHERE t1.id = t2.ud;
or
DELETE FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t1.id = t2.id);
Before I'll do it on real data, I'd like to ask - is it a good idea to use such queries in Redshift from performance point of view or there are some other (better) techicues there for such case?