0
votes

I'm trying to get the document keys for all entries that have duplicate values for a particular field.

I have this query to find the duplicates:

SELECT t1.somefield, t1.fieldcount FROM
(SELECT somefield, fieldcount FROM `mybucket` WHERE `somefield` IS NOT MISSING GROUP BY somefield
LETTING fieldcount = COUNT(somefield) ) as t1
WHERE t1.fieldcount > 1;

and need to get the META().id for each result. Can I do this in a single query?

1

1 Answers

0
votes

USE ARRAY_AGG(META().id)

SELECT d.somefield, cnt, docids
FROM mybucket AS d
WHERE d.somefield IS NOT NULL
GROUP BY d.somefield
LETTING cnt = COUNT(1), docids = ARRAY_AGG(META(d).id)
HAVING cnt > 1;

OR

SELECT d.somefield, ARRAY_LENGTH(docids) AS cnt, docids
FROM mybucket AS d
WHERE d.somefield IS NOT NULL
GROUP BY d.somefield
LETTING docids = ARRAY_AGG(META(d).id)
HAVING ARRAY_LENGTH(docids) > 1;