I have table of polygons (thousands), and table of points (millions). Both tables have GIST indexes on geometry columns. Important this is, polygons do not overlap, so every point is contained by exactly one polygon. I want to generate table with this relation (polygon_id + point_id).
Trivial solution of course is
SELECT a.polygon_id, p.point_id
FROM my_polygons a
JOIN my_points p ON ST_Contains(a.geom, p.geom)
This works, but I think it is unnecessary slow, since it matches every polygon with every point - it does not know that every point can belong to one polygon only.
Is there any way to speed things up?
I tried looping for every polygon, selecting points by ST_Contains, but only those not already in the result table:
CREATE TABLE polygon2point (polygon_id uuid, point_id uuid);
DO $$DECLARE r record;
BEGIN
FOR r IN SELECT polygon_id, geom
FROM my_polygon
LOOP
INSERT INTO polygon2point (polygon_id, point_id)
SELECT r.polygon_id, p.point_id
FROM my_points p
LEFT JOIN polygon2point t ON p.point_id = t.point_id
WHERE t.point_id IS NULL AND ST_Contains(r.geom, p.geom);
END LOOP;
END$$;
This even slower than trivial JOIN approach. Any ideas?
my_points
when you callST_Contains
? What is "unnecessarily slow", can you give us the actual times? And there are only thousands (how many?) of polygons, and each contains only one point (which I assume is just a row) from the points table, is that correct? Maybe an EXPLAIN ANALYZE would be useful. – 404create table
statements for both tables including allcreate index
statements. According to the manualST_Contains()
should be able to use GIST indexes on those columns. And as 404 already said: the execution plan generated usingexplain (analyze, buffers)
would be helpful as well – a_horse_with_no_name