1
votes

I'm working on a query using the PostGIS extension that implements a 'spatial join' work. Running the query took an incredibly long time and failed in the end. The query is as follows:

CREATE INDEX buffer_table_geom_idx ON buffer_table USING GIST (geom);
CREATE INDEX point_table_geom_idx ON point_table USING GIST (geom);

SELECT
    point_table.*,
    buffer_table.something
FROM
    point_table
    LEFT JOIN buffer_table ON ST_Intersects (buffer_table.geom, point_table.geom);

where the point_table stands for a table that contains over 10 million rows of point records; the buffer_table stands for a table that contains only one multi-polygon geometry.

I would want to know if there is anything wrong with my code and ways to adjust. Thanks in advance.

1
Can you add the query plan? -> explain analyse Also, could you add some sample data? - Jim Jones
@JimJones Hi, Jim. I run the 'EXPLAIN ANALYSE' query in PgAdmin4. However, the query is unable to be run. The interface keeps showing 'Waiting for the query to complete'. - user12949097

1 Answers

2
votes

With a LEFT JOIN you're going through every single record of point_table and therefore ignoring the index. Try this and see the difference:

SELECT point_table.*
FROM point_table
JOIN buffer_table ON ST_Contains(buffer_table.geom, point_table.geom);

Divide and conquer with ST_SubDivide

Considering the size of your multipolygon (see comments), it might be interesting to divide it into smaller pieces, so that the number of vertices for each containment/intersection calculation also gets reduced, consequently making the query less expensive.

First divide the large geometry into smaller pieces and store in another table (you can also use a CTE/Subquery)

CREATE TABLE buffer_table_divided AS
SELECT ST_SubDivide(geom) AS geom FROM buffer_table

CREATE INDEX buffer_table_geom_divided_idx ON buffer_table_divided USING GIST (geom);

.. and perform your query once again against this new table:

SELECT point_table.*
FROM point_table
JOIN buffer_table_divided d ON ST_Contains (d.geom, point_table.geom);

Demo: db<>fiddle