4
votes

I'm using PostGIS with Postgresql in order to be able to locate entries within some radius by coordinates stored in location column Geometry/Point SRID: 4326. Here are two queries that I'm experimenting with:

First one with distance in meters and use_spheroid=true

EXPLAIN ANALYZE SELECT count(*) FROM "cars" WHERE ST_DWithin(location, ST_SetSRID(ST_MakePoint(20, -30), 4326), 105000, true) LIMIT 1000;
                                                                                                                                                                                             QUERY PLAN                                                                                                                                                                                              
--------------
 Limit  (cost=11884.28..11884.30 rows=1 width=8) (actual time=18.843..18.844 rows=1 loops=1)
   ->  Aggregate  (cost=11884.28..11884.30 rows=1 width=8) (actual time=18.842..18.843 rows=1 loops=1)
         ->  Seq Scan on cars  (cost=0.00..11883.33 rows=381 width=0) (actual time=0.486..18.827 rows=38 loops=1)
               Filter: (((location)::geography && '0101000020E610000000000000000034400000000000003EC0'::geography) AND ('0101000020E610000000000000000034400000000000003EC0'::geography && _st_expand((location)::geography, '105000'::double precision)) AND _st_dwithin((location)::geography, '0101000020E610000000000000000034400000000000003EC0'::geography, '105000'::double precision, true))
               Rows Removed by Filter: 28549
 Planning time: 0.166 ms
 Execution time: 18.878 ms
(7 rows)

Second, I assume, accepts distance in degrees and use_spheroid is false by default. CORRECTION: Turned out this still uses use_spheroid=true, but function signature matching this call expects geometries and SRID units, which is degrees for 4326.

EXPLAIN ANALYZE SELECT count(*) FROM "cars" WHERE ST_DWithin(location, ST_SetSRID(ST_MakePoint(20, -30), 4326), 1) LIMIT 1000;
                                                                                                                          QUERY PLAN                                                                                                                          
-----------------------------
 Limit  (cost=145.30..145.31 rows=1 width=8) (actual time=0.154..0.155 rows=1 loops=1)
   ->  Aggregate  (cost=145.30..145.31 rows=1 width=8) (actual time=0.154..0.154 rows=1 loops=1)
         ->  Bitmap Heap Scan on cars  (cost=4.59..145.29 rows=3 width=0) (actual time=0.050..0.147 rows=37 loops=1)
               Recheck Cond: (location && '0103000020E6100000010000000500000000000000000033400000000000003FC000000000000033400000000000003DC000000000000035400000000000003DC000000000000035400000000000003FC000000000000033400000000000003FC0'::geometry)
               Filter: (('0101000020E610000000000000000034400000000000003EC0'::geometry && st_expand(location, '1'::double precision)) AND _st_dwithin(location, '0101000020E610000000000000000034400000000000003EC0'::geometry, '1'::double precision))
               Rows Removed by Filter: 11
               Heap Blocks: exact=47
               ->  Bitmap Index Scan on cars_location_index  (cost=0.00..4.59 rows=42 width=0) (actual time=0.037..0.037 rows=48 loops=1)
                     Index Cond: (location && '0103000020E6100000010000000500000000000000000033400000000000003FC000000000000033400000000000003DC000000000000035400000000000003DC000000000000035400000000000003FC000000000000033400000000000003FC0'::geometry)
 Planning time: 0.280 ms
 Execution time: 0.188 ms
(11 rows)

Both queries return similar results (+/- because of precision). However first one runs 100 times slower. Also setting use_spheroid to false doesn't guarantee using of index, it falls back to Seq Scan either when the distance to small (<0.4) or too big (>45). Is this how it's supposed to be or am I doing something wrong?

ADDITION: After some more experiments I changed column types to Geography.Point and now it always uses index. Problem seems to be solved, but I'm still confused with that behavior that I observed with Geometry type.

1

1 Answers

5
votes

The ST_DWithin documentation states that the first function signature accepts geography types not geometry types :

boolean ST_DWithin(geography gg1, geography gg2, double precision distance_meters, boolean use_spheroid);

since (location,ST_SetSRID(ST_MakePoint(20, -30), 4326)) are all geometries it would make sense that the execution of the function is messy. And i think your second function was working properly because what you were executing was this signature:

boolean ST_DWithin(geometry g1, geometry g2, double precision distance_of_srid);

And as you stated that switching the column types to Geography instead of Geometry would solve the issue as that would get you the correct execution of :

boolean ST_DWithin(geography gg1, geography gg2, double precision distance_meters);

boolean ST_DWithin(geography gg1, geography gg2, double precision distance_meters, boolean use_spheroid);

Hope this helps a bit.

Edit:

Found this part in the documentation that states that upon data entry

Standard geometry type data will autocast to geography if it is of SRID 4326

this could explain why Postgres accepted your first invocation of the ST_DWithin() as postgis apparently would cast it to geography, and that also explains why the execution takes longer and neglects the index as each casting would result in a new object that is not indexed in your original column.