2
votes

I have table with 50 mln rows. One column named u_sphinx is very important available values are 1,2,3. Now all rows have value 3 but, when i checking for new rows (u_sphinx = 1) the query is very slow. What could be wrong ? Maybe index is broken ? Server: Debian, 8GB 4x Intel(R) Xeon(R) CPU E3-1220 V2 @ 3.10GHz

Table structure:

base=> \d u_user
Table "public.u_user"
         Column          |       Type        |                       Modifiers                       
 u_ip                    | character varying | 
 u_agent                 | text              | 
 u_agent_js              | text              | 
 u_resolution_id         | integer           | 
 u_os                    | character varying | 
 u_os_id                 | smallint          | 
 u_platform              | character varying | 
 u_language              | character varying | 
 u_language_id           | smallint          | 
 u_language_js           | character varying | 
 u_cookie                | smallint          | 
 u_java                  | smallint          | 
 u_color_depth           | integer           | 
 u_flash                 | character varying | 
 u_charset               | character varying | 
 u_doctype               | character varying | 
 u_compat_mode           | character varying | 
 u_sex                   | character varying | 
 u_age                   | character varying | 
 u_theme                 | character varying | 
 u_behave                | character varying | 
 u_targeting             | character varying | 
 u_resolution            | character varying | 
 u_user_hash             | bigint            | 
 u_tech_hash             | character varying | 
 u_last_target_data_time | integer           | 
 u_last_target_prof_time | integer           | 
 u_id                    | bigint            | not null default nextval('u_user_u_id_seq'::regclass)
 u_sphinx                | smallint          | not null default 1::smallint
Indexes:
    "u_user_u_id_pk" PRIMARY KEY, btree (u_id)
    "u_user_hash_index" btree (u_user_hash)
    "u_user_u_sphinx_ind" btree (u_sphinx)

Slow query:

base=> explain analyze SELECT u_id FROM u_user WHERE u_sphinx = 1 LIMIT 1;
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..0.15 rows=1 width=8) (actual time=485146.252..485146.252 rows=0 loops=1)
   ->  Seq Scan on u_user  (cost=0.00..3023707.80 rows=19848860 width=8) (actual time=485146.249..485146.249 rows=0 loops=1)
         Filter: (u_sphinx = 1)
         Rows Removed by Filter: 23170476
 Total runtime: 485160.241 ms
(5 rows)

Solved:

After adding partial index

base=> explain analyze SELECT u_id FROM u_user WHERE u_sphinx = 1 LIMIT 1;
                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.27..4.28 rows=1 width=8) (actual time=0.063..0.063 rows=0 loops=1)
   ->  Index Scan using u_user_u_sphinx_index_1 on u_user  (cost=0.27..4.28 rows=1 width=8) (actual time=0.061..0.061 rows=0 loops=1)
         Index Cond: (u_sphinx = 1)
 Total runtime: 0.106 ms

Thx for @Kouber Saparev

2

2 Answers

2
votes

Try making a partial index.

CREATE INDEX u_user_u_sphinx_idx ON u_user (u_sphinx) WHERE u_sphinx = 1;
1
votes

Your query plan looks like the DB is treating the query as if 1 was so common in the DB that it'll be better off digging into a disk page or two in order to identify a relevant row, instead of adding the overhead of plowing through an index and finding a row in a random disk page.

This could be an indication that you forgot to run to analyze the table so the planner has proper stats:

analyze u_user