0
votes

I am writing a web app that uses eight Postgres queries on a quite small database (circa 300 records). There is only one query that is repeatedly unsatisfactory. It seems strange to me that the bad query, which basically looks up a single record by an indexed ID number, collects various fields, and then performs a single join, is slower than my search function, which could conceivably go through all 300 records (and also performs a join). Am I misunderstanding queries or is there perhaps a method of indexing that would speed up the situation?

    SELECT id, title, datepub, description, comments,               
           price, publisher, placepub, numberofimages, name 
    FROM book 
    LEFT JOIN author 
      ON book.author = author.auth_id 
    WHERE id=11003

Here is how Postico created the table:

     CREATE TABLE author (
       auth_id integer NOT NULL,
       name character varying(250)
     );

     CREATE SEQUENCE author_id_seq
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
     NO MAXVALUE
     CACHE 1;

     ALTER TABLE author_id_seq OWNER TO evanbates;
     ALTER SEQUENCE author_id_seq OWNED BY author.auth_id;

     ALTER TABLE ONLY author ALTER COLUMN auth_id SET DEFAULT  
     nextval('author_id_seq'::regclass);

     ALTER TABLE ONLY author
     ADD CONSTRAINT author_pkey PRIMARY KEY (auth_id);

I created the index on author with:

     'CREATE UNIQUE INDEX author_pkey ON author USING btree (auth_id)'

I created the book table with:

     CREATE TABLE book (
     id integer NOT NULL,
     author integer,
     title text,
     subtitle text,
     datepub text,
     editiontext character varying(100),
     description text,
     comments text,
     bindingtext character varying(50),
     first boolean,
     dj boolean,
     signedtext boolean,
     isbn character varying(25),
     subject1 integer,
     subject2 integer,
     subject3 integer,
     subject4 integer,
     price numeric,
     location character varying(50),
     paid money,
     keyword1 text,
     bookcondition character varying(50),
     featured boolean,
     illustrator character varying(100),
     quantity smallint,
     dateadded date DEFAULT ('now'::text)::date,
     publisher character varying(100),
     placepub character varying(100),
     numberofimages smallint,
     imgurl text,
     hook character varying(75),
     display smallint DEFAULT '1'::smallint
);


     CREATE SEQUENCE book_id_seq
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
     NO MAXVALUE
     CACHE 1;


     ALTER TABLE book_id_seq OWNER TO evanbates;
     ALTER SEQUENCE book_id_seq OWNED BY book.id;

     ALTER TABLE ONLY book ALTER COLUMN id SET DEFAULT       
     nextval('book_id_seq'::regclass);

     ALTER TABLE ONLY book
     ADD CONSTRAINT book_pkey PRIMARY KEY (id);

     ALTER TABLE ONLY book
     ADD CONSTRAINT book_author_fkey FOREIGN KEY (author) REFERENCES     
     author(auth_id);

     ALTER TABLE ONLY book
     ADD CONSTRAINT book_subject1_fkey FOREIGN KEY (subject1) REFERENCES    
     subject(subj_id);

     ALTER TABLE ONLY book
     ADD CONSTRAINT book_subject2_fkey FOREIGN KEY (subject2) REFERENCES   
     subject(subj_id);

     ALTER TABLE ONLY book
     ADD CONSTRAINT book_subject3_fkey FOREIGN KEY (subject3) REFERENCES    
     subject(subj_id);

     ALTER TABLE ONLY book
     ADD CONSTRAINT book_subject4_fkey FOREIGN KEY (subject4) REFERENCES   
     subject(subj_id);

I have two indexes on book:

    CREATE UNIQUE INDEX book_pkey ON book USING btree (id)

    and

    CREATE INDEX first_idx ON book USING gin (title gin_trgm_ops)

My EXPLAIN ANALYZE is:

    Hash Right Join (cost=8.18..15.35 rows=1 width=426) (actual  time=0.089..0.093 rows=1 loops=1) 
      Hash Cond: (author.auth_id = book.author) 
      -> Seq Scan on author (cost=0.00..6.03 rows=303 width=33) (actual time=0.005..0.031 rows=304 loops=1) 
      -> Hash (cost=8.17..8.17 rows=1 width=401) (actual time=0.013..0.013 rows=1   
    loops=1) 
           Buckets: 1024 Batches: 1 Memory Usage: 9kB 
           -> Index Scan using book_pkey on book (cost=0.15..8.17 rows=1 width=401) (actual time=0.008..0.008 rows=1 loops=1) 
                Index Cond: (id = 11003)  

    Planning time: 0.213 ms 
    Execution time: 0.126 ms 
    (9 rows)

After adding CREATE INDEX book_author_idx ON book USING btree (author) (see below), my EXPLAIN ANALYZE is:

   Hash Right Join  (cost=8.18..15.35 rows=1 width=426) (actual  time=0.089..0.093 rows=1 loops=1)
     Hash Cond: (author.auth_id = book.author)
     ->  Seq Scan on author  (cost=0.00..6.03 rows=303 width=33) (actual time=0.005..0.035 rows=304 loops=1)
     ->  Hash  (cost=8.17..8.17 rows=1 width=401) (actual time=0.012..0.012 rows=1 loops=1)
           Buckets: 1024  Batches: 1  Memory Usage: 9kB
           ->  Index Scan using book_pkey on book  (cost=0.15..8.17 rows=1 width=401) (actual time=0.008..0.009 rows=1 loops=1)
                Index Cond: (id = 11003)
   Planning time: 0.223 ms
   Execution time: 0.127 ms
   (9 rows)
1
Please show your DDL & read How to Ask and minimal reproducible example.philipxy
0.1 seconds is unsatisfactory? How fast do you need that to be? And please make sure you preserve the formatting of the execution plan. The indention of the plan steps is an important information when reading the plana_horse_with_no_name
Fixed the EA formatting. It's not really the actual time that bugs me, I just feel like it should be faster than it is. I found it productive to run my queries through explain.depesz.com. I am going to remove the joins for some of them.eabates

1 Answers

0
votes

Try run the query direct in postgresql using PgAdmin. Also include

  EXPLAIN ANALYZE
  <YourQuery>

So we can see the plan created by optimizer. Check for INDEX SEARCH (good) or FULL SCAN (bad)

My guess is you have index for ID but not for auth_id so the LEFT JOIN is slow. Make sure both JOIN fields have index