On greenplum, I have a big table called fact_table which is partitioned by RANGE(day_bucket). Why is so slow for the following query:
select max(day_bucket) from fact_table where day_bucket >= '2011-09-11 00:00:00' and day_bucket < '2011-12-14'.
I suppose it should just look at the head of each partition and return the result immediately, since each partition of the same day_bucket column. But greenplum did a FULL scan to calculate the result. Anybody can explain to me the reason?
Update:
Thanks for answering my question, but it doesn't help with your tip. Greenplum always do a full scan, even though I create the table with PARTITION BY LIST(day_bucket):
CREATE TABLE fact_table (
id character varying(25) NOT NULL,
day_bucket timestamp without time zone NOT NULL,
)
WITH (appendonly=true, orientation=column, compresstype=zlib, compresslevel=6) DISTRIBUTED BY (user_id) PARTITION BY LIST(day_bucket)
(
PARTITION p20120101 VALUES ('2012-01-01 00:00:00'::timestamp without time zone) WITH (tablename='fact_table_1_prt_p20120101', appendonly=true, orientation=column, compresstype=zlib, compresslevel=6 ),
PARTITION p20120102 VALUES ('2012-01-02 00:00:00'::timestamp without time zone) WITH (tablename='fact_table_1_prt_p20120102', appendonly=true, orientation=column, compresstype=zlib, compresslevel=6 ),
PARTITION p20120103 VALUES ('2012-01-03 00:00:00'::timestamp without time zone) WITH (tablename='fact_table_1_prt_p20120103', appendonly=true, orientation=column, compresstype=zlib, compresslevel=6 ),
PARTITION p20120104 VALUES ('2012-01-04 00:00:00'::timestamp without time zone) WITH (tablename='fact_table_1_prt_p20120104', appendonly=true, orientation=column, compresstype=zlib, compresslevel=6 ),
.....
Explain command shows that it always do a full scan:
-> Append-only Columnar Scan on mytestlist_1_prt_p20120102 mytestlist (cost=0.00..34.95 rows=1 width=8) Filter: day_bucket >= '2012-01-02 00:00:00'::timestamp without time zone AND day_bucket Append-only Columnar Scan on mytestlist_1_prt_p20120103 mytestlist (cost=0.00..39.61 rows=1 width=8) Filter: day_bucket >= '2012-01-02 00:00:00'::timestamp without time zone AND day_bucket