Teradata optimizer wrongly estimates row number then accessing to table through view with union

Question

Let's say I have three tables: t1(it has about 1 billion rows, fact table) and t2(empty table, 0 rows). and t0 (dimension table), all of them have properly collected statistics. In addition there is view v0:

REPLACE VIEW v0
AS SELECT * from t1
   union
   SELECT * from t2;

Let's look to these three queries:

1) Select * from t1 inner t0 join on t1.id = t0.id; -- Optimizer correctly estimates 1 bln rows

2) Select * from t2 inner t0 join on t1.id = t0.id; -- Optimizer correctly estimates 0 row

3) Select * from v0 inner t0 join on v0.id = t0.id;  -- Optimizer locks t1 and t2 for read, the correctly estimated, that it will get 1 bln rows from t1, but for no clear reasons estimated same number 1 bln from table t2.

What is going on here? Is is it the bug or a feature?

PS. Original query, that pretty big to show here, didn't finished in 35 minutes. After leaving just t1 - successfully finished in 15 minutes.

TD Release: 15.10.03.07
TD Version: 15.10.03.09

(1) For the optimizer it is (2) It is indication that you didn't describe the execution plan correctly — David דודו Markovitz

dnoeth dnoeth · Accepted Answer · 2017-02-15T21:22:29

It's not the same number for the 2nd Select, it's the overall number of rows in spool after the 2nd Select, which is 1 billion plus 0.

And you query was running slowly because you used a UNION which defaults to DISTINCT, running this on a billion rows is really expensive.

Better switch to UNION ALL instead.

Teradata optimizer wrongly estimates row number then accessing to table through view with union

1 Answers