2
votes

The following is the set up and test scripts. The test script suppose to scan table T4 only. However, it starts to scan both tables T1 and T4 when they have more than 10000 rows.

create table T1 (A varchar(5) check  ((A='S4' or A='S3' or A='S2' or A='S1' or A='FS' or A='FM' or A='FBL' or A='ES' or A='EBL' or A='BL'))
                ,DateX date
                ,id char(6)
                ,DateY date
                ,primary key clustered (A, DateX, id))
create table T4 (A varchar(5) check ((A='S1780' OR A='C1780' OR A='B1780'))
                ,DateX date
                ,id char(6)
                ,DateY date
                ,primary key clustered (A, DateX, id));
-- Insert some values
go
create view dbo.tall
as
select * from    dbo.T1
union all
select * from    dbo.T4

Test code:

declare @A table (A varchar(5) primary key (A));
insert  @A
values  ('S1780'), ('C1780'), ('B1780');

with    a as (select    *
              from      tall
              where     A in (select    *
                              from      @A)
             ),
        sd
          as (select    A, max(DateY) DateY
              from      a
              group by  A
             ),
        filter24m 
        -- Un-comment the lines in this CTE will make the scanning T1 occur with even less row count
          as (select    id, a.A --, sd.DateY
              from      a
                        join sd on a.A = sd.A
              --where     DateX between dateadd(mm, 1, sd.DateY) and dateadd(mm, 24 + 1, sd.DateY) --
              --group by  id, a.A, sd.DateY
              --having    count(*) = 24
             )
    --
             select *
             from   filter24m

Bad execution (test when T1 has 100 rows and T4 has 10000 rows):

Table 'T4'. Scan count 2, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#1B3A42B1'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'T1'. Scan count 3, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
  |--Concatenation
       |--Nested Loops(Inner Join, OUTER REFERENCES:([workdb].[dbo].[T1].[A]))
       |    |--Nested Loops(Inner Join, OUTER REFERENCES:([A]))
       |    |    |--Clustered Index Seek(OBJECT:(@A), SEEK:([A] >= 'B1780' AND [A] <= 'S4') ORDERED FORWARD)
       |    |    |--Stream Aggregate(DEFINE:([workdb].[dbo].[T1].[A]=ANY([workdb].[dbo].[T1].[A])))
       |    |         |--Nested Loops(Inner Join, OUTER REFERENCES:([workdb].[dbo].[T1].[A]))
       |    |              |--Stream Aggregate(DEFINE:([workdb].[dbo].[T1].[A]=ANY([workdb].[dbo].[T1].[A])))
       |    |              |    |--Clustered Index Seek(OBJECT:([workdb].[dbo].[T1].[PK__T1__EE1DD21123AC6823]), SEEK:([workdb].[dbo].[T1].[A]=[A]) ORDERED FORWARD)
       |    |              |--Clustered Index Seek(OBJECT:(@A), SEEK:([A]=[workdb].[dbo].[T1].[A]),  WHERE:([A]>='B1780' AND [A]<='S4') ORDERED FORWARD)
       |    |--Clustered Index Seek(OBJECT:([workdb].[dbo].[T1].[PK__T1__EE1DD21123AC6823]), SEEK:([workdb].[dbo].[T1].[A]=[workdb].[dbo].[T1].[A]) ORDERED FORWARD)
       |--Merge Join(Inner Join, MERGE:([workdb].[dbo].[T4].[A])=([workdb].[dbo].[T4].[A]), RESIDUAL:([workdb].[dbo].[T4].[A]=[workdb].[dbo].[T4].[A]))
            |--Nested Loops(Inner Join, OUTER REFERENCES:([workdb].[dbo].[T4].[A]))
            |    |--Nested Loops(Inner Join, OUTER REFERENCES:([workdb].[dbo].[T4].[A]))
            |    |    |--Stream Aggregate(GROUP BY:([workdb].[dbo].[T4].[A]))
            |    |    |    |--Clustered Index Scan(OBJECT:([workdb].[dbo].[T4].[PK__T4__EE1DD21128711D40]), ORDERED FORWARD)
            |    |    |--Clustered Index Seek(OBJECT:(@A), SEEK:([A]=[workdb].[dbo].[T4].[A]),  WHERE:([A]>='B1780' AND [A]<='S4') ORDERED FORWARD)
            |    |--Clustered Index Seek(OBJECT:(@A), SEEK:([A]=[workdb].[dbo].[T4].[A]),  WHERE:([A]>='B1780' AND [A]<='S4') ORDERED FORWARD)
            |--Clustered Index Scan(OBJECT:([workdb].[dbo].[T4].[PK__T4__EE1DD21128711D40]), ORDERED FORWARD)

Good (test when both tables have only 100 rows):

Table '#1DE1A532'. Scan count 101, logical reads 202, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'T4'. Scan count 103, logical reads 206, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
  |--Nested Loops(Inner Join, OUTER REFERENCES:([Union1006]))
       |--Nested Loops(Inner Join, OUTER REFERENCES:([A]))
       |    |--Clustered Index Seek(OBJECT:(@A), SEEK:([A] >= 'B1780' AND [A] <= 'S4') ORDERED FORWARD)
       |    |--Concatenation
       |         |--Filter(WHERE:(STARTUP EXPR([A]='BL' OR [A]='EBL' OR [A]='ES' OR [A]='FBL' OR [A]='FM' OR [A]='FS' OR [A]='S1' OR [A]='S2' OR [A]='S3' OR [A]='S4')))
       |         |    |--Clustered Index Seek(OBJECT:([workdb].[dbo].[T1].[PK__T1__EE1DD21123AC6823]), SEEK:([workdb].[dbo].[T1].[A]=[A]) ORDERED FORWARD)
       |         |--Filter(WHERE:(STARTUP EXPR([A]='B1780' OR [A]='C1780' OR [A]='S1780')))
       |              |--Clustered Index Seek(OBJECT:([workdb].[dbo].[T4].[PK__T4__EE1DD21128711D40]), SEEK:([workdb].[dbo].[T4].[A]=[A]) ORDERED FORWARD)
       |--Top(TOP EXPRESSION:((1)))
            |--Nested Loops(Inner Join, WHERE:([Union1019]=[A]))
                 |--Concatenation
                 |    |--Filter(WHERE:(STARTUP EXPR([Union1006]='B1780' OR [Union1006]='C1780' OR [Union1006]='S1780')))
                 |    |    |--Clustered Index Seek(OBJECT:([workdb].[dbo].[T4].[PK__T4__EE1DD21128711D40]), SEEK:([workdb].[dbo].[T4].[A]=[Union1006]) ORDERED FORWARD)
                 |    |--Filter(WHERE:(STARTUP EXPR([Union1006]='BL' OR [Union1006]='EBL' OR [Union1006]='ES' OR [Union1006]='FBL' OR [Union1006]='FM' OR [Union1006]='FS' OR [Union1006]='S1' OR [Union1006]='S2' OR [Union1006]='S3' OR [Union1006]='S4')))
                 |         |--Clustered Index Seek(OBJECT:([workdb].[dbo].[T1].[PK__T1__EE1DD21123AC6823]), SEEK:([workdb].[dbo].[T1].[A]=[Union1006]) ORDERED FORWARD)
                 |--Clustered Index Seek(OBJECT:(@A), SEEK:([A] >= 'B1780' AND [A] <= 'S4') ORDERED FORWARD)

Good execution plan in xml:

https://docs.google.com/file/d/0B6OXmuJYfpRcTE9Pd0xpSEhEQy04eWZqa2lKejM5YkdPRHFr/edit?usp=docslist_api

Bad execution plan in xml: https://docs.google.com/file/d/0B6OXmuJYfpRcU2ZUVFdtLUcxQk83TVFSNUFoZEYtbVdaWU4w/edit?usp=docslist_api

1
Please post good and bad plans.usr
The plan and IO statistics are posted.ca9163d9
Could you upload the XML plans somewhere (pastebin)? I am really untrained in interpreting these text plans.usr
The xml plans are attached at the bottom of the question.ca9163d9
Looking at these execution plans, the "bad" one (with 10k rows) is only scanning T4, but you stated it is scanning both. Are you sure that you have that right or did you was your question phrased incorrectly? Scanning vs seeks will depend upon your indexes and the distribution of your data. I mocked up data which created seeks because it was not enough to tip it over to a scan. If you want to get rid of the scans altogether, think about creating an index on DateY and other columns.Mark Wojciechowicz

1 Answers

0
votes

You are expecting that T1 will not be scanned because the view is partitioned. However, the table variable is causing scanning. The statement to select from the table variable is parsed separately from the statement to create and insert values into the table variable -- so the optimizer is unaware of the values.

If you use the literal values rather than the (SELECT * FROM @a), it will not reference T1 and it will simply scan T4. Though, oddly, this is at a higher cost and worse performing.

Explore creating an index on DateY with included columns to optimize performance.