understanding Merge Vs hash join in sql server

Question

I have done a test on a query set to see the performance. in which i had found the query without clustered index is fast,WHY...?

The query is below

select A.col1 ,B.col2,B.col3 from table1 A inner join table2 on A.col1 =B.col1

The performance is

Hash Match( both have No Index or either have no index)

(913271 row(s) affected) Table 'Table B'. Scan count 5, logical reads 18681, physical reads 193, read-ahead reads 18681, lob

logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Table B'. Scan count 5, logical reads 57798, physical reads 4, read-ahead reads 57798, lob logical reads 0, lob

physical reads 0, lob read-ahead reads 0. Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical

reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times: CPU time = 3665 ms, elapsed time = 9391 ms.

Total Time :09 Sec

Merge Join:(Both have unique non clustered index)

(913271 row(s) affected) Table 'Table B'. Scan count 1, logical reads 18723, physical reads 6, read-ahead reads 18727, lob

logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Table B'. Scan count 1, logical reads 56811, physical reads 21, read-ahead reads 56921, lob logical reads 0, lob

physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times: CPU time = 1466 ms, elapsed time = 14881 ms. SQL Server parse and compile time:

Total Time :14 Sec

Update statistics, rebuild indices, try again. And note that in both cases you're using an index scan - you're not really using the index as an index. Also, the first query is parallelized, while the second isn't. — Luaan
please post actual execution plan xml using pastebin or some link — TheGameiswar

Mike Robinson Mike Robinson · Accepted Answer · 2016-07-11T13:52:25

The relative performance of the two index-types depends rather extremely upon the distribution of values in the various tables. Both index types favor situations that allow them to avoid reading new blocks, and/or to be able to re-use blocks that have already been cached, to profitably exploit "read-ahead" strategies, and so on. But, their pragmatic ability to do this depends on the data, as well as the particular operation(s) that are being performed.

Part of your application-design should be pragmatic examination to see if one approach (if any ...) "clearly appears superior to the other" in y-o-u-r situation. But, there is really no "de facto winner." (If there were, "the other" index-type would have been abandoned long ago.)

"A single, isolated, resource-consumption test" is not sufficient: you must consider all angles, including time required to perform various operations (not just one), the influence of data-quantity, and so on and on and on.

understanding Merge Vs hash join in sql server

2 Answers