What the heck is Adaptive Query Processing, you ask? Adaptive Query Processing is a set of improvements to query processing introduced in SQL Server 2017 and Azure SQL Database. Adaptive Query Processing improves upon query processing by dynamically modifying operators during run time. There are currently three flavors of Adaptive Query Processing: Batch Mode Memory Grant Feedback, Batch Mode Adaptive Joins and Interleaved Execution. This blog post will focus on Adaptive Joins.
Batch Mode Adaptive Joins
Simply put, Batch Mode Adaptive Joins will choose between a nested lookup and hash join once the first input has been scanned. During normal execution, SQL Server parses and analyzes the query and builds a best-case execution plan. Within these plans are different operations based on join and search predicates, with the best operations chosen based on the cardinality of those predicates. Often, join inputs can fluctuate between values. Sometimes a few rows are joined and sometimes many rows are joined. Table join operations can be either nested loop, merge or hash joins. Each has its specific uses and SQL Server chooses the appropriate join based on those cardinality estimates. If the plan is generated using a specific join type, and the number of join rows changes drastically, the join type selected can cause performance to suffer. This is where Batch Mode Adaptive Joins come in.
How they work
Batch Mode Adaptive Joins work by first scanning the input before deciding the join type. A row threshold is set and the join type is based on where the number of input rows falls based on that threshold. If the number of rows is below the threshold, a nested loop join is used. Conversely, a hash join is used when the number of rows is greater than the threshold. Simple enough? A quick example will help demonstrate:
This first image shows an adaptive join in use where the rows fall below the threshold. A nested loop join is used. The threshold in this case is 10 rows.
Once the number of rows exceeds the threshold, the adaptive join switches to a hash join.
How this helps us
Anytime the optimizer can create an appropriate plan, we benefit. SQL Server chooses a good enough plan. But it doesn’t always choose the best plan because often the time spent generating the best plan is longer than the time it takes to return results. The following images show some benefits. While the example results are only a small number of rows, you can understand how performance scales with larger data sets. The first image shows I/O based on forcing a loop join versus the hash join.
Letting SQL Server choose the join reduces our I/O substantially
Requirements for Batch Mode Adaptive Joins
First and foremost, the database must be either in 140 compatibility mode or be an Azure SQL Database. Secondly, the join must be eligible to use both the nested loop and hash join algorithms. Finally, the join must be executed in Batch Mode. To execute in Batch Mode, the query must access a Columnstore index or the join itself must use a Columnstore Index.
Who will benefit the most from Batch Mode Adaptive Joins?
Any query that satisfies the requirements above can benefit from Batch Mode Adaptive Joins. Queries that fluctuate between small and large join sets will benefit the most. Batch Mode Adaptive Joins are simple to understand, and the benefits are easily obtained, assuming the requirements are met.
In the next post, we’ll cover Batch Mode Memory Grant Feedback. If you’re interested in SQL Server, take a look at this collection of our blog posts, including a few of my own personal posts. If you have any other questions around SQL Server, we would love to help you out. Please don’t hesitate to contact us!
SQL Server Architect