SyntaxHighlighter

Friday, December 20, 2013

CONVERT_IMPLICIT in a JOIN or Subquery

Implicit conversion can be a big problem in SQL Server that can slow down your queries by orders of magnitude -- even with relatively low row counts.

First of all, what is implicit conversion? When two values are compared in an expression, the data types should match. If they don't match, SQL Server will implicitly (without being told) convert one data type into the other so that they match. After they match, then SQL Server can make the comparison.

Implicit conversion can be identified and corrected in many cases and it's most easily done in the query itself with explicit CAST and CONVERT statements. But what happens when joining two tables together with keys of different data types?

When sets are overlapped (with a JOIN or with a subquery in an IN operator), each set has a key which correlates them together. If those keys are the same data type (but not simply compatible data types), then indexes which exist on the keys can be directly used to speed up the correlation. However, if the two data types are not the same, SQL Server must implicitly convert the data types and it can no longer use those indexes. So basically, all that work you did to make those indexes finely tuned and optimized gets thrown out the window because they will never be used.

Look at this snippet of an execution plan:


Above, an implicit conversion is happening inside the COMPUTE SCALAR nodes in this execution plan. You can see that almost half a million rows are being scanned from Table3. Notice that the compile time is almost 4 seconds. Client statistics of 10 runs revealed that this query takes an average of 2892ms to run.

COMPUTE SCALAR is actually converting nvarchar(15) in Table3 to char(10), and it's doing it 500,000 times! Is all that work really necessary?! Not at all.

Look what happens when I change the data type of the base table Table3 to char(10). Note that SQL Server will complain about this conversion because it could result in a loss of data, but in my case, I verified that it would not have.


Above, first notice that the anonymizer has changed Table3 to Table4. Furthermore, there is no implicit conversion and the indexes can and in fact are being seeked. Instead of scanning 500,000 rows, you now see it's seeking 112 and 7. A tiny change has radically altered the dynamics of this query. Notice other stats, like the fact that it ran in almost 1/40th (3%!) of the time. Client statistics confirm the speedup.

Your mileage may vary, of course, but always look out for CONVERT_IMPLICIT!


No comments:

Post a Comment