Thursday, October 17, 2013

Forensic Query Pathology 201: Exceptionally Bad Query Plans

This one is rather difficult to explain without a diagram, so here it is: 

Here is a SELECT query. You can see that it gets a fair number of rows from the base tables/clustered indexes -- approximately 6 million. But then something odd happens. The filter takes out approximately 5 million rows and leaves the query with only 1 row. You are probably saying to yourself, "That's perfectly normal. You've clearly written a WHERE clause that filters out all the rows."

That's reasonable.

Next, the same SELECT query wrapped by an INSERT DML statement:

Interesting. So the query optimizer believes we are inserting a single row into this table, so it chooses nested loop joins for the foreign keys -- and that makes a lot of sense because with only one row, the best join would be a nested loop.

All's well that ends well right? This INSERT statement takes only four minutes to run because the SELECT grabs 6 million rows, filters it down, and inserts only a single row.


Here's why. It turns out the original query plan was bunk. It turns out that the Filter in the original query plan completely messed up its estimate. It was supposed to estimate 6 million rows, because that's how many actually passed through the filter. And since 6 million rows in any sort of nested join usually end up multiplying with the number of rows in the join tables (at least in this case), we actually had something on the order of 3.6 trillion rows to loop through ... So in fact, the query was taking dozens of hours to insert those records.

I am really beginning to hate the SQL Server query optimizer.

So now you're asking yourself if updating the statistics helped, and the answer is no. We have so many hundreds of millions of rows in the base tables that inserting a fraction of a percent during ETL per day makes the stats engine completely underestimate them to the point where they are zero. This is a big problem for Microsoft's query planner and an even bigger problem for us. The actual solution was to dematerialize the query (in this case I used a temp table) and then the estimate for insertion was 100% accurate and the proper merge joins/hash joins were selected by the query planner.

Wednesday, October 16, 2013

Installing BIDS 2008 R2 with Visual Studio 2008 -- SP1 is definitely Required

A few days ago the hard drive in my work laptop crashed. Yay!

So I went to MicroCenter and got myself a Samsung 840EVO. Yay! (No, the company didn't pay for it because they're cheap like that -- but $100 for my sanity is an easy price to pay -- the time I saved with an SSD install and the peppiness of the system would easily repay itself manifold in terms of my pay, but try to convince a centralized IT department of that).

I downloaded Visual Studio 2008 from MSDN. Then I went to install SQL Server 2008 R2 -- at least the BIDS (Business Intelligence Development Studio, later known as SQL Server Data Tools in SQL Server 2012) portion of it (I have to work in older reports sometimes). But BIDS wouldn't install because it didn't like that Visual Studio 2008 SP1 wasn't installed. Huh? Didn't I get SP1 with my MSDN download? The answer is no!

I poked around for a while and realized after some knuckle dragging that Windows Update doesn't update VS2008 and MSDN doesn't offer a slipstreamed SP1 version. You actually have to manually download VS 2008 SP1 and manually install it. The horror!

Here's what the error messages looked like:

Previous releases of Microsoft Visual Studio 2008 Failed
A previous release of Microsoft Visual Studio 2008 is installed on this computer. Upgrade Microsoft Visual Studio 2008 to the SP1 before installing SQL Server 2008. Visual Studio SP1 can be found at
Installing Service Pack 1 for Visual Studio 2008

Now that I've installed SP1, all is well. But keep in mind it's not an automatic process so you have to do it manually. PS, uninstalling Visual Studio does not fix the problem as this MSKB article describes. The only way to fix it is to install a working copy of VS and then install SP1 on top of it and then install BIDS.

Oh -- and spend the money and get yourself an SSD. For this article I uninstalled and reinstalled VS2008 3 times (including the service pack) in less than an hour! You'll never go back to a spindle again for your primary OS drive -- I guarantee it.