Problem Description / Requirement
A customer is reporting a deterioration in the performance of all queries after the software vendor updated the database objects using a T-SQL script. After the vendor completed the updates, the customer discovered that the query row estimates were no longer correct.
The customer identified the problem using a general query and is requesting support in determining the cause of the faulty “row estimates”.
SELECT c_custkey,
c_mktsegment,
c_nationkey,
c_name,
c_address,
c_phone,
c_acctbal,
c_comment
FROM dbo.customers
WHERE c_nationkey = 1
ORDER BY
c_name;
Run the above query in your challenge database and watch the actual execution plan. Have a look to the row estimates and the executed number of rows.
Problem
All queries executed against the database are returning incorrect “estimate rows”. These faulty estimates are causing the system database TEMPDB to regularly reach its capacity limits. The vendor cannot find the cause of the error and recommends allocating more memory to the TEMPDB system database.
Note
Pay particular attention to the estimated number of records and the SORT Spill resulting from the miscalculation.
Challenge
- Find and correct the cause of the massive miscalculation.
- Explain to the customer the basis for the estimate and how Microsoft SQL Server calculates the estimated value.
Condition
- none. You can do all to improve the query (adding indexes, tables, partitions, …)
Technical Information
The challenge #10 creates one table for demo purposes
- [dbo].[customers] with 50,000 rows
- nonclustered index: nix_customers_c_nationkey
How to start the challenge
When you are connected to the SQL Server instance, run the following commands:
EXECUTE sqlchallenge.dbo.create_challenge 'list'; |
will give you a list of open challenges |
EXECUTE sqlchallenge.dbo.create_challenge 'SQL Challenge #9 - estimates going wrong'; |
will create the challenge for you |
customer expectations
The query should generate an optimal execution plan and consume ideal memory the SORT operator
The query must not be changed because it is generated from the framework of the software
Connect to Microsoft SQL Server
| SQL Server: | sqlchallenges.germanywestcentral.cloudapp.azure.com |
| Port: | 1433 |
| Version: |
16.0.4165.4 |
| Edition: | Developer Edition |
| Authentication Type: | SQL Server Authentication |
| Login Name: | sqlchallenge |
| Password: | sqlchallenge#2025! |
| Encryption | Mandatory |
| Trust Server Certificate: | True |


Hi Uwe,
I only spent a few minutes to find one way to speed up things a bit. Maybe this was cheating though….
I just enabled auto create stats and auto update stats for the db
one of the settings I would assume every DB has turned on unless turned off for specific reasons.
Running the query again now results in a table scan but with a proper estimate of 16870 rows. (nationkey = 1)
So the problem was: due to the changes, the app vendor possibly dropped all auto created statistics but maybe accidentally also the idx stats
In combination with auto create stats off and auto update stats off the optimizer was somehow “blind” and could only use a row count (I guess)
I’d like to dig a little bit deeper but I somehow broke the demo for me because I tried to cleanup things for replay 😀
Hi Dirk,
perfectly explained. Exactly this happens inside the installation / update script.
So all indexes/statistics have been cleaned up but they changed the settings of the database, too (maybe for maintenance) but forgot to reactivate it after the update.
Maybe you know the formular for the guessing of the wrong estimates, too 🙂
Thank you very much for participating in this challenge!
Hi Uwe,
got it 🙂
but wont share it here, so others still have some fun with it.
will drop it via linkedin
Created a covering index and did UPDATE STATS WITH FULLSCAN in customers table.
CREATE NONCLUSTERED INDEX [idx_challenge] ON [dbo].[customers]
(
[c_nationkey] ASC,
[c_name] ASC
)
INCLUDE([c_custkey],[c_mktsegment],[c_address],[c_phone],[c_acctbal],[c_comment])…
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(16870 rows affected)
Table ‘customers’. Scan count 1, logical reads 397, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Not possible to upload image here, but low costs in index seek op.
But i sent fix only for a unique scenario, thats is invalid for whole solution.
Ignore comment above.
Dear Joseph,
thank you very much for taken the challenge. Your approach seems o.k. But did you check the estimated rows in the execution plan. Are these estimates correct after implementation of the index?
Dear Uwe, yes.
Estimated Operator Cost: 0.309513 (100%)
Estimated I/O Cost: 0.290979
Estimated Subtree Cost: 0.309513
Estimated CPU Cost: 0.018714
Estimated Number of Executions: 1
Estimated Number of Rows to be Read: 16870
Estimated Number of Rows for All Executions: 16870
Estimated Number of Rows Per Execution: 16870
Estimated Row Size: 151 B
Ordered: True
Dirk’s answer seems OK in the whole scenario.
Cheers.
If we don’t want to change the database configuration and only want to address the wrong estimate:
CREATE STATISTICS statx ON dbo.customers (c_nationkey)
Dear Peter,
thank you for accepting the challenge. You are correct! Stats must be created or the database settings for automatic creation of stats must be enabled.