Problem Description / Requirement
The marketing department of db Berater GmbH is planning a private event for its 10 TOP customers. The marketing department provides the following guidelines for selecting customers:
- Only customers with at least 3 orders in the last quarter of 2022 may be considered
- The 10 customers with the most orders in the last quarter of 2022 should be considered
- If more customers have had the same number of orders in the last quarter of 2022, they should also be invited – regardless of sales.
- The total amount of sales must consider only the last 3 orders from 2022.
The image on the left shows the marketing department’s blueprint with all the relevant attributes the team needs.
Challenge
Create the required query to meet the needs of the Marketing department. This query will be used in the Marketing department in a dashboard that is updated every 5 minutes. It is important that the query runs as quickly as possible. The department’s requirement is a runtime of less than 5 seconds!
Category
- Writing efficient queries
- Level 200
Technical information
When creating the challenge, two tables are created as heaps.
Table Name | Number of Rows | Storage Space |
dbo.customers | 1.607.957 | 219 MB |
dbo.orders | 2.275.781 | 296 MB |
Solution
The key to the requirements was the use of “WITH TIES” and either the window function ROW_NUMBER() or CROSS APPLY as the operator.
Connect to Microsoft SQL Server
SQL Server: | sqlchallenges.germanywestcentral.cloudapp.azure.com |
Port: | 1433 |
Version: |
16.0.4165.4
|
Edition: | Developer Edition |
Authentication Type: | SQL Server Authentication |
Login Name: | sqlchallenge |
Password: | sqlchallenge#2025! |
Encryption | Mandatory |
Trust Server Certificate: | True |
Improve Your Skills as Query Tuner for SQL Server
During this workshop, you’ll gain the expertise and confidence needed to diagnose and fine-tune slow-performing queries, resulting in improved database responsiveness and end-user satisfaction. Whether you’re an experienced database administrator, developer, or data analyst, this workshop will equip you with the knowledge and practical skills to tackle the most challenging query optimization tasks.
Improve Your Skills as a Database Developer for SQL Server
In this workshop, we will cover a wide range of topics and techniques that are essential for modern database developers working with SQL Server. From fundamental concepts to advanced best practices, you’ll gain the knowledge and practical experience needed to excel in your role.
How about this result stats (while I hope I understood the criteria right):
(13 rows affected)
Total execution time: 00:00:00.893
Wow – that’s impressive. Good job 🙂
13 rows is correct and the execution time sound pretty awesome for the machine.
These are my own stats about the execution after writing and optimizing
SQL Server Execution Times:
CPU time = 500 ms, elapsed time = 269 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
Thanks Uwe, so I think I understood the challenge… 😉
Using ChatGPT 😃
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(13 rows affected)
SQL Server Execution Times:
CPU time = 1157 ms, elapsed time = 587 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
There’s a missing index hint that could improve the query cost by 47.9283%.
but I didn’t create!
Hello.
First of all, I’d like to say that this website is a fantastic idea!
How about:
select TOP 10 WITH TIES
t1.c_custkey,
count(t2.o_orderkey) AS CON,
(select sum(o_totalprice) from (
select top 3 o_totalprice from dbo.orders x
where t1.c_custkey = x.o_custkey and x.o_OrderDate BETWEEN ‘2022-01-01’ AND ‘2022-12-31’
order by x.o_orderdate desc
) x ) total_amount
from dbo.customers t1 join dbo.orders t2
on t1.c_custkey = t2.o_custkey
where t2.o_OrderDate BETWEEN ‘2022-10-01’ AND ‘2022-12-31’
GROUP BY t1.c_custkey
HAVING COUNT(t2.o_orderkey) > 3
ORDER BY count(t2.o_orderkey) DESC
SQL Server Execution Times:
CPU time = 298 ms, elapsed time = 159 ms.
Could anything be improved here?
Cool Idea. Is working with an index on dbo.orders (o_custkey, o_orderdate) INCLUDE (o_totalprice) like a charm.
Hi Seba, it looks like your query is missing the customer name. Adding it will consume more CPU but you can work around this by separating the query into 3 parts.
Part 1 is to get the top 10 plus ties from the orders table only based on number of orders for last quarter of 2022.
Part 2 is to join or cross apply to the orders table again based on the last 3 orders of 2022 to get the total_amount. This you have already done.
Part 3 is to join to the customer table to get the name. Best to join after the initial aggregation to avoid aggregating over the name and consuming CPU.
/*
CREATE NONCLUSTERED INDEX ix_orders ON orders (o_orderdate, o_custkey) INCLUDE (o_totalprice);
CREATE NONCLUSTERED INDEX ix_customers ON customers (c_custkey) INCLUDE (c_name)
*/
SQL Server parse and compile time:
CPU time = 23 ms, elapsed time = 23 ms.
SQL Server Execution Times:
CPU time = 173 ms, elapsed time = 96 ms.
Total reads: 4,975
13 rows , starting with customer Customer#000131146 and ending with customer Customer#000817556.
But I do not know if creating indexes is part of the solution?