Advanced T-SQL: Best way to find the last 3 orders a customer made

INTRODUCTION AND PROBLEM STATEMENT

Assuming we have two tables in our database, Customers and orders, our goal is to find the last three orders the customer made.

The query should output 3 columns: custid, orderid and orderdate.

You can also find a script that creates and populates a database containing these tables (and much more) created by Itzik Ben-Gan at this link: http://tsql.solidq.com/SourceCodes/TSQLFundamentals2012.zip

SOLUTIONS

This is a somewhat tricky query to write, but there are several ways to implement it.

1. Correlated subquery

select C.custid, O.orderid, O.orderdate

from Sales.Customers as C

join Sales.Orders as O

on C.custid = O.custid

where O.orderid in

(select orderid

from Sales.Orders as O2

where O2.custid = C.custid

order by orderdate desc

offset 0 rows fetch first 3 rows only)

order by custid asc, orderdate desc

2. Using a Window function

select result.custid, result.orderid, result.orderdate from (

select C.custid, O.orderid, O.orderdate,

ROW_NUMBER() over (partition by C.custid order by O.orderdate DESC) as [rank]

from Sales.Customers as C

join Sales.Orders as O

on C.custid = O.custid

) as result

where result.rank <= 3

order by result.custid asc, result.orderdate desc

3. Using the Apply operator

select C.custid, O.orderid, O.orderdate

from Sales.Customers as C

cross apply

(select orderid, empid, orderdate

from sales.Orders as OT

where OT.custid = C.custid

order by orderdate desc, orderid desc

OFFSET 0 ROWS FETCH FIRST 3 ROWS ONLY) as O

order by C.custid asc, O.orderdate desc

 

Can you guess which of the 3 solutions is the fastest? 

 

ANALYSIS OF SOLUTIONS

 

I will not analyze the execution plans, but only the execution time.

To make an accurate benchmark, we have to do a couple of things first.

  1. Enable time statistics
  2. Clean the query cache before running each query

We can do that with the following snippet


CHECKPOINT;

GO;

DBCC DROPCLEANBUFFERS; 

GO

DBCC FREEPROCCACHE;

GO

SET STATISTICS TIME ON;

GO

Now the result we get are following

Query 1 – Correlated subquery

  1. Parse and compile elapsed time = 22ms
  2. Execution elapsed time = 27ms

Query 2 – Window function

  1. Parse and compile elapsed time = 19ms
  2. Execution elapsed time = 2ms

Query 3 – Cross apply operator

  1. Parse and compile elapsed time = 18ms
  2. Execution elapsed time = 9ms

 

Also, just a quick analysis of the Live Query Statistics shows a similar story with the correlated subquery using 89% of all resources, Window function using 1% and the apply operator using 10%.

 

Advertisements

About dorianbg

A Data Engineer based in London, United Kingdom
This entry was posted in Data Engineering, SQL Server, T-SQL. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s