Optimizing an SQL Server query
- Posted in
OK, so I played a little with SQL and I found an interesting flow for analysing queries. It uses the SET STATISTICS PROFILE functionality, but the results of this are usually hard to read and handle in any meaningful way. There are applications that help out with this, but this blog post is trying to show you a method that doesn't need any extra software (for when you are working for a paranoid company that doesn't allow you to install what you need to do your work, for example).
This works in the query itself, so no need of any extra tool except SQL Server Management Studio and Excel:
- Add SET STATISTICS PROFILE OFF at the start of the query (because you don’t need to profile the setup)
- Add SET STATISTICS PROFILE ON just before the SELECT that you want to optimize
- Clear cache and stats - this is optional, but good practice. There are multiple ways of doing this and it depends on your environment and preferences, so I am not covering this here.
- Execute the query -> In the query results you will get the results of the query, but also the profiling statistics of the query execution, also in table form
- Copy the entire statistics table with headers and insert it into a new Excel sheet
- Add a new column right after Parent, call it IsLeaf
- Fill the IsLeaf column with a formula to see if the value in NodeId exists in the Parent column
- Write "=COUNTIF($F$2:$F$10000,E2)=0" as the first value of the column
- Keep pressing Shift, then press End and Down arrow (and release Shift) – you should have the entire column selected
- Press Ctrl-D
- Select the header row of the table
- Click on "Sort and Filter"
- Select "Filter"
- Click on a random cell, click on "Sort and Filter" again
- Click on "Custom sort"
- Select TotalTreeSubcost and "From largest to smallest"
- Now click on the filter on the IsLeaf column and filter on value TRUE (only the leaves)
You should now have the rows of the final tree branch nodes, ordered descending by the cost to the query.
Here you can look at the IO cost, CPU cost and Rows columns to find the places you need to work on. These values need to be as small as possible.
I hope this helps.
Be the first to post a comment