Chapter 4. Detecting Inefficient Queries

Inefficient queries exist in every system. They impact performance in many ways, most notably by increasing I/O load, CPU usage and blocking in the system. It is essential to detect and optimize them. We’ll start with detection in this chapter, then move on to optimization strategies in subsequent chapters.

This chapter discusses inefficient queries and their potential impact on your system and provides guidelines for detecting them, starting with an approach that uses plan cache-based execution statistics. Next, I will talk about Extended Events and SQL Traces, and then cover Query Store. I’ll wrap up the chapter by sharing a few thoughts on third-party monitoring tools.

The Impact of Inefficient Queries

During my career as a database engineer, I have yet to see a system that wouldn’t benefit from query optimization. I’m sure they exist – after all, no one calls me in to look at perfectly healthy systems. Nevertheless, those are few and far between, and there are always opportunities to improve and optimize.

Not every company prioritizes query optimization, though. It’s a time-consuming and tedious process, and in many cases it’s cheaper, given the benefits of speeding up development and time-to-market, to throw hardware at the problem than to invest hours in performance tuning.

At some point, however, that approach leads to scalability issues. Poorly optimized queries impact systems from many angles, but perhaps the most obvious is disk performance. If the I/O subsystem cannot keep up with the load of large scans, the performance of your entire system will suffer.

You can mask this problem, to a degree, by adding more memory to the server. This increases the size of the buffer pool and allows SQL Server to cache more data, reducing physical I/O. As amount of data in the system grows over time, however, this approach may become impractical or even impossible—especially in non-Enterprise editions of SQL Server that restrict the maximum buffer pool size.

Another effect to watch for is that nonoptimized queries burn CPU on the servers. The more data you process, the more CPU resources you consume. A server might spend just a few microseconds per logical read and in-memory data-page scan, but that quickly adds up as the number of reads increases.

Again, you can mask this by adding more CPUs to the server. (Note, however, that you will need to pay for additional licenses. In non-Enterprise editions, expect a cap on the number of CPUs.) Moreover, adding CPUs may not always solve the problem – nonoptimized queries will still contribute to blocking in the system. While there are ways to reduce blocking without performing query tuning, this can change system behavior and has performance implications.

The bottom line is: When you troubleshoot a system, always analyze whether queries in the system are poorly optimized. Once you’ve done that, estimate the impact of those inefficient queries.

While query optimization always benefits a system, it is not always simple, nor does it always provide the best ROI for your efforts. More often than not, you will at least need to tune some queries.

To put things in perspective: I perform query tuning when I see high disk throughput, blocking, or high CPU load in the system. However, I may initially focus my efforts elsewhere if data is cached in the buffer pool and the CPU load is acceptable. I have to be careful and think about data growth, though – it is possible that active data will one day outgrow the buffer pool, which could lead to sudden and serious performance issues.

Fortunately, query optimization does not require an all-or-nothing approach! You can achieve dramatic performance improvements by optimizing just a handful of frequently executed queries. Let’s look at a few methods of how we can detect them.

Plan-Cache-Based Execution Statistics

In most cases, SQL Server caches and reuses execution plans for queries. For each plan in the cache, it also maintains execution statistics, including the number of times the query ran, cumulative CPU time, and I/O load. You can use this information to quickly pinpoint the most resource-intensive queries for optimization. (We will discuss plan caching in more details in Chapter 6.)

Analyzing plan cache-based execution statistics is not the most comprehensive detection technique; it has quite a few limitations. Nevertheless, it is very easy to use and, in many cases, good enough. It works in all versions of SQL Server and it is always present in the system. You don’t need to set up any additional monitoring to collect the data.

Note

The code calls the sys.dm_exec_query_plan function for each cached plan in the system. This is a CPU-intensive operation, so remove it if your server is CPU-bound. You may also need to comment some of the columns in the statement, depending on the version and patching level of your SQL Server instance.

You can get execution statistics using the sys.dm_exec_query_stats view (as shown in Listing 4-1). The query there is a bit simplistic, but it demonstrates the view in action and shows you the list of metrics exposed in the view. We will use it to build a more sophisticated version of the code later in the chapter.

Example 4-1. Using the sys.dm_exec_query_stats view
SELECT TOP 50
    qs.creation_time AS [Cached Time]
    ,qs.last_execution_time AS [Last Exec Time]
    ,SUBSTRING(qt.text, (qs.statement_start_offset/2)+1,
    ((
        CASE qs.statement_end_offset
            WHEN -1 THEN DATALENGTH(qt.text)
            ELSE qs.statement_end_offset
        END - qs.statement_start_offset)/2)+1) AS SQL
    ,qp.query_plan AS [Query Plan]
    ,qs.execution_count AS [Exec Cnt]
    ,CONVERT(DECIMAL(10,5),
        IIF(datediff(second,qs.creation_time, qs.last_execution_time) = 0,
            NULL,
            1.0 * qs.execution_count / 
                datediff(second,qs.creation_time, qs.last_execution_time)
        )
    ) AS [Exec Per Second]
    ,(qs.total_logical_reads + qs.total_logical_writes) / 
        qs.execution_count AS [Avg IO]
    ,(qs.total_worker_time / qs.execution_count / 1000) 
        AS [Avg CPU(ms)]
    ,qs.total_logical_reads AS [Total Reads]
    ,qs.last_logical_reads AS [Last Reads]
    ,qs.total_logical_writes AS [Total Writes]
    ,qs.last_logical_writes AS [Last Writes]
    ,qs.total_worker_time / 1000 AS [Total Worker Time]
    ,qs.last_worker_time / 1000 AS [Last Worker Time]
    ,qs.total_elapsed_time / 1000 AS [Total Elapsed Time]
    ,qs.last_elapsed_time / 1000 AS [Last Elapsed Time]
    ,qs.total_rows AS [Total Rows] 
    ,qs.last_rows AS [Last Rows] 
    ,qs.total_rows / qs.execution_count AS [Avg Rows]
    ,qs.total_physical_reads AS [Total Physical Reads]
    ,qs.last_physical_reads AS [Last Physical Reads]
    ,qs.total_physical_reads / qs.execution_count 
        AS [Avg Physical Reads]
    ,qs.total_grant_kb AS [Total Grant KB]
    ,qs.last_grant_kb AS [Last Grant KB]
    ,(qs.total_grant_kb / qs.execution_count) 
        AS [Avg Grant KB] 
    ,qs.total_used_grant_kb AS [Total Used Grant KB]
    ,qs.last_used_grant_kb AS [Last Used Grant KB]
    ,(qs.total_used_grant_kb / qs.execution_count) 
        AS [Avg Used Grant KB] 
    ,qs.total_ideal_grant_kb AS [Total Ideal Grant KB]
    ,qs.last_ideal_grant_kb AS [Last Ideal Grant KB]
    ,(qs.total_ideal_grant_kb / qs.execution_count) 
        AS [Avg Ideal Grant KB] 
    ,qs.total_columnstore_segment_reads
        AS [Total CSI Segments Read]
    ,qs.last_columnstore_segment_reads 
        AS [Last CSI Segments Read]
    ,(qs.total_columnstore_segment_reads / qs.execution_count)
        AS [AVG CSI Segments Read]
    ,qs.max_dop AS [Max DOP]
    ,qs.total_spills AS [Total Spills]
    ,qs.last_spills AS [Last Spills]
    ,(qs.total_spills / qs.execution_count) AS [Avg Spills]
FROM 
    sys.dm_exec_query_stats qs WITH (NOLOCK)
        CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
        CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
ORDER BY
    [Avg IO] DESC
OPTION (RECOMPILE, MAXDOP 1);

You will likely sort data differently based on your tuning goals: by I/O when you need to reduce disk load; by CPU on CPU-bound systems, and so on.

Figure 4-1 shows a partial output of the query from one of the servers. As you can see, it is very easy to choose queries to optimize based on frequency of query executions and resource consumption data in the output.

Figure 4-1. Fig 4-1 Partial output from sys.dm_exec_query_stats view

The execution plans you get in the output do not include actual execution metrics. In this respect, they are similar to estimated execution plans. You’ll need to take this into consideration during optimization (we’ll talk more about that in chapter 5).

There are several other important limitations to remember.

First and foremost, you won’t see any data for the queries that do not have execution plans cached. You may miss some infrequently executed queries with plans evicted from the cache. Usually, this is not a problem – infrequently executed queries rarely need to be optimized at the beginning of tuning.

There is another possibility, however. SQL Server won’t cache execution plans if you are using statement-level recompile or executing stored procedures with a RECOMPILE clause. You need to capture those queries using Query Store or Extended Events, which we will discuss later in the chapter.

The second problem is related to how long plans stay cached. This varies by plan, which may skew the results when you sort data by total metrics. For example, a query with lower average CPU time may show a higher total number of executions and CPU time than a query with higher average CPU time, depending on the time when both plans were cached.

You can look at the creation_time and last_execution_time columns, which show the last time when plans were cached and executed, respectively. I usually look at the data sorted based on both total and average metrics, taking the frequency of executions into consideration.

The final problem is more complicated: it is possible to get multiple results for the same or similar queries. This can happen with ad-hoc workloads, with clients that have different SET settings in their sessions, when users run the same queries with slightly different formatting, or in many other cases.

Fortunately, you can address that problem by using two columns, query_hash and query_plan_hash, both exposed in the sys.dm_exec_query_stats view. The same values in those columns would indicate similar queries and execution plans. You can use those columns to aggregate data.

Warning

The DBCC FREEPROCCACHE statement clears the plan cache to reduce the size of the output in the demo. Do not run it on production servers!

Let me demonstrate with a simple example. Listing 4-2 runs three queries and then examines the content of the plan cache. The first two are the same—they just have different formatting. The third one is different.

Example 4-2. Query_hash and query_plan_hash in action
DBCC FREEPROCCACHE -- Do not run in production!
GO
SELECT /*V1*/ TOP 1 object_id FROM sys.objects WHERE object_id = 1;
GO
SELECT /*V2*/ TOP 1 object_id 
FROM sys.objects
WHERE object_id = 1;
GO
SELECT COUNT(*) FROM sys.objects  
GO
SELECT 
    qs.query_hash, qs.query_plan_hash, qs.sql_handle, qs.plan_handle, 
    SUBSTRING(qt.text, (qs.statement_start_offset/2)+1,
    ((
        CASE qs.statement_end_offset
            WHEN -1 THEN DATALENGTH(qt.text)
            ELSE qs.statement_end_offset
        END - qs.statement_start_offset)/2)+1
    ) as SQL
FROM 
    sys.dm_exec_query_stats qs 
        CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
ORDER BY query_hash
OPTION (MAXDOP 1, RECOMPILE);

You can see the results in Figure 4-2. There are three execution plans in the output. The last two rows have the same query_hash and query_plan_hash and different sql_handle and plan_handle values.

Figure 4-2. Fig, 4-2. Multiple plans with the same query_hash and query_plan_hash

Listing 4-3 provides a more sophisticated version of the script from Listing 4-1 by aggregating statistics from similar queries. The statement and execution plans are picked up randomly from the first query in each group, so factor that into your analysis.

Example 4-3. Using the sys.dm_exec_query_stats view with query_hash aggregation
;WITH Data
AS
(
    SELECT TOP 50
        qs.query_hash
        ,COUNT(*) as [Plan Count]
        ,MIN(qs.creation_time) AS [Cached Time]
        ,MAX(qs.last_execution_time) AS [Last Exec Time]
        ,SUM(qs.execution_count) AS [Exec Cnt]
        ,SUM(qs.total_logical_reads) AS [Total Reads]
        ,SUM(qs.total_logical_writes) AS [Total Writes]
        ,SUM(qs.total_worker_time / 1000) AS [Total Worker Time]
        ,SUM(qs.total_elapsed_time / 1000) AS [Total Elapsed Time]
        ,SUM(qs.total_rows) AS [Total Rows] 
        ,SUM(qs.total_physical_reads) AS [Total Physical Reads]
        ,SUM(qs.total_grant_kb) AS [Total Grant KB]
        ,SUM(qs.total_used_grant_kb) AS [Total Used Grant KB]
        ,SUM(qs.total_ideal_grant_kb) AS [Total Ideal Grant KB]
        ,SUM(qs.total_columnstore_segment_reads)
            AS [Total CSI Segments Read]
        ,MAX(qs.max_dop) AS [Max DOP]
        ,SUM(qs.total_spills) AS [Total Spills]
    FROM 
        sys.dm_exec_query_stats qs WITH (NOLOCK)
    GROUP BY
        qs.query_hash
    ORDER BY
        SUM((qs.total_logical_reads + qs.total_logical_writes) /
            qs.execution_count) DESC
)
SELECT 
    d.[Cached Time]
    ,d.[Last Exec Time]
    ,d.[Plan Count]
    ,sql_plan.SQL
    ,sql_plan.[Query Plan]
    ,d.[Exec Cnt]
    ,CONVERT(DECIMAL(10,5),
        IIF(datediff(second,d.[Cached Time], d.[Last Exec Time]) = 0,
            NULL,
            1.0 * d.[Exec Cnt] / 
                datediff(second,d.[Cached Time], d.[Last Exec Time])
        )
    ) AS [Exec Per Second]
    ,(d.[Total Reads] + d.[Total Writes]) / d.[Exec Cnt] AS [Avg IO]
    ,(d.[Total Worker Time] / d.[Exec Cnt] / 1000) AS [Avg CPU(ms)]
    ,d.[Total Reads]
    ,d.[Total Writes]
    ,d.[Total Worker Time]
    ,d.[Total Elapsed Time]
    ,d.[Total Rows] 
    ,d.[Total Rows] / d.[Exec Cnt] AS [Avg Rows]
    ,d.[Total Physical Reads]
    ,d.[Total Physical Reads] / d.[Exec Cnt] AS [Avg Physical Reads]
    ,d.[Total Grant KB]
    ,d.[Total Grant KB] / d.[Exec Cnt] AS [Avg Grant KB] 
    ,d.[Total Used Grant KB]
    ,d.[Total Used Grant KB] / d.[Exec Cnt] AS [Avg Used Grant KB] 
    ,d.[Total Ideal Grant KB]
    ,d.[Total Ideal Grant KB] / d.[Exec Cnt] AS [Avg Ideal Grant KB] 
    ,d.[Total CSI Segments Read]
    ,d.[Total CSI Segments Read] / d.[Exec Cnt] AS [AVG CSI Segments Read]
    ,d.[Max DOP]
    ,d.[Total Spills]
    ,d.[Total Spills] / d.[Exec Cnt] AS [Avg Spills]
FROM 
    Data d
        CROSS APPLY
        (
            SELECT TOP 1
                SUBSTRING(qt.text, (qs.statement_start_offset/2)+1,
                ((
                    CASE qs.statement_end_offset
                        WHEN -1 THEN DATALENGTH(qt.text)
                        ELSE qs.statement_end_offset
                    END - qs.statement_start_offset)/2)+1
                ) AS SQL
                ,qp.query_plan AS [Query Plan]
            FROM
                sys.dm_exec_query_stats qs 
                    CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
                    CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
            WHERE
                qs.query_hash = d.query_hash AND ISNULL(qt.text,'') <> ''
        ) sql_plan
ORDER BY
     [Avg IO] DESC
OPTION (RECOMPILE, MAXDOP 1);

Starting with SQL Server 2008, you can get execution statistics for stored procedures through the sys.dm_exec_procedure_stats view. You can use the code from Listing 4-4 to do that. As with the sys.dm_exec_query_stats view, you can sort data by various execution metrics, depending on your optimization strategy.

Example 4-4. Using the sys.dm_exec_procedure_stats view
SELECT TOP 50
    DB_NAME(ps.database_id) AS [DB]
    ,OBJECT_NAME(ps.object_id, ps.database_id) AS [Proc Name]
    ,ps.type_desc AS [Type]
    ,ps.cached_time AS [Cached Time]
    ,ps.last_execution_time AS [Last Exec Time]
    ,qp.query_plan AS [Plan]
    ,ps.execution_count AS [Exec Count]
    ,CONVERT(DECIMAL(10,5),
        IIF(datediff(second,ps.cached_time, ps.last_execution_time) = 0,
            NULL,
            1.0 * ps.execution_count / 
                datediff(second,ps.cached_time, ps.last_execution_time)
        )
    ) AS [Exec Per Second]
    ,(ps.total_logical_reads + ps.total_logical_writes) / 
        ps.execution_count AS [Avg IO]
    ,(ps.total_worker_time / ps.execution_count / 1000) 
        AS [Avg CPU(ms)]
    ,ps.total_logical_reads AS [Total Reads]
    ,ps.last_logical_reads AS [Last Reads]
    ,ps.total_logical_writes AS [Total Writes]
    ,ps.last_logical_writes AS [Last Writes]
    ,ps.total_worker_time / 1000 AS [Total Worker Time]
    ,ps.last_worker_time / 1000 AS [Last Worker Time]
    ,ps.total_elapsed_time / 1000 AS [Total Elapsed Time]
    ,ps.last_elapsed_time / 1000 AS [Last Elapsed Time]
    ,ps.total_physical_reads AS [Total Physical Reads]
    ,ps.last_physical_reads AS [Last Physical Reads]
    ,ps.total_physical_reads / ps.execution_count AS [Avg Physical Reads]
    ,ps.total_spills AS [Total Spills]
    ,ps.last_spills AS [Last Spills]
    ,(ps.total_spills / ps.execution_count) AS [Avg Spills]
FROM 
    sys.dm_exec_procedure_stats ps WITH (NOLOCK) 
        CROSS APPLY sys.dm_exec_query_plan(ps.plan_handle) qp
ORDER BY
     [Avg IO] DESC
OPTION (RECOMPILE, MAXDOP 1);

Figure 4-3 shows partial output of the code.

Figure 4-3. Fig. 4-3 Partial output of sys.dm_exec_procedure_stats view

As you can see in the output, you can get execution plans for the stored procedures. Internally, the execution plans of stored procedures and other T-SQL modules are just collections of each statement’s individual plan. In some cases—for example, when a stored procedure involves dynamic SQL—the script will not return a plan in the output.

Listing 4-5 helps to address this. You can use it to get cached execution plans and their statistics (for stored procedure statements that have plans cached).

Example 4-5. Getting execution plan and statistics for stored procedure statements
SELECT 
    qs.creation_time AS [Cached Time]
    ,qs.last_execution_time AS [Last Exec Time]
    ,SUBSTRING(qt.text, (qs.statement_start_offset/2)+1,
    ((
        CASE qs.statement_end_offset
            WHEN -1 THEN DATALENGTH(qt.text)
            ELSE qs.statement_end_offset
        END - qs.statement_start_offset)/2)+1) AS SQL
    ,qp.query_plan AS [Query Plan]
    ,CONVERT(DECIMAL(10,5),
        IIF(datediff(second,qs.creation_time, qs.last_execution_time) = 0,
            NULL,
            1.0 * qs.execution_count / 
                datediff(second,qs.creation_time, qs.last_execution_time)
        )
    ) AS [Exec Per Second]
    ,(qs.total_logical_reads + qs.total_logical_writes) / 
        qs.execution_count AS [Avg IO]
    ,(qs.total_worker_time / qs.execution_count / 1000) 
        AS [Avg CPU(ms)]
    ,qs.total_logical_reads AS [Total Reads]
    ,qs.last_logical_reads AS [Last Reads]
    ,qs.total_logical_writes AS [Total Writes]
    ,qs.last_logical_writes AS [Last Writes]
    ,qs.total_worker_time / 1000 AS [Total Worker Time]
    ,qs.last_worker_time / 1000 AS [Last Worker Time]
    ,qs.total_elapsed_time / 1000 AS [Total Elapsed Time]
    ,qs.last_elapsed_time / 1000 AS [Last Elapsed Time]
    ,qs.total_rows AS [Total Rows] 
    ,qs.last_rows AS [Last Rows] 
    ,qs.total_rows / qs.execution_count AS [Avg Rows]
    ,qs.total_physical_reads AS [Total Physical Reads]
    ,qs.last_physical_reads AS [Last Physical Reads]
    ,qs.total_physical_reads / qs.execution_count 
        AS [Avg Physical Reads]
    ,qs.total_grant_kb AS [Total Grant KB]
    ,qs.last_grant_kb AS [Last Grant KB]
    ,(qs.total_grant_kb / qs.execution_count) 
        AS [Avg Grant KB] 
    ,qs.total_used_grant_kb AS [Total Used Grant KB]
    ,qs.last_used_grant_kb AS [Last Used Grant KB]
    ,(qs.total_used_grant_kb / qs.execution_count) 
        AS [Avg Used Grant KB] 
    ,qs.total_ideal_grant_kb AS [Total Ideal Grant KB]
    ,qs.last_ideal_grant_kb AS [Last Ideal Grant KB]
    ,(qs.total_ideal_grant_kb / qs.execution_count) 
        AS [Avg Ideal Grant KB] 
    ,qs.total_columnstore_segment_reads
        AS [Total CSI Segments Read]
    ,qs.last_columnstore_segment_reads 
        AS [Last CSI Segments Read]
    ,(qs.total_columnstore_segment_reads / qs.execution_count)
        AS [AVG CSI Segments Read]
    ,qs.max_dop AS [Max DOP]
    ,qs.total_spills AS [Total Spills]
    ,qs.last_spills AS [Last Spills]
    ,(qs.total_spills / qs.execution_count) AS [Avg Spills]
FROM 
    sys.dm_exec_query_stats qs WITH (NOLOCK)
        CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
        CROSS APPLY sys.dm_exec_text_query_plan
        (qs.plan_handle,qs.statement_start_offset,qs.statement_end_offset) qp
WHERE 
    OBJECT_NAME(qt.objectid, qt.dbid) = <SP Name>       
ORDER BY 
    qs.statement_start_offset, qs.statement_end_offset
OPTION (RECOMPILE, MAXDOP 1);

Finally, starting with SQL Server 2016, you can get execution statistics for triggers and scalar user-defined functions, using sys.dm_exec_trigger_stats and sys.dm_exec_function_stats, respectively. Listing 4-6 shows the code to do that.

Example 4-6. Getting execution statistics for user-defined functions and triggers
SELECT TOP 50
    DB_NAME(fs.database_id) AS [DB]
    ,OBJECT_NAME(fs.object_id, fs.database_id) AS [Function]
    ,fs.type_desc AS [Type]
    ,fs.cached_time AS [Cached Time]
    ,fs.last_execution_time AS [Last Exec Time]
    ,qp.query_plan AS [Plan]
    ,fs.execution_count AS [Exec Count]
    ,CONVERT(DECIMAL(10,5),
        IIF(datediff(second,fs.cached_time, fs.last_execution_time) = 0,
            NULL,
            1.0 * fs.execution_count / 
                datediff(second,fs.cached_time, fs.last_execution_time)
        )
    ) AS [Exec Per Second]
    ,(fs.total_logical_reads + fs.total_logical_writes) / 
        fs.execution_count AS [Avg IO]
    ,(fs.total_worker_time / fs.execution_count / 1000) AS [Avg CPU(ms)]
    ,fs.total_logical_reads AS [Total Reads]
    ,fs.last_logical_reads AS [Last Reads]
    ,fs.total_logical_writes AS [Total Writes]
    ,fs.last_logical_writes AS [Last Writes]
    ,fs.total_worker_time / 1000 AS [Total Worker Time]
    ,fs.last_worker_time / 1000 AS [Last Worker Time]
    ,fs.total_elapsed_time / 1000 AS [Total Elapsed Time]
    ,fs.last_elapsed_time / 1000 AS [Last Elapsed Time]
    ,fs.total_physical_reads AS [Total Physical Reads]
    ,fs.last_physical_reads AS [Last Physical Reads]
    ,fs.total_physical_reads / fs.execution_count AS [Avg Physical Reads]
FROM 
    sys.dm_exec_function_stats fs WITH (NOLOCK) 
        CROSS APPLY sys.dm_exec_query_plan(fs.plan_handle) qp
ORDER BY
     [Avg IO] DESC
OPTION (RECOMPILE, MAXDOP 1);
SELECT TOP 50
    DB_NAME(ts.database_id) AS [DB]
    ,OBJECT_NAME(ts.object_id, ts.database_id) AS [Function]
    ,ts.type_desc AS [Type]
    ,ts.cached_time AS [Cached Time]
    ,ts.last_execution_time AS [Last Exec Time]
    ,qp.query_plan AS [Plan]
    ,ts.execution_count AS [Exec Count]
    ,CONVERT(DECIMAL(10,5),
        IIF(datediff(second,ts.cached_time, ts.last_execution_time) = 0,
            NULL,
            1.0 * ts.execution_count / 
                datediff(second,ts.cached_time, ts.last_execution_time)
        )
    ) AS [Exec Per Second]
    ,(ts.total_logical_reads + ts.total_logical_writes) / 
        ts.execution_count AS [Avg IO]
    ,(ts.total_worker_time / ts.execution_count / 1000) AS [Avg CPU(ms)]
    ,ts.total_logical_reads AS [Total Reads]
    ,ts.last_logical_reads AS [Last Reads]
    ,ts.total_logical_writes AS [Total Writes]
    ,ts.last_logical_writes AS [Last Writes]
    ,ts.total_worker_time / 1000 AS [Total Worker Time]
    ,ts.last_worker_time / 1000 AS [Last Worker Time]
    ,ts.total_elapsed_time / 1000 AS [Total Elapsed Time]
    ,ts.last_elapsed_time / 1000 AS [Last Elapsed Time]
    ,ts.total_physical_reads AS [Total Physical Reads]
    ,ts.last_physical_reads AS [Last Physical Reads]
    ,ts.total_physical_reads / ts.execution_count AS [Avg Physical Reads]
FROM 
    sys.dm_exec_trigger_stats ts WITH (NOLOCK) 
        CROSS APPLY sys.dm_exec_query_plan(ts.plan_handle) qp
ORDER BY
     [Avg IO] DESC
OPTION (RECOMPILE, MAXDOP 1);

Troubleshooting based on plan cache-based execution statistics has several limitations, and you may miss some queries. Nevertheless, it is a great starting point. Most importantly, the data is collected automatically and you can access it immediately, without setting up additional monitoring tools.

SQL Traces and Extended Events

I am sure that every SQL Server engineer is aware of SQL Traces and Extended Events. They allow you to capture various events in a system for analysis and troubleshooting in real time. You can also use them to capture long-running and expensive queries, including those that don’t cache execution plans and are therefore missed by the sys.dm_exec_query_stats view.

I’d like to start this section with a warning, though: Do not use SQL Traces and xEvents for this purpose unless it is absolutely necessary. Capturing executed statements is an expensive operation that may introduce significant performance overhead in busy systems. (You saw one such example in Chapter 1.)

It does not matter how much data you collect. You can exclude most statements from the output by filtering out queries with low resource consumption. But SQL Server will still have to capture all statements to evaluate, filter, and discard unnecessary events.

Don’t collect unnecessary information in events you are collecting or in xEvent actions you are capturing. Some actions—for example, callstack—are expensive and lead to a serious performance hit when enabled.

I do not want to beat a dead horse, but I have no choice: use Extended Events instead of SQL Traces. They are lighter and introduce less overhead in the system. Choose an in-memory ring_buffer target and allow event loss in configuration when possible.

Table 4-1 shows several Extended and SQL Trace Events that can be used to detect inefficient queries.

Table 4-1. Extended and SQL Trace Events to detect inefficient queries
SQL Trace Event xEvent Comments
SQL:StmtStarting sqlserver.sql_statement_starting Fired when statement starts the execution.
SQL:StmtCompleted sqlserver. sql_statement_completed Fired when statement finishes the execution.
SP:StmtStarting s qlserver.sp_statement_starting Fired when SQL statement within T-SQL module (stored procedure, user-defined function, etc.) starts the execution.
SP:StmtCompleted
sqlserver.sp_statement_completed
Fired when SQL statement within T-SQL module completes the execution.
RPC:Starting sqlserver.rpc_starting Fired when remote procedure call (RPC) is starting. Those calls are parameterized SQL requests, such as calls of stored procedures or parameterized batches, sent from applications. Many client libraries will run queries via sp_executesql calls, which can be captured by that event.
RPC:Completed sqlserver.rpc_completed Fired when RPC completes.
SP:Starting sqlserver.module_start Fired when T-SQL module starts execution.
SP:Complerted sqlserver.module_end
Fired when T-SQL module completes the execution.
Error:Attention sqlserver.attention Occurs when client terminates query execution due to timeout or connection loss.

Usually, when I need to capture inefficient queries, I set up an xEvents session capturing sqlserver.rpc_completed, sqlserver.sql_completed, and sqlserver.attention events and filtering data by execution metrics, such as cpu_time or logical_reads. As part of the event, I capture several actions, most notably sqlserver.sql_text and client information.

Listing 4-7 shows code to capture queries that consume more than 3,000ms of CPU time or produce more than 10,000 logical reads or writes. This code will work in SQL Server 2012 and above; it may require small modifications in SQL Server 2008 due to the different way it works with the file target.

Example 4-7. Capturing CPU- and I/O intensive queries
CREATE EVENT SESSION [Expensive Queries] 
ON SERVER
ADD EVENT
    sqlserver.sql_statement_completed
    (
        ACTION
        (
            sqlserver.client_app_name
            ,sqlserver.client_hostname
            ,sqlserver.database_id
            ,sqlserver.plan_handle
            ,sqlserver.sql_text
            ,sqlserver.username
        )    
        WHERE
        (
            (
                cpu_time >= 3000000 or -- Time in microseconds
                logical_reads >= 10000 or
                writes >= 10000
            ) AND
            sqlserver.is_system = 0 
        )
    ),
ADD EVENT
    sqlserver.rpc_completed
    (
        ACTION
        (
            sqlserver.client_app_name
            ,sqlserver.client_hostname
            ,sqlserver.database_id
            ,sqlserver.plan_handle
            ,sqlserver.sql_text
            ,sqlserver.username
        )
        WHERE
        (
            (
                cpu_time >= 3000000 or
                logical_reads >= 10000 or
                writes >= 10000
            ) AND
            sqlserver.is_system = 0 
        )
    )
 ADD TARGET 
    package0.event_file
    (
        SET FILENAME = 'c:\ExtEvents\Expensive Queries.xel'
    )
WITH
    (
        event_retention_mode=allow_single_event_loss
        ,max_dispatch_latency=30 seconds
    );

You can parse the captured results with the code from Listing 4-8.

Example 4-8. Parsing collected xEvent data
;WITH TargetData(Data, File_Name, File_Offset)
AS
(
  SELECT CONVERT(xml,event_data) AS Data, file_name, file_offset 
  FROM 
    sys.fn_xe_file_target_read_file
      ('c:\extevents\Expensive Queries*.xel',NULL,NULL,NULL)
)
,EventInfo([Event],[Event Time],[DB],[Statement],[SQL],[User Name]
    ,[Client],[App],[CPU Time],[Duration],[Logical Reads]
    ,[Physical Reads],[Writes],[Rows],[PlanHandle]
    ,File_Name,File_Offset)
as
(
  SELECT
    Data.value('/event[1]/@name','sysname') AS [Event]
    ,Data.value('/event[1]/@timestamp','datetime') AS [Event Time]
    ,Data.value('((/event[1]/data[@name="database_id"]/value/text())[1])','INT') 
        AS [DB]
    ,Data.value('((/event[1]/data[@name="statement"]/value/text())[1])'
        ,'nvarchar(max)') AS [Statement]
    ,Data.value('((/event[1]/data[@name="sql_text"]/value/text())[1])'
        ,'nvarchar(max)') AS [SQL]
    ,Data.value('((/event[1]/data[@name="username"]/value/text())[1])'
        ,'nvarchar(255)') AS [User Name]
    ,Data.value('((/event[1]/data[@name="client_hostname"]/value/text())[1])'
        ,'nvarchar(255)') AS [Client]
    ,Data.value('((/event[1]/data[@name="client_app_name"]/value/text())[1])'
        ,'nvarchar(255)') AS [App]
    ,Data.value('((/event[1]/data[@name="cpu_time"]/value/text())[1])'
        ,'bigint') AS [CPU Time]
    ,Data.value('((/event[1]/data[@name="duration"]/value/text())[1])'
        ,'bigint') AS [Duration]
    ,Data.value('((/event[1]/data[@name="logical_reads"]/value/text())[1])'
        ,'int') AS [Logical Reads]
    ,Data.value('((/event[1]/data[@name="physical_reads"]/value/text())[1])'
        ,'int') AS [Physical Reads]
    ,Data.value('((/event[1]/data[@name="writes"]/value/text())[1])'
        ,'int') AS [Writes]
    ,Data.value('((/event[1]/data[@name="row_count"]/value/text())[1])'
        ,'int') AS [Rows]
    ,Data.value(
        'xs:hexBinary(((/event[1]/action[@name="plan_handle"]/value/text())[1]))'
            ,'varbinary(64)') AS [PlanHandle]
    ,File_Name
    ,File_Offset
  FROM 
    TargetData 
)
SELECT 
  ei.*, qp.Query_Plan
FROM 
  EventInfo ei 
    OUTER APPLY sys.dm_exec_query_plan(ei.PlanHandle) qp
OPTION (MAXDOP 1, RECOMPILE);

When you work with SQL Traces and xEvents, you have to deal with raw data. You’ll need to aggregate it to determine which queries introduce the most cumulative impact.

Again: beware of the overhead that xEvents and SQL Traces introduce in systems. Do not create and run those sessions permanently. In many cases you can get enough troubleshooting data by enabling the session or trace for just a few minutes.

For more extensive examples on how to work with different xEvents targets, see the code repository that accompanies this book. You can also read more about Extended Events in my book Pro SQL Server Internals.

Query Store

So far in this chapter, we have discussed two approaches to detecting inefficient queries. Both have limitations. Plan-cache-based data may miss some queries; SQL Traces and xEvents require you to perform complex analysis of the output and may have significant performance overhead in busy systems.

The Query Store, introduced in SQL Server 2016, helps to address those limitations. You can think of it as something like the flight data recorders (or “black boxes”) in airplane cockpits, but for SQL Server. When the Query Store is enabled, SQL Server captures and persists runtime statistics and execution plans of the queries in the database. It shows how the execution plans perform and how they evolve over time. Finally, it allows you to force specific execution plans to queries addressing parameter-sniffing issues, which we will discuss in Chapter 6.

Note

The Query Store is disabled by default in the on-premises version of SQL Server. It is enabled by default in Azure SQL Databases and Azure SQL Managed Instances.

The Query Store is fully integrated into the query processing pipeline, as illustrated by the high-level diagram in Figure 4-4.

Figure 4-4. Fig. 4-4. Query processing pipeline

When a query needs to be executed, SQL Server looks up the execution plan from the plan cache. If it finds a plan, SQL Server checks if the query needs to be recompiled (due to statistics updates or other factors), if a new forced plan has been created, and if an old forced plan has been dropped from the Query Store.

During the compilation, SQL Server checks if the query has a forced plan available. When that happens, the query essentially gets compiled with the forced plan, much like when the USE PLAN hint is used. If the resulting plan is valid, it is stored in the plan cache for reuse.

If the forced plan is no longer valid (for example, when a user drops an index referenced in the forced plan), SQL Server does not fail the query. Instead, it compiles the query again without the forced plan and without caching it afterwards. The Query Store, on the other hand, persists both plans, marking the forced plan as invalid. All of that happens without affecting the applications.

Despite its tight integration with the query processing pipeline and various internal optimizations, Query Store still adds overhead to the system. Just how much overhead depends on two main factors: the number of compilations and the data collection settings.

The more compilations SQL Server performs, the more load the Query Store must handle. In particular, the Query Store may not work very well in systems that have a very heavy, ad-hoc, non-parameterized workload.

Query Store’s configurations allow you to specify if you want to capture all queries or just expensive ones, along with aggregation intervals and data retention settings. If you collect more data and/or use smaller aggregation intervals, you’ll have more overhead.

The overhead introduced by the Query Store is usually relatively small. However, it may be significant in some cases. For example, I’ve been using the Query Store to troubleshoot the performance of one process that consists of a very large number of small ad-hoc queries. I captured all queries in the system using QUERY_CAPTURE_MODE=ALL mode, collecting almost 10GB of data in the Query Store. The process took 8 hours to complete with the Query Store enabled, comparing to 2.5 hours without it.

Nevertheless, I suggest enabling Query Store if your system can handle the overhead. Some SQL Server features, such as Intelligent Query Processing, rely on Query Store data and will benefit from it.

Note

Monitor QDS* waits when you enable Query Store. Excessive QDS* waits may be a sign of higher Query Store overhead in the system. Ignore QDS_PERSIST_TASK_MAIN_LOOP_SLEEP and QDS_ASYNC_QUEUE waits – they are benign.

You can work with the Query Store in two ways – through the graphics UI in SSMS or by querying data management views directly. Let’s look at the UI first.

Query Store SSMS Reports

After you enable the Query Store in the database, you’ll see a Query Store folder in the Object Explorer (Figure 4-5). The number of reports in the folder will depend on the versions of SQL Server and SSMS in your system. The rest of this section will walk you through the seven reports shown in Figure 4-5.

Figure 4-5. Fig. 4-5. Query Store reports in SSMS

Regressed Queries

This report, shown in Figure 4-6, shows queries whose performance has regressed overtime. You can configure the time frame and regression criteria (such as disk operations, CPU consumption, and number of executions) for analysis.

Figure 4-6. Fig 4-6. Regressed Queries report

Choose the query in the graph on the top left. The top right portion of the report illustrates collected execution plans for the selected query. You can click on the dots, which represent different execution plans, and see the plans at the bottom. You can also compare different execution plans.

The Force Plan button allows you to force a selected plan for the query. It calls the sys.sp_query_store_force_plan stored procedure internally. Similarly, the Unforce Plan button removes a forced plan by calling the sys.sp_query_store_unforce_plan stored procedure.

The Regressed Queries report is a great tool for troubleshooting issues related to parameter sniffing, which we will discuss in Chapter 6, and fixing them quickly by forcing specific execution plans.

Top Resource Consuming Queries

This report (Figure 4-7) allows you to detect the most resource-intensive queries in the system. While it works similarly to the data provided by sys.dm_exec_query_stats view, it does not depend on the plan cache. You can customize the metrics used for data sorting and the time interval.

Figure 4-7. Fig. 4-7. Top Resource Consuming Queries report

Overall Resource Consumption

This report shows you the workload’s statistics and resource usage over the time intervals you specify. It will allow you to detect and analyze spikes in resource usage and drill down to the queries that introduce such spikes. Figure 4-8 shows the output of the report.

Figure 4-8. Fig. 4-8. Overall Resource Consumption report

Queries With High Variation

This report allows you pinpoint queries with high performance variation. You can use it to detect anomalies in the workload, along with possible performance regressions. (For the sake of space, I’ll skip the screenshots here.)

Queries With Forced Plan

This report shows you the queries that have an execution plan forced in the system.

Query Wait Statistics

This report allows you to detect queries with high waits. The data is grouped by several categories (such as CPU, disk, and blocking), depending on wait type. You can see details on wait mapping in the Microsoft Documentation.

Tracked Queries

Finally, the Tracked Queries report allows you to monitor execution plans and statistics for individual queries. It provides similar information to the Regressed Queries and Top Resource Consuming Queries reports, at the scope of individual queries.

These reports will give you a large amount of data for analysis. However, in some cases, you’ll want to use T-SQL and work with the Query Store data directly. Let’s look at how you can accomplish that.

Working with Query Store DMVs

The Query Store data management views (DMVs) are highly normalized, as shown in Figure 4-9. Execution statistics are tracked for each execution plan and grouped by collection intervals, which are defined by the INTERVAL_LENGTH_MINUTES setting.

As I’ve noted, the smaller the intervals you use, the more data will be collected and persisted in the Query Store. The same applies to the system workload: an excessive number of ad-hoc queries may balloon the Query Store’s size. Keep this in mind when you configure the Query Store in your system.

Figure 4-9. Fig. 4-9. Query Store DMVs

You can logically separate DMVs into two categories: plan store and runtime statistics. The former ones include the following views:

sys.query_store_query

The sys.query_store_query view provides information about queries and their compilation statistics, and last execution time.

sys.query_store_query_text

The sys.query_store_query_text view shows information about query text.

sys.query_context_setting

The sys.query_context_setting view contains information about context settings associated with the query. It includes SET options, default schema for the session, language, and other attributes. SQL Server may generate and cache separate execution plans for the same query when those settings are different.

sys.query_store_plan

The sys.query_store_plan view provides information about query execution plans. The is_forced_plan column indicates whether the plan is forced. The last_force_failure_reason tells you why a forced plan was not applied to the query.

As you can see, each query can have multiple entries in the sys.query_store_query and sys.query_store_plan views. This will vary based on your session context options, recompilations, and other factors.

Three other views represent runtime statistics:

sys.query_store_runtime_stats_interval

The sys.query_store_runtime_stats_interval view contains information about statistics collection intervals.

sys.query_store_runtime_stats

The sys.query_store_runtime_stats view references the sys.query_store_plan view and contains information about runtime statistics for a specific plan during a particular sys.query_store_runtime_stats_interval interval. It provides information about execution count, CPU time and call durations, logical and physical I/O statistics, transaction log usage, degree of parallelism, memory grant size, and a few other useful metrics.

sys.query_store_wait_stats

Starting with SQL Server 2017, you can get information about query waits with the sys.query_store_wait_stats view. The data is collected for each plan and time interval and grouped by several wait categories, including CPU, memory, and blocking.

Let’s look at a few scenarios for working with Query Store data.

Listing 4-9 provides code that returns information about the system’s 50 most I/O-intensive queries. Because the Query Store persists execution statistics over time intervals, you’ll need to aggregate data from multiple sys.query_store_runtime_stats rows. The output will include data for all intervals that ended within the last 24 hours, grouped by queries and their execution plans.

Example 4-9. Getting information about expensive queries from Query Store
SELECT TOP 50 
  q.query_id, qt.query_sql_text, qp.plan_id, qp.query_plan
  ,SUM(rs.count_executions) AS [Execution Cnt]
  ,CONVERT(INT,SUM(rs.count_executions * 
    (rs.avg_logical_io_reads + avg_logical_io_writes)) / 
      SUM(rs.count_executions)) AS [Avg IO]
  ,CONVERT(INT,SUM(rs.count_executions * 
    (rs.avg_logical_io_reads + avg_logical_io_writes))) AS [Total IO]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_cpu_time) /
    SUM(rs.count_executions)) AS [Avg CPU]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_cpu_time)) AS [Total CPU]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_duration) / 
    SUM(rs.count_executions)) AS [Avg Duration]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_duration)) 
    AS [Total Duration]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_physical_io_reads) / 
    SUM(rs.count_executions)) AS [Avg Physical Reads]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_physical_io_reads)) 
    AS [Total Physical Reads]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_query_max_used_memory) / 
    SUM(rs.count_executions)) AS [Avg Memory Grant Pages]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_query_max_used_memory)) 
    AS [Total Memory Grant Pages]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_rowcount) /
    SUM(rs.count_executions)) AS [Avg Rows]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_rowcount)) AS [Total Rows]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_dop) /
    SUM(rs.count_executions)) AS [Avg DOP]
  ,CONVERT(INT,SUM(rs.count_executions * rs.avg_dop)) AS [Total DOP]
FROM 
  sys.query_store_query q WITH (NOLOCK)
    JOIN sys.query_store_plan qp WITH (NOLOCK) ON
      q.query_id = qp.query_id
    JOIN sys.query_store_query_text qt WITH (NOLOCK) ON
      q.query_text_id = qt.query_text_id
    JOIN sys.query_store_runtime_stats rs WITH (NOLOCK) ON
      qp.plan_id = rs.plan_id 
    JOIN sys.query_store_runtime_stats_interval rsi WITH (NOLOCK) ON
      rs.runtime_stats_interval_id = rsi.runtime_stats_interval_id
WHERE
  rsi.end_time >= DATEADD(DAY,-1,GETDATE())
GROUP BY
  q.query_id, qt.query_sql_text, qp.plan_id, qp.query_plan
ORDER BY 
  [Avg IO] DESC
OPTION (MAXDOP 1, RECOMPILE);

Obviously, you can sort data by different criteria than average I/O. You can also add predicates to the WHERE and/or HAVING clauses of the query to narrow down the results. For example, you can filter by DOP columns if you want to detect queries that use parallelism in an OLTP environment and fine-tune the Cost Threshold for Parallelism setting.

Another example is for detecting queries that balloon the plan cache. The code in Listing 4-10 provides information about queries that generate multiple execution plans due to different context settings. The two most common reasons for this are sessions that use different SET options and queries that reference objects without schema names.

Example 4-10. Queries with different context settings
SELECT 
    q.query_id, qt.query_sql_text
    ,COUNT(DISTINCT q.context_settings_id) AS [Context Setting Cnt]
    ,COUNT(DISTINCT qp.plan_id) AS [Plan Count]
FROM 
    sys.query_store_query q WITH (NOLOCK)
        JOIN sys.query_store_query_text qt WITH (NOLOCK) ON
            q.query_text_id = qt.query_text_id
        JOIN sys.query_store_plan qp WITH (NOLOCK) ON
            q.query_id = qp.query_id
GROUP BY
    q.query_id, qt.query_sql_text
HAVING
    COUNT(DISTINCT q.context_settings_id) > 1
ORDER BY 
    COUNT(DISTINCT q.context_settings_id)
OPTION (MAXDOP 1, RECOMPILE);

Listing 4-11 shows you how to find similar queries based on query_hash value. Usually, those queries belong to a non-parameterized ad-hoc workload in the system. You can parameterize those queries in the code. If that’s not possible, consider using forced parameterization, which we will discuss in Chapter 6.

Example 4-11. Detecting queries with duplicated query_hash value
SELECT TOP 100 
    q.query_hash
    ,COUNT(*) AS [Query Count]
    ,AVG(rs.count_executions) AS [Avg Exec Count]
FROM 
    sys.query_store_query q WITH (NOLOCK)
        JOIN sys.query_store_plan qp WITH (NOLOCK) ON
            q.query_id = qp.query_id
        JOIN sys.query_store_runtime_stats rs WITH (NOLOCK) ON
            qp.plan_id = rs.plan_id 
GROUP BY 
    q.query_hash
HAVING 
    COUNT(*) > 1
ORDER BY 
     [Avg Exec Count] ASC, [Query Count] DESC
OPTION(MAXDOP 1, RECOMPILE);

You can view additional examples in the book’s code repository.

As you can see, the possibilities are endless. Use the Query Store if you can afford its overhead in your system.

Third-Party Tools

As you’ve now seen, SQL Server provides a very rich and extensive set of tools to locate inefficient queries. Nevertheless, you may also benefit from monitoring tools developed by other vendors. Most will provide you with a list of most resource-intensive queries for analysis and optimization. Many will also give you the baseline, which you can use to analyze trends and detect regressed queries.

I am not going to discuss specific tools; instead, I want to offer you a few tips for choosing and using these tools.

The key to using any tool is understanding it. Research how it works and analyze its limitations and what data it may miss. For example, if a tool gets data by polling the sys.dm_exec_requests view on schedule, it may miss a big portion of small but frequently executed queries that run in between polls. Alternatively, if a tool determines inefficient queries by session waits, the results will greatly depend on your system’s workload, the amount of data cached in the buffer pool, and many other factors.

Depending on your specific needs, these limitations might be acceptable. Remember the Pareto principle (also known as the “80/20 rule”): you don’t need to optimize all inefficient queries in the system. Nevertheless, you may benefit from a holistic view and from multiple perspectives. For example, it is very easy to cross-check a tool’s list of inefficient queries against the plan-cache-based execution statistics for a more complete list.

There is another important reason to understand your tool, though: estimating the amount of overhead it could introduce. Some DMVs are very expensive to run. For example, if a tool calls the sys.dm_exec_query_plan function during each sys.dm_exec_requests poll, it may lead to a measurable increase in overhead in busy systems. It is also not uncommon for tools to create traces and xEvent sessions without your knowledge.

In the end, choose the approach that best allows you to pinpoint inefficient queries and that works best with your system. Remember that query optimization will help in any system.

Summary

Inefficient queries impact SQL Server’s performance and can overload the disk subsystem. Even in systems that have enough memory to cache data in the buffer pool, those queries burn CPU, increase blocking, and affect the customer experience.

SQL Server keeps track of execution metrics for each cached plan and exposes them through the sys.dm_exec_query_stats view. You can also get execution statistics for stored procedures, triggers, and scalar user-defined functions with sys.dm_exec_procedure_stats, sys.dm_exec_trigger_stats, and sys.dm_exec_function_stats views, respectively.

Your plan-cache-based execution statistics will not track runtime execution metrics in execution plans, nor will it include queries that do not have plans cached. Make sure to factor this to your analysis and query-tuning process.

You can capture inefficient queries in real time with Extended Events and SQL Traces. Both approaches introduce overhead, especially in busy systems. They also provide raw data, which you’ll need to process and aggregate for further analysis.

In SQL Server 2016 and above, you can utilize the Query Store. This is a great tool that does not depend on the plan cache and allows you to quickly pinpoint plan regressions. The Query Store adds some overhead; this may be acceptable in many cases, but monitor it when you enable the feature.

Finally, I discussed how you can use third-party monitoring tools to find inefficient queries. Remember to research how a tool works and understand its limitation and overhead.

In the next chapter, we will discuss a few common techniques that you can use to optimize inefficient queries.

Troubleshooting Checklist

  • Get the list of inefficient queries from the sys.dm_exec_query_stats view. Sort the data according to your troubleshooting strategy (CPU, I/O, and so forth).

  • Detect the most expensive stored procedures with the sys.dm_exec_procedure_stats view.

  • Consider enabling the Query Store in your system and analyzing the data you collect. (This may or may not be feasible if you already use external monitoring tools.)

  • Analyze data from third-party monitoring tools and cross-check it with SQL Server data.

  • Analyze the overhead that inefficient queries introduce in the system. Correlate queries’ resource consumption with wait statistics and server load.

  • Optimize queries if you determine this is needed.

Get SQL Server Advanced Troubleshooting and Performance Tuning now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.