Table Performance
Index Fragmentation
Just like a hard drive gets fragmented, the data and indexes within SQL Server also get fragmented which can affect query performance. As such, regular index rebuilding or defragmentation should be part of any database maintenance plan.
The command ‘DBCC SHOWCONTIG (<table name>) WITH ALL_INDEXES’ will show the fragmentation level for all the indexes within a table. For example, executing ‘DBCC SHOWCONTIG (User_Defined_Events) WITH ALL_INDEXES’ in Query Analyser will show the following results for each index in the table User_Defined_Events.
DBCC SHOWCONTIG scanning 'User_Defined_Events' table...
Table: 'User_Defined_Events' (779149821); index ID: 1, database ID: 7 TABLE level scan performed.
-Pages Scanned 66458
-Extents Scanned 8378
-Extent Switches 8400
-Avg. Pages per Extent : 7.9
-Scan Density [Best Count:Actual Count] : 98.89% [8308:8401]
-Logical Scan Fragmentation 0.04%
-Extent Scan Fragmentation 96.60%
-Avg. Bytes Free per Page : 2307.0
-Avg. Page Density (full) 71.50%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
The primary fields to look at are Scan Density and Logical Scan Fragmentation. Scan Density should be greater than 90% and Logical Scan Fragmentation should be less than 10%. If either is outside those ranges, then defragmentation or rebuilding the index should be considered.
Rebuilding an index using the command ‘DBCC REINDEX’ tends to produce better results but requires a full lock on the table while defragmenting an index using the command ‘DBCC INDEXDEFRAG’ can run in the background without affecting ongoing activity in the table. As such, it is better to start with defragmentation and only consider rebuilding the index if the defragmentation is ineffective.
‘DBCC INDEXDEFRAG’ accepts 3 arguments, database name, table name, and index id. The index id can be retrieved from the ‘DBCC SHOWCONTIG’ command described above and it must be run individually for each index in a table. For example,
DBCC INDEXDEFRAG (GBDB, User_Defined_Events, 1)
DBCC INDEXDEFRAG (GBDB, User_Defined_Events, 2)
etc…
You should always start with the clustered index (which always has an index id of 1) as that affects the other indexes as well.
Table Statistics
Whenever an individual table query is experiencing significantly poor performance or picking the wrong index, the first things that can be done is to update the table statistics. The query optimizer in SQL Server bases it’s choice of indexes and execution plans based on the current table statistics. If the table statistics are not representative of the actual data in the table, then the query optimizer may end up making bad decisions.
By default, SQL Server automatically updates table statistics periodically but the automatic update is not always accurate. Furthermore, the default statistics are calculated based on a small subset of the rows in a table, which may not always give the best result. SQL Server chooses a small subset for performance reasons in that calculating statistics can be a time consuming and resource intensive task depending on the size of the table. Manually updating the statistics based on the full set of rows in the table or even based on a larger percentage of rows (for example, 50%) may improve query performance. However, care should be taken to ensure that the statistics update is done during low utilization periods.
The ‘DBCC SHOW_STATISTICS’ command will show the date of the last update, the number of
rows sampled for the calculations as well as the current statistics themselves. For
example, executing the following in Query Analyzer, DBCCSHOW_STATISTICS
(User_Defined_Events, UserDefinedEvents_IDX_EventId)
will return the
following,
Statistics for INDEX 'UserDefinedEvents_IDX_EventId'.
Updated Rows Rows Sampled Steps Density Average key length
Feb 21 2006 11:50PM 2106647 2106647 174 4.7497781E-7 16.069065
The sample percentage used in the last update can be calculated from the ‘Rows Sampled’ and the ‘Rows’ fields. This sample percentage can be used to gauge whether the table should be updated with a larger sample size.
The ‘Density’ number can also be useful but really only when comparing it to the density of another index or to the density of the same index after updating the statistics. A lower density number is better so a successful update of the statistics should result in a lower density number. The query optimizer has a very complex selection process but, generally, when it is faced with a choice of indexes it will pick the one with the lowest density number.
The ‘UPDATE STATISTICS’ command can be used to manually update table statistics. For
example, UPDATESTATISTICS User_Defined_Events WITH FULLSCAN
Executing the above statement in Query Analyzer will recalculate the statistics based on
all the rows in the table (for example, ‘FULLSCAN’). Once complete, the SHOW_STATISTICS
command will show the current date for ‘Updated’, the ‘Rows’, and
‘Rows Sample’ columns should have the same value and the
‘Density’ number may be lower. However, this is a resource intensive operation and on a
table with 2 million rows, it could take up to 5 minutes to complete depending on table
width, SQL configuration and hardware. It would be wise to test the update with lower
percentages (for example, initially to gauge the impact before committing to the
‘FULLSCAN’ option. For example, UPDATE STATISTICS User_Defined_Events WITH
SAMPLE 25 PERCENT