How SQL Server Index Statistics works - Part 1How SQL Server Index Statistics works - Part 1

Selectivity of a Query and how it affects indexing

I discussed the selectivity of a query. Today I will further extend our discussion and will drill down how index Statistics works. I will continue it in my next article where we will see some real world examples but first we need to understand the concept.
Index statistics is one of the most fundamental part of query plan optimization in SQL Server. Better understanding of index statistics helps you optimize and fine tune your queries. So if you have 1000000 rows and you are searching for a range of values of an indexed column then how SQL Server forecasts how many number of rows will match your criteria before even searching for any row? Here is the actual process:
As we all know that when we index a column, SQL Server does two things:
  1. Sorts the values of the column in an internal data structure called “Index” This data structure contains the sorted value and a pointer to its respective row in the table.
  2. Generates a histogram of values.
The histogram shows the distribution of values for the sorted column. For example:
if we have following values in an integer column 1,2,2,3,4,1,4,5,4 then a typical histogram will be similar to this
Value
Rows matched
1
2
2
2
3
1
4
3
5
1
So suppose if you are searching for all rows having column values between 3 and 5 then by looking at the above histogram table you can estimate that you will get total 1+3+1=5 rows.  If your table contains just 9 rows in total then 5 rows means approximately 55% of total rows will match the criteria. Based on this calculation you may elect to directly scan the table instead of first reading the index, fetching the pointers of the matching rows and then read the actual rows. Although, it is a very simple example and there are other factors involve in deciding whether to use index or not but this example demonstrates the concept in very simple terms.
In my next article, I will further explain this concept with the help of real world examples.

.................................................................................................................................................................


Explain the concepts of indexing XML data in SQL Server 2005 - March 12, 2009 at 14:00 PM by Amit Satpute

Explain the concepts of indexing XML data in SQL Server 2005 

  • SQL Server 2005 supports four different types of XML indexes.
  • The primary XML index on an XML column is a clustered index on an internal table known as the node table. It cannot be used directly from the T-SQL statements. The primary XML index is a B+tree.
  • The primary XML index contains one row for each node in the XML instance.
  • SQL Server 2005 executes a SQL query that contains an XML data type method.
  • When an SQL-with-XML query is executed against a table containing an XML data type column, the query must process every XML instance in every row. At the top level, there are two ways that such a query can be executed
  • Select the rows in the base table (that is, the relational table that contains the XML data type column) that qualify first, and then process each XML instance using XQuery. This is known as top-down query processing.
  • Process all XML instances using the XQuery first, and then join the rows that qualify to the base table. This is known as bottom-up query processing. 

0 comments:

Post a Comment