SQL Simplified: How does a non clustered index looks like ?

Through this blog I am going to show you how a non clustered index looks like in a heap table as well as a cluster table.

USEMaster

GO

-- Create a database NLCStruct

CREATE DATABASE NLCStruct;

USE NLCStruct

GO

Create a heap table such a way that each page has 2 rows. The col1 is identity and will start from 1 and increment by 1

CREATE TABLE Heap(col1 int identity, col2 varchar(4000));

Now that the heap is created, let's insert few rows in this table and see the how the pages allocated looks like

INSERT INTO Heap VALUES (REPLICATE('a',4000))

GO 8

-- So the rows in the pages are distributed like this

Page 80 = 50 (hex) = 2 rows (col1 = 1 and col1 = 2)
Page 90 = 5A (hex) = 2 rows (col1 = 3 and col1 = 4)
Page 93 = 5D (hex) = 2 rows (col1 = 5 and col1 = 6)
Page 94 = 5E (hex) = 2 rows (col1 = 7 and col1 = 8)

DBCC IND(NLCStruct, Heap, 1)

We now create a non clustered index on this table and see the page allocated to that index using DBCC IND. The page is 109

CREATE NONCLUSTERED INDEX idx_Heap ON Heap(col1);

DBCC IND(NLCStruct, Heap, 2)

Now lets see what does the index page 109 holds for us. This can be explored using the DBCC PAGE command as shown below

DBCC TRACEON(3604)

DBCC PAGE(NLCStruct, 1, 109, 3)

Lets explore this output from DBCC PAGE. We will focus only on the column HEAP RID (key)

Row 1 => 0x5000000001000000 => 0x50000000 is page number in hex (80 decimal) => 0100 = 1000 (byte reversed) = 0001 (read reverse) - is file id 01 => 0000 = 0000 (byte reversed) = 0000 (read reverse) - is row number 0
Row 2 => 0x5000000001000100 => 0x50000000 is page number in hex (80 decimal) => 0100 = 1000 (byte reversed) = 0001 (read reverse) - is file id 01 => 0100 = 1000 (byte reversed) = 0001 (read reverse) - is row number 1
Row 3 => 0x5A00000001000000 => 0x5A000000 is page number in hex (90 decimal) => 0100 = 1000 (byte reversed) = 0001 (read reverse) - is file id 01 => 0000 = 0000 (byte reversed) = 0000 (read reverse) - is row number 0
Row 4 => 0x5A00000001000100 => 0x5A000000 is page number in hex (90 decimal) => 0100 = 1000 (byte reversed) = 0001 (read reverse) - is file id 01 => 0100 = 1000 (byte reversed) = 0001 (read reverse) - is row number 1

Similarly you can decode the rest of the data. This is what the BOL also says "The row locators in nonclustered index rows are either a pointer to a row or are a clustered index key for a row, as described in the following". "If the table is a heap, which means it does not have a clustered index, the row locator is a pointer to the row. The pointer is built from the file identifier (ID), page number, and number of the row on the page. The whole pointer is known as a Row ID (RID)."

Lets explain the same thing for a clustered index by creating a cluster table such a way that each page has 2 rows. The col1 is identity and will start from 1 and increment by 1. Also col1 has a primary key and since there is no other index the primary key will be the clustered index also.

CREATE TABLE Cluster(col1 int identity primary key, col2 varchar(4000));

Now that the cluster is created, let's insert few rows in this table and see the how the pages allocated look like. There will be 8 rows in the table with col1 = 1 to col1 = 8

INSERT INTO Cluster VALUES (REPLICATE('a',4000))

GO 8

-- So the rows in the data pages are distributed like this

Page 120 = 78 (hex) = 2 rows (col1 = 1 and col1 = 2)
Page 127 = 7F (hex) = 2 rows (col1 = 3 and col1 = 4)
Page 148 = 94 (hex) = 2 rows (col1 = 5 and col1 = 6)
Page 151 = 97 (hex) = 2 rows (col1 = 7 and col1 = 8)

DBCC IND(NLCStruct, Cluster, 1)

We now create a non clustered index on this table and see the page allocated to that index using DBCC IND. The page is 173

CREATE NONCLUSTERED INDEX idx_Cluster ON Cluster(col1);

DBCC IND(NLCStruct, Cluster, 2)

Now lets see what does the index page 109 holds for us. This can be explored using the DBCC PAGE command as shown below

DBCC TRACEON(3604)

DBCC PAGE(NLCStruct, 1, 173, 3)

I don't think there is much to explain here. We can clearly see that each row in the non clustered index leaf node has the corresponding cluster key against it. One thing I would like to point out is that in this case we create a unique clustered index, however if there was a non unique clustered index created then BOL says "If the clustered index is not a unique index, SQL Server makes any duplicate keys unique by adding an internally generated value called a uniqueifier. This four-byte value is not visible to users. It is only added when required to make the clustered key unique for use in nonclustered indexes. SQL Server retrieves the data row by searching the clustered index using the clustered index key stored in the leaf row of the nonclustered index."

SQL Simplified

Search This Blog

Thursday, January 12, 2012

How does a non clustered index looks like ?

Followers

Blog Archive

About Me