0

Partitioning

Posted by Danielle Smith on 12:18 in
Good afternoon everyone! Today I will be tackling the basics of Table and Index Partitioning, something which I haven't personally used in the working environment however it will be incredibly useful in the future in order to maintain and speed up data imports from one data table to another.

What is Partitioning?

Partitioning tables and indexes means that you can split them by rows into smaller chunks. This means that fewer operations will need to be processed when performing an INSERT, UPDATE or DELETE operation while you're importing data.

If you decide not to partition your tables, you will end up placing new data into a table and letting SQL Server handle the indexes, which can cause a lot of fragmentation. You can remove fragmentation by using the REBUILD command however this can take a long time and you're better off just dropping the indexes before the import and then creating them again after the import. However this is still a waste of time as you're duplicating your work, plus there's quite a heavy query cost to make all this happen. Therefore, this is where Partitioning really can help.

Note that although Partitioning does aid query performance, it's not as beneficial as Indexes and they should really be used in the first instance.

In order to be able to partition your tables and their corresponding indexes, you will need a Partition Function and a Partition Scheme (which I will explain in more detail below).

What is a Partition Function?

A Partition Function defines the dividing point between partitions. This is capped to a maximum of 999 values and you can decide whether to define your function as LEFT or RIGHT, depending on whether you want it to be less than or equal to (<=) or simply less than (<).

See below for examples of a partition function using LEFT and RIGHT:









The LEFT partition function will bring back partition ranges of:

Partition 1: <= 500
Partition 2: > 500 AND <= 1000
Partition 3: > 1000 AND <= 1500
Partition 4: > 1500

While the RIGHT partition function will bring back partition ranges of: 

Partition 1: < 500
Partition 2: >= 500 AND < 1000
Partition 3: >= 1000 AND < 1500
Partition 4: >= 1500

Make sure you choose your partition function carefully!

What is a Partition Scheme? 

A Partition Scheme defines a map between different parts of a table. This will allow you to store different parts of a table on different storage devices (so data rows that were created before a particular time period can be stored in a separate location so you will only ever be dealing with the most up to date of data).

In the example of a Partition Scheme shown below, the scheme is named and is related to the Partition Function you just created. This partition scheme maps all partitions to the primary file group:




However you can create your own file group and use it for multiple partitions in different places like in the example below: 



Steps to Creating a Partitioned Table 

The following steps show how to create a partitioned table:

1. Create your Partition Function:





2. Create your Partition Scheme: 




3. Create your new table that you will input your data into on the Partition Scheme: 









4. Create an index on this new table: 




5. INSERT the data from your old table into the partitioned table: 




Now may been a good time for you to query against the sys.partitions catalogue (taken from MCTS 70-433 exam book, page 239.):



And the query returns the following: 











The sys.partitions catalogue contains a row for each partition of all tables and indexes in the database and their corresponding information. For the purpose of this example, I have decided to bring back index_id, partition_number and rows, plus I have set a WHERE clause to make sure it's only bringing back data from the ItemsPartitioned table: 

  • index_id shows the ID of the index which this partition belongs:
    • 0 = Heap
    • 1 = Clustered Index
    • >= 2 = Non-Clustered Index
  • partition_number shows the number of each corresponding partition in the index or heap. 
  • rows shows the (approximate) number of rows in each partition. 

6. CREATE a new temporary table: 








7. INSERT the TOP 300 items from the Item Table into the New Item Table:




8. CREATE an index (with any other constraints):



9. ADD a CHECK constraint to the table:  




10. SWITCH places between the temporary table and Partition 5 (which is empty):





11. Drop your temporary table as it's not required anymore!



Now run an identical query as previously against the sys.partitions catalogue.





And the query returns the following:












As you can see, the TOP 300 rows have now been placed in the partitioned table. And that's all there is to it! Now any queries that are run against that partitioned column will return quicker.

What Next...

Next week, I will be introducing XML, which I found rather daunting when I first looked at it but actually when you get your head around it, it's really not that bad at all! During the course of next week I will also be discussing how to use the XML data type and also what SQLCLR is. Stay tuned for that!

As A Final Word...

Thank you for reading today's blog post! If you have any questions/comments/feedback, please leave them in the comments section below and I will get back to you as soon as I can. Alternatively, please like my SQL Genius Facebook Page and leave a message on there.

Have a great weekend everyone! :)

0 Comments

Post a Comment

Please post any feedback or comments here...

Copyright © 2009 SQL Genius - Personal Development of a Junior All rights reserved. Theme by Laptop Geek. | Bloggerized by FalconHive.