Saturday 5 July 2014

A simple Indexing strategy

Indexing is very important to database performance.  But not everyone knows or cares how to do it right.  There are increasing options for indexes with every version, and code developers aren't going to be familiar with all the options and will end up make sub-optimal choices.

I like to recommend a simple 3 phase indexing strategy.  This is not aimed at experts or systems with high levels of optimization necessary.  These are general guidelines.  This is for those who are building databases, need them to perform, but do not understand SQL Server indexing.  The main idea behind them is that whist it might not be the most optimal indexing, it will perform adequately and not require additional tuning.

Phase 1 - Clustering key selection
I recommend every table having a clustered index.  You don't always have to choose this, because SQL Server will make your primary key the clustering key unless you specify otherwise.
(I also recommend every table should have a primary key, but that's a topic for another post)

The selection of a clustering key should exhibit all of the following principles:
  • Narrow - The combined size of the key should be kept to an absolute minimum
  • Static - None of the key values for a row should ever change
  • Unique - This makes the clustering key highly selective.
  • Ever Increasing - This reduces fragmentation, pages splits and creates a positive hot-spot.
  • Non-Null - This removed the need for a null bitmap

My recommendation is that every table has a surrogate key of integer type (int, smallint, etc) that is an IDENTITY field and is the primary key.  This meats all of the above principles and creates a suitable table structure.

Phase 2 - Index Foreign keys
I recommend creating a non-clustered index for every foreign key.  These are often useful for joins and play a very important role with deletes.

Phase 3 - Index Search Fields
Surrogate keys and foreign keys are not very useful to applications looking to list customers with a last-name of 'smith', or get the order lines using order number 'O43923'.  So we need to create indexes that are for the purpose of our workloads.

I can't tell you want these are, but start with business keys and natural keys.  This are usually unique within a table and have meaning outside of the database.  So things like invoice number and order number.  Also customer number and product code.

Also look for searchable fields, these are things you will put in your where clauses.  This phase is often based on the statements that you write (or that your ORM produces.)

Finally
There is much more to indexing that these simple 3 phases.  But by following this simple strategy, your off to a good start.

I hope be publishing more detailed posts about indexing in future.

No comments:

Post a Comment