Indexing



  • To check performance of a well written query on a table like this you need to consider the following things:

    - What is my clustered index (primary key is irrelevant.)
    - How fragmented are the values of my clustered index.
    - What are my non-clustered indexes.
    - Do I have statistics on this table? (auto or manually created)
    - Are my statistics up to date?
    - How full are my index pages?

    Clustered index
    In my experience, a compound clustered index (index with more then 1 field) is only valid when 99% of the queries use all the fields in your compound index as parameters.

    Index value fragmentation.
    The more fragmented the values are of the first field in your clustered index the better (It's counter intuitive for most people.). It will make the index tree very flat and reaching the data pages will require minimal disk reads. (you can see disk reads by using 'set statistics io on'. Your aim is to get physical reads as low as possible. But note that after running the query the data will be cashed and physical io will be zero in case of a rerun.)

    Non clustered indexes.
    An important thing to know about non clustered indexes is that the leaf pages of the nonclusterd index are the clustered index pages. So the field(s) from your clustered index are implicitly included in the non clustered index.

    Statistics
    Statistics drive what query plan that the query governor will produce. Force update statistics and procedure recompilation daily to optimise queries. When you manually create statistics, always create separate statistics per column. Compound statistics are 99% of the time worthless.

    Fill Factor
    The fuller the index pages are, the faster a select query will run, and the bigger risk there is that an insert will be slower. You have to find the right balance.

     

    To fix your problem I would look if the field your are filtering on is included in an index. If it is in a compound index, look if it is the first field in the compound index. If both questions are answered no, create an index on this field. if it is the only field you ever query on, make it a clustered index.

    44 million records is a big table, but not huge. SQL server can handle a lot more. But you have to know what you are doing. With small tables you can get away with poor indexes and badly written queries, with big tables like this you can't hide the truth.

    It's one of the reasons why I plead for developing on huge tables. (10 times as big as the max data you expect). If you do a bad job, it will show immediately and not 2 years down the road when nobody can remember how something was written.

     



  • intended as reply for the sql server post, but I kinda messed it up :/


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.