SQL Server: Weird behavior with CONTAINSTABLE, ISABOUT and weighted terms

I came across the following weird behaviour in an ISABOUT query in SQL SERVER, that involves weight keyword and the final rank of the results. I want to describe that behaviour here, just in case some has a good explanation over it!

This posts assumes some basic knowledge of querying with full text search

The following bullets, are the sum up of the behaviour. Notice how the results are reversed as the weight value goes down!

  • weight(1): RANK of KEY 1 is 249
    (results order 1,2,3)
  • weight(0.8): RANK of KEY 1 is 321
    (weight down => rank up, results order 1,2,3)
  • weight(0.2): RANK of KEY 1 is 998
    (weight down => rank up, results order 1,2,3)
  • weight(0.17): RANK of KEY 1 is 802
    (weight down => rank down, results order 2,3,1)
  • weight(0.16): RANK of KEY 1 is 935
    (weight down => rank up, results order 2,1,3)
  • weight(0.01): RANK of KEY 1 is 50
    (weight down => rank down, results order 3,2,1)

As you can see, from 0.2 to 0.17 ranking decreases and results are messed up! From 0.16 results are inverted (the weight values that reproduce this behaviour depend on terms, columns searched, etc).

Microsoft states here that the actual value of RANK is meaningless, but I am sure the results order isn’t!

Reproducing the behaviour

These are the exact queries that I used to reproduce this behaviour:

QUERY 1 (weight 1): (Initial ranking)

Results

   KEY   RANK
    1    249
    2    156
    3    114

QUERY 2 (weight 0.8): (Ranking increases, initial order is preserved)

Results

   KEY    RANK
    1     321
    2     201
    3     146

QUERY 3 (weight 0.2): (Ranking increases, initial order is preserved)

Results

   KEY   RANK
    1    998
    2    877
    3    692

QUERY 4 (weight 0.17): (Ranking decreases, best match is now last, inverted behavior for these terms begin at 0.17)

Results

   KEY   RANK
    2    960
    3    958
    1    802

QUERY 5 (weight 0.16): (Ranking increases, best match is now second)

Results

   KEY   RANK
    2    978
    1    935
    3    841

QUERY 6 (weight 0.01): (Ranking decreases, best match is last again)

Results

 
   KEY   RANK
    3    105
    2     77
    1     50

This of course causes major problems when you use a custom “word-breaker”, creating something like this:

But for now, and until a better solution is found, I changed the algorithm of the custom word-breaker to always use weights above 0.2!