Friday, April 27, 2012

A Real-life Story about Floating Point Value Comparison

I got a problem today in our software. In general, when we have two input double values like v1= 800 and v2= 750, it's working correctly while if v1=1000 and v2 = 900, it's not working as expected.

The logic is based on the comparison result of the two values. While for both two cases above we have v1 > v2, the results are not the same, why?

After tracking down to where the comparison is made, I found out that it's essentially written like this:
if (FormatToString(v1) > FormatToString(v2)) DoSomething();

I'm pretty sure, at the first look to this line, it's not doing what whoever wrote it wanted to do. The above line is actually doing a lexical comparison of the two strings converted from two double values.

Take a look at how the operator > is overloaded for string type in the basic_string.h :

  /**
   *  @brief  Test if string follows string.
   *  @param lhs  First string.
   *  @param rhs  Second string.
   *  @return  True if @a lhs follows @a rhs.  False otherwise.
   */
  template
    inline bool
    operator>(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
          const basic_string<_CharT, _Traits, _Alloc>& __rhs)
    { return __lhs.compare(__rhs) > 0; }


Notice that in the comments, it's stated as "Test if string follows string"  and the implementation is using the basic_string::compare() method which by default will do a lexical comparison like this : it returns 0 if all the characters in the compared contents compare equal, a negative value if the first character that does not match compares to less (the character's ASCII value) in the object than in the comparing string, and a positive value in the opposite case.

So come to our example:
"800" > "750" : Ture, but
"1000" > "900" : False because '1' < '9'.

Fix it is simple, but the question is why the guy wrote this ? A simple mistake (and has been there since several years without being detected and corrected) ? I don't know exactly but tracking up in the code I found some clue that might explain.

Some lines before this buggy line, I found a similar comparison but it's only testing the equality of two double values and there, the double values are formatted into strings then compared for equality:

if (FormatToString(v1) != FormatToString(v2)) DoSomething();


This works fine and hides all the double value maths complexity behind the FormatToString method who knows exactly how many digits should be considered.

I guess later on, there comes another need to test not only for equality but also to know which value is larger then the guy took a simple copy-paste-modifiy way which results in this bug.


The the question comes back to how to compare two double values ? This is a hard question to answer, read this paper and you will know why. However, in our case, a really simple solution does exist. Guess what?

Just patch the shorter one of the two strings with leading '0's, then the lexical comparison will be an equivalent to the value comparison!

"1000" > "900"  : False, but
"1000" > "0900": True !


No comments: