Floats : equality compare

hutou · May 3, 2021, 8:25pm

The comparison of 2 floating-point numbers with == is not reliable, as everyone knows.
Nevertheless, it is sometimes complicated to do without it.

Wouldn’t it be a good idea to integrate in the stdlib an “approximate” comparison function, such as the one defined here, and even to overload the =~ and !~ operators for this purpose?

Thoughts?

asterite · May 3, 2021, 8:39pm

I think this is mostly needed in specs, and there’s a matcher for that already:

https://crystal-lang.org/api/1.0.0/Spec/Expectations.html#be_close(expected,delta)-instance-method

Or were you talking about using it in regular code? What’s the use case?

hutou · May 3, 2021, 9:25pm

Yes, I mean for standard code, not for tests, and to do what the == operator exists for, but with increased reliability, and without having to look for workarounds (with < or >) or code logic changes, and in my current development project, the use of floats cannot be avoided.

According to what I read, be_close just makes a simple comparison of the difference of 2 values with a delta, quite far from more elaborate functions like the one referenced in the link of my post or here

For the time being, I will adapt the almost_equal function in Java for the Crystal language for my own use.

straight-shoota · May 3, 2021, 10:15pm

Comparison between two floating point values is actually exact and reliable. Imprecise is the conversion between floating point format and decimal representation.

Outside of specs which test algorithms to return specific values expressed as literals, this should rarely be an issue.
Can you tell us about your use case?

hutou · May 4, 2021, 8:29am

Reading the mentioned posts and also this blog, I was wondering about the reliability of the test x == 0.0 when x is the result of a series of calculations on a data history (which is my use case)
If I understand correctly, this test will return a false result in all cases where x is not exactly equal to 0.0, hence my interest in a test of approximate equality in the case where, for example, x would have a value very close to 0.0
But thinking about it, a test like if -epsilon < x < epsilon, with epsilon = 0.0000000001 for example, will do just as well!
Thanks for your comments

RespiteSage · May 4, 2021, 2:29pm

It looks like you’ve already figured out a good solution, but I wanted to weigh in with something maybe obvious just because it wasn’t explicitly stated: the problem with using =~ and !~ is that they use two values (the receiver on the left side and the argument on the right), but what you want is something that uses three values (the receiver, the argument, and an epsilon). Your two links and the solution you came to (-epsilon < x < epsilon) all use an epsilon, and that value should be implementation-dependent.

For example, if I’m working on values for some GIS system, I might use an epsilon of 10e-6 for latitude and longitude, but if I want a consistent epsilon when I’m considering values in kilometers I’d want to use 10e-4, since both values come out to around 10 cm. Any standard library implementation of a closeness method would need to take an epsilon, which unfortunately rules out graceful use of operators like ~=.

I suspect you already know this, but I wanted to make sure that future readers have an explicit explanation.

rogerdpack · May 13, 2021, 7:16pm

Seems reasonable, to me, to have a compare with a delta in the stdlib. Or maybe it’s not useful?

rogerdpack · June 7, 2021, 9:49pm

I’d almost be in favor of removing equality comparison between float and anything else. It’s dangerous! :)

Didactic.Drunk · June 8, 2021, 2:17am

Why not both? Remove equality comparisons and add

def approximately(b, epsilon = {% type.default_epsilon %})
  (a - b).abs < epsilon
end

Unity via Mathf.approximately and Julia via the ≈ operator have builtin approximate functions.

elder-n00b · June 10, 2021, 8:57am

I think it could be useful but, as pointed out, no size will fit all use cases.

Even within one project, I expect that some comparisons will need different tolerances than others, perhaps even calculated at run-time. Plus there’s the issue of different float sizes and other numeric types.

That said, some other languages do have such function and operator (Julia for example, has isapprox and ≈).

Of the ones that do, not all agree on the formula, arguments and default values used.

A quick translation of the SO c++ code (with different argument names and defaults):

@[AlwaysInline]
def nearly_equal(a : F, b : F, rtol : F = F::EPSILON * 16, atol : F = F::MIN_POSITIVE) forall F
  # defaults are arbitrary, `rtol` in particular
  return true if a == b
  diff = (a-b).abs
  norm = Math.min((a+b).abs, F::MAX)
  diff < Math.max(atol, rtol * norm)
end

the =~ operator (!~ is already in Object):

struct Float
  def =~ (other : self)  
    nearly_equal(self, other)
  end
end

(and why not)

struct Float
  def =~ (range : Range)
    self.in? range
  end
end

Hint: to use the same tolerance for a series of comparisons, set it in a Tuple and splat it in place:

tol = {..., ...}; nearly_equal(a,b,*tol)
tol = {rtol: ..., atol: ...}; nearly_equal(a,b,**tol)

Or maybe:

def nearly_equal_fn(*tol : F) forall F
  ->(a : F, b : F){::nearly_equal(a,b,*tol)}
end
# neq = nearly_equal_fn(rtol,atol)
# neq.call(a,b) # => true|false -- ugly syntax though
# neq[a,b] # somewhat better, hijacking `Proc#[]`

Or something like this one, which “looks” nicer but has problems:

struct NEqTolerance(F)
  property rtol, atol # optional
  def initialize(@rtol : F , @atol : F)
  end
  def nearly_equal(a, b)
    ::nearly_equal(a, b, @rtol, @atol)
  end
end
# neq = NEqTolerance.new(rtol,atol)
# neq.nearly_equal(a,b) # => true|false

def with_tolerance(rtol : F, atol : F) forall F
  with NEqTolerance(F).new(rtol, atol) yield
end
# with_tolerance(rtol,atol) do
#   nearly_equal(a,b) # => true|false
# end

NOTES:

Assumes all values are of the same type.

Tested with Float32 and Float64 but in theory works with any numeric type defining a MAX constant, plus EPSILON and MIN_POSITIVE for the defaults (this requirement can be easily satisfied or removed). For instance, BigFloat can be coerced by defining a few arbitrary constants. But it would be more sensible to have an overload for those cases.

Sometimes it will complain if the type cannot be inferred.

The =~ operator is unaffected by with_tolerance.

Beware with_tolerance by the way, it behaves funny. Say you do with_tolerance with Float32 values, which nearly_equal is going to be called for comparing Float64 values?

Not really well tested but passes basic sanity checks.

Not really well profiled but seems comparable with a regular float == in all its shapes.

This is quite “quick and dirty”, and I’m new to Crystal, I bet there are better ways to do it (with macros probably) but I wanted to make it short. It may be a starting point but if it has to be a library, or end up in the standard library, it should be more solid and handle, for instance, approximately comparing a float to an exact value (a BigRational for instance).

If it is interesting, I can upload the whole repo with some tests, but it needs review before being used.

elder-n00b · June 13, 2021, 11:24pm

A little better:

  struct ::Float

    # what are sensible defaults for `Float32` and `Float64`?
    # what are sensible defaults for `BigFloat` and `BigRational`?
    macro default_rtol
      {% if @type.has_constant?(:EPSILON) %}
      {{ @type.id }}::EPSILON * 16 #? pulled from thin air
      {% else %}
      {{ @type.id }}.new(0) 
      {% end %}
    end
    macro default_atol
      {% if @type.has_constant?(:MIN_POSITIVE) %}
      {{ @type.id }}::MIN_POSITIVE #? this makes all denormals zeroish, but is it wise?
      {% else %}
      {{ @type.id }}.new(0)
      {% end %}
    end

    @[AlwaysInline] #?
    def nearly_equals?(other, rtol = default_rtol, atol = default_atol)
      return true if self == other
      diff = (self-other).abs
      norm = (self+other).abs
      {% if @type.has_constant?(:MAX) %}
      norm = Math.min(norm, {{ @type.constant(:MAX) }})
      {% end %}
      diff < Math.max(atol, rtol * norm)
    end

    @[AlwaysInline] #?
    def =~ (other)
      self.nearly_equals?(other)
    end

    @[AlwaysInline] #?
    def =~ (range : Range)
      self.in? range
    end

  end

and forget with_tolerance, it’s more trouble than good.
The top-level function form can of course be defined in terms of the method if desired.

This works with all floats regardless what constants they define, just adapt the two macros if necessary.

By the way, a =~ b == b =~ a (not as obvious as it would seem).

Take a look at this Julia forum discussion for some math, interesting considerations, and arguments in favor and against having such a function (keeping in mind it’s quite a different language with a different target, and their isapprox formula is a bit different too).
Especially relevant, the problem of normalization and the case of comparing to zero.
It seems to me that the best reason for having the function (not necessarily in standard library) is that most people would do it wrong and/or… pick default tolerances out of thin air like I just did.
I hereby admit my understanding of the problem is not up to task.

Topic		Replies	Views
Why is 2.0 - 1.8 = 0.199... instead of 0.2	4	527	October 13, 2020
Confusion about / and // Crystal Contrib	8	152	January 9, 2024
RFC: Can we remove the `===` operator? Help & Support	12	899	March 12, 2020
Faster floating point parsing algorithm	15	738	September 14, 2021
Integer sqrt in stdlib Crystal Contrib	47	2345	October 8, 2020

Floats : equality compare

Related topics