Integer sqrt in stdlib

My proposed isqrt produces the integer squareroot for all integer types, not floating point.

2 Likes

Hey @asterite, I think the fundamental problem here is Crystal’s implementation of the GMP library.

Check out the following.

require "big"
require "benchmark"

module BitLength
  def bit_length()
    x = self < 0 ? ~self : self

    if x.is_a?(Int::Primitive)
      self.class.new(sizeof(self) * 8 - x.leading_zeros_count)
    else
      Math.log2(self).ceil.to_i
    end
  end
end

struct Int; include BitLength end

n = 10 ** 9
puts; puts "n = (10 ** 9) # => 30"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10_u64 ** 19
puts; puts "n = (10 ** 19) # => 64"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 21
puts; puts "n = (10 ** 21) # => 70"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 30
puts; puts "n = (10 ** 38) # => 100"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 50
puts; puts "n = (10 ** 50) # => 167"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 200
puts; puts "n = (10 ** 200) # => 667"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 308
puts; puts "n = (10 ** 308) # => 1024"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

It produces this output, with all the input values creating valid results.

n = (10 ** 9) # => 30
  bit_length 436.29M (  2.29ns) (± 3.52%)   0.0B/op        fastest
to_s(2).size  11.09M ( 90.19ns) (± 6.20%)  48.0B/op  39.35× slower

n = (10 ** 19) # => 64
  bit_length 435.76M (  2.29ns) (± 3.71%)   0.0B/op        fastest
to_s(2).size   9.61M (104.02ns) (± 5.63%)  80.0B/op  45.33× slower

n = (10 ** 21) # => 70
  bit_length 108.68M (  9.20ns) (± 3.70%)  0.0B/op        fastest
to_s(2).size   3.00M (333.32ns) (± 4.80%)  176B/op  36.22× slower

n = (10 ** 38) # => 100
  bit_length 106.59M (  9.38ns) (± 7.24%)  0.0B/op        fastest
to_s(2).size   2.33M (429.16ns) (± 4.79%)  240B/op  45.75× slower

n = (10 ** 50) # => 167
  bit_length 108.22M (  9.24ns) (± 5.51%)  0.0B/op        fastest
to_s(2).size   1.50M (665.18ns) (± 4.75%)  368B/op  71.99× slower

n = (10 ** 200) # => 667
  bit_length 107.04M (  9.34ns) (± 6.26%)    0.0B/op         fastest
to_s(2).size 428.15k (  2.34”s) (± 5.89%)  1.44kB/op  249.99× slower

n = (10 ** 308) # => 1024
  bit_length 114.35M (  8.75ns) (± 6.29%)    0.0B/op         fastest
to_s(2).size 278.45k (  3.59”s) (± 5.36%)  2.63kB/op  410.65× slower

The Crystal version of Math.log2(n) bombs starting with (10 ** 309), and larger.

However, the Ruby implementation of Math.log2(n) works for large n.

2.7.0 :172 > Math.log2(10**100000).ceil
 => 332193 
2.7.0 :173 > (10**100000).bit_length 
 => 332193 
2.7.0 :174 > (10**100000).to_s(2).size
 => 332193 
2.7.0 :175 

This says to me Crystal’s implementation of GMP is severely deficient, and fixing it will cure a host of ills, and provide the same usefulness and benefits of Ruby.

Sorry, I don’t follow. In which way it’s severely deficient? Are you talking about the bit_length producing wrong results or something else?

Thanks for the feedback. Here’s a PR that fixes and improves everything: https://github.com/crystal-lang/crystal/pull/8931

1 Like

Yes, because Math in Crystal was coded with just primitive types in mind. That you can pass BigInt to it is a mistake. What it does is to convert the given parameter to float with to_f, then call log2 on that. That’s why for big int a lot of precision is lost.

I guess for BigInt we need to provide a better overload. It can be done, but it’s a different issue (feel free to report that too).

1 Like

First, thanks for your persistence to getting this to work correctly. :+1:

A suggestion, add to the spec file as an additional sanity check:
(10.to_big_i ** 3010).bit_length.should eq(10000)

The deficiency between Ruby and Crystal I was alluding to is between the operation of Math.log2(n).
I assumed it was due to how they bind to the GMP C library, and maybe that is causing other (unseen) problems at a fundamental level with other functions. (Does Crystal’s version now match Ruby’s?)

But I’m glad you found the proper GMP function for this method.
Thanks again.

It seems there are no logarithmic functions in GMP


There’s a different library, MPFR, which seems to have those functions. But porting big float over would be pretty time consuming.

From this: https://en.wikipedia.org/wiki/GNU_MPFR
it looks like this would be an essential library to use for accurate numerical processing.

I wish I had the skills to help with some of this, but C programming is not my forte.

Crystal has progressed significantly since just a couple of years ago, and is still young (compared to Ruby, Python, etc) so things will come with time.

I appreciate all the work, time, and thinking all the devs are putting into it, and can’t wait to see what 1.0 will be.

3 Likes

Sounds like a good opportunity for a community shard.

As I said, I wouldn’t mind add Int.sqrt to the standard library. We already have the code provided by @jzakiya, and there’s a GMP function for that.

@jzakiya Would you mind sending a PR for that?

I think we need to discuss the API for the method before implementation, e.g.:

  • use the Newton method(s) implementation I provided for sqr|nth roots or the GMP library
  • if so, incorporate the new bit_length method
  • are people good with the name isqrt, or other?
  • use as an Int instance method (Ruby’s uses Integer.sqrt as a class method)?
  • how to deal with negative integers (return nil, raise error and abort|return, crash program)?

These aren’t that hard to decide, but as I say, the divinity is in the details.

That’s already done and will be available in the next release

I think having Math.isqrt is good. That way all “math” methods are kept in the Math module.

Raise an ArgumentError. Same as Math.sqrt (which isn’t like that in Crystal but should be).

3 Likes

I like being able to do n.isqrt just like now you can do x.gcd(y) vs gcd(x, y).

For me this is more natural/intuitive and how math is done/used by real people (vs devs :grin:).

Also, code reads/looks more natural/pretty when chaining methods like:

n.isqrt.times {..} or x.isqrt.gcd y

I fought for having it an instance method in Ruby too, but the history behind having Math.sqrt won.

Yeah
 we can consider removing the Math module and moving all the methods to the corresponding number types. But I’m not sure. Having all of those methods duplicated in all types will generate a lot of extra code which I’m not sure is worth it.

Nim (and other languages) use forms of dynamic dispatch

So you can write either something like x.gcd(y) or gcd(x, y) and the compiler knows they’re the same things. This allows for writing more OOPish type code, and facilitate pretty method chaining.

I don’t know if Crystal allows this. Maybe some form of aliasing?

However implemented, there are real productivity/comprehension benefits being able to write code that looks like this.

I believe it’s called “Uniform Function Call Syntax”:

Yeh, yeh, yeh, that’s it, the right name for it (though Nim also does use dynamic dispatch).

But I’m not a compiler/language writer, so I leave it to you gurus to figure out how/when to do/use it. :blush:

If the below code is acceptable I’ll submit a PR.
Where (what file) should it go into?

  def isqrt(n = self)          # Newton's method for Integer squareroot
    raise ArgumentError.new "Input must be non-negative integer" if n < 0
    return n if n < 2
    b = n.bit_legth            # replaces n.to_s(2).size
    one = typeof(self).new(1)  # value 1 of type self
    x = one << ((b - 1) >> 1) | n >> ((b >> 1) + 1)
    while (t = n // x) < x
      x = (x + t) >> 1 
    end
    x                          # ouput same Integer type as input
  end
2 Likes

I think it should go in math.cr The definition should go inside module Math:

module Math
  def isqrt(n : Int)          # Newton's method for Integer squareroot
    raise ArgumentError.new "Input must be non-negative integer" if n < 0
    return n if n < 2
    b = n.bit_legth            # replaces n.to_s(2).size
    one = typeof(n).new(1)  # value 1 of type self
    x = one << ((b - 1) >> 1) | n >> ((b >> 1) + 1)
    while (t = n // x) < x
      x = (x + t) >> 1 
    end
    x                          # ouput same Integer type as input
  end
end
3 Likes

In Ruby when you include Math inside your code you can write, e.g:

sqrt(x) or cos(x) vs Math.sqrt(x) or Math.cos(x)

You can do the same in Crystal, right?

Then I can chain as isqrt(x).gcd(y),
though still not as consistent/pretty as: x.isqrt.gcd(y)