Integer sqrt in stdlib

jzakiya · March 23, 2020, 5:31am

My proposed isqrt produces the integer squareroot for all integer types, not floating point.

jzakiya · March 23, 2020, 5:17pm

Hey @asterite, I think the fundamental problem here is Crystal’s implementation of the GMP library.

Check out the following.

require "big"
require "benchmark"

module BitLength
  def bit_length()
    x = self < 0 ? ~self : self

    if x.is_a?(Int::Primitive)
      self.class.new(sizeof(self) * 8 - x.leading_zeros_count)
    else
      Math.log2(self).ceil.to_i
    end
  end
end

struct Int; include BitLength end

n = 10 ** 9
puts; puts "n = (10 ** 9) # => 30"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10_u64 ** 19
puts; puts "n = (10 ** 19) # => 64"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 21
puts; puts "n = (10 ** 21) # => 70"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 30
puts; puts "n = (10 ** 38) # => 100"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 50
puts; puts "n = (10 ** 50) # => 167"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 200
puts; puts "n = (10 ** 200) # => 667"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

n = 10.to_big_i ** 308
puts; puts "n = (10 ** 308) # => 1024"
Benchmark.ips do |x|
  x.report("bit_length") do
    n.bit_length
  end
  x.report("to_s(2).size") do
    n.to_s(2).size
  end
end

It produces this output, with all the input values creating valid results.

n = (10 ** 9) # => 30
  bit_length 436.29M (  2.29ns) (± 3.52%)   0.0B/op        fastest
to_s(2).size  11.09M ( 90.19ns) (± 6.20%)  48.0B/op  39.35× slower

n = (10 ** 19) # => 64
  bit_length 435.76M (  2.29ns) (± 3.71%)   0.0B/op        fastest
to_s(2).size   9.61M (104.02ns) (± 5.63%)  80.0B/op  45.33× slower

n = (10 ** 21) # => 70
  bit_length 108.68M (  9.20ns) (± 3.70%)  0.0B/op        fastest
to_s(2).size   3.00M (333.32ns) (± 4.80%)  176B/op  36.22× slower

n = (10 ** 38) # => 100
  bit_length 106.59M (  9.38ns) (± 7.24%)  0.0B/op        fastest
to_s(2).size   2.33M (429.16ns) (± 4.79%)  240B/op  45.75× slower

n = (10 ** 50) # => 167
  bit_length 108.22M (  9.24ns) (± 5.51%)  0.0B/op        fastest
to_s(2).size   1.50M (665.18ns) (± 4.75%)  368B/op  71.99× slower

n = (10 ** 200) # => 667
  bit_length 107.04M (  9.34ns) (± 6.26%)    0.0B/op         fastest
to_s(2).size 428.15k (  2.34µs) (± 5.89%)  1.44kB/op  249.99× slower

n = (10 ** 308) # => 1024
  bit_length 114.35M (  8.75ns) (± 6.29%)    0.0B/op         fastest
to_s(2).size 278.45k (  3.59µs) (± 5.36%)  2.63kB/op  410.65× slower

The Crystal version of Math.log2(n) bombs starting with (10 ** 309), and larger.

However, the Ruby implementation of Math.log2(n) works for large n.

2.7.0 :172 > Math.log2(10**100000).ceil
 => 332193 
2.7.0 :173 > (10**100000).bit_length 
 => 332193 
2.7.0 :174 > (10**100000).to_s(2).size
 => 332193 
2.7.0 :175

This says to me Crystal’s implementation of GMP is severely deficient, and fixing it will cure a host of ills, and provide the same usefulness and benefits of Ruby.

asterite · March 23, 2020, 5:21pm

Sorry, I don’t follow. In which way it’s severely deficient? Are you talking about the bit_length producing wrong results or something else?

asterite · March 23, 2020, 5:38pm

Thanks for the feedback. Here’s a PR that fixes and improves everything: https://github.com/crystal-lang/crystal/pull/8931

asterite · March 23, 2020, 5:40pm

Yes, because Math in Crystal was coded with just primitive types in mind. That you can pass BigInt to it is a mistake. What it does is to convert the given parameter to float with to_f, then call log2 on that. That’s why for big int a lot of precision is lost.

I guess for BigInt we need to provide a better overload. It can be done, but it’s a different issue (feel free to report that too).

jzakiya · March 23, 2020, 7:47pm

First, thanks for your persistence to getting this to work correctly.

A suggestion, add to the spec file as an additional sanity check:
(10.to_big_i ** 3010).bit_length.should eq(10000)

The deficiency between Ruby and Crystal I was alluding to is between the operation of Math.log2(n).
I assumed it was due to how they bind to the GMP C library, and maybe that is causing other (unseen) problems at a fundamental level with other functions. (Does Crystal’s version now match Ruby’s?)

But I’m glad you found the proper GMP function for this method.
Thanks again.

asterite · March 23, 2020, 8:44pm

It seems there are no logarithmic functions in GMP…

There’s a different library, MPFR, which seems to have those functions. But porting big float over would be pretty time consuming.

jzakiya · March 23, 2020, 9:51pm

From this: https://en.wikipedia.org/wiki/GNU_MPFR
it looks like this would be an essential library to use for accurate numerical processing.

I wish I had the skills to help with some of this, but C programming is not my forte.

Crystal has progressed significantly since just a couple of years ago, and is still young (compared to Ruby, Python, etc) so things will come with time.

I appreciate all the work, time, and thinking all the devs are putting into it, and can’t wait to see what 1.0 will be.

RespiteSage · March 24, 2020, 2:51pm

Sounds like a good opportunity for a community shard.

asterite · March 24, 2020, 3:40pm

As I said, I wouldn’t mind add Int.sqrt to the standard library. We already have the code provided by @jzakiya, and there’s a GMP function for that.

@jzakiya Would you mind sending a PR for that?

jzakiya · March 24, 2020, 4:02pm

I think we need to discuss the API for the method before implementation, e.g.:

use the Newton method(s) implementation I provided for sqr|nth roots or the GMP library
if so, incorporate the new bit_length method
are people good with the name isqrt, or other?
use as an Int instance method (Ruby’s uses Integer.sqrt as a class method)?
how to deal with negative integers (return nil, raise error and abort|return, crash program)?

These aren’t that hard to decide, but as I say, the divinity is in the details.

asterite · March 24, 2020, 4:26pm

That’s already done and will be available in the next release

I think having Math.isqrt is good. That way all “math” methods are kept in the Math module.

Raise an ArgumentError. Same as Math.sqrt (which isn’t like that in Crystal but should be).

jzakiya · March 24, 2020, 4:42pm

I like being able to do n.isqrt just like now you can do x.gcd(y) vs gcd(x, y).

For me this is more natural/intuitive and how math is done/used by real people (vs devs ).

Also, code reads/looks more natural/pretty when chaining methods like:

n.isqrt.times {..} or x.isqrt.gcd y

I fought for having it an instance method in Ruby too, but the history behind having Math.sqrt won.

asterite · March 24, 2020, 4:55pm

Yeah… we can consider removing the Math module and moving all the methods to the corresponding number types. But I’m not sure. Having all of those methods duplicated in all types will generate a lot of extra code which I’m not sure is worth it.

jzakiya · March 24, 2020, 5:04pm

Nim (and other languages) use forms of dynamic dispatch

So you can write either something like x.gcd(y) or gcd(x, y) and the compiler knows they’re the same things. This allows for writing more OOPish type code, and facilitate pretty method chaining.

I don’t know if Crystal allows this. Maybe some form of aliasing?

However implemented, there are real productivity/comprehension benefits being able to write code that looks like this.

asterite · March 24, 2020, 5:24pm

I believe it’s called “Uniform Function Call Syntax”:

jzakiya · March 24, 2020, 5:28pm

Yeh, yeh, yeh, that’s it, the right name for it (though Nim also does use dynamic dispatch).

But I’m not a compiler/language writer, so I leave it to you gurus to figure out how/when to do/use it.

jzakiya · March 25, 2020, 11:29pm

If the below code is acceptable I’ll submit a PR.
Where (what file) should it go into?

  def isqrt(n = self)          # Newton's method for Integer squareroot
    raise ArgumentError.new "Input must be non-negative integer" if n < 0
    return n if n < 2
    b = n.bit_legth            # replaces n.to_s(2).size
    one = typeof(self).new(1)  # value 1 of type self
    x = one << ((b - 1) >> 1) | n >> ((b >> 1) + 1)
    while (t = n // x) < x
      x = (x + t) >> 1 
    end
    x                          # ouput same Integer type as input
  end

asterite · March 25, 2020, 11:59pm

I think it should go in math.cr The definition should go inside module Math:

module Math
  def isqrt(n : Int)          # Newton's method for Integer squareroot
    raise ArgumentError.new "Input must be non-negative integer" if n < 0
    return n if n < 2
    b = n.bit_legth            # replaces n.to_s(2).size
    one = typeof(n).new(1)  # value 1 of type self
    x = one << ((b - 1) >> 1) | n >> ((b >> 1) + 1)
    while (t = n // x) < x
      x = (x + t) >> 1 
    end
    x                          # ouput same Integer type as input
  end
end

jzakiya · March 26, 2020, 3:27am

In Ruby when you include Math inside your code you can write, e.g:

sqrt(x) or cos(x) vs Math.sqrt(x) or Math.cos(x)

You can do the same in Crystal, right?

Then I can chain as isqrt(x).gcd(y),
though still not as consistent/pretty as: x.isqrt.gcd(y)

Topic		Replies	Views
Integer sqrt not in std lib yet Crystal Contrib	1	306	July 18, 2021
Ruby -> Crystal: n.to_s[0].to_i Help & Support	11	738	March 12, 2020
Hack to convert [Big]Ints to any Int type Help & Support	0	211	May 8, 2023
BigInt powmod(b, e, m) or equivalent	1	456	March 3, 2019
Proposal: Support for arbitrary sized integers Crystal Contrib	5	569	February 25, 2021

Integer sqrt in stdlib

Related topics