BigInt is really really slow when run on linux, Or just Boehm GC issue?

It seems using GC.malloc_atomic produces a significant different. Well, it goes from 26 seconds to 23 seconds in my machine.

Could you please send a patch or diff?

i can test it on my linux laptop.

I guess it would be this change

diff --git a/src/big/lib_gmp.cr b/src/big/lib_gmp.cr
index 446f3ea5a..1791f00c4 100644
--- a/src/big/lib_gmp.cr
+++ b/src/big/lib_gmp.cr
@@ -240,7 +240,7 @@ lib LibGMP
 end

 LibGMP.set_memory_functions(
-  ->(size) { GC.malloc(size) },
+  ->(size) { GC.malloc_atomic(size) },
   ->(ptr, old_size, new_size) { GC.realloc(ptr, new_size) },
   ->(ptr, size) { GC.free(ptr) }
 )
1 Like

Fiber or thread? I was under the impression that GC happens outside the main thread. At least, I remember seeing GC threads outside the main thread when profiling.

Regardless of implementation, though, the GC doesn’t need you to relinquish the CPU to collect. For example, this app never uses more than 2.3MB of RAM despite generating tons of garbage and never yielding the CPU:

strings = [] of String

1_000_000_000.times do |i|
  string = i.to_s
  if rand < 0.1
    strings << string
    strings.shift if strings.size > 100
  end
end

That’s not true. The garbage collector works like this: it reserves some space for memory. When you call malloc, if there’s free space, it gives you that. Otherwise, it runs the GC to see if there’s memory that can be free. If so, it frees it and uses that space for the requested memory. Otherwise it requests more memory.

From their manual:

GMP may use allocated blocks to hold pointers to other allocated blocks. This will limit the assumptions a conservative garbage collection scheme can make.

So no, malloc_atomic is unsafe and will cause those internally pointed blocks to be collected.

2 Likes

I test on following code use a clear linux env

require "big"

def fib(n)
  if n <= 1
    BigInt.new(1)
  else
    fib(n - 1) + fib(n - 2)
  end
end

time = Time.local
puts fib(BigInt.new(42))
puts Time.local - time
 ╰─ $ cr version
Crystal 1.6.0-dev [f2599414a] (2022-09-25)

LLVM: 14.0.6
Default target: x86_64-pc-linux-gnu

Following is the result before patch:

 ╰─ $ cr run --release 1.cr 
433494437
00:01:37.712517060

Following is the result after apply the following malloc_atomic patch

 ╰─ $ cr run --release 1.cr 
433494437
00:01:35.808699703

There is no big different.

1 Like

Oh, yeah what I wrote is completely wrong :man_facepalming: I was writing it on the go and apparently mixed something up. Sorry. I need to be more careful and not write such nonsense.

1 Like

Hi, i don’t know, just ask, how to know this code use more than 2.3MB of RAM? Could you please paste the code?

I was using macOS Activity Monitor. You could also use top or htop.

1 Like

Disabling the GC actually ends up using more time because of the sheer amount of dynamic allocations involved.

GMP is meant to achieve peak performance when the values are mutable, which BigInt doesn’t support right now. A recursive implementation that stays true to GMP’s spirit might look like:

struct BigInt
  def add!(other : self) : self
    LibGMP.add(self, self, other)
    self
  end

  def add!(other : Int32) : self
    if other >= 0
      LibGMP.add_ui(self, self, other)
    else
      LibGMP.sub_ui(self, self, -other)
    end
    self
  end
end

ONE = 1.to_big_i

def fib2(ret, n) : Nil
  if n.value <= 1
    ret.value.add!(ONE)
  else
    n.value.add!(-1)
    fib2(ret, n)
    n.value.add!(-1)
    fib2(ret, n)
    n.value.add!(2)
  end
end

def fib2(n : BigInt) : BigInt
  ret = 0.to_big_i
  fib2(pointerof(ret), pointerof(n))
  ret
end

On my M2, fib2(42.to_big_i) takes 6.42 seconds whereas fib(42.to_big_i) takes 24.7, a speedup over 280%. The new version also allocates exactly 3 BigInts: the argument, the return value, and ONE. But it is hard to say if fib2 is idiomatic arithmetic code anymore, and the standard library’s BigInt API is specifically designed so that users don’t have to worry about writing (and more importantly, reading) this kind of code.

Also, for a fairer comparison try using a really large constant for fib(1)'s value on Ruby (something larger than 2 ** 63).

1 Like

See also: Add mutating methods to BigInt by Exilor · Pull Request #9825 · crystal-lang/crystal · GitHub

This seems to be a recurring Add mutating methods to BigInt by Exilor · Pull Request #9825 · crystal-lang/crystal · GitHub so I guess we eventually do something about it.