It seems using GC.malloc_atomic produces a significant different. Well, it goes from 26 seconds to 23 seconds in my machine.
Could you please send a patch or diff?
i can test it on my linux laptop.
I guess it would be this change
diff --git a/src/big/lib_gmp.cr b/src/big/lib_gmp.cr
index 446f3ea5a..1791f00c4 100644
--- a/src/big/lib_gmp.cr
+++ b/src/big/lib_gmp.cr
@@ -240,7 +240,7 @@ lib LibGMP
end
LibGMP.set_memory_functions(
- ->(size) { GC.malloc(size) },
+ ->(size) { GC.malloc_atomic(size) },
->(ptr, old_size, new_size) { GC.realloc(ptr, new_size) },
->(ptr, size) { GC.free(ptr) }
)
Fiber or thread? I was under the impression that GC happens outside the main thread. At least, I remember seeing GC threads outside the main thread when profiling.
Regardless of implementation, though, the GC doesnât need you to relinquish the CPU to collect. For example, this app never uses more than 2.3MB of RAM despite generating tons of garbage and never yielding the CPU:
strings = [] of String
1_000_000_000.times do |i|
string = i.to_s
if rand < 0.1
strings << string
strings.shift if strings.size > 100
end
end
Thatâs not true. The garbage collector works like this: it reserves some space for memory. When you call malloc, if thereâs free space, it gives you that. Otherwise, it runs the GC to see if thereâs memory that can be free. If so, it frees it and uses that space for the requested memory. Otherwise it requests more memory.
From their manual:
GMP may use allocated blocks to hold pointers to other allocated blocks. This will limit the assumptions a conservative garbage collection scheme can make.
So no, malloc_atomic
is unsafe and will cause those internally pointed blocks to be collected.
I test on following code use a clear linux env
require "big"
def fib(n)
if n <= 1
BigInt.new(1)
else
fib(n - 1) + fib(n - 2)
end
end
time = Time.local
puts fib(BigInt.new(42))
puts Time.local - time
â°â $ cr version
Crystal 1.6.0-dev [f2599414a] (2022-09-25)
LLVM: 14.0.6
Default target: x86_64-pc-linux-gnu
Following is the result before patch:
â°â $ cr run --release 1.cr
433494437
00:01:37.712517060
Following is the result after apply the following malloc_atomic patch
â°â $ cr run --release 1.cr
433494437
00:01:35.808699703
There is no big different.
Oh, yeah what I wrote is completely wrong I was writing it on the go and apparently mixed something up. Sorry. I need to be more careful and not write such nonsense.
Hi, i donât know, just ask, how to know this code use more than 2.3MB of RAM? Could you please paste the code?
I was using macOS Activity Monitor. You could also use top
or htop
.
Disabling the GC actually ends up using more time because of the sheer amount of dynamic allocations involved.
GMP is meant to achieve peak performance when the values are mutable, which BigInt
doesnât support right now. A recursive implementation that stays true to GMPâs spirit might look like:
struct BigInt
def add!(other : self) : self
LibGMP.add(self, self, other)
self
end
def add!(other : Int32) : self
if other >= 0
LibGMP.add_ui(self, self, other)
else
LibGMP.sub_ui(self, self, -other)
end
self
end
end
ONE = 1.to_big_i
def fib2(ret, n) : Nil
if n.value <= 1
ret.value.add!(ONE)
else
n.value.add!(-1)
fib2(ret, n)
n.value.add!(-1)
fib2(ret, n)
n.value.add!(2)
end
end
def fib2(n : BigInt) : BigInt
ret = 0.to_big_i
fib2(pointerof(ret), pointerof(n))
ret
end
On my M2, fib2(42.to_big_i)
takes 6.42 seconds whereas fib(42.to_big_i)
takes 24.7, a speedup over 280%. The new version also allocates exactly 3 BigInt
s: the argument, the return value, and ONE
. But it is hard to say if fib2
is idiomatic arithmetic code anymore, and the standard libraryâs BigInt
API is specifically designed so that users donât have to worry about writing (and more importantly, reading) this kind of code.
Also, for a fairer comparison try using a really large constant for fib(1)
's value on Ruby (something larger than 2 ** 63
).
See also: Add mutating methods to BigInt by Exilor ¡ Pull Request #9825 ¡ crystal-lang/crystal ¡ GitHub
This seems to be a recurring Add mutating methods to BigInt by Exilor ¡ Pull Request #9825 ¡ crystal-lang/crystal ¡ GitHub so I guess we eventually do something about it.