I’m messing around with implementing and comparing different prime number generators, including prime sieves, after becoming interested in @jzakiya’s Sieve of Zakiya. I haven’t gotten as far as complex sieves like that yet, though. I’m currently working on implementing varieties of the Sieve of Eratosthenes, and I’m stumped by my Segmented Sieve of Eratosthenes using much more memory than my standard Sieve of Eratosthenes implementation.
Here are my implementations:
def sieve_of_eratosthenes(upto n)
possible_primes = BitArray.new n + 1, initial: true
possible_primes[0] = false
possible_primes[1] = false
(2..).take_while { |i| i <= Math.sqrt(n) }.each do |i|
if possible_primes[i]?
(i**2).upto(n).step(i).each do |j|
possible_primes[j] = false
end
end
end
possible_primes.each_index.select { |index| possible_primes[index] }.to_a
end
def segmented_sieve_of_eratosthenes(upto n)
delta = Math.sqrt(n).to_i
known_primes = sieve_of_eratosthenes upto: delta
possible_primes = BitArray.new delta
(delta + 1).upto(n).step(delta).each do |segment_min|
segment_max = segment_min + delta - 1
prime_divisor_limit = Math.sqrt(segment_max)
relevant_prime_divisors = known_primes.each.take_while { |prime| prime <= prime_divisor_limit }
possible_primes.fill true
pre_segment_min = segment_min - 1
relevant_prime_divisors.each do |p|
starting_index = p - (pre_segment_min % p) - 1
starting_index.upto(delta - 1).step(p).each do |j|
possible_primes[j] = false
end
end
possible_primes.each_index do |index|
if possible_primes[index]
known_primes << (segment_min + index)
end
end
end
known_primes
end
The segmented sieve is using around 5 times as much memory as the basic version*. I initially noticed a difference in memory consumption when using Benchmark.ips
, and since then I’ve just been checking with GC.stats.total_bytes
at the end of a run (switching implementations and recompiling to compare).
I’m not sure whether adding GC.stats.total_bytes
checks around the code is a valid way to check what’s causing memory allocation, but when I did that it seemed like the if possible_primes[index]
line toward the bottom of the segmented sieve implementation was allocating a bunch of memory, which seems… wrong.
So basically I’m wondering if anyone else has an idea of why I’m seeing more memory usage in the segmented sieve than in the basic one. If you have a critique of the algorithm re runtime performance or correctness, I guess that’s fine, but it’s not really what I’m looking for.
If it matters, I’m running Crystal 1.2.2 with LLVM 10.0.0 on OpenSUSE Tumbleweed.
* 36.7MiB vs 7.27MiB for n = 10,000,000, 94.9kiB vs 26.5kiB for n = 10,000