Segmented Sieve Memory Usage

RespiteSage · December 23, 2021, 3:59am

I’m messing around with implementing and comparing different prime number generators, including prime sieves, after becoming interested in @jzakiya’s Sieve of Zakiya. I haven’t gotten as far as complex sieves like that yet, though. I’m currently working on implementing varieties of the Sieve of Eratosthenes, and I’m stumped by my Segmented Sieve of Eratosthenes using much more memory than my standard Sieve of Eratosthenes implementation.

Here are my implementations:

def sieve_of_eratosthenes(upto n)
  possible_primes = BitArray.new n + 1, initial: true
  possible_primes[0] = false
  possible_primes[1] = false

  (2..).take_while { |i| i <= Math.sqrt(n) }.each do |i|
    if possible_primes[i]?
      (i**2).upto(n).step(i).each do |j|
        possible_primes[j] = false
      end
    end
  end

  possible_primes.each_index.select { |index| possible_primes[index] }.to_a
end

def segmented_sieve_of_eratosthenes(upto n)
  delta = Math.sqrt(n).to_i

  known_primes = sieve_of_eratosthenes upto: delta

  possible_primes = BitArray.new delta
  (delta + 1).upto(n).step(delta).each do |segment_min|
    segment_max = segment_min + delta - 1
    prime_divisor_limit = Math.sqrt(segment_max)
    relevant_prime_divisors = known_primes.each.take_while { |prime| prime <= prime_divisor_limit }
    possible_primes.fill true

    pre_segment_min = segment_min - 1
    relevant_prime_divisors.each do |p|
      starting_index = p - (pre_segment_min % p) - 1
      starting_index.upto(delta - 1).step(p).each do |j|
        possible_primes[j] = false
      end
    end

    possible_primes.each_index do |index|
      if possible_primes[index]
        known_primes << (segment_min + index)
      end
    end
  end

  known_primes
end

The segmented sieve is using around 5 times as much memory as the basic version*. I initially noticed a difference in memory consumption when using Benchmark.ips, and since then I’ve just been checking with GC.stats.total_bytes at the end of a run (switching implementations and recompiling to compare).

I’m not sure whether adding GC.stats.total_bytes checks around the code is a valid way to check what’s causing memory allocation, but when I did that it seemed like the if possible_primes[index] line toward the bottom of the segmented sieve implementation was allocating a bunch of memory, which seems… wrong.

So basically I’m wondering if anyone else has an idea of why I’m seeing more memory usage in the segmented sieve than in the basic one. If you have a critique of the algorithm re runtime performance or correctness, I guess that’s fine, but it’s not really what I’m looking for.

If it matters, I’m running Crystal 1.2.2 with LLVM 10.0.0 on OpenSUSE Tumbleweed.

* 36.7MiB vs 7.27MiB for n = 10,000,000, 94.9kiB vs 26.5kiB for n = 10,000

rogerdpack · December 23, 2021, 7:03am

Seems like a reasonable way to test. Maybe you can check total_bytes within the code to see where it spikes?

RespiteSage · December 23, 2021, 3:49pm

Yeah, here’s the code with #total_bytes checks throughout and the code that creates that output: gist. The non-segmented version takes only 10.1 KiB in this case, while the segmented version takes 146 KiB. I did intentionally choose a small n in order to lessen the amount of output to sift through, but the results have been similar for larger n.

EDIT: I goofed with my #total_bytes output and was creating a bunch of strings which were taking memory. I’ve updated the gist to reflect a better way of outputting that value without ruining the usefulness of it. The difference in memory consumption is not nearly so much now, but the segmented version is still using more memory: 11.7 KiB vs the aforementioned 10.1 KiB for the non-segmented version. The values I gave in the original post for larger n are still valid.

rogerdpack · December 24, 2021, 6:24am

I’m not quite sure how to debug it. Maybe add similar debugs to the original and look for differences? Does GC.collect help?

Topic		Replies	Views
Twin Primes Segmented Sieve of Zakiya (SSoZ) Explained News	16	613	June 29, 2022
Kudos to Crystal! News	12	2147	December 29, 2021
Why so big differences? Help & Support	16	380	January 23, 2024
Ensuring Memory Alignment Help & Support	4	143	February 2, 2024
Crystal vs Go: Notes and Reflections Learning Resources	3	7197	March 3, 2021

Segmented Sieve Memory Usage

Related topics