There is a way to optimize this program?

You can make it use much less memory by only creating as many fibers as there are threads, sending the batches through a Channel(Int64...Int64) and have each fiber loop to process the next batch. It may even be a bit faster, especially if the channel buffer size is the number of batches.

Notice that by creating as many fibers as batch numbers, you get faster… because you end up creating less fibers. With only a fixed set of fibers, it would be the opposite: small batches (e.g. 1000) are faster than larger ones.

require "wait_group"

def collatz(seed : Int64)
  steps = 0_i64

  while seed > 1
    while seed % 2 == 0
      steps &+= 1
      seed //= 2
    end

    if seed > 1
      steps &+= 1
      seed = seed &* 3 &+ 1
    end
  end

  steps
end

n = ENV.fetch("N", "1_000_000_000").to_i64(underscore: true)
b = ENV.fetch("B", "1_000").to_i64(underscore: true)
w = ENV.fetch("CRYSTAL_WORKERS", "4").to_i

channel = Channel({Int64, Int64}).new((n // b + 1).to_i32)
wg = WaitGroup.new(w)

w.times do |i|
  spawn(name: "WORKER-#{i}") do
    while r = channel.receive?
      (r[0]...r[1]).each do |seed|
        steps = collatz(seed)

        if seed % 1_000_000 == 0
          print "Seed: #{seed} Steps: #{steps}\r"
        end
      end
    end
    wg.done
  end
end

start = Time.measure do
  s = n // b
  r0 = 0_i64

  s.times do
    r1 = r0 &+ b
    channel.send({r0, r1})
    r0 = r1
  end

  if n - b &* s > 0
    channel.send({r0, n})
  end

  channel.close
  wg.wait
end

puts "\ncollatz took: #{start}"

As for timings, on my old i7 1472HQ, the Go example takes 6m30s, the direct Crystal port 5m30s and the above 1m30s (8 threads).

5 Likes