Arithmetic overflow when trying to benchmark some digest

I randomly came across this benchmark of some ruby code ruby Digest::* benchmark · GitHub
and I got curious to see what that would look like in crystal.

Here’s my attempt at the same benchmark:

require "digest"
require "benchmark"

SRC = File.read("/dev/urandom")[0, 4096]

Benchmark.bm do |bm|
  bm.report("MD5") { 100000.times { Digest::MD5.hexdigest(SRC) } }
  bm.report("SHA1") { 100000.times { Digest::SHA1.hexdigest(SRC) } }
  bm.report("SHA256") { 100000.times { Digest::SHA256.hexdigest(SRC) } }
end

but when I run this, I get an Arithmetic overflow error.

Unhandled exception: Arithmetic overflow (OverflowError)
  from Math@Math::pw2ceil<Int32>:Int32
  from /usr/local/Cellar/crystal/0.36.1_2/src/string/builder.cr:124:5 in 'write'
  from /usr/local/Cellar/crystal/0.36.1_2/src/io.cr:1120:7 in '__crystal_main'
  from /usr/local/Cellar/crystal/0.36.1_2/src/crystal/main.cr:110:5 in 'main'

I didn’t want to just post a bug in case I was doing something weird in my code example.

❯ crystal -v
Crystal 0.36.1 (2021-02-02)

LLVM: 11.0.1
Default target: x86_64-apple-macosx

File.read(/dev/urandom") reads from /dev/urandom to the end. But there’s no end. You can always read more from /dev/urandom.

You might use something like this instead:

SRC = String.build do |io|
  File.open("/dev/urandom") do |file|
    IO.copy(file, io, 4096)
  end
end

Maybe we should add a limit parameter to File.read.

And the error message could be improved, i.e. rescue OverflowError and raise IO::Error instead in String::Builder#write (also in similar places).

1 Like

Ah. Ok thanks! That makes sense. Yeah, a limit option on read would be pretty cool. Want me to open an issue on that?

1 Like

You can do something like File.open(filename, &.gets(limit)), there’s no need to change anything.

1 Like

Nice! That’s handy. Yeah, I’m all down for if there’s already a built-in way. I guess it’s just a matter of knowing how to do those things.

Although not directly related to this specific context (as the file is infinite), it is possible to pass a file path instead of the actual data. E.g.

Digest::SHA256.hexdigest &.file "./foo.txt"

Which can be handy for larger files.

1 Like

gets(limit) stops at line break. So that’s not a generic alternative.

1 Like