Crystal string build failure

I was reading this post that mentioned this language comparison
Evaluating Language Performance when Handling Extensive String Constructions the crystal program fails for large buffers and can’t see why.

How large are your large buffers? Are you hitting the Int32::MAX size limit on strings?

3 Likes

Not my code - saw a blog post about speed comparision on bufer building and clicked link.
200,000,000 iterations, 2.13 GB text size is when it fails. Also, not as fast as the other implementations

Language Time Notes
Rust 00:01:625
Java 00:01:77 Failed to save
C# 00:02:89 Failed to save
Nim 00:03:89
C++ 00:04:14
C 00:04:41
Perl 00:04:497
Crystal 00:05:90
require "time"

def test(num : Int64)
  
  print "  Crystal"
    
    start_time = Time.monotonic

    s = String.build do |str|
    
    (1..num).each do |i|
      str << " C #{i}"
    end

    elapsed_time = Time.monotonic - start_time
    
    mins = elapsed_time.total_minutes.floor
    secs = elapsed_time.total_seconds.floor % 60
    millis = elapsed_time.total_milliseconds % 1000
    
    puts "  #{mins}:#{secs}:#{millis} Iter #{num} Len #{str.bytesize} "
     
 
    File.write("out/crystal_output.txt", str.to_s) rescue puts "Error saving string to file."

  end
end

num = ARGV[0].to_i
test(num)

The code does not actually work in the first place:

  Crystal  0.0:2.0:662.8273509999999 Iter 50000000 Len 538888897 
Unhandled exception: Can only invoke 'to_s' once on String::Builder (Exception)
  from src/string/builder.cr:101:5 in 'to_s'
  from usr/test.cr:96:1 in '__crystal_main'
  from src/crystal/main.cr:118:5 in 'main'
  from /lib/x86_64-linux-gnu/libc.so.6 in '??'
  from /lib/x86_64-linux-gnu/libc.so.6 in '__libc_start_main'
  from ./test in '_start'
  from ???

Even if it worked, it would indeed fail to save because String only handles byte sizes up to around Int32::MAX.

1 Like

Yeah the Crystal code is broken. And the comparison is completely flawed.
Some implementations (including Crystal’s) construct an intermediary string in the loop, others don’t (Java and Rust for example).

The difference in Crystal is that str << " C #{i}" builds an intermediary before appending to str , while str << " C " << i does not (so less allocations) and appends directly.

2 Likes
# crystal build -o ..\exe\cr_strbld.exe --release cr_strbld.cr

require "time"

def test(num : Int64)
  
  print "  Crystal"
    
    start_time = Time.monotonic

    s = String.build do |str|
    
    (1..num).each do |i|
      str << " C #{i}"
    end

    elapsed_time = Time.monotonic - start_time
    
    mins = elapsed_time.total_minutes.floor
    secs = elapsed_time.total_seconds.floor % 60
    millis = elapsed_time.total_milliseconds % 1000
    
    puts "  #{mins}:#{secs}:#{millis} Iter #{num} Len #{str.bytesize} "
     
 
    File.write("out/crystal_output.txt", str.to_s) rescue puts "Error saving string to file."

  end
end

num = ARGV[0].to_i
test(num)

I got it to compile on my windows machine - cloned the repo and compiled it . To make it the same as Rust - how would you get rid of the allocations - or is it possible? Every time on interpolation it creates a String.

It’s a simple change:

-      str << " C #{i}"
+      str << " C " << i

Now we don’t have an intermediate string " C #{i}" anymore, " C " and i are appended directly to str.
This should cut the execution time almost in half.

1 Like

Sorry - it is broken but not sure why. Can someone explain the error and how to fix it?

I made the optimization @straight-shoota mentioned and converted the counter variable to a UInt64 and now it’s performing much closer to the Rust code — which also has that same optimization. Down from 5.357s to 1.344s on 100M. I also made some other edits to make it more idiomatic Crystal code.

➜  langs_string_build_test git:(main) ✗ for lang in c rs cr; do; exe/${lang}_strbld.exe 100000000; done
  C  0:5:5115 Iter 100000000: Len 1088888898
  Rust  00:01:211 iter 100000000 len 1088888898
  Crystal  00:01:344 iter 100000000 len 1088888898

Assuming the ratio of the Rust implementation’s performance on my machine vs theirs also applies to the Crystal implementation, that puts it at roughly 1.803s on their machine, making it the second-fastest unbroken implementation for the 100M table.

The code:

# crystal build -o ..\exe\cr_strbld.exe --release cr_strbld.cr

require "time"

def test(num : UInt64)
  print "  Crystal"

  start_time = Time.monotonic

  s = String.build do |str|
    (1u64..num).each do |i|
      str << " C " << i
    end
  end

  elapsed_time = Time.monotonic - start_time

  mins = elapsed_time.total_minutes.floor.to_i
  secs = elapsed_time.total_seconds.floor % 60
  millis = elapsed_time.total_milliseconds % 1000

  puts "  %02d:%02d:%03d iter #{num} len #{s.bytesize} " % {mins, secs, millis}

  begin
    File.write("out/crystal_output.txt", s)
  rescue ex
    puts "Error saving string to file: #{ex}"
  end
end

num = ARGV[0].to_u64
test(num)

I have a feeling the performance difference may be in stringifying the integer. I haven’t checked, though.

1 Like

One thing I found hilarious was the project’s readme says:

To ensure a fair comparison, I avoided using any optimization tricks or advanced features that could enhance the programs’ performance, relying solely on the default language features.

And then it has this optimization only applied to the Rust implementation, which just happens to be the most performant in their benchmarks:

diff --git a/src/rs_strbld.rs b/src/rs_strbld.rs
index 3e6c2da..084b501 100644
--- a/src/rs_strbld.rs
+++ b/src/rs_strbld.rs
@@ -3,6 +3,7 @@ use std::fs;
 use std::time::Instant;
 use std::env;
 use std::io::{self, Write};
+use std::fmt::Write as _;
 
 fn format_time(time_ms: u128) -> String {
   let milliseconds = time_ms % 1000;
@@ -22,7 +23,10 @@ pub fn test(num: i64) {
      
     for _ in 1..=num {
         i += 1;
-        s.push_str(&format!(" R {}", i)); 
+        //s.push_str(&format!(" R {}", i)); 
+        s.push_str(" R "); 
+        write!(s, "{}", i).unwrap();
+
3 Likes

That’s interesting.
It’s impossible for one person to fully understand the details of 20 programming languages, so comparing languages is always hard. Such comparisons only make sense after community members of each language review and make optimizations. A very human way to go about it.

Rust is not the only one. At least Java avoids the intermediary allocation as well: s.append(" J ").append(i);.
It’s almost as fast as the Rust implementation.

Create a PR there

Anyway, there are no drawbacks let new user know Crystal faster.

1 Like

Try to change code to works on bigger number as some others programming language do, but following code not work.

require "time"

class Iter
  include Iterator(UInt64)

  def initialize(@num : UInt64)
    @produced = 0
  end

  def next
    if @produced < @num
      @produced &+= 1
      @produced
    else
      stop
    end
  end
end

num = 200_000_000_u64

file = File.open("crystal_output.txt", "w")

def test(iter : Iter)
  number = 0

  str = String.build do |io|
    loop do
      number = iter.next

      io << " C " << number
    end
  end

  {str, ""}
rescue ex : IO::EOFError
  pp!({str, number})
  {str, number}
end

iter = Iter.new(num)

begin
  loop do
    test(iter).each do |e|
      file << e
    end
  end
end

file.close

When String buffer reach it max limit(Int32::MAX), the IO::EOFError exception raised, when this exception happen, what i expect is, String.build still build and return the string, but it return nil instead.

I can’t figure out others way to achieve this, unless call to_s on every i, then counting the string number, make the String.build always less than limit, but that fallback to the initial question, we don’t want create new string …

Any idea?

It does, but the Java implementation had it from the beginning. They went back and optimized the Rust implementation later. I was linking to a diff with an optimization in it.