I was reading this post that mentioned this language comparison
Evaluating Language Performance when Handling Extensive String Constructions the crystal program fails for large buffers and can’t see why.
How large are your large buffers? Are you hitting the Int32::MAX
size limit on strings?
Not my code - saw a blog post about speed comparision on bufer building and clicked link.
200,000,000 iterations, 2.13 GB text size is when it fails. Also, not as fast as the other implementations
Language | Time | Notes |
---|---|---|
Rust | 00:01:625 | |
Java | 00:01:77 | Failed to save |
C# | 00:02:89 | Failed to save |
Nim | 00:03:89 | |
C++ | 00:04:14 | |
C | 00:04:41 | |
Perl | 00:04:497 | |
Crystal | 00:05:90 |
require "time"
def test(num : Int64)
print " Crystal"
start_time = Time.monotonic
s = String.build do |str|
(1..num).each do |i|
str << " C #{i}"
end
elapsed_time = Time.monotonic - start_time
mins = elapsed_time.total_minutes.floor
secs = elapsed_time.total_seconds.floor % 60
millis = elapsed_time.total_milliseconds % 1000
puts " #{mins}:#{secs}:#{millis} Iter #{num} Len #{str.bytesize} "
File.write("out/crystal_output.txt", str.to_s) rescue puts "Error saving string to file."
end
end
num = ARGV[0].to_i
test(num)
The code does not actually work in the first place:
Crystal 0.0:2.0:662.8273509999999 Iter 50000000 Len 538888897
Unhandled exception: Can only invoke 'to_s' once on String::Builder (Exception)
from src/string/builder.cr:101:5 in 'to_s'
from usr/test.cr:96:1 in '__crystal_main'
from src/crystal/main.cr:118:5 in 'main'
from /lib/x86_64-linux-gnu/libc.so.6 in '??'
from /lib/x86_64-linux-gnu/libc.so.6 in '__libc_start_main'
from ./test in '_start'
from ???
Even if it worked, it would indeed fail to save because String
only handles byte sizes up to around Int32::MAX
.
Yeah the Crystal code is broken. And the comparison is completely flawed.
Some implementations (including Crystal’s) construct an intermediary string in the loop, others don’t (Java and Rust for example).
The difference in Crystal is that str << " C #{i}"
builds an intermediary before appending to str
, while str << " C " << i
does not (so less allocations) and appends directly.
# crystal build -o ..\exe\cr_strbld.exe --release cr_strbld.cr
require "time"
def test(num : Int64)
print " Crystal"
start_time = Time.monotonic
s = String.build do |str|
(1..num).each do |i|
str << " C #{i}"
end
elapsed_time = Time.monotonic - start_time
mins = elapsed_time.total_minutes.floor
secs = elapsed_time.total_seconds.floor % 60
millis = elapsed_time.total_milliseconds % 1000
puts " #{mins}:#{secs}:#{millis} Iter #{num} Len #{str.bytesize} "
File.write("out/crystal_output.txt", str.to_s) rescue puts "Error saving string to file."
end
end
num = ARGV[0].to_i
test(num)
I got it to compile on my windows machine - cloned the repo and compiled it . To make it the same as Rust - how would you get rid of the allocations - or is it possible? Every time on interpolation it creates a String.
It’s a simple change:
- str << " C #{i}"
+ str << " C " << i
Now we don’t have an intermediate string " C #{i}"
anymore, " C "
and i
are appended directly to str
.
This should cut the execution time almost in half.
Sorry - it is broken but not sure why. Can someone explain the error and how to fix it?
I made the optimization @straight-shoota mentioned and converted the counter variable to a UInt64
and now it’s performing much closer to the Rust code — which also has that same optimization. Down from 5.357s to 1.344s on 100M. I also made some other edits to make it more idiomatic Crystal code.
➜ langs_string_build_test git:(main) ✗ for lang in c rs cr; do; exe/${lang}_strbld.exe 100000000; done
C 0:5:5115 Iter 100000000: Len 1088888898
Rust 00:01:211 iter 100000000 len 1088888898
Crystal 00:01:344 iter 100000000 len 1088888898
Assuming the ratio of the Rust implementation’s performance on my machine vs theirs also applies to the Crystal implementation, that puts it at roughly 1.803s on their machine, making it the second-fastest unbroken implementation for the 100M table.
The code:
# crystal build -o ..\exe\cr_strbld.exe --release cr_strbld.cr
require "time"
def test(num : UInt64)
print " Crystal"
start_time = Time.monotonic
s = String.build do |str|
(1u64..num).each do |i|
str << " C " << i
end
end
elapsed_time = Time.monotonic - start_time
mins = elapsed_time.total_minutes.floor.to_i
secs = elapsed_time.total_seconds.floor % 60
millis = elapsed_time.total_milliseconds % 1000
puts " %02d:%02d:%03d iter #{num} len #{s.bytesize} " % {mins, secs, millis}
begin
File.write("out/crystal_output.txt", s)
rescue ex
puts "Error saving string to file: #{ex}"
end
end
num = ARGV[0].to_u64
test(num)
I have a feeling the performance difference may be in stringifying the integer. I haven’t checked, though.
One thing I found hilarious was the project’s readme says:
To ensure a fair comparison, I avoided using any optimization tricks or advanced features that could enhance the programs’ performance, relying solely on the default language features.
And then it has this optimization only applied to the Rust implementation, which just happens to be the most performant in their benchmarks:
diff --git a/src/rs_strbld.rs b/src/rs_strbld.rs
index 3e6c2da..084b501 100644
--- a/src/rs_strbld.rs
+++ b/src/rs_strbld.rs
@@ -3,6 +3,7 @@ use std::fs;
use std::time::Instant;
use std::env;
use std::io::{self, Write};
+use std::fmt::Write as _;
fn format_time(time_ms: u128) -> String {
let milliseconds = time_ms % 1000;
@@ -22,7 +23,10 @@ pub fn test(num: i64) {
for _ in 1..=num {
i += 1;
- s.push_str(&format!(" R {}", i));
+ //s.push_str(&format!(" R {}", i));
+ s.push_str(" R ");
+ write!(s, "{}", i).unwrap();
+
That’s interesting.
It’s impossible for one person to fully understand the details of 20 programming languages, so comparing languages is always hard. Such comparisons only make sense after community members of each language review and make optimizations. A very human way to go about it.
Rust is not the only one. At least Java avoids the intermediary allocation as well: s.append(" J ").append(i);
.
It’s almost as fast as the Rust implementation.
Create a PR there
Anyway, there are no drawbacks let new user know Crystal faster.
Try to change code to works on bigger number as some others programming language do, but following code not work.
require "time"
class Iter
include Iterator(UInt64)
def initialize(@num : UInt64)
@produced = 0
end
def next
if @produced < @num
@produced &+= 1
@produced
else
stop
end
end
end
num = 200_000_000_u64
file = File.open("crystal_output.txt", "w")
def test(iter : Iter)
number = 0
str = String.build do |io|
loop do
number = iter.next
io << " C " << number
end
end
{str, ""}
rescue ex : IO::EOFError
pp!({str, number})
{str, number}
end
iter = Iter.new(num)
begin
loop do
test(iter).each do |e|
file << e
end
end
end
file.close
When String buffer reach it max limit(Int32::MAX), the IO::EOFError
exception raised, when this exception happen, what i expect is, String.build still build and return the string, but it return nil
instead.
I can’t figure out others way to achieve this, unless call to_s
on every i
, then counting the string number, make the String.build
always less than limit, but that fallback to the initial question, we don’t want create new string …
Any idea?
It does, but the Java implementation had it from the beginning. They went back and optimized the Rust implementation later. I was linking to a diff with an optimization in it.