A blog article on performant vs idiomatic code (using Crystal examples)

straight-shoota · January 6, 2024, 11:11pm

An idea for further optimizations: If you have long segments of characters that don’t get encoded (i.e. they’re copied 1:1 from the original string), it would be more efficient to copy them as a big batch instead of writing every single character.
For example, encoding the string abababababababab (either as the entire input or as a segment between two curly encodings) could be a single memcpy of 16 bytes instead of 16 individual writes of one byte.

This would require going even further down, using Char::Reader directly instead of String#each_char so you can keep track of byte indices.

This mechanism is used in some places in stdlib, by the way, for example in HTML.escape

github.com

crystal-lang/crystal/blob/7df0b2e9d0b1c735ebc9d30e903fa037f297538c/src/html.cr#L43-L64


      
          # Same as `escape(String, IO)` but accepts `Bytes` instead of `String`.
          #
          # The slice is assumed to be valid UTF-8.
          def self.escape(string : Bytes, io : IO) : Nil
            last_copy_at = 0
            string.each_with_index do |byte, index|
              str = case byte
                    when '&'  then "&amp;"
                    when '<'  then "&lt;"
                    when '>'  then "&gt;"
                    when '"'  then "&quot;"
                    when '\'' then "&#39;"
                    else
                      next
                    end
          
              io.write_string(string[last_copy_at, index &- last_copy_at])
              last_copy_at = index &+ 1
              io << str
            end

This file has been truncated. show original

Topic		Replies	Views
Frequency distribution of words in text : Is this code idiomatic? Learning Resources	5	544	November 21, 2021
[Mini Review] Giving up on Crystal	14	11067	March 24, 2019
Unicode as syntax	17	525	July 1, 2023
Crystal 1.5.0 has been released! Official release	6	503	July 8, 2022
Ongoing experiment: ident pool Community	13	566	July 21, 2022

A blog article on performant vs idiomatic code (using Crystal examples)

Related topics