Performance issues with the JSON parser

Did anyone ever have any performance issues with the JSON parser?
Thought I’d ask here before investigating further (and I don’t know if time permits) if this is a known problem.

I need to parse a rather large payload of JSON data.
At first I tried JSON.parse(io). Then I switched to MyModel.from_json(io) using structs. I also tried with records and classes.

In the default build, it was always in the range of 2.5 to 3 seconds, give or take.
It was admittedly faster in “--release” build, but still felt slow.

Telling by what Time Profiler in XCode Instruments (a 25 GB install, what the heck) showed me, the bottleneck seemed to be somwhere in JSON reading from an IO, or in IO reading chars. (I don’t really know how to interpret the profile.)

Turns out, the parsing is indeed faster when parsing from a string.
Here’s what I found (built with “–release”):

  • from a File: 0.54 s
  • from an IO::Memory: 0.45 s
  • from a String: 0.20 s

By comparison, parsing the same thing in Ruby using the built-in JSON parser takes some 0.11 s (totally unscientific benchmark, for now). And that’s not even using one of the faster parsers, such as Oj.

I’d be curious to know:

  1. How big is the file
  2. What is the timing when using JSON.parse vs MyType.from_json

How large is your JSON payload? 200ms to parse JSON from a string and 450ms to parse from an IO sounds like a whole lot of JSON. On my laptop, reading an 11MB JSON payload from a string takes ~45ms.

Is it feasible to use a different serialization format for what you’re trying to do or are you stuck with JSON? Using MessagePack, I’m seeing parse times as low as 2ms for the equivalent payload.

The JSON data is some 34 MB.

Just checked: There’s basically no difference between JSON.parse(json_str) and MyType.from_json(json_str). They both take around 0.21 s.

Crystal 1.11.2, LLVM 17.0.6, on aarch64-apple-darwin, FWIW.

And I’m stuck with JSON.

There’s basically no difference between JSON.parse(json_str) and MyType.from_json(json_str). They both take around 0.21 s.

With laptop on battery, that is (for all numbers so far).
With the power connected, both take around 0.16 or 0.17 s.
And then in Ruby it’s 0.09 s.

How did you parse Jason using file?

Like this?

MyJson.from_json File.read(path)

Or like this?

File.open(path, 'r'){ |io| MyJson.from_json io}
  • From a File was basically like this:

    file = File.open(path, "rb")
    
    t0 = Time.utc
    result = MyType.from_json(file)
    t1 = Time.utc
    puts "Parsing JSON took #{ t1 - t0 }."
    
  • From a String:

    file = File.open(path, "rb")
    bytes = Bytes.new(file.size)
    file.read_fully?(bytes)
    str = String.new(bytes)
    
    t0 = Time.utc
    result = MyType.from_json(str)
    t1 = Time.utc
    puts "Parsing JSON took #{ t1 - t0 }."
    
1 Like

Tip: Better use Time.monotonic or .measure(&) for an accurate measure of duration.

2 Likes

I did some testing on my machine, and it certainly seems like the parsing is slower than I remember when testing GeoJSON parsing a while back. I could be misremembering, though, and I can’t find the benchmarks I made (on pretty large files) at the moment.

Here’s my code:

require "json"
require "random"
require "file"
require "time"

struct Inner
  include JSON::Serializable

  property inner_name : String
  property numbers : Array(Int32)

  def initialize(
    @inner_name : String,
    @numbers : Array(Int32) = Array(Int32).new
  )
  end
end

struct Middle
  include JSON::Serializable

  property middle_name : String
  property inner_values : Array(Inner)

  def initialize(
    @middle_name : String,
    @inner_values : Array(Inner) = Array(Inner).new
  )
  end
end

struct Outer
  include JSON::Serializable

  property outer_name : String
  property middle_values : Array(Middle)

  def initialize(
    @outer_name : String,
    @middle_values : Array(Middle) = Array(Middle).new
  )
  end
end

def create_structure(scale_factor : Int32, rng : Random, numbers_range : Range(Int32, Int32)) : Array(Outer)
  Array(Outer).new(scale_factor) {
    Outer.new(
      rng.base64,
      Array(Middle).new(scale_factor) {
        Middle.new(
          rng.base64,
          Array(Inner).new(scale_factor) {
            Inner.new(
              rng.base64,
              Array(Int32).new(scale_factor) { rng.rand(numbers_range) }
            )
          }
        )
      }
    )
  }
end

def count(structure : Array(Outer))
  structure.sum { |outer|
    outer.middle_values.sum { |middle|
      middle.inner_values.sum { |inner|
        0_u64 + inner.numbers.size
      }
    }
  }
end

def puts_seconds_elapsed(label, start_time, end_time)
  puts "#{label}: #{(end_time - start_time).total_seconds}s"
end

scale_factor = 100
if ARGV.size > 0 && (first_arg_int = ARGV.first.to_i?)
  scale_factor = first_arg_int
end

filename = "big_json.json"
rng = Random.new(seed: scale_factor)
numbers_range = (1000..9999)

do_write = true

if do_write
  structure = create_structure scale_factor, rng, numbers_range

  begin
    start_time = Time.monotonic
    structure_json = structure.to_json
    end_time = Time.monotonic

    puts_seconds_elapsed "serialization to string", start_time, end_time

    file = File.open filename, "w"
    start_time = Time.monotonic
    file << structure_json
    end_time = Time.monotonic
    file.close

    puts_seconds_elapsed "file write from string", start_time, end_time
  end

  file = File.open filename, "w"
  start_time = Time.monotonic
  structure.to_json file
  end_time = Time.monotonic
  file.close

  puts_seconds_elapsed "file write with serialization", start_time, end_time
end

begin
  start_time = Time.monotonic
  file_contents = File.read filename
  end_time = Time.monotonic

  puts_seconds_elapsed "file read to string", start_time, end_time

  start_time = Time.monotonic
  structure_from_file = Array(Outer).from_json file_contents
  end_time = Time.monotonic

  puts_seconds_elapsed "parsing from string", start_time, end_time

  # just to make sure the compiler doesn't elide anything
  File.write File::NULL, count(structure_from_file)
end

file = File.open(filename, "r")
start_time = Time.monotonic
structure_from_file = Array(Outer).from_json file
end_time = Time.monotonic
file.close

puts_seconds_elapsed "parsing from file", start_time, end_time

# just to make sure the compiler doesn't elide anything
File.write File::NULL, count(structure_from_file)
Notes on the Code
  • I tried out different File buffering settings, but it didn’t seem to make any difference, even in the “write with serialization” and “parsing from file” cases.
  • The begin...end blocks are an attempt to create variable scopes to help manage memory usage, but I don’t know if that actually works.
  • I tried to make the serializable structures as simple as possible (to make it easier to review) while still exhibiting nesting, since real-world JSON tends to be heavily nested.
  • I made basically no attempt to optimize create_structure or count because they’re not what I was trying to benchmark.
Example produced JSON, with scale factor 2, after formatting with jq
[
  {
    "outer_name": "qetdD4TLe9Ijt+J9Z+dlYg==",
    "middle_values": [
      {
        "middle_name": "T81VaTBy7EJ+2r4G2fATSA==",
        "inner_values": [
          {
            "inner_name": "epJ0N7wZSPdM/UZJzuTAvA==",
            "numbers": [
              9016,
              9814
            ]
          },
          {
            "inner_name": "gC1J8zsb6sXhl9i6A67Apw==",
            "numbers": [
              2739,
              3830
            ]
          }
        ]
      },
      {
        "middle_name": "0I9taSfFVEFJNkUbOPnJxA==",
        "inner_values": [
          {
            "inner_name": "bBlzJ6IPbI53SC+4LLIjAg==",
            "numbers": [
              1986,
              5623
            ]
          },
          {
            "inner_name": "x9Z5bWIal4qRClJfeMw2fg==",
            "numbers": [
              8853,
              7967
            ]
          }
        ]
      }
    ]
  },
  {
    "outer_name": "siItIL1Wb72iq3N/bqYoYQ==",
    "middle_values": [
      {
        "middle_name": "K/j4cgIgOXpV1juImq15uQ==",
        "inner_values": [
          {
            "inner_name": "os64AVLIAuYGuhKhBaxZDw==",
            "numbers": [
              9189,
              1888
            ]
          },
          {
            "inner_name": "CguKhvwLFKCG8WkAtlTUWA==",
            "numbers": [
              9455,
              9214
            ]
          }
        ]
      },
      {
        "middle_name": "uIYmymvfO2Y2k8wQXjCB6Q==",
        "inner_values": [
          {
            "inner_name": "RUZapun49A2gzOHArkubNA==",
            "numbers": [
              6706,
              3441
            ]
          },
          {
            "inner_name": "Fc+DBmjHNtxcevNweLKyQQ==",
            "numbers": [
              5703,
              9299
            ]
          }
        ]
      }
    ]
  }
]

And here’s the output I’m getting on my machine:

Scale Factor 10 (112 kb file)

serialization to string: 0.003119035s
file write from string: 0.0001233s
file write with serialization: 0.001344965s
file read to string: 0.000161006s
parsing from string: 0.002256683s
parsing from file: 0.005329564s

Scale Factor 50 (37 mb file)

serialization to string: 0.407213263s
file write from string: 0.036222296s
file write with serialization: 0.428914678s
file read to string: 0.029467402s
parsing from string: 0.910451311s
parsing from file: 1.886663833s

Scale Factor 100 (529 mb file)

serialization to string: 6.845525082s
file write from string: 0.533993881s
file write with serialization: 7.428256482s
file read to string: 0.199061933s
parsing from string: 15.233927115s
parsing from file: 29.497250194s
1 Like

FWIW: I get a 6 % improvement by simplifying how codepoints are counted in a String:

require "string"

class String
  def size : Int32
    if @length > 0 || @bytesize == 0
      return @length
    end
    # original:
    #@length = each_byte_index_and_char_index { }
    # new:
    @length = utf8_len
  end
  
  protected def utf8_len : Int32
    # 0b10...... is a continuation byte.
    # Counting continuation bytes is faster than counting leading bytes
    # if the string has more leading bytes than continuation bytes, i.e
    # mostly ASCII.
    count_continuation_bytes = to_slice.count{ |byte|
      byte & 0b11000000 == 0b10000000
    }
    return (bytesize - count_continuation_bytes)
  end
end

I’m also wondering if .from_json could be improved by operating on bytes instead of chars.

Or if it’s possible not to care about UTF-8/UTF-16/… initially. I.e. first just locate the hierarchy of objects (basically ‘{’/‘}’, unless in a string), then parallelize the parsing depth-first. Not exactly a low-hanging fruit though.

This algorithm only works on the premise that the string is valid UTF-8. Crystal’s String type expects to be UTF-8 encoded, but it does not enforce it. String data read from external source may contain bytes that are not valid UTF-8 encodings and those needs to be properly handled.

2 Likes

Yeah, that could be an easy optimization. The JSON structure is entirely ASCII, so there’s no need to handle multi byte encodings. They can just transparently pass through as payload. Some other parsing algorithms in stdlib work on this principle already.

1 Like

Parellelizing sounds interesting. I fear it would be quite complex though :thinking:

I’m glad to hear I’m not alone. :grinning:

I see.
For JSON I would probably happily assume, that the data is valid UTF-8.
If it isn’t and something explodes: :person_shrugging::grinning:
Or maybe only then fallback to the more robust implementation. Ok, probably not good idea.
And while 6 % is nice, it’s far from 50 %.

IME the lowest-hanging fruit for performance optimization in Crystal is usually reducing heap allocations. I have no doubt that the JSON parser could make fewer allocations.

In a few cursory checks (I don’t have the energy to do much more than that atm), it currently allocates about 3-4x the JSON payload size — even more if your JSON::Serializable types’ properties have union types or use_json_discriminator. But I just had a quick scan through the lexer and parser code and nothing’s jumping out at me.

I found a way to improve the performance and the memory used: Optimize JSON parsing a bit by asterite · Pull Request #14366 · crystal-lang/crystal · GitHub

10 Likes

Nice! I ran your benchmark against an empty json {} just to make sure that was ok, and it looks like the parse with IO gets faster there too!

# Before
JSON.parse (string)   9.43M (106.02ns) (± 0.40%)  640B/op        fastest
    JSON.parse (IO) 588.70k (  1.70µs) (± 0.25%)  608B/op  16.02× slower

# After
JSON.parse (string)   9.38M (106.64ns) (± 2.78%)  625B/op        fastest
    JSON.parse (IO) 764.24k (  1.31µs) (± 0.61%)  625B/op  12.27× slower

Nice work :clap:

2 Likes

Wait, what? Shouldn’t Crystal at least be on par with Ruby, unless Ruby’s making some unacceptable shortcuts?

I didn’t look at what the JSON lexer/parser does in Ruby, but here’s my benchmarks: