Performance issues with the JSON parser

I found out that I can significantly improve performance by feeding less data to the JSON lexer/parser. Who would have thought! :slight_smile:

I tried to simplify the JSON using String#gsub to remove some properties that I don’t need (most of the time), such as some long strings (several kilobytes each).
String#gsub is expensive, but even then, it improved the total time:

  • Simplifying: 0.060 s,
  • parsing: 0.060 s,
  • total: 0.120 s.

Now I’m using some custom code to do some search/replace operations on Bytes objects, before even turning the Bytes into a String.
Really feels like a hack, but now I’m at:

  • Simplifying: 0.015 s,
  • parsing: 0.032 s,
  • total: 0.047 s.

That’s much better than the 0.210 s total that is was originally.

Can you share the JSON you are parsing, and what you need from it?

No it’s not necessarily faster. Ruby’s JSON parser is implemented in C (ruby/ext/json/parser at master · ruby/ruby · GitHub). So it compiles down to machine code like Crystal’s and both should have the same performance potential. Ruby’s implementation is apparently more optimized, though.

I remember the same situation happening when manipulating large numbers, Ruby is better.

Not yet. I need to craft some sample data for the specs.

I don’t want to hijack your pull request on GitHub too much, so I’m responding here.

Oj seems MIT licensed to me. This isn’t legal advice, and I don’t know about your licensing requirements, but I don’t think getting inspired by MIT licensed code will be a problem, will it?

Incidentally, maybe SimdJson is worth a look too?

1 Like