WoW! I finally found a way to show the differentiation in speed between JSON::Any and static types

At first, my benchmark code looked like this:

require "json"
require "benchmark"

struct Message
  JSON.mapping(
    cmd: String,
    message: Hash(String, String),
    extra: Int32
  )
end

JSON_DATA = %({"cmd": "LOGIN", "extra": 123, "message": {"hi": "[1, 2, 3]", "username": "george", "password": "muffin"}})

Benchmark.ips do |x|
  x.report("JSON.parse with JSON::Any") {
    message = JSON.parse(JSON_DATA)
    # We usually have to invoke type checking.. so let's add some
      if message["extra"].as_i == 123
      end
  }
  x.report("from_json with struct & JSON.mapping") {
    message = Message.from_json(JSON_DATA)

      # statically typed as Int32?
      if message.extra == 123
      end
  }
end

I noticed a problem. The results did not show a big difference . JSON.parse ends up being only 1.04x slower. In fact, sometimes the JSON.parse test was faster, which was weird.

          JSON.parse with JSON::Any 519.58k (  1.92µs) (± 3.15%)  1648 B/op   1.06× slower
from_json with struct & JSON.mapping 552.71k (  1.81µs) (± 4.04%)  1408 B/op        fastest

That got me thinking! Eventually, I added in a loop (which makes more sense because if it’s an incoming command from the client to the gameserver, it will be called more often):

New Code:

require "json"
require "benchmark"

TimesToLoop = 250

struct Message
  JSON.mapping(
    cmd: String,
    message: Hash(String, String),
    extra: Int32
  )
end

JSON_DATA = %({"cmd": "LOGIN", "extra": 123, "message": {"hi": "[1, 2, 3]", "username": "george", "password": "muffin"}})

Benchmark.ips do |x|
  x.report("JSON.parse with JSON::Any") {
    message = JSON.parse(JSON_DATA)
    # We usually have to invoke type checking.. so let's add one
    TimesToLoop.times do
      if message["extra"].as_i == 123
      end
    end
  }
  x.report("from_json with struct & JSON.mapping") {
    message = Message.from_json(JSON_DATA)

    TimesToLoop.times do
      # statically typed as Int32?
      if message.extra == 123
      end
    end
  }
end

Output:

           JSON.parse with JSON::Any 157.83k (  6.34µs) (± 0.86%)  1649 B/op   3.57× slower
from_json with struct & JSON.mapping 564.13k (  1.77µs) (± 2.12%)  1408 B/op        fastest

Over 3.5 times SLOWER!

Actually… shouldn’t the LOOP include the from_json and JSON.parse as well? Oops! Gonna do more testing! This might be even a bigger difference than I thought

edit:

Wait!

require "json"
require "benchmark"

TimesToLoop = 250

struct Message
  JSON.mapping(
    cmd: String,
    message: Hash(String, String),
    extra: Int32
  )
end

JSON_DATA = %({"cmd": "LOGIN", "extra": 123, "message": {"hi": "[1, 2, 3]", "username": "george", "password": "muffin"}})

Benchmark.ips do |x|
  x.report("JSON.parse with JSON::Any") {
    TimesToLoop.times do
      message = JSON.parse(JSON_DATA)
      # We usually have to invoke type checking.. so let's add some
      if message["extra"].as_i == 123
      end
    end
  }
  x.report("from_json with struct & JSON.mapping") {
    TimesToLoop.times do
      message = Message.from_json(JSON_DATA)

      # statically typed as Int32?
      if message.extra == 123
      end
    end
  }
end

Outputs

           JSON.parse with JSON::Any   2.09k (478.01µs) (± 0.79%)  412107 B/op        fastest
from_json with struct & JSON.mapping   2.07k (482.05µs) (± 0.90%)  352158 B/op   1.01× slower

What the heck? The from_json is acting as the bottleneck that ruins the speed of the struct?
With no from_json, it’s 3.5 times faster. Thus, it ruins performance because from_json has to be called. FFS :/ LOL

2.09k * 250 = ~500k, which is identical to the first results. JSON.parse has almost the same speed as .from_json because they both use JSON::PullParser. But .from_json additionally casts from JSON::Any to a desired type, therefore it is tiny bit slower.

Interesting.
I’ve heard that using JSON.parse is slower than a struct w/ JSON.mapping. I can’t seem to find any real world benchmarks to support that. I thought I did in my first post, but now we’re back to square one.

Unless… my benchmark code is wrong, which could be a possibility

require "json"
require "benchmark"

TimesToLoop = 250 # note that it's not needed - Benchmark.ips will call a proc many times already to calculate average and deviation.

struct Message
  JSON.mapping(
    cmd: String,
    message: Hash(String, String),
    extra: Int32
  )
end

JSON_DATA = %({"cmd": "LOGIN", "extra": 123, "message": {"hi": "[1, 2, 3]", "username": "george", "password": "muffin"}})

Benchmark.ips do |x|
  x.report("JSON.parse with JSON::Any") {
    TimesToLoop.times do
      message_json = JSON.parse(JSON_DATA)
    end
  }
  message_json = JSON.parse(JSON_DATA)
  x.report("access JSon::Any") {
    # We usually have to invoke type checking.. so let's add some
    if message_json["extra"].as_i == 123
    end
  }
  x.report("from_json with struct & JSON.mapping") {
    TimesToLoop.times do
      message_struct = Message.from_json(JSON_DATA)
    end
  }
  message_struct = Message.from_json(JSON_DATA)
  x.report("access struct") {
    # statically typed as Int32?
    if message_struct.extra == 123
    end
  }
end

results

           JSON.parse with JSON::Any    2.7k (369.84µs) (± 2.88%)  412014 B/op  269013.49× slower
                    access JSon::Any  84.08M ( 11.89ns) (± 1.89%)       0 B/op       8.65× slower
from_json with struct & JSON.mapping   2.87k (348.23µs) (± 2.35%)  352010 B/op  253296.14× slower
                       access struct 727.38M (  1.37ns) (± 2.88%)       0 B/op            fastest

so access to struct is faster than access to Any, but they both take negligible time when compared to initial JSON parsing. Just don’t use JSON for client-server intercation (unless of course your client side is JS).

Something is definitely not right. If a developer is manually typing out the structure of the JSON and its types, it should yield a much larger performance than 1 to 3%. (with highs of 6%).

Otherwise, structs don’t make sense in this case. Let’s use JSON.parse with JSON::Any, and cast type checking methods everywhere!

JSON.mapping and JSON.parse are using JSON::PullParser to parse the string into some structure, an actual object and JSON::Any respectfully.

So parsing the string is going to be roughly equally performant. However, I think a factor you’re not seeing is that JSON.parse has to do that typing every time; which is why accessing it is slower than JSON.mapping.

JSON.mapping only has to do the type conversion once since you told it the types of each property ahead of time.

But why would that be better? It would be like 9x slower based on @konovod’s benchmark. Again, since the conversions from string to T have to happen multiple times, vs the once of JSON.mapping.

LOL. Not sure how he’s getting 9x slower. See this code in my Github Issue. Using JSON.parse is roughly 1 to 4% slower, sometimes 6%. Hell, sometimes it comes out faster.

from_json basically acts like a rogue equilibrium and ruins the performance of accessing properties with a struct. This should not be happening.

Please try parsing a struct with many fields, all of primitive types (string, int, etc.), with nested objects that are themself composed of primitive types and so on. Don’t have a field of type Hash, because JSON.parse essentially parses to Array and Hash.

Try a benchmark with that and you might see a bigger difference in performance.

I went ahead and did a benchmark with 4 structs using JSON::Serializable, each with some primitive data types and a nested object.

require "benchmark"
require "json"

struct CorsObject
  include JSON::Serializable

  getter str : String
  getter int : Int32
  getter int_64 : Int64
  getter float : Float64
  getter bool : Bool
end

struct CorsConfig
  include JSON::Serializable

  getter cors_object : CorsObject

  getter str : String
  getter int : Int32
  getter int_64 : Int64
  getter float : Float64
  getter bool : Bool
end

struct RouteConfig
  include JSON::Serializable

  getter cors : CorsConfig

  getter str : String
  getter int : Int32
  getter int_64 : Int64
  getter float : Float64
  getter bool : Bool
end

struct Config
  include JSON::Serializable

  getter routing : RouteConfig

  getter str : String
  getter int : Int32
  getter int_64 : Int64
  getter float : Float64
  getter bool : Bool
end

json_str = <<-JSON
{
  "str": "config_string",
  "int": 1,
  "int_64": 111,
  "float": 1.11,
  "bool": true,
  "routing": {
    "str": "routing_string",
    "int": 2,
    "int_64": 222,
    "float": 2.22,
    "bool": false,
    "cors": {
      "str": "cors_string",
      "int": 3,
      "int_64": 3,
      "float": 3.33,
      "bool": false,
      "cors_object": {
        "str": "cors_object_string",
        "int": 4,
        "int_64": 444,
        "float": 4.44,
        "bool": true
      }
    }
  }
}
JSON

json_config = Config.from_json json_str
json_parse_config = JSON.parse json_str

puts "Just parsing the structure"
Benchmark.ips do |x|
  x.report("from_json") do
    Config.from_json json_str
  end
  x.report("JSON.parse") do
    JSON.parse json_str
  end
end

puts

puts "Parse the structure and read a nested value"
Benchmark.ips do |x|
  x.report("from_json") do
    config = Config.from_json json_str
    config.routing.cors.float
  end
  x.report("JSON.parse") do
    config = JSON.parse json_str
    config["routing"]["cors"]["float"].as_f
  end
end

puts

puts "Just access already parsed data"
Benchmark.ips do |x|
  x.report("from_json") do
    json_config.routing.cors.float
  end
  x.report("JSON.parse") do
    json_parse_config["routing"]["cors"]["float"].as_f
  end
end

The results

Just parsing the structure
from_json 309.73k ( 3.23µs) (± 2.96%) 1728 B/op fastest
JSON.parse 264.19k ( 3.79µs) (± 2.65%) 3697 B/op 1.17× slower

Parse the structure and read a nested value
from_json 306.01k ( 3.27µs) (± 4.53%) 1728 B/op fastest
JSON.parse 259.74k ( 3.85µs) (± 2.80%) 3697 B/op 1.18× slower

Just access already parsed data
from_json 884.2M ( 1.13ns) (± 2.47%) 0 B/op fastest
JSON.parse 23.16M ( 43.17ns) (± 1.96%) 0 B/op 38.17× slower

So from this, what can we tell?

  1. from_json is slightly faster in parsing the string into an object.
  2. from_json is slightly faster in parsing the string, then reading a value from it
  3. from_json is substantially faster in reading values after the initial parsing.

In conclusion, from_json is faster in every way. JSON.parse comes close in initial parsing of the string, probably due to them both using similar parsing methods. However, once the string is parsed into an object, in this case structs, reading values from it is much faster.

That’s some good benchmark code, thanks!

Yes, exactly. Reading from a struct is insanely faster. But when you do from_json and access the properties at the same time, performance is now bound to how fast from_json can parse JSON. Which now, your code isn’t utilizing the power of a struct. It’s now bound to from_json's speed.

Which… completely nullifies the entire point of using a struct in the first place.

Not really. How often in real code are you going to parse the same JSON string? Once.

It’s more so hindered because of it has to parse JSON. See https://github.com/crystal-lang/crystal/issues/7609#issuecomment-478354495 As the benchmark shows, each is roughly similar when it comes to doing the initial deserialization of the JSON data.

The benefit of the struct is:

  1. You get type safety.
  2. Its faster after the initial parsing
  3. You can add methods and stuff to it since its an object.
  4. Inheritance/Generics
  5. etc

Yeah, those are the benefits I want, and why I wanted to see if it would be worth it to remove all my type checking code for JSON::Any and use a struct instead of JSON.parse.

In my case, which is a completely valid real-world use case, it’s every incoming command from the client->server.

Me modifying hundreds of methods for a 1 or 4% improvement difference is very slim. If it was for a 3.5 times improvement (see my OP), I would have already done it :slight_smile:

But it’s not even about that, struct gets bound to from_json’s speed. In fact, it gets bound to to_json’s speed as well. Which I forgot to mention.

Of course. If you use a struct but then you read a 10 terabytes file, your program will be bound to the reading. Here the program is bound to parsing JSON, which means decoding the string that contains the JSON data, independently of where you end up storing that info. Using a struct won’t make your code magically faster.

So I guess I don’t understand what’s your complaint or problem. If you want to use JSON.parse and you don’t care about type safety or the small performance optimizations, nobody is forcing you to use from_json. That’s why JSON.parse is there.

Let me just ask this. Is this intended / normal? Or could it be an issue?

As I said in my latest post on GitHub, I thought the entire point of using JSON.mapping is to get the speed of a struct. Since the developer is statically typing the structure. This doesn’t happen, for my use case at least. In my use case, the struct now becomes bound to from_json or to_json’s speed. Completely nullifying the entire point of using a struct in the first place?

The entire points are: a bit more efficient parsing, type safety and less memory consumed. from_json achieves those goals. Using a struct or a class is an orthogonal issue.

Struct/Class also allows you to work in an OOP manner, while JSON.parse, you’re basically just working with big hash representations of your JSON.

If you need more performance, using a different format, like message pack or something could allow you to reduce the time parsing, which would lead to an increase in performance.

EDIT: https://github.com/crystal-community/msgpack-crystal But of course your client would have to send the data in that format as well.

What’s the best for client server communiaction (for sending whole object trees), for example, if both sides is written in crystal? Thanks!

IMHO it’s cannon. Basically - just dumps raw data from all fields without any format specifiers, as fast as it can be.
Of course it have downsides - no classes (only structs\arrays\hashes) and reciever have to read data into same structure, but I think it’s right model for a client-server game.

Unfortunately cannon hasn’t been updated for 2 years