How to dig for a dynamic JSON value?

Hello,

I have a JSON::Any value and would to like to #dig? for it using some user input.

require "json"
data = JSON.parse %[{"data": {"food": [{"name": "pizza", "cost": 5}]}}]
lookup = "data.food.0.name".split "."


data.dig? *lookup # Error: argument to splat must be a tuple, not Array(String)

t = Tuple(String).from(lookup) # Expected array of size 1 but one of size 4 was given. 

So I can’t pass the array to #dig, and I can’t make a tuple either because I don’t know how big the array is. What to do?

This is my best try to solve your stated problem:

require "json"
data = JSON.parse %[{"data": {"food": [{"name": "pizza", "cost": 5}]}}]
lookup = ["data", "food", 0, "name"]

d = lookup.reduce(data.as_h) { |hash, key| hash[key].as_h? || hash[key] }

p! d # => "pizza"
p! typeof(d) # => (Hash(String, JSON::Any) | JSON::Any)

Run it with carc.in

Some notes:

  • I changed lookup to an array literal because the 0 needed to be an Int32, not a String. If you need to parse a lookup yourself, I’ll leave solving that part to you.
  • #reduce is incredibly useful, but my solution feels a bit hacky. I hope someone else has a better solution, but I wanted to get you something you could use, since you haven’t gotten any replies yet.
  • As my code demonstrates, any way you do this you’re going to end up with a Union type you’ll need to handle. Of course, if you’re able to have some sort of guarantees about the JSON you’re handling (anywhere from a schema to just knowing the depth) or the lookup path then you might be able to get the right value. But even if you always have a {"key": {"key": [{"key", value, ...},...],...},...}-type structure then you’re still going to run into the issue in your example that the actual value could be a String or an Int32 (or whatever other value type). If you have questions about narrowing Union types (which is one of the most common and important things that people need to learn when first using Crystal), I’m happy to do what I can to help.
  • This isn’t a note on my code, but on yours: Tuple(T...) takes the same number of generic arguments as the number of elements. So the type of {1, "two", 3.0} is Tuple(Int32, String, Float64) and the type of {1, 2, 3, 4, 5} is Tuple(Int32, Int32, Int32, Int32, Int32).

There’s still the question of why there isn’t a #dig implementation that uses an Enumerable or Array or whatever. I’ll have to defer that one to someone else, because I don’t know.

1 Like

Thanks so much! I love you guys. reduce may be a bit hacky but sure is cleaner than my first attempt at iterating through the hash.

What does the .as_h? give you? I thought JSON::Any had a []? method? So you could just always use the hash[key] and still be able to lookup on the next pass? I obviously haven’t tried this, will do so after lunch!

I’d love to learn more about "narrowing Union types, if you can point to any previous discussions; that would be some good reading. :nerd_face:

Right you are! d = lookup.reduce(data) { |hash, key| hash[key] } works the same way. I don’t use JSON::Any that much, so I’m happy to learn new things about it.

Here’s the official language reference documentation about union types.

Here’s another example, integrating what you just pointed out about JSON::Any:

require "json"
data = JSON.parse %[{"data": {"food": [{"name": "pizza", "cost": 5}]}}]
lookup = ["data", "food", 0, "name"]

d = lookup.reduce(data) { |hash, key| hash[key] }
d = d.raw

p! d # => "pizza"
p! typeof(d) # => (Array(JSON::Any) | Bool | Float64 | Hash(String, JSON::Any) | Int64 | String | Nil)

Run it with carc.in

That last typeof shows that the compile-time type of d is JSON::Any::Type (docs).

JSON::Any provides some types for narrowing the type. For example, if you knew for absolute certain that d (before you call #raw) was a string value, you could just narrow it with JSON::Any#to_s. However, you’re unlikely to be completely certain. In those cases, what you want to do depends on how you want to use the value.

If all you need is to output the value, you can just call #to_s on it. If the value is a String, that won’t even use any extra memory.

If you want to store the value, you’ll probably have to narrow the type eventually, and you can choose to do it before you store it, after you store it, or some combination.

As an example of the last one, if I knew that I’d end up with either a String or an Int32, I could use case to narrow to String|Int32:

... # previous code

case d
when String
  value = d.as(String)
when Int32
  value = d.as(Int32)
else
  raise "Unexpected type in JSON lookup!"
end

p! value # => "pizza"
p! typeof(value) # => (Int32 | String)

Then I could store value in an object or something; presumably I’d need to further narrow to either String or Int32 when actually doing something with the value.

(Note that the raise is important because it tells the compiler that the type couldn’t possibly be anything other than what is in the when clauses. You could alternately use in instead of when, but you’d have to handle every single type in the union.)

If you want to completely narrow the type, you need to do whatever different logic you have for each type in the case:

case d
when String
  # do something when it's a String
when Int32
  # do another thing when it's an Int32
else
  raise "Unexpected type in JSON lookup!"
end

Handling types is a huge topic in Crystal, and I’m by no means a master of the language, but I hope that this provides something to start with.

1 Like

The main reason is that dig is type-safe. If you do:

a = [{"a" => ['a', 'b', 'c']}]
value = a.dig(0, "a", 1)
puts typeof(value)
puts value

You get:

Char
'b'

Here’s a simple implementation of a dynamic dig:

module Indexable
  def dig(values : Enumerable)
    obj = self
    values.each do |value|
      if obj.is_a?(Array) && value.is_a?(Int32)
        obj = obj[value]
      elsif obj.is_a?(Hash)
        obj = obj[value]
      else
        raise "OH NO!"
      end
    end
    obj
  end
end

Then if you do this:

a = [{"a" => ['a', 'b', 'c']}]
value = a.dig([0, "a", 1])
puts typeof(value)
puts value

You get this:

(Array(Char) | Array(Hash(String, Array(Char))) | Char | Hash(String, Array(Char)))
b

That type that you get is basically useless, it’s a giant union of all the types you could encounter in the way.

That said, dynamic dig for JSON::Any makes total sense, because any value in the chain is itself another JSON::Any.

Feel free to send a feature request!

4 Likes