Is there a better way to create a hash from a colon and comma delimited string?

Been fiddling around with this for quite some time. Still not sure if I’m doing it correctly, but it does seem to work: Playground: https://play.crystal-lang.org/#/r/6dp2/edit

Code:

data = "0:123,4:599,3:6912"

def colon_comma_to_hash(data)
  test = Hash(Int8, Int64).new
  split = data.split(",")

split.each do |vv|
 socket_data = vv.split ":"
  test[socket_data[0].to_i8] = socket_data[1].to_i64
end
  test
end

def colon_comma_to_s(hash)
  hash.to_s.gsub("_i8 =>", ":").gsub('{', "").gsub("_i64", "").gsub(" ", "").chomp('}')
end

test = colon_comma_to_hash(data)

puts test
puts colon_comma_to_s(test)
 

Output:

{0_i8 => 123_i64, 4_i8 => 599_i64, 3_i8 => 6912_i64}
0:123,4:599,3:6912

I feel like the gsub replace chaining is a bit hacky, however, using chomp felt pretty good. :stuck_out_tongue:

Any improvements to this would be greatly appreciated

edit: I wanted to use JSON for this, but I don’t want to store redundant " around keys in my items table. Unfortunately, that’s required for the JSON spec. And I tried from_yaml, but got an error when trying to compile. And I don’t know if I need to install a YAML dependency after a static compile on the VPS, that would be super annoying. Static compiling works great already, not going to break it all just for YAML support.

https://carc.in/#/r/6duk

data = "0:123,4:599,3:6912"
hash = data.split(',').map(&.split(':').map(&.to_i)).to_h
pp! hash # => {0 => 123, 4 => 599, 3 => 6912}

if i’m understanding you correctly. Just split to array of two-element arrays, then use to_h for conversion to hash.

1 Like

Thanks @konovod, very cool!

And for the way back: hash.to_a.map(&.join(':')).join(',') https://carc.in/#/r/6dun

1 Like

Also if you have lots of data you can avoid most immediate allocations if you trade in some verbosity:

data = "0:123,4:599,3:6912"
hash = Hash(Int8, Int64).new
data.split(',') do |pair| 
  key, value = pair.split(':', 2)
  hash[key.to_i8] = value.to_i64
end
pp! hash


data = String.build do |io|
  first = true
  hash.each do |key, value|
    io << ',' unless first
    io << key << ':' << value
    first = false
  end
end

puts data

https://carc.in/#/r/6dus

There’s some optimization potential left while deserializing by getting rid of the inner split in favor of tokenizing the string by hand, but that’s gets quite a bit more verbose.

For the serialization part. the String.build is just an example, you could use an IO (like a socket or stdout) directly, avoiding the big intermediate string.

Eventually I’d like to write a string tokenizer for the std, similar to the one in Go (bufio.Scanner), which doesn’t use regex.

2 Likes

And if you really want to use regexes, you can simplify it to:
hash.to_s[1...-5].gsub("_i8 => ", ':').gsub("_i64, ", ',')
where the [1...-5] strips one char from the start and 5 from the end. Using a 3 dot range (up to, but not including) makes this clearer in my opinion.
However, I’d still go with one of the previous suggestions because this relies on the format returned by Hash#to_s, and is probably slower.

Wow, so map, split and join can reduce my code into a few lines of code. @jhass Yeah, items can only have a max of 6 sockets, and the item id is a bigInt, so I won’t have to deal with that much data I think. However, player’s can have quite a bit of items, especially if they are loading items from their stash. So at this point, I’ll probably do the map, split and join way and if something happens later down the road, i’ll come back to this thread :O.

With that said however, these methods are really only used when an item’s sockets, links, etc are being modified.

Thanks again you guys! Very cool to see how to properly do things compared what I originally thought.