Having trouble parsing YAML into a programatically usable format

I have decided to migrate a small package managing program I was developing in ruby over to crystal for performance reasons, and I’m having some hiccups with the yaml/json libs.

My program takes a search query and pulls down results from and http rpc interface. The results are in JSON, however I would much rather use YAML because it makes my life easier reading-wise. I have the functionality of parsing the results into YAML, and being able to write them to a file or stdout, but I’m stuggling to get it to a point where the yaml is formatted the way I’d like it to be.

---
- Description: A CLI system information tool written in BASH that supports displaying images.
  Name: neofetch-git
  URL: https://github.com/dylanaraps/neofetch
  URLPath: /cgit/aur.git/snapshot/neofetch-git.tar.gz
  Version: 3.2.0.r28.g2eca41d-1
- Description: some other package xyz
  Name: packagename

this is an example of the output im working with.

I used a YAML::Serialization class to format them a bit differently, like such

---
Name: neofetch-git
Description: A CLI system information tool written in BASH that supports displaying
  images.
Version: 3.2.0.r28.g2eca41d-1
URLPath: /cgit/aur.git/snapshot/neofetch-git.tar.gz
NumVotes: 30
URL: https://github.com/dylanaraps/neofetch
---

But i am unable to read multiple entries of this from a file, it will only see the first one and then exit, I imagine it is because of the — on the beginning and end of the entry.

what i want my results to be is something along the lines of this…

---
- Package: packagename1
  PkgInfo: ["Name": name, "Version": version, "Description": desc, # so on & so forth ]
- Package: packagename2
  PkgInfo: ["Name": name, "Version": version, "Description": desc, # so on & so forth ]
- Package: packagename3
  PkgInfo: ["Name": name, "Version": version, "Description": desc, # so on & so forth ]

I know that this is a hash that I am trying to make, but it seems the YAML objects have very few methods, or perhaps I am just using the wrong ones to be able to easily access those values programmatically. I know this has been a long question, but i assume if anyone is going to read through a post this long about yaml formatting and access, that they probably see a decent amount of value in it being readable/elegant and accessible. If anyone has any ideas or can point me in the direction of something that may help me I would really appreciate it. Thank you for coming to my TED talk.

I think the key diff here is that file maps to a single object. Where in the example before it, the - in front of Description denotes an array of objects. To get your results like you want you’d need to have something like:

require "yaml"

class Package
  include YAML::Serializable

  property name : String
  property info : PkgInfo
end

array_of_packages.to_yaml

I.e. an array of objects that represent the outer structure of what you want, then another package info type to get the PkgInfo object within it.

@wreedb Is this what you’re looking for?

require "yaml"
require "uri/yaml"

struct Package
  include YAML::Serializable

  @[YAML::Field(key: "Description")]
  getter description : String
  @[YAML::Field(key: "Name")]
  getter name : String
  @[YAML::Field(key: "URL")]
  getter url : URI?
  @[YAML::Field(key: "URLPath")]
  getter url_path : String?
  @[YAML::Field(key: "Version")]
  getter version : String?
end

pp Array(Package).from_yaml(<<-YAML)
---
- Description: A CLI system information tool written in BASH that supports displaying images.
  Name: neofetch-git
  URL: https://github.com/dylanaraps/neofetch
  URLPath: /cgit/aur.git/snapshot/neofetch-git.tar.gz
  Version: 3.2.0.r28.g2eca41d-1
- Description: some other package xyz
  Name: packagename
YAML
1 Like

This seems like it could work for me, the question I may still have is that the package: packagename entry in the desired output doesn’t actually exist in the data I’m feeding into the parser, it’s kind of just something that I want to be there , is it possible to insert such a field without it being there in parsing?

This seems very close to the serialization class that I currently have, although the (yaml<<–) part is foreign to me in the class call, what does adding that on the end do? Is that specifying the output format with the - after the <<?

That’s called a heredoc. They are useful for nontrivial strings, like YAML or SQL.

Sure, can ignore that property on deserialization, make it nilable, then fill it in later. Assuming it comes from an external source. Another option would be to tap into the after initialize logic of the deserialization process to fill it in based on some data point from the parsed data. Hard to say what the best option would be, but you have some options.