Official Tree Sitter

Hey keidax! I really appreciate all the work you’ve put into the tree sitter implementation. In case you didn’t know, Hugo put together a test script and your tree sitter implementation can already parse ~26% of the standard library without errors, which is honestly really amazing.

The stdlib parser code is also what I’ve been reliant on for figuring out the syntax / semantics of the language, though at one point I did start trying to define (most) of the language in EBNF, still a lot to do for that though (and impossible to express everything).

As far as macros go, I agree with just treating everything inside macro blocks as opaque code. This would mean limited visibility inside macro blocks when editing them, but for me that’s just the tradeoff to be made when utilizing macros.

I also think it would be a good idea to treat getter/setter/property/record as their own node types, rather than macro calls, as these are the most common uses for macros. This can be explored later though.

I’m really excited for the future of Crystal tooling!

3 Likes

That would be ideal. These edge cases are pretty rare, and in these occasions, an utility parser should be able to recover from “error”.

1 Like

Parsing macro content doesn’t require executing it.

The way the lexer works is this:

  • it lexes chunks of text normally, except that it also checks for some keyword. If it finds ‘if’, ‘for’ or ‘unless’ at the beginning of a line (just followed by spaces) it increments a counter. When it finds end it decrements the counter.
  • it also checks for string literals like %q(...) to know that finding ‘end’ inside that isn’t actually an end

The snippet you gave works because the macro necessarily ends when {% end %} is found. If you define it like this:

macro foo
  if true
    puts "in macro"
  {{["e", "n", "d"].join("").id}}
end

the code gives a syntax error saying the macro is unterminated. That’s because there’s if without a matching end. The actual content of {{...}} is completely ignored by the lexer.

(to make the above case work you could also do {{"if".id}}.

That said, Crystal should probably error in your snippet saying that it couldn’t find the ‘end’ for that if.

6 Likes

IMO Crystal macros can be opaque nodes in tree-sitter, at least for now to not disturb highlighting on edge cases.

OTOH maybe this is an ultra edge case, since as Asterite said, it fails to parse.

$ crystal eval
macro foo
  if true
    puts "in macro"
  {{["e", "n", "d"].join("").id}}
end
syntax error in eval:1
Error: unterminated macro

And I don’t see why someone would write a macro that writes if/end from macro variables… if an edge case is valid by the compiler but never used in practice, I think tree-sitter failing to parse it isn’t a big deal.

Reading keidax scanner last days I was imagining if would be possible to move some stuff from the scanner to the grammar.js, like the operator precedence… however I’m not aware of the use cases where it would depend on white space.

2 Likes