With the recent archiving of the language-crystal repository which housed the language grammars, I think this would be the best time to get started on an official tree-sitter for Crystal under the crystal-lang-tools organisation. This has been a long-requested tool for the Crystal community and would help with Crystal tooling support. We could also house new language configurations in the repository, such as the query files and syntax highlighting. Several attempts have been made already, but none of them complete and with very little maintenance. This would be a good opportunity for people to collaborate on this and build a better/more usable tree sitter.
I’ve been following keidax’s treesitter implementation for awhile, seems to be the most complete one so far. Personally, I would love this, as an active Neovim user using treesitter already on every other language.
How close to completion would you say Keidax’s implementation is? We could probably use it as a reference or even copy parts of it for the official tree sitter.
I’m not going to pretend I’m an expert, I think we should just ask keidax.
This grammar is still under construction.
Syntax highlighting has not been implemented yet, and many syntax features are incomplete.
Looks like it potentially still has a lot of work before being completed.
As far as I tried the keidax’s implementation, the parser seems to work well for relatively simple files but fails with more complex, big files and sometimes crashes at the C scanner which seems to be handwritten and has about 2000 lines of code (vs about 1000 lines for tree-sitter-ruby and none for will’s tree-sitter-crystal implementation).
The test files give us some idea about all the syntax features currently implemented and they seem extensive enough that we could use it as a reference for sure.
I tested out Keidax’s implementation recently and apart from the lack of syntax highlighting, it worked pretty well. Perhaps we could get some movement on this to get the tree sitter up to speed.
I’m doing the same, I also did tree-sitter API experimental bindings in the meantime, so people can use tree-sitter in Crystal applications… however the API is far away from being stable or even complete since it’s being done based only on my special needs so far .
Could we get a repo in the organisation setup for this? I’d love to get started on this soon, even if that’s just via issues and PRs.
Are you just looking to create an empty repository? If so I can create one.
Yes, from there we can figure out a good starting point for the tree sitter.
up up.
keidax parser isn’t perfect but I can already use it on nvim for syntax highlighting.
This is really cool!
I’ve forked that repo to crystal-lang-tools/tree-sitter-crystal to replace the old version (moved here). From there I plan on updating the Zed plugin to use this version - macro highlighting may be broken but that’s the case right now regardless (due to using ruby’s tree sitter).
Nice, I’ll point the PRs to it and add it to nvim-tree-sitter soon.
Related to this: I forked the tree-sitter-slim parser and turned it into tree-sitter-slang, not yet published since meanwhile I just renamed the parser.
I actually merged your PRs already as wanted the CI improvements and to see where parser issues were happening. Thank you for those!
Happy to create create-lang-tools/tree-sitter-slang
or similar. Do any changes need to be made beyond renaming the parser?
Probably some small changes, but the syntax highlighting already works, I can publish it next week since I have no time to touch the computer on weekends due to cute babies demanding all my attention.
Hey all, I’m the creator of keidax/tree-sitter-crystal! Just found this thread.
I’ve worked on my version of the Crystal tree-sitter grammar on and off since 2021. I’ve gone through periods of motivation, and periods where I lost interest. But it’s great seeing renewed interest in this project, and I’d love to help!
Since active community development is happening at the crystal-lang-tools repo, I will focus my efforts there. Eventually I’d hope to archive my repo in favor of the community version.
A few random thoughts I wanted to share, based on my experience working with tree-sitter:
-
The external scanner code is the biggest liability in the project. It’s also absolutely crucial, since language features like heredocs cannot be scanned in
grammar.js
alone. I’ve used the scanner for things like implementing precedence for some operators. Crystal operator precedence is whitespace-dependent in certain contexts, and that’s not easy to express ingrammar.js
.I know the C code isn’t the prettiest . Some of the structure is necessary, because the scanner can only return a single symbol at a time. But there’s definitely room for improvement.
-
As far as I know the Crystal language isn’t specified in a formal way. I’ve relied on the compiler’s parsing code as the source of truth. This isn’t ideal, because the compiler could change how it handles some edge case without anyone thinking to update tree-sitter-crystal at the same time.
-
I believe parsing macro content in all cases is essentially impossible. For example, this is valid Crystal code:
{%begin%} if true puts "in macro" {{["e", "n", "d"].join("").id}} {%end%}
There’s no way to resolve the
end
keyword without actually executing the macro code. And doing that would require either building a full Crystal macro interpreter intoscanner.c
, or calling out to an external tool on each scan. Neither option seems feasible to me. (Tree-sitter grammars are supposed to have no external dependencies besides a C compiler, so relying on the crystal toolchain would be a last resort.)I wonder if there’s a way to use conflicts or
prec.dynamic
to have a fallback macro rule. This fallback would treat everything in the macro as a “macro body” node without distinguishing syntax further. Or maybe just leave it up to tree-sitter’s error tolerance?