Formatter vs ToSVisitor

The Crystal compiler contains two tools which turn an AST into source code: The formatter and the to_s visitor.

Their implementations are completely independent, so there is a lot of duplication. I’m wondering if there’s any chance to reduce that.

The formatter is used to format existing source code, the to_s visitor is used whenever the compiler needs to stringify nodes, most notably for macro expansion ({{ node }} is similar to #{node.to_s} in the source string).

The primary difference is that the formatter generally retains formatting of the existing source (for example, whether a method call uses parenthesis or not), whereas the to_s visitor stringifies a more strict formatting, based solely on the AST. The formatter on the other hand has access to original source format.

The to_s visitor still allows some variation, as long as the information is available in the AST.

A bit surprising actually, the to_s visitor does not put line breaks inside do ... end block if they didn’t exist before, while the formatter always reformats do ... end blocks with line breaks.
I suppose this caused by the primary difference in use cases: The formatter is concerned about standardizing a well-formatted source. to_s is generally not meant to follow the original formatting, but it takes care to maintain line breaks, so that line numbers of expanded macro expressions match up.

I tried to see what happens when you pass the to_s result through the formatter. Most of the code is actually well-formatted. The line-breaks of do ... end blocks is a major difference, and there’s an issue with indentation inside binary operators.

And I noticed there are to_s specs for annotations on splat parameters, and the formatter breaks on those (fix Fix format annotations of splat parameter by straight-shoota · Pull Request #16987 · crystal-lang/crystal · GitHub).

We had the idea of moving tools outside the compiler. If we drop the ToSVisitor what needs to change (besides the playground :see_no_evil_monkey:).

I don’t recall if macros produce AST directly or source that needs to be parsed later.

I wonder if replacing means it won’t be able to be extracted…

I don’t think extracting the ToSVisitor is an option. The compiler depends on it for stringifying macro nodes.