How bad is $~?

In my new internationalization system, I am using $~, $1, $2, etc. The idea is that your original string may contain "#{a} arbitrary text #{b}", and your translated string is "#{$1} translated arbitrary text #{$2}" and gets the original interpolated expressions in whatever order you specify. You can also rewrite the expressions if you have to.

$~, $1, $2 is a weird thing we inherited via Perl and Ruby and inspired by Bourne shell. It is singular in the language in that it leaks one scope level out of a function, and only one scope level, and thus can be used to return multiple values, although tuples are cleaner. You can’t take its value if it’s not set. It was intended to hold the return value of regular expession matches, although it’s better to take the value of Regex#match.

The way I use it, it does not leak into the scope of the caller of my internationalization functions.

What I want to know is how bad $~ is considered, especially by the language developers. In particular:

  • Does it do bad things to code generation and optimization?

  • Does it cause block captures where they would otherwise be expanded inline?

  • Is it another of the language features @asterite wants to delete, and what are his reasons?



1 Like

This issue should have the basic answers covered:

I don’t understand what you’re trying to do or how dollar sign variables would help anything with you i18n use case. Why can’t you use a and b in your translated string?
In any case, I’d advise not to use dollar sign variables for anything but the intended use case of convenient access to regex matchdata.

I don’t want to use a and b in my translated strings because the translators are not programmers, and since the translated strings are in different files from the native language strings, I want to avoid changing code in the translated strings.

I love translation APIs where you can have named placeholders. “%person likes this post” just gives the translator so much more context vs “$1 likes this post”.


Exactly. And translators need to understand the source translation in order to translate it, so what’s the point in having different formats?

On a different, note - still not sure what exactly this is about - localization in my understanding mostly happens at runtime, but in order to use Crystal expressions like string interpolation, that would need to happen at compile time.

1 Like

The magic variables $~, $1, etc. are fine. I was a bit reluctant at first (even though I implemented them! :-P) but they are usually used next to a match, and they are very convenient.

That said, I also don’t understand why they need to be part of a translation API.

@jhass I am all for giving the translator instructions. But I am starting with crystal’s existing string syntax and not inventing a new string format. So, I can do something like this:

t(“String #{1+1} #{2+2} String”, exp=[“person”, “place”])

@asterite The reason I wanted them for the translation API was that I wanted to use interpolated strings as the translated strings, and I wanted to have a set of local variables that would never collide with my caller, even if I was in a block in the caller’s context. But I was probably getting a bit too clever for my own good.

Maybe this can help you?

@asterite My problem with fresh variables is that they work by letter rather than number, and the alphabet is wildly different across national languages. So, I can’t expect the translator to think that %b is the second argument, etc.