String#<=> uses unsafe.memcmp, so it’s a numeric byte sort rather than anything language-aware. Sorting of UTF-8 codepoints for a particular language would require a table of character-order for that language, so that "ä" and "a" are in proper order relative to each other. A table of characters to ignore in sorting, like "'", is also necessary.
Has anyone done this for Crystal? There is a treatise on Unicode sorting at UTS #10: Unicode Collation Algorithm that is a mullti-level sort with weights, a lot more than just two tables, but I don’t know of an implementation.
Thanks
Bruce