Do ascii/binary strings exist?

asterite · March 19, 2022, 5:29pm

In retrospective we should probably should have just exposed to_bytes or something like that, that returns a slice. So all methods on String would be character related, and to go to bytes you would use that other single method.

straight-shoota · March 21, 2022, 10:13am

Well I’m very fond of String#byte_slice which would really quite a bit to implement on top of #to_bytes. Unless we’d add such a method to Slice, which might not be a bad idea (but it fits better on String).

anon69898395 · March 21, 2022, 11:15am

I think the available methods are fine (just slightly unhappy with their names). And I would definitely recommend to adjust the docs a bit (I would be happy to make suggestions, but sadly I don’t feel competent enough). String offers everything I need, but even after looking it up in the docs I wasn’t aware of it.

jhass · March 21, 2022, 12:58pm

I’d still love to see for String to go back to being enforced and validated UTF-8 always, loose all the byte wise operations (except internally for ASCII only optimizations) and then Bytes become it’s own type (potentially inheriting Slice(UInt8)), gain any functions necessary, perhaps gain a literal syntax and receive any special cases in codegen necessary to make it performant.

anon69898395 · March 21, 2022, 1:16pm

Without in any way implying it should be done this way, but as an honest question: why not handling it the same way as ruby does (by giving the object the information if it’s supposed to be utf8 or just some binary data)? And depending on this information the methods will react accordingly.

But I am totally fine with the status quo, too, and don’t mind instead using Bytes either.

jhass · March 21, 2022, 1:29pm

Because in my personal experience the whole strings may be text or binary blobs (and have more than one possible encoding) is easily the biggest design flaw in the Ruby standard library. The whole pack/unpack interface is quite cryptic. How the string and IO encoding interacts is quite hard to grasp, etc. In other words I’m not happy that Ruby does it this way either, it’s just conflating two concepts because C does it this way

Topic		Replies	Views
Pack / Unpack methods Help & Support	3	949	September 27, 2023
String#ascii_only? Crystal Contrib	17	444	November 23, 2024
Add validity flags to String Crystal Contrib	5	298	October 13, 2023
JSON serialization: ensure_ascii Help & Support	2	126	January 9, 2024
More on Symbols Crystal Contrib	37	2848	January 15, 2023

Do ascii/binary strings exist?

Related topics