Do ascii/binary strings exist?

In retrospective we should probably should have just exposed to_bytes or something like that, that returns a slice. So all methods on String would be character related, and to go to bytes you would use that other single method.

2 Likes

Well I’m very fond of String#byte_slice which would really quite a bit to implement on top of #to_bytes. Unless we’d add such a method to Slice, which might not be a bad idea (but it fits better on String).

2 Likes

I think the available methods are fine (just slightly unhappy with their names). And I would definitely recommend to adjust the docs a bit (I would be happy to make suggestions, but sadly I don’t feel competent enough). String offers everything I need, but even after looking it up in the docs I wasn’t aware of it.

I’d still love to see for String to go back to being enforced and validated UTF-8 always, loose all the byte wise operations (except internally for ASCII only optimizations) and then Bytes become it’s own type (potentially inheriting Slice(UInt8)), gain any functions necessary, perhaps gain a literal syntax and receive any special cases in codegen necessary to make it performant.

1 Like

Without in any way implying it should be done this way, but as an honest question: why not handling it the same way as ruby does (by giving the object the information if it’s supposed to be utf8 or just some binary data)? And depending on this information the methods will react accordingly.

But I am totally fine with the status quo, too, and don’t mind instead using Bytes either.

Because in my personal experience the whole strings may be text or binary blobs (and have more than one possible encoding) is easily the biggest design flaw in the Ruby standard library. The whole pack/unpack interface is quite cryptic. How the string and IO encoding interacts is quite hard to grasp, etc. In other words I’m not happy that Ruby does it this way either, it’s just conflating two concepts because C does it this way :slight_smile:

3 Likes