The Crystal Programming Language Forum

String.new with Pointer(Char)

I have been doing a few C extensions and have noticed some strings I have to use as(Pointer(UInt8). This is not too annoying and can extracted to method and stuff. But I was wondering why. This seems to be like an alias and I dont know why this is a problem. Seems like some of the C extensions in the lib return Char* and work. Also looking into submitting a patch it looks like it would change some fundamental APIs. Would it be of benefit to more String to use Char instead of UInt8? It seems like it is the fundamental unit that String is based on.

example: Carcin

p = Pointer(Char).new(3)
p[0] = 'a'
p[1] = '\0'
p[2] = 'b'
s = String.new(p, 3)

I’m not sure I’m reading your comment correctly, but it seems you’re confusing Crystal’s Char type with C’s char. They are called the same, but are fundamentally different types. The latter represents an 8 bit character. It is mostly equivalent to Crystal’s UInt8 (alias LibC::Char = UInt8). Char is 4 byte wide and represents a Unicode codepoint in UTF-32 encoding.
While a string consists of characters, String uses UTF-8 encoding for space efficiency. That means 8 bit per character, which means the data format is compatible to a char pointer in C.

Bottom line: The Char type is equivalent to C’s unsigned long. I don’t think there are any C APIs that use that to represent strings. So you should never need Pointer(Char) for library bindings.

Yep, that’s already String#to_unsafe. It applies implicitly when passing String to a lib function.

1 Like

That makes a lot of sense. I was mistaken and I think the compiler was saving me from myself. I will have to look into my C lib to make sure I am doing everything correctly.