Hi there, I’m brand new to Crystal, and looking into it primarily for integrating with C libraries. I’ve put together a couple examples, and I have some questions. Note that it’s not about the implementation (i.e. I know Crystal can puts and upcase), but rather the interaction of data between Crystal and C.
(solved) Q1: How do I link to libraries in a relative path?
Linking against a C library in a relative path, I found that I needed to use __DIR__:
libpath = Path["../hello-world-c/_build"].expand()
linkargs = "-L#{libpath}"
#@[Link(lib: "hello", ldflags: linkargs)] # Error: 'ldflags' link argument must be a String
#@[Link(lib: "hello", ldflags: "-L../hello-world-c/_build")] # ld: error: unable to find library -lhello
@[Link(lib: "hello", ldflags: "-L#{__DIR__}/../hello-world-c/_build")] # SUCCESS
lib LibHello
fun hello_world = helloWorld()
end
LibHello.hello_world()
(maybe solved?) Q2: How do you work with C strings?
In this next example, I have a C library that upcases a string.
I think my upcase is example is incorrect, or at least exhibits undefined behavior. I say this because String.new(chars : UInt8*) calls LibC.strlen under the hood - meaning that the UInt8* passed to LibUpcase.upcase isn’t guaranteed to be null-terminated. At least, I don’t see anything under the hood that null terminates it.
So I think the correct way looks like this:
@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
fun upcase(name : UInt8*) : UInt32
end
str = String.new("hello crystal\0".to_slice)
LibUpcase.upcase(str)
puts str[0..-2]
If you need to pass a string to library, just pass it. String already has to_unsafe that will return pointer and strings are null-terminated in Crystal.
@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
fun upcase(name : UInt8*) : UInt32
end
str = "hello crystal"
LibUpcase.upcase(str)
puts str
The only problem that you must be sure that library do not change length of string. If it is changed, things will go bad.
So better use
In general you should use the LibC types to define common types (Like Short, Long, etc) because they can actually be different depending on the system. Since the LibC types are just aliases that are controlled by a macro in some situations.
If you need to pass a string to library, just pass it. String already has to_unsafe that will return pointer
Sorry, I realize I left out an important bit of info about why I’m asking.
LibUpcase.upcase(str) modifies str - and the code you share fails. I assume this is a segfault in Crystal, because the string is a literal and so is compiled into the data portion of the binary:
Invalid memory access (signal 11) at address 0x230eec
[0x26cf36] *Exception::CallStack::print_backtrace:Nil +118 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x25c626] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil) +310 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x822a9eb6e] pthread_sigmask +1358 in /lib/libthr.so.3
[0x822a9e11f] pthread_setschedparam +2111 in /lib/libthr.so.3
[0x7ffffffff2d3] ???
[0x2e6b1b] upcase +43 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x24c52e] __crystal_main +1070 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x2e69f6] *Crystal::main_user_code<Int32, Pointer(Pointer(UInt8))>:Nil +6 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x2e696a] *Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32 +58 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x259ad6] main +6 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
*** Error code 11
So I think I need a heap string - just like how you would strdup("hello") in C to get a string that you can modify.
strings are null-terminated in Crystal
Ah very interesting. Is that documented anywhere? I was poking through the code and thought that this line might be doing it. But I haven’t been able to verify it, and haven’t been able to identify the underlying bytes (aside from #bytes which appears it would truncate the null byte if there is one).
Strings in Crystal are supposed to be immutable, and using C functions that doesn’t respect that could bite you pretty hard if it is used together with something that make use of that assumption. So if possible I’d recommend to limit the usage of that category of C functions to instances of Slice rather than string.
That makes sense. I still have the issue of static / dynamic string. So here’s what this looks like, passing a Slice instead of a String. String.new("hello crystal".to_slice).to_slice is a bit weird, but so far String.new("hello crystal".to_slice) is the only way I’ve found to produce a mutable reference.
@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
fun upcase(name : UInt8*) : UInt32
end
str = String.new("hello crystal".to_slice).to_slice
LibUpcase.upcase(str)
puts String.new(str)
yes, checked it myself.
Yes, Strings are supposed to be immutable so perhaps string literal was placed to readonly section.
For your case you can do:
@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
fun upcase(s : UInt8*) : UInt32
end
def wrap_upcase(s : String) : String
slice = Slice(UInt8).new(s.bytesize + 1) # allocate new (mutable) slice
s.to_slice.copy_to(slice) # copy String content to it
slice[s.bytesize] = 0 # null-terminate it :)
Lib1.upcase(slice)
String.new(slice) # this will perform one more copy.
end
puts wrap_upcase("hello crystal")
note that it will perform copying of string twice - to the slice and from the slice to the returned string. If the performance is critical, you can use some black magic:
@[Link("#{__DIR__}/project1")]
lib Lib1
fun upcase(s : UInt8*) : UInt32
end
def wrap_upcase(s : String) : String
String.new(s.bytesize) do |buffer|
buffer.copy_from(s.to_unsafe, s.bytesize)
Lib1.upcase(buffer)
{s.bytesize, s.size}
end
end
puts wrap_upcase("hello crystal")
but that actually can break if strings aren’t null terminated - i was sure they are and this always worked, but well, i’ve never checked that it is documented somewhere.
Alright so taking all these responses into consideration, I wrote a simple CString class. I like how this looks:
@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
fun upcase(name : UInt8*) : UInt32
end
class CString
@bytes : Slice(UInt8)
def initialize(str : String)
@bytes = Slice(UInt8).new(str.size + 1)
str.to_slice.copy_to(@bytes)
@bytes[-1] = 0
end
def to_unsafe
@bytes.to_unsafe
end
def to_s(io)
io << String.new(@bytes)
end
end
str = CString.new("hello crystal")
LibUpcase.upcase(str)
puts str
It’s not safe for UTF, intermediate null characters, etc etc - but as a basic mechanism for encapsulating a mutable string to pass to C, it gets the job done.
There is a bug in your code: str.size + 1 is not correct. String#size returns the number of UTF-8 codepoints. You should use str.bytesize + 1 instead.
Another issues is the implementation of to_s. It’s super ineficcient to allocate a new string everytime you want to stringify a CString. You can use io.write_slice(@bytes).
I’m not sure this type is very useful though. You should be fine with using String as long as you make sure that any instance you pass to the C function is owned only by the code that uses it and there are no other references to it. If that’s the case, it should be no issue to alter the string’s contents.
The benefit of that is that the result is just a regular string.
An alternative would be to construct a new string with the output of the function.
Something like this:
def upcase(orig)
String.new(orig.bytesize) do |buffer|
buffer.copy_from(orig.to_slice)
LibUpcase.upcase(buffer)
{orig.bytesize, 0}
end
end