A couple questions about C integration

Hi there, I’m brand new to Crystal, and looking into it primarily for integrating with C libraries. I’ve put together a couple examples, and I have some questions. Note that it’s not about the implementation (i.e. I know Crystal can puts and upcase), but rather the interaction of data between Crystal and C.

(solved) Q1: How do I link to libraries in a relative path?

Linking against a C library in a relative path, I found that I needed to use __DIR__:

libpath = Path["../hello-world-c/_build"].expand()
linkargs = "-L#{libpath}"

#@[Link(lib: "hello", ldflags: linkargs)] # Error: 'ldflags' link argument must be a String
#@[Link(lib: "hello", ldflags: "-L../hello-world-c/_build")] # ld: error: unable to find library -lhello
@[Link(lib: "hello", ldflags: "-L#{__DIR__}/../hello-world-c/_build")] # SUCCESS
lib LibHello
  fun hello_world = helloWorld()
end

LibHello.hello_world()

(maybe solved?) Q2: How do you work with C strings?

In this next example, I have a C library that upcases a string.

The C library:

int upcase(char *s) {
  while(*s) {
    *s = toupper(*s);
    s++;
  }
  return 1;
}

Crystal:

@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
  fun upcase(name : UInt8*) : UInt32
end

str = String.new("hello crystal\0".to_unsafe)
LibUpcase.upcase(str)
puts str

I think that’s as good as I’m going to get… but I’m curious if others would do it differently.

1 Like

I think my upcase is example is incorrect, or at least exhibits undefined behavior. I say this because String.new(chars : UInt8*) calls LibC.strlen under the hood - meaning that the UInt8* passed to LibUpcase.upcase isn’t guaranteed to be null-terminated. At least, I don’t see anything under the hood that null terminates it.

So I think the correct way looks like this:

@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
  fun upcase(name : UInt8*) : UInt32
end

str = String.new("hello crystal\0".to_slice)
LibUpcase.upcase(str)
puts str[0..-2]

If you need to pass a string to library, just pass it. String already has to_unsafe that will return pointer and strings are null-terminated in Crystal.

@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
  fun upcase(name : UInt8*) : UInt32
end
str = "hello crystal"
LibUpcase.upcase(str)
puts str

The only problem that you must be sure that library do not change length of string. If it is changed, things will go bad.
So better use

str = "hello crystal"
ptr = str.to_unsafe
LibUpcase.upcase(ptr)
str = String.new(ptr)
puts str
1 Like

I would recommend using LibC::Char* as a type instead. It should allow String to be used in it’s place automatically.

Example:
Usage: raylib-cr/examples/rlgl_solar_system/src/rlgl_solar_system.cr at master · sol-vin/raylib-cr · GitHub
Definition: raylib-cr/src/raylib-cr/raylib.cr at master · sol-vin/raylib-cr · GitHub

In general you should use the LibC types to define common types (Like Short, Long, etc) because they can actually be different depending on the system. Since the LibC types are just aliases that are controlled by a macro in some situations.

From the Crystal Source: crystal/src/lib_c.cr at master · crystal-lang/crystal · GitHub

1 Like

If you need to pass a string to library, just pass it. String already has to_unsafe that will return pointer

Sorry, I realize I left out an important bit of info about why I’m asking.

LibUpcase.upcase(str) modifies str - and the code you share fails. I assume this is a segfault in Crystal, because the string is a literal and so is compiled into the data portion of the binary:

Invalid memory access (signal 11) at address 0x230eec
[0x26cf36] *Exception::CallStack::print_backtrace:Nil +118 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x25c626] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil) +310 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x822a9eb6e] pthread_sigmask +1358 in /lib/libthr.so.3
[0x822a9e11f] pthread_setschedparam +2111 in /lib/libthr.so.3
[0x7ffffffff2d3] ???
[0x2e6b1b] upcase +43 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x24c52e] __crystal_main +1070 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x2e69f6] *Crystal::main_user_code<Int32, Pointer(Pointer(UInt8))>:Nil +6 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x2e696a] *Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32 +58 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
[0x259ad6] main +6 in /usr/home/patmaddox/src/ffi-adventure/upcase/upcase-crystal/_build/upcase
*** Error code 11

So I think I need a heap string - just like how you would strdup("hello") in C to get a string that you can modify.

strings are null-terminated in Crystal

Ah very interesting. Is that documented anywhere? I was poking through the code and thought that this line might be doing it. But I haven’t been able to verify it, and haven’t been able to identify the underlying bytes (aside from #bytes which appears it would truncate the null byte if there is one).

1 Like

Strings in Crystal are supposed to be immutable, and using C functions that doesn’t respect that could bite you pretty hard if it is used together with something that make use of that assumption. So if possible I’d recommend to limit the usage of that category of C functions to instances of Slice rather than string.

1 Like

That makes sense. I still have the issue of static / dynamic string. So here’s what this looks like, passing a Slice instead of a String. String.new("hello crystal".to_slice).to_slice is a bit weird, but so far String.new("hello crystal".to_slice) is the only way I’ve found to produce a mutable reference.

@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
  fun upcase(name : UInt8*) : UInt32
end

str = String.new("hello crystal".to_slice).to_slice
LibUpcase.upcase(str)
puts String.new(str)
1 Like

yes, checked it myself.
Yes, Strings are supposed to be immutable so perhaps string literal was placed to readonly section.
For your case you can do:

@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
  fun upcase(s : UInt8*) : UInt32
end

def wrap_upcase(s : String) : String
  slice = Slice(UInt8).new(s.bytesize + 1) # allocate new (mutable) slice 
  s.to_slice.copy_to(slice) # copy String content to it
  slice[s.bytesize] = 0 # null-terminate it :)
  Lib1.upcase(slice)
  String.new(slice) # this will perform one more copy.
end

puts wrap_upcase("hello crystal")

note that it will perform copying of string twice - to the slice and from the slice to the returned string. If the performance is critical, you can use some black magic:

@[Link("#{__DIR__}/project1")]
lib Lib1
  fun upcase(s : UInt8*) : UInt32
end

def wrap_upcase(s : String) : String
  String.new(s.bytesize) do |buffer|
	buffer.copy_from(s.to_unsafe, s.bytesize)
	Lib1.upcase(buffer)
	{s.bytesize, s.size}
  end
end

puts wrap_upcase("hello crystal")

but that actually can break if strings aren’t null terminated - i was sure they are and this always worked, but well, i’ve never checked that it is documented somewhere.

1 Like

Alright so taking all these responses into consideration, I wrote a simple CString class. I like how this looks:

@[Link(lib: "upcase", ldflags: "-L#{__DIR__}/../upcase-c/_build")]
lib LibUpcase
  fun upcase(name : UInt8*) : UInt32
end

class CString
  @bytes : Slice(UInt8)

  def initialize(str : String)
    @bytes = Slice(UInt8).new(str.size + 1)
    str.to_slice.copy_to(@bytes)
    @bytes[-1] = 0
  end

  def to_unsafe
    @bytes.to_unsafe
  end

  def to_s(io)
    io << String.new(@bytes)
  end
end

str = CString.new("hello crystal")
LibUpcase.upcase(str)
puts str

It’s not safe for UTF, intermediate null characters, etc etc - but as a basic mechanism for encapsulating a mutable string to pass to C, it gets the job done.

2 Likes

crystal_lib is my favorite, it saves me so much time in integration work with C libraries.

1 Like

nice, thanks for sharing that

1 Like

There is a bug in your code: str.size + 1 is not correct. String#size returns the number of UTF-8 codepoints. You should use str.bytesize + 1 instead.

Another issues is the implementation of to_s. It’s super ineficcient to allocate a new string everytime you want to stringify a CString. You can use io.write_slice(@bytes).

I’m not sure this type is very useful though. You should be fine with using String as long as you make sure that any instance you pass to the C function is owned only by the code that uses it and there are no other references to it. If that’s the case, it should be no issue to alter the string’s contents.
The benefit of that is that the result is just a regular string.
An alternative would be to construct a new string with the output of the function.
Something like this:

def upcase(orig)
  String.new(orig.bytesize) do |buffer|
    buffer.copy_from(orig.to_slice)
    LibUpcase.upcase(buffer)
    {orig.bytesize, 0}
 end
end

EDIT: I see @konovod suggested pretty much the same in A couple questions about C integration - #8 by konovod. This should be totally fine.

Crystal strings are always null terminated to ensure C interop.