Embeddable / Interoperable with ruby

Keep in mind that you have to ensure that neither Crystal nor Ruby randomly deletes objects from the respective other language.

I still think that Anyolite might be the faster solution here for you, since you don’t have to write all Ruby bindings, wrappers and type conversions again. The only disadvantage is that you have to write a modified MRI interpreter with Anyolite (which are only a handful of code lines), but you can install regular gems for it and you can include any Crystal library with Anyolite.

If you really want to use your approach, you can still require Anyolite to at least get the infrastructure and wrappers. No need to reinvent the whole wheel.

Hm :thinking: Your approach is almost identical with the one I had - the only difference I spot is when you call __crystal_init. You do it when you initialise the ruby extension, I did it, just whenever I would call any Crystal code for the very first time. I don’t see why one should work while the other one shouldn’t - but it obviously does, so I’ll give it another try (also: your approach is far more convenient to use, than to track and check if it has been initialised already like I did).
EDIT: I just tried your code (first I adjusted mine to call things at the same time as you do, but as this didn’t work either, I directly copy pasted things from here), and it still gives me the same segmentation fault. So I guess it’s actually really related to something else (as the few gems I use which have a C extension, don’t have a problem, I suspect it’s one of the libraries crystal requires which don’t do well with my ruby version).

Irrelevant for this, but it confused me a lot, and maybe you experienced something similar (or can explain why this happened): With all the playing around, I once accidentally picked the wrong c file, which also had a main function. I was sure it would fail (because of duplicate symbols and my crystal file obviously had already a main included) but to my surprise it didn’t fail, it didn’t care about it at all. Do static dylibs handle duplicate symbols differently? And which one would get picked - is there a rule, undefined behaviour, or would it crash?

Similar: I also tried for a bit to remove symbols. Not that I had any reason to do so. I am new to all the linking stuff, so maybe I learn something along the way and in worst case I would just waste time. And then I played around with the strip tool, and removed several sections. To my surprise, even when I chose to remove like everything, by every available option I found, things still worked seemingly fine while I was able to shave approximately another 20% off my files (even though I had used already things like --no-debug etc for the crystal file before). nm -C wouldn’t show almost any symbols left (for the crystal file there were maybe 6 or 8 symbols left), but the linker accepted them anyway without any complains and things worked still just like before. First I thought the symbols would be maybe just like not advertised anymore or something like that, but as the file size decreased so significantly I guess quite a lot of the stuff must have been really removed.

Regarding export "C" yes and no. In the example, you are absolutely right, it would fully do the job. But when you do it with class methods then you already overstep a line (you can’t declare them static and extern). You also can’t just declare them outside of the class, like you can’t just do a static void Foo::bar(); it needs to be inside of class Foo { … but there you can’t tell it to be extern "C" and you aren’t allowed to put the whole class into a extern “C” block either. So for those you would have to deal with the name mangling anyway, and then I preferred to do it just for everything and have it all the same way.

The following doesn’t work, but I think it’s just that I mess something up within CRYSTAL (likely with the proc). But maybe you get it to work.

I think this would be a HUGE improvement:

@[Link(ldflags:"-dynamic -bundle -framework Ruby")]

lib LibRuby

  # next line needs maybe already adjusting
  fun rb_define_global_function(LibC::Char*, VALUE, LibC::Int) : Void
end

alias VALUE= Pointer(Void)

# maybe this is already a main issue
# What I wanted is, that Qnil just points to address 8, 
# did I get this one right? Otherwise it might be just this.
Qnil=Pointer(Void).new(8)

# this seems to work already fine!
fun rubyInit="Init_foobar"
  main 0, Pointer(Pointer(UInt8)).null

  # here the code still works
  puts "Yupp, still working…"

  # TopLevel code is run BEFORE Init_foobar
  # so let's add the ruby methods within the init (but seems not to matter)
  addGlobalMethod "foo", 0 do
    puts "currently you won't see this line :-("
    Qnil
  end
end

# this might need adjusting
def addGlobalMethod(name, x, &b)
  LibRuby.rb_define_global_function name, Box.box(b), x
end

# if you add code here, it still works
puts "Not crashed yet!"

When you require "foobar" in ruby you get all the expected outputs.
But as soon, as you call foo within ruby, you will get:

Invalid memory access (signal 10) at address 0x10fd1aea0
[0x10fe33a8b] *Exception::CallStack::print_backtrace:Nil +107 in /private/tmp/crrb/foobar.bundle
[0x10fe18f00] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil)@/opt/crystal/src/signal.cr:127 +304 in /private/tmp/crrb/foobar.bundle
[0x7ff80b227e2d] _sigtramp +29 in /usr/lib/system/libsystem_platform.dylib

I wouldn’t be surprised if it’s all just caused by my proc. Or by my pointers. Or both.

Adding foo in general, also works fine, ruby will complain about an incorrect amount of args if you use intentionally the wrong amount.

It would be just a bit of static boilerplate, which could be just loaded from a shard (but even there it would be far less code than an average shard comes with).
Otherwise there is almost no overhead (the few Ruby constants and type translators would need to be added, but this would be just once and they could be done in a day or two).

If we get this working I believe it would result in a huge boost for both, Crystal, but Ruby as well.

You have to be cautious with type safety here. For example, the fun line is the reason of your crashes. You need to explicitly pass the information that the function requires a Proc, for example:

alias VALUE = Pointer(Void)
fun rb_define_global_function(LibC::Char*, Proc(VALUE), LibC::Int) : Void

Then you can also drop the boxing in addGlobalMethod (although I needed to specify the Proc type for some reason):

Qnil = Pointer(Void).new(8)

def addGlobalMethod(name, x, &b : Proc(LibRuby::VALUE))
  LibRuby.rb_define_global_function name, b, x
end

Now, the part

addGlobalMethod("foo", 0) do 
  puts "You WILL see this"
  Qnil
end

should totally work.
Note that the definitions of VALUE and Qnil are still not quite faithful to the actual Ruby implementation, but this does not matter here.
Also the specific type of Proc you want to pass to rb_define_global_function might vary depending on the arguments, so you need to find a solution for that.

I can once again only recommend Anyolite here, since it provides all these methods (with rb_define_global_function specifically missing, because you usually want to wrap entire modules anyway - but you can add it without problems) and deals with most of the problems.

What you did can be done easily with:

require "anyolite"

module Bar
  def self.foo
    puts "You WILL see this"
    nil
  end
end

Anyolite::RbInterpreter.create do |rb|
  Anyolite.wrap rb, Bar
  Anyolite.eval "Bar.foo"
  # You could also run a script passed as a command line argument.
  # This way, this program can serve as a full Ruby interpreter.
  # It can also use gems if it was compiled using MRI!
end
1 Like

Thanks!! I’ll give it a try :star_struck:

Can you maybe roughly explain, why this works? I had it as VALUE/boxed Proc intentionally, because that’s how I understood the ruby source, as it would expect me to provide the pointer to the proc (which I hoped I would do the way I did). I originally just copied the line from ruby’s source and edited it until crystal accepted it, too. I am more than happy if I don’t have to box it (and the same reasoning goes for VALUE and Qnil).

By boxing the Proc, you created a pointer pointing to the Proc (which also is a pointer).

This is acceptable for your fun definition, since it only requires a VALUE, which you defined as an alias for Pointer(Void), but in reality, you’d need the Proc itself instead of the pointer to the Proc.

Essentially, you passed the pointer to a pointer instead of a pointer - which resulted in the segmentation fault.

However, in Crystal, the Procs are not Void pointers, so you’d either need to cast them using b.as(Pointer(Void)) or something (absolutely unsafe) - or by using the safe way of explicitely declaring the function argument to be a Proc(VALUE) (a Proc with no arguments returning a VALUE).

The boxing is only necessary if you want to store some objects in memory to pass them to a C function explicitly accepting Void pointers.

There are very few use cases to this, one of them being the option to pass information to callbacks (see Proc(*T, R) - Crystal 1.5.0), but this is not the case here, since the Ruby method callbacks already allow for arbitrary arguments.

1 Like

Thanks! TIL: I’ve been using boxing absolutely wrong the whole time (I use it whenever I want to get a pointer and Crystal tells me I can’t cast something into a pointer).

Which, if you don’t mind would be my next question:

Why .as and not .cast ? I mean, yes obviously, because your as works, while my cast didn’t work. But… why? :thinking: Or maybe even HOW (as in "How did you even come to choose .as - I didn’t even think of it. I checked the api, typed in the search what I wanted to do, “oh, look there is a cast, let’s take it”)

It was just an example, it doesn’t even work that way (just tested it), since Procs are technically not actually pointers in Crystal (only in C).

The .as method is just Crystal’s way of casting values (Object - Crystal 1.5.0), which also works for pointers (then it is comparable to the classic C cast or a reinterpret_cast from C++).

Secret addendum: You could technically circumvent that by using boxing with Box.box(b).as(Pointer(Pointer(Void))).value, but please don’t ever do that, because it goes against everything Crystal stands for (and will likely cause problems as soon as the boxed pointer gets GC’d) :smiley:

1 Like

TIL so much! Thanks!

I keep coming back to this idea of having a JSON-per-line protocol to allow Ruby code to spawn a Crystal command, and then send JSON RPC messages to the Crystal command via stdin and receive structured data back from the Crystal command via stdout. You could then automatically map the JSON response values into Ruby Value objects to give the illusion that you were working with native Ruby classes. RPC/RMI seems like a quicker way of bridging Crystal to Ruby without trying to shoehorn Crystal’s runtime into Ruby’s runtime.

2 Likes

This entire thread is awesome.

Semi related, if we’re able to compile crystal into a binary that can now be treated as a library that C can import and use, could we also have crystal import the compiled binary the same way?

I’m wondering if this could potentially be used to cut down on compile times, by pre-compiling certain bundles of code and exposing a C-like interface for them. It would limit the types that could be shared between the caller crystal and the callee crystal, but maybe still useful?

2 Likes

I tend to “no”. It might work in a few very specific edge cases (and with a lot of extra work to get it maybe somewhat working), but normally absolutely not.

Already problematic, but closer to what you try to achieve would be reusing some of the earlier built object files.

Closest to “doable” might be to add some feature to a prebuilt application (which would have to be built for that purpose upfront). So you probably could go and somewhat compile an incomplete template and just add specific add-ons to it. (And then also “replace” an older add-on with a newer add-on later). This wouldn’t work in general but in some cases.

But this all is just my guess(!) - I might be absolutely wrong, it’s just what I would expect of what I have seen so far.

The add-on stuff would be preferably in a complete new class, and “old” classes should remain untouched

1 Like

Awesome idea!,

I made some tests from @anon69898395 samples, and I gets something “working”
(at lest for anything that not use crystal lib).

Assuming we have a file1.cr that require on file2.cr, but we don’t want to recompile file2 if we modify only file1.

The file2 could be:

# file2.cr

fun foo(i : LibC::Int) : LibC::Int
  i + 1
end

fun throwDice() : UInt8
  1_u8 + Random::Secure.rand(6)
end

In which we compile with crystal build file2.cr -o file2 --cross-compile --target "x86_64-unknown-linux-gnu" that give a file2.o.

Then the file1 has just to be:

# file1.cr
@[Link(ldflags:"#{__DIR__}/file2.o -Wl,--allow-multiple-definition")]
lib LibB
  fun foo(LibC::Int) : LibC::Int
  fun throwDice() : UInt8
end

puts LibB.foo(1)
puts "Alea iacta est:  #{LibB.throwDice}!\n"

and we can compile normally crystal file1.cr.

It doesn’t need to pass by C actually.
Note I used allow-multiple-definition because symbols like __crystal_main, __crystal_malloc64 or other stuff are defined in both file1/file2. I guess it doesn’t pose problems while this set of function stay the same, but I don’t think that will stay true if some Constant/Type are reopen?

But it’s obviously not so simple, the slightest puts in file2 crash.
For that I thought of sharing some context between the two file.
For example on file2, I defined a global context to contain the file1 STDOUT, so it can be initialized with a fun by file1.
Then I redefine puts to use it.

Like that:

# file2.cr
struct Context
  getter stdout = STDOUT
end

module Global
  @@ctx = uninitialized Context
  class_property ctx
end

fun init(ctx : Void*) : Void
  Global.ctx = ctx.as(Context*).value
end

def puts(*object)
  Global.ctx.stdout.puts *object
end


fun foo(i : LibC::Int) : LibC::Int
  i.times do
    puts "foo!"
  end
  puts "foo!!!:"
  i + 1
end

fun throwDice() : UInt8
  1_u8 + Random::Secure.rand(6)
end

file1 just have to initialize file2 giving its own STDOUT.

# file1.cr
@[Link(ldflags:"#{__DIR__}/file2.o -Wl,--allow-multiple-definition")]
lib LibB
  fun foo(LibC::Int) : LibC::Int
  fun throwDice() : UInt8
  fun init(Void*) : Void
end

struct Context
  getter stdout = STDOUT
end

ctx = Context.new

LibB.init(pointerof(ctx))

puts LibB.foo(1)
puts "Alea iacta est:  #{LibB.throwDice}!\n"

Then it works!

It still broke if we allocate something with the GC, but maybe it possible to something similar to share the GC, and all others stuff that need be. Then we have only to override some methods in stdlib, (a special prelude for sub file could be defined).

I don’t know if this can really work further?, I very septic about that allow-multiple-definition. Maybe all of this is wrong. But the thing seems to be possible, at least theoretically.

3 Likes

I investigated such a possibility a year ago.
Disclaimer: I’m a ruby dev, so compilation, linkage and so on is not what I’m experienced in.

As far as I understand, methods in Crystal works like templates in C++. If method is called with some arguments, new overload is created for types of that arguments.
Partial compilation and recompilation is therefore possible only at method level, not file. For example

class Sample
  def self.sample(arg1, arg2)
    arg1+arg2
  end
end

Sample.sample(1,2) # method for (Int32, Int32) is created here

Then, if we add later:

Sample.sample(1.1, 2.2) # method for (Float, Float)

New method overload will be generated. That new method can be compiled to separate library and linked, without recompiling previous code. If overload is no longer in use, it can be removed by recompiling old code (for (Int, Int) case) or by removing the library (for (Float, Float) case)

It requires extensive changes to compiler. Existing binaries, both libraries and executables, contain whole runtime

Let’s take this code:

def foo(x)
  if x.is_a?(Int32)
    1
  else
    2
  end
end

foo(1 || 'a')
foo('a' || 1)

We can compile it like this:

crystal build --prelude=empty --emit llvm-ir foo.cr

That will give us the LLVM IR for that file, and specifically for what the generated foo function is.

How does the generated code check if x.is_a?(Int32)? The answer is that Crystal gives each type a unique ID: an integer.

Here’s the LLVM IR code for the generated foo function:

define internal i32 @"*foo<(Char | Int32)>:Int32"(%"(Char | Int32)" %x) #0 !dbg !21 {
alloca:
  %x1 = alloca %"(Char | Int32)", align 8, !dbg !22
  br label %entry

entry:                                            ; preds = %alloca
  store %"(Char | Int32)" %x, %"(Char | Int32)"* %x1, align 8, !dbg !22
  %0 = getelementptr inbounds %"(Char | Int32)", %"(Char | Int32)"* %x1, i32 0, i32 0, !dbg !23
  %1 = load i32, i32* %0, align 4, !dbg !23
  %2 = getelementptr inbounds %"(Char | Int32)", %"(Char | Int32)"* %x1, i32 0, i32 1, !dbg !23
  %3 = icmp eq i32 12, %1, !dbg !23
  br i1 %3, label %then, label %else, !dbg !23

then:                                             ; preds = %entry
  br label %exit, !dbg !23

else:                                             ; preds = %entry
  br label %exit, !dbg !23

exit:                                             ; preds = %else, %then
  %4 = phi i32 [ 1, %then ], [ 2, %else ], !dbg !23
  ret i32 %4, !dbg !23
}

It’s hard to grok, but there are these lines:

  %3 = icmp eq i32 11, %1, !dbg !23
  br i1 %3, label %then, label %else, !dbg !23

If “something” is 11, then jump to the “then” label, otherwise jump to the “else” label. That 11 is the ID Crystal assigned to the Int32 type.

Now let’s slightly change the original program:

# This type wasn't there before!
class Foo
end

def foo(x)
  if x.is_a?(Int32)
    1
  else
    2
  end
end

foo(1 || 'a')
foo('a' || 1)

Recompiling and checking the generated LLVM IR, we get this now for foo:

define internal i32 @"*foo<(Char | Int32)>:Int32"(%"(Char | Int32)" %x) #0 !dbg !21 {
alloca:
  %x1 = alloca %"(Char | Int32)", align 8, !dbg !22
  br label %entry

entry:                                            ; preds = %alloca
  store %"(Char | Int32)" %x, %"(Char | Int32)"* %x1, align 8, !dbg !22
  %0 = getelementptr inbounds %"(Char | Int32)", %"(Char | Int32)"* %x1, i32 0, i32 0, !dbg !23
  %1 = load i32, i32* %0, align 4, !dbg !23
  %2 = getelementptr inbounds %"(Char | Int32)", %"(Char | Int32)"* %x1, i32 0, i32 1, !dbg !23
  %3 = icmp eq i32 12, %1, !dbg !23
  br i1 %3, label %then, label %else, !dbg !23

then:                                             ; preds = %entry
  br label %exit, !dbg !23

else:                                             ; preds = %entry
  br label %exit, !dbg !23

exit:                                             ; preds = %else, %then
  %4 = phi i32 [ 1, %then ], [ 2, %else ], !dbg !23
  ret i32 %4, !dbg !23
}

Do you see what the comparison looks now?

  %3 = icmp eq i32 12, %1, !dbg !23
  br i1 %3, label %then, label %else, !dbg !23

Now the compiler assigned 12 to the type ID of Int32, presumably because types were ordered alphabetically and Foo comes before Int32. Well, I’m not sure if that’s the reason, but what’s important to know is that the type ID of a given type isn’t guaranteed to be the same across different compilations.

If we compile two Crystal programs into shared libraries and load them both at the same time, one of the foo will win and override the other. When doing that, the logic for one of the programs will break, because if we pass an Int32 in that program the check x.is_a?(Int32) will be false, likely leading to a segfault.

This is the main reason creating shared libraries in Crystal isn’t possible right now. Well, it’s possible if you only use one shared library, and that’s it… which in my opinion is not very useful.

So if you run into segfaults or strange behavior when playing with this… you know why! You didn’t do anything wrong: it’s just not supposed to work at all.

If we want this to work, we first have to solve the problem of changing type IDs.

6 Likes

But wouldn’t it be possible still for well defined interfaces of primitive values? That is, with the exact same restrictions as for C abi.

It’s not useful as a shared library in the usual way (“okay, for our application let’s use this library to deal with X, and this one to deal with something else, and let’s use that one, too…”).

BUT I find it extremely useful at a later stage, when your application is ready to run and be used in production.In the very most cases you have a fixed set of libraries you want to use, which is known before you want to install or run your application. At that point pack all your crystal code together and compile it into one library (probably containing a dozen of somewhat unrelated shards). Especially for such “start and run for months/years” applications with their libraries basically very rarely changing, I think it’s absolutely amazing.

1 Like

Well, those functions with primitive types will work well. But if those functions use other functions that don’t use primitive types, and use is_a?, multidispatch or other things that rely on type IDs, and such functions are also used in other Crystal shared libraries, things will break.

1 Like

What does “pack all your crystal code together” mean? That’s essentially what the compiler does for you.

1 Like

But if each part is compiled as a single module, wouldn’t each .o be self-consistent, including all the required stuff from the stdlib and such?

1 Like