The Crystal Programming Language Forum

Interpreter and use of MCJIT

Ok, this topic is pretty much way over my head, but I’m hoping to get a conversation started at least so it can be brought up with the people that know more about this.

My company uses Crystal in production heavily, and one thing we bring up on a weekly basis is having a native REPL. We have done some research on how this would work, and if it’s even possible. It looks like Julia is using LLVM, and they have a native REPL. One thing these LLVM languages have in common is an interpreted mode.

After doing some more research on other compiled languages that have interpreters, I’ve seen references to MCJIT come up. It looks like there was some talk in 2015 about using it with Crystal, but that was eventually closed out. I guess it is/was used in some parts of the language? Would MCJIT give us the ability to have an interpreted mode?

During the last Crystal live Q&A stream @RX14 had briefly mentioned that an interpreter was a possible goal for the future, but obviously there are a lot more pressing issues to worry about first.

What if the community started working on this? It could be a shard on the side to natively interpret Crystal and give us the ability to have a native REPL. What would this look like? Where would you even start? Are there already classes in Crystal that could handle most the heavy lifting, or would this need to be written in C (c++ ?) directly?

Here’s a few other resources I’ve found
Monkey Lang which talks about writing an interpreter and compiler even with macros.

The problem with a REPL in Crystal is not to do with LLVM or the codegen side, it is purely to do with the type system.

An interpreter would be a complete rewrite of at least everything but the first few parse, lex and normalize stages of the compiler. It’s doable. Maybe. Nobody in the core team will have to the to write it, but I’m more than happy to help a community effort.

Many compiled languages such as Rust, Java and Go do not have REPLs. The REPLs that are provided by those are hacks with some limitations. Crystal’s hacky repls have many more limitations than most, but we don’t consider a REPL essential. Certainly not worth maintaining two implementations of crystal for.

2 Likes

I think making an interpreter for Crystal would be fantastic! For example Haskell is compiled with ghc but you can run ghci and run it (I think) in interpreted mode. Then you get the best of both worlds: static typing and compilation, and a pleasant REPL to work with.

That said, making a REPL for Crystal might be harder than for other languages. Julia is typed but types have a fixed layout that can’t be reopened or changed. Imagine in a Crystal REPL you could do a = Foo.new, then do require "foo_extension" which adds more instance variables to Foo. How would that work?

I think Julia REPL works like this: it receives an expression and then it goes and compiles that and all its dependencies to LLVM. Probably Julia can already compile individual modules to .o or some format that exposes an interface, and if not it just compiles the code on the fly. Then the result of the expression is a pointer to some value. When you pass to another function you can be sure its shape is the one expected by the program. This is not true in Crystal, because of my “foo_extension” example above. Also, Crystal would have to retype everything and codegen everything again because there’s no notion of a module.

Another way to implement a Crystal REPL is by fully interpreting the language. The tricky parts I think are implementing the primitives (mainly Pointer), C bindings and inline assembly (probably just try to pass that to LLVM somehow). Dynamic library loading is also a pain-point (how to interpret those @[Link] things with stuff potentially linking .o files, not sure). Then there’s fibers and context switching, which happens in assembly.

I also wrote a small REPL some time ago for Crystal, using the second (interpreter) approach. It was really slow, but I’m sure it could have been optimized. I can’t remember why I quit, I think it was because of C bindings or something like that, not sure…

The second approach might be worth a try, to be done by the community. But you need to know how to do method resolution and how to define classes and methods, though there’s existing code for that in the compiler. I can help with input if someone starts this.

4 Likes

Yeah, I’m not too sure how Julia is doing it, but my main test was assigning the current time to a variable, waiting a few seconds, and then printing out the variable to see the time hadn’t changed. So it doesn’t re-compile between inputs (like what ICR is doing).

Is there any bits of code in Crystal now that can handle parsing code already? Or would it literally need to start from the ground up?

There’s code to lex, parse and type code. Also code to define classes and methods in a program. There’s also code to evaluate a crystal program on the fly using the JIT (but last thing I remember it wasn’t working well with exceptions and dynamically loaded libraries, so some tweaks might be needed there).

When I implemented the interpreter I reused a lot of the compiler’s stack. So if someone is going to go the full-interpreter route (not jitting) then only the interpreter part (by means of an AST visitor) will be needed. Everything else already exists and is written in Crystal.

The other route (jitting) is a bit harder (because of changing stuff) but it can also reuse most of the stack.

I know this is not apart of the language but I use icr all the time.

2 Likes

Same. I’m technically a maintainer on that project so I’m very familiar with it. If we had a way to cache the results in it, I think that would cover about 95% of my use cases for what I need. I tried coming up with a solution using a separate datastore, but since Crystal doesn’t have Marshaling, it made it very difficult to convert a string back in to an unknown type. Like if we used redis for instance. Then you did x = Time.utc, and have the value of x converted to json and shoved in to redis, the next time you call x, would look in redis for the value, and pull it back out. But you wouldn’t know that it was an instance of Time, so you couldn’t convert it back. I think with a Marshal it might make that easier? (no clue really).

your knowledge is way over mine. I hope people here that are more familiar with this will be able to help.

1 Like