Special request

Hi guys, I was thinking about a funny thing recently.

I would like to know, is it possible to merge 2 binaries into ones ? Or a crystal script (interpreted) with a crystal binary, and then from the script call the binary (I mean when they are merged).

The question is a bit unclear, but here’s what I understand:

  1. The Crystal compiler can be used as a library.
    See: How can I use Crystal compiler as a library?

  2. Embedding an interpreter has been discussed, but I don’t know of any practical examples.
    Since Crystal’s interpreter is slow to start and run, some people embed Ruby instead.
    See: Anyolite 1.0.0

  3. Creating C ABI-compatible shared libraries in Crystal is difficult because function types depend on how they are called.
    A library compiled in one context may not be usable in another.
    See: Discussion between asterite and beta-ziliani

  4. As far as I know, there is no standard for a Crystal-specific ABI or shared library format beyond the C ABI.
    The Swift community, on the other hand, might be exploring this idea:
    See: Swift ABI Stability Manifesto

You could probably use something like baked filesystem or rucksack to achieve something similar.

But that begs the question. Why? :rofl:

I know it can look funny but I would like to do a new kind of packaging, embedding a cli + the package, in one binary

Do you get what I mean ?

Packaging is such an untamed beast in Crystal for all the reasons @kojix2 listed.

This kind of thing could make developer tooling a lot easier to build. For example in both Lucky and Amber, project specific “cli tasks” are built into the same binary as the rest of the application and it makes running them feel “heavy”. A fluidity between multiple binaries would allow the tasks to boot up as an interpreted task and then defer execution to a compiled database model where needed. Currently this is handled by just compiling the whole application together (big compile times) or building N tiny binaries which interact via shell.

I believe my initial response was appropriate for the Crystal Forum.
However, after some thought, I realized there might have been a subtle mismatch between my answer and Fulgurance’s original intent.

While I was considering how to reuse libraries or components created with Crystal within a Crystal-based CLI tool, it seems that Fulgurance might have been imagining something slightly different:
specifically, embedding a binary created outside of Crystal into a Crystal binary — especially given that he/she is working on building an OS and a package manager.

After seeing the exchange with Barney, Fulgurance’s true intent became a bit clearer to me.
So, for my own study, I asked ChatGPT a series of follow-up questions to explore the topic more deeply.

Of course, as always, there’s a risk of hallucination when using ChatGPT,
but the responses were interesting enough that I thought I would share them here.


Challenges in Executing Embedded Binaries

The Kernel’s Fundamental Design

In Linux, when starting a program, the execve system call is used.
This system call is designed on the fundamental premise that the binary is read from the very beginning of the file.
There is no provision for specifying an offset to start from partway through; the entire file is treated as a single, consistent unit.

The Strict Rules of the ELF Format

Executable files in Linux follow the ELF (Executable and Linkable Format) specification.
An ELF file must place its header right at byte 0 — the very start of the file.
This header contains critical information needed to correctly load the program, such as references to the program header table.
If a file’s header were located partway through, the kernel would be unable to recognize it as a valid executable.

Security Considerations

Allowing programs to start execution from an arbitrary point within a file would open the door to serious security threats.
For example, an attacker could append malicious code to the end of a legitimate binary and trick the system into executing only that part.
Such an ability would undermine the trustworthiness of file signatures and integrity checks.
Naturally, Linux kernel developers have designed the system to prevent such vulnerabilities by enforcing strict rules.

Embedded Data in Crystal Binaries

Binaries generated by the Crystal programming language also conform to the ELF format.
The compiled machine code is stored in the .text section, while strings and constants are placed in the .rodata section.
If an additional executable binary is embedded within a Crystal program, it is simply saved as raw data inside the .rodata section.
From the kernel’s perspective, this is not “an executable program” but merely a block of data.
In other words, even if an executable is embedded within a file, there is no direct way to launch it as a program.

How to Execute Embedded Binaries

Given these constraints, there are only two practical ways to execute an embedded binary:

1. Extract to a Temporary File and Execute

The embedded binary can be written out to a temporary file.
Once saved, it can be executed using execve just like any other file on the system.
This method is simple, reliable, and works across all Unix-like operating systems.

2. Use memfd_create and fexecve

A Linux-specific approach involves the memfd_create system call, which creates an anonymous in-memory file.
The embedded binary data is written into this memory file, and fexecve is used to execute it directly from the file descriptor.
This method entirely avoids touching the disk, making it suitable for environments with noexec mounts or read-only filesystems.
It also offers very fast startup performance.

However, one trade-off is that the entire binary must first be copied into RAM, which can introduce memory overhead and a slight startup delay for large binaries.
This approach requires Linux kernel version 3.17 or later, and glibc 2.27 or later.

Looking Ahead

The possibility of someday executing a program directly from an offset within a file is, realistically, nonexistent.
The current model — based on whole-file execution — is deeply embedded in the kernel’s architecture, the ELF specification, the toolchain ecosystem, and the broader security infrastructure.
Overturning this would entail immense costs and risks.
Anyone needing such a capability would have no choice but to create a heavily modified, custom Linux distribution.

maybe GitHub - crystal-china/baked_file_system_mounter: assemble files in assets folder into executable binary use `backed_file_system` at compile time, then mount it to new file system folder at runtime. can help?

This approach may sound native, but it worked really well for my cases, and many of my projects use it, e.g. this is a example.

1 Like

Thanks a lot guys for your feedback, I will have a look