Why Process.fork needs a good alternative in crystal

Hey. I’m building freebsd.cr, Crystal bindings for FreeBSD’s Capsicum capability mode, Casper privilege-separation framework, and related kernel security APIs. The Process.fork deprecation warning prompted me to raise a broader question about Crystal’s direction here.

Crystal already does fork safely — internally

Here is the current stdlib implementation:

# Only used by deprecated `::Process.fork`
def self.fork
  {% raise("Process fork is unsupported with multithreaded mode") if flag?(:preview_mt) %}

  pid, errno = lock_write do
    pthread_disable_cancelstate do
      block_signals do
        pid = LibC.fork
        {pid, Errno.value}
      end
    end
  end

  case pid
  when 0
    ::Process.after_fork_child_callbacks.each(&.call)
    nil
  when -1
    raise RuntimeError.from_os_error("fork", errno)
  else
    pid
  end
end

Two things stand out:

  1. The stdlib already knows how to fork safely. It calls after_fork_child_callbacks in the child, wraps the syscall in lock_write + pthread_disable_cancelstate + block_signals, and guards against multi-threaded mode with a compile-time {% raise %}.
  2. The guard is preview_mt, not “always unsafe.” In single-threaded mode, the stdlib considers fork safe enough to implement correctly.

The deprecation is a policy choice, not a technical impossibility. What has actually been deprecated is the general-purpose API — not the safety machinery underneath it.

What we use fork for

There are two fork patterns in the library, neither replaceable by spawn:

Casper helper (src/freebsd/casper/helper.cr:111) — pure-Crystal privilege separation

child_proc = Process.fork do
  server = Server(C).new(helper_sock)
  yield server   # serve privileged requests over a UNIXSocket pair
end

The helper stays unsandboxed while the parent calls cap_enter(2) to enter capability mode.
The parent then delegates privileged operations (DNS, file access, network) to the helper over a
pre-opened socket pair.

pdfork(2) (src/freebsd/capsicum/process_descriptor.cr:171) — capability-mode-safe child management

pd = LibPdfork.pdfork(pointerof(fd), flags)
# child path:
Process.after_fork_child_callbacks.each(&.call)   # reinitialize Crystal runtime

pdfork is a FreeBSD syscall that returns the child as a file descriptor rather than a PID. Once
the parent enters capability mode the PID namespace is gone, but the fd stays valid — so the parent can still pdkill/pdwait the child from inside the sandbox. We call after_fork_child_callbacks manually here because we bypassed Process.fork.

Both callsites follow exactly the same rules the stdlib enforces: fork before any concurrent fibers,
reinitialize the runtime immediately in the child.

Why spawn cannot substitute

cap_enter(2) is per-process. You cannot sandbox one fiber while leaving another unsandboxed.
Privilege separation requires genuine process separation. The same applies to other UNIX patterns that depend on real fork: daemonizing, prefork/unicorn worker models, and any
setuid/setgid/chroot isolation applied to a child rather than the parent.

What would help

  1. Stabilize Process.after_fork_child_callbacks as a public API. It is already the load-bearing piece — code wrapping raw LibC.fork or pdfork(2) needs it to reinitialize Crystal’s runtime correctly. If it disappears or gets renamed, the safe path breaks.

  2. Provide a supported escape hatch for single-threaded fork. Something like Process.single_threaded do ... end or a compile-time flag that acknowledges “this code forks early and handles reinitialization itself.” The machinery is already there; it just needs a door that isn’t marked deprecated.

  3. Don’t remove Process.fork without an equivalent. Without one, every legitimate UNIX privsep use case is pushed onto raw LibC.fork — an unsupported, undocumented path with no runtime reinitialization guarantee.

Crystal’s C-interop story is one of its strengths. The C ecosystem is full of security
vulnerabilities, and OS-level mitigations — least privilege, process isolation, capability
restrictions — are the best countermeasures available. It would be a shame if Crystal’s runtime
evolution made those mitigations harder to use, not easier.

Happy to share more implementation details from freebsd.cr as concrete examples.

Thanks for bringing this up. :+1:

It’s clear that fork is an important primitive on Unix systems. I think it should be available for doing Unixy things in Crystal.
The main reason why it’s deprecated is that it’s not portable. And we’d like stdlib to be portable as much as possible.

One option to go forward is to move such platform-specific features (Process.exec is another one, see Semantics of `Process.exec` on Windows · Issue #14422 · crystal-lang/crystal · GitHub) out of stdlib into a separate shard. Then it wouldn’t be part of the batteries included stdlib, you have to explicitly pull it in as a dependency, but it’d would still work.

Related discussion about fork removal: Sunsetting `Process.fork` · Issue #16371 · crystal-lang/crystal · GitHub

Its unfortunate that FreeBSD (and other BSDs?) don’t seem to provide a less troublesome way of solving the problem than using a fork variant. Having to choose between multithreading via crystal and separated freebsd style is not a great choice to have to make.

This is true for all Unix systems, BSD, Linux, Solaris, MacOs,… the reason is, that the process is the security boundary. I also think this doesn’t contradict multithreaded execution. But there needs to be a supported path for both, I will make a proposal as a GitHub issue and link it here.

We never supported fork in the first place. We never wanted to. It’s been in limbo since we introduced the event loop (long before 1.0) and was never documented.

Not only is it not portable, but we won’t try to support Process.fork in a MT environment, which unlike fork has always been planned to be the default.. Stdlib starts fibers and threads before any user code is reached. User code is never safe of concurrency (or parallel) issues, …

I don’t think it’s realistic to extract Process.fork and all the machinery into a posix shard. It would have to monkey-patch lots of places (e.g. event loops) and play catch-up with stdlib on every other release. It would also only be usable without MT (we’re about to enable execution contexts), …

There are other ways to look at the issue, though.

For example I doubt we can pdfork + exec (sadly), but you might be able to override and/or patch main and Crystal.main to initialize the very core runtime (mostly GC.init and some more) then pdfork and only then run the main user code inside the forked process.

Agree, the .fork needs to stay in the stdlib. This is not the only thing where Crystal is not consistently “platform agnostic”. I understand that this simplifies the work of language maintainers, but basically shifts the work to the community, that will then only find numerous ways to handle it.

Crystal also has a UNIXServer and UNIXSocket. Taking this into account it would be better to have a UNIX package with a UNIX::Server, UNIX::Socket and UNIX::Process.fork, UNIX::Process.daemonize, UNIX::Process.exec.

Crystal is not JavaScript, it was born on UNIX. Yes it supports WASM but I it still searches its place. Its C interior is much stronger then e.g Go this makes it attractive for me, when it comes to systems programming. If that is not possible which Crystal, I guess I have to search for zig, rust or similar. Go doesn’t handle this kind of stuff well and Ruby is not compiled…

We might want to make the runtime a bit more defined and accessible for that to be a truly good idea though.

Would be pretty cute to have a fully defined runtime instance, easily findable by being the root in the stacktraces, and doing all instantiations (including having easy programmatic access to prelude and anything that today executes at startup).

I took a closer look, and I confirm that you don’t need Process.fork and don’t need to reinitialize anything, instead override main and call Crystal.main to initialize the app multiple times independently:

This is safe, MT ready, it’s a public and documented API (although a bit rough):

{% if flag?(:freebsd) %}
fun main(argc : Int32, argv : UInt8**) : Int32
  Crystal.main do
    case ret = LibC.pdfork
    when -1
      Crystal::System.print_error "pdfork failed\n"
      1
    when 0
      LibC.setenv("PDFORK_PID", "0")
      Crystal.main_user_code(argc, argv)
    else
      LibC.setenv("PDFORK_PID", ret.to_s)
      Crystal.main_user_code(argc, argv)
    end
  rescue ex
    Crystal::System.print_exception "Unhandled exception: ", ex
    1
  end
end

case pid = ENV["PDFORK_PID"].to_i
when 0
  puts "running in pdfork (pid=#{})"
else
  puts "running in parent child=#{pid}"
end
{% else %}
# not freebsd: single process
{% end %}

FWIW, AF_UNIX is just a specialization on the type of socket, using exactly the same system API as other socket types. It’s not a completely independent concept such as fork.
Crystal just chose to implement different socket types as different API types.

Also, despite the name, modern versions of Windows support Unix domain sockets as well.

That looks good, although that manual override loses the direct exit from crystal/src/crystal/system/unix/main.cr at 6424595ea3d4bb71c128f4f3a436bf654b38c159 · crystal-lang/crystal · GitHub

So while this is reasonably easy as long as you are happy to go down into unsafe territory (which I assume when you want to use fork), it’s still error-prone.
Perhaps we could consider a more convenient mechanism for injecting code before the runtime starts / overriding ::main?

I believe we had previously mentioned such an option in regard to mutating environment variables at the start of the process, which is a similar use case to fork.

I appreciate that you try to help me in finding a solution. But I can currently not see how you proposal would help me. In the mentioned library I have to use fork to test the code, that uses cap_enter you suggest to compile sub programs here?

Yes. Or just call the currently running program again with different arguments, which wouldn’t be any slower than fork.

@straight-shoota I’m 99% sure that explicit exit ain’t needed (never noticed any failure) and that it’s just me being overly cautious.

Alternatively, one can redefine Crystal.main_user_code to achieve the same effect:

def Crystal.main_user_code(argc : Int32, argv : UInt8**)
  # ...
  previous_def
end

Okay, it seems to work. Thanks @ysbaddaden

How one would use it:

require "http/client"

require "../src/freebsd/casper"
require "../src/freebsd/casper/net"

# runs in separate process
FreeBSD::Casper.register_net(
  FreeBSD::Casper::Service::Net::Mode::Name2Addr |
  FreeBSD::Casper::Service::Net::Mode::ConnectDNS
) do |b|
  b.allow_name2addr("example.com", "80")
end

Time::Location.local # cache, since no access after sandboxing
FreeBSD::Capsicum.sandbox!

response = HTTP::Client.get("http://example.com/")
puts "#{response.status_code} #{response.status.description}"
puts response.body[0, 200]

the pattern forks internally (in .install_net):

  macro register_net(mode, &block)
    def Crystal.main_user_code(argc : Int32, argv : UInt8**)
      \{% if flag?(:freebsd) || flag?(:dragonfly) %}
        _chan = FreeBSD::Casper::Channel.open
        _net  = _chan.net
        _net.limit({{mode}}) {{block}}
        _chan.close
        FreeBSD::Casper.install_net(_net)
      \{% end %}
      previous_def
    end
  end

For the capsicum/capser usecases I have it seems to work fine

Oh, the previous_def will stack on each call of the macro, so you can call it anywhere and it registers yet another pdfork. Cool.