Eliminate environment modifications

There are alternatives, but AFAICT none are “better”. Even the totem shard you linked above comes with tradeoffs.

Setting environment variables at the start of the process before spawning additional threads seems lika a reasonable use case and should be supported.
What’s wrong with the approach I suggested before?

@straight-shoota We can’t guarantee that something won’t call setenv in parallel to a legit getenv call (or windows equivalents), even if Crystal code doesn’t expose it or makes it explicitly unsafe, the dotenv-like shards will merely wrap ENV.unsafe_set and hide it, so it might be called at whatever time.

There’s also no API to tell if there’s more than 1 thread running, and it would still be fragile because a lib can start threads without our knowledge…

But, none of the usages above mean to mutate the system environment, they just want to populate ENV as a convenience. ENV could merely keep an internal hash for overrides and fallback to the system environment for unknown keys; add a mutex around the hash and libc calls, and that would be just fine…

…but something might call setenv in parallel and the seemingly safe ENV methods still aren’t safe. I guess this is why Go copies the system environment on startup.

There might be value in fully taking over the system environment, after all :thinking:

There could still be ENV.unsafe_set that would mutate the system environment, too, for the very few cases where modifying it might be relevant, with the warning that “it might segfault” :bomb:

If some code uses setenv in a multi-threaded process, it should be responsible to make sure it’s safe. Which is impossible in most situations, so it probably just shouldn’t do it. And if it does, it’s on their head, not on anyone who wants to use getenv.
We cannot even control that in any way. libraries don’t advertise which functions call getenv.

This functionality must be restricted so it cannot be called accidentally in a multi-threaded environment.
Maybe stdlib needs to provide a hook for something that needs to run at the start of the process.

There can be safe-guards to ensure we’re not trying to mutate the environment, after spawning threads in execution contexts at least. We could also ask the operating system how many threads the current process has. But that might be costly and fragile, and also not very useful probably.

If we can ensure that environment-mutating code runs first thing in the process, as part of the runtime bootstrap, we should be pretty safe.
That’s similar to other process-wide configuration that would be unsafe with multiple threads (e.g. setting up signal handlers).

That sounds quite complex and inconvenient to use when the use cases reported don’t care to modify the actual environment…

libraries don’t advertise which functions call getenv.

The linux man pages actually do :astonished_face:

See for example getaddrinfo(3) that declare env under thread safety values. Per attributes(7) it means that getaddrinfo calls getenv. That’s only valid for linux glibc, and other systems’ man pages don’t document these properties, but it’s still nice.

Don’t they? I suppose this hasn’t been discussed explicitly.

I expect there are use cases where you would want dotenv-loaded variables to be visible outside the Crystal scope for example as library functions.

If this is not relevant, maybe ENV isn’t even the right tool for this.

I did a quick survey what other languages are doing:

  • Python: Reads environ into memory at process startup. Setenv alters both, the copy and environ. Protected by the GIL.
  • Go: Reads environ into memory at process startup. Setenv alters both, the copy and environ. Protected by mutexes.
  • Java, Swift: Immutable snapshot. No access to environ at runtime.
  • Rust: Internal synchronization for accessing environ.
  • Nim: Merely calls the libc functions (no synchronization).

I don’t think there’s much value in a snapshot if we still have to synchronize environ.

We could detect whether there’s another thread running (multiple execution contexts, or the default context has been resized), but if even Rust doesn’t care more than having a rwlock on top of environ and the libc functions, I’m not sure it’s worth it.

Yeah, perhaps throwing a RWLock on environ access could be good enough. And explicit documentation that we cannot protect against interference with non-Crystal code in the process and advise against mutating the environment entirely if possible.

It might still be interesting to implement a fiber-local overlay for mocking in order to test Crystal code that reads ENV. But this could entirely be a spec-only feature.

1 Like

Summary from several out-of-band discussions (mainly between @ysbaddaden and me):

  • We cannot iterate over environ directly because we have no control over code executed in the yielded block. We must cache the values in order to avoid deadlocks (and reduce the time we hold the pointer to avoid corruption by external, unprotected mutations).
  • We can cache for each individual call (that’s what Rust does), or cache on demand, or cache greedily at process startup (that’s what Go does).
  • Caching once at process startup is simple and predictable: ENV contains the environment variables at the start of the process plus changes made through its interface. Any changes made directly to environ are not visible.
    Reading later would have a higher chance of including external changes and potential access conflicts.
  • In order to implement the readers-writer lock, we want to start merging stuff from the nursing shard GitHub - ysbaddaden/sync: Synchronization primitives for safe concurrent and parallel Crystal code. into stdlib.
  • The overlay/caching is a specific mechanism to alleviate some of the issues coming from the OS implementation on Unix systems. It’s unnecssary on Windows, so it should be a part of the system implementation. There is an argument for consistency between platforms, but this is very system-specific stuff anyway. And a difference would only visible in strong edge cases (mutating environ directly) which we strongly advise against.
2 Likes

Just curious. The environment in every C program I’ve looked at the source code for, has only read environ at startup (usually given as arguments to main). Then if a sub-process needs a different env, a separate env array was passed to exec* (execle, execvpe) or the current env passed to exec(execl, execlp, execv, execvp). Can it just be implemented that way, or forced to be done like that? That way there is no issue with setenv in code.

Or am I missing something?

What do you mean with read environ at startup? Extract all environment variables that the program could possibly need into memory and use it from there?

Even if your program itself never calls getenv or similar, some library functions may read from environ. An example for that is getaddrinfo (at least in glibc). But there are other functions which may need to read environment variables like LOCALE or TMPDIR, for example. These usages may not even always be explicitly documented.
Still, reading is not a problem for concurrent access, as long as nobody writes. That’s of course much less likely to happen unintentionally as a side effect of some random function call.

But there are use cases like the ones mentioned here where you want to read a file with environment configuration at process startup. That can be relatively isolated because it happens at the start, but if not handled carefully could still lead to race conditions.

Maybe it is just me, but I’ve never had to modify environment variables in a program except for when spawning (fork/exec) another program. I’ve only read env vars at startup. Those are provided in c (gcc for example) in either environ (extern char **environ), or as arguments to main (int main(int argc, char *argv[], char *envp[]). So a solution could be to actually disallow changing environment variables. Only allow them to be read, or passed as an argument to Process.new (as it is now).

So reading is still allowed, but not setting except when spawning a new process.

I know c has setenv(), but since you do get those problems, and unless there is pressing need for it, don’t allow setting.

Again, unless I’m missing something.

Yeah, that’s basically what I suggested in the OP.

Most programs probably don’t need to mutate their environment.
But there are some use cases for doing that mentioned here, such as loading .env files or testing code that reads environment variables.

I still like the idea of making ENV immutable.

But we may need to allow modifications in some way and make that as safe as possible by guarding environ when accessed through ENV. Although we should still strongly advise against using it (unless you know what you’re doing).

@riffraff169 Yes, we could allow all POSIX use cases by parsing environ once at program startup then generate envp from it to pass to execve or posix_spawn. That would be safe since we’d never call any of the *env methods directly, except for some indirect getenv calls in libc and external libs.

ENV could still allow to mutate its internal hash to cover the “I want to fill ENV at runtime” scenarios. It would be safe as it wouldn’t call any *env method (not needed).

Though, there’s one edge case: you need the variables loaded at runtime to affect an external library (sic). IMO this is a huge smell and I’m still not convinced it should be supported (if not for backward compatibility): you should either configure the library programmatically or the system should have already configured the environment, possibly through a wrapper shell script.

Maybe we could introduce some flags, like -Denv=immutable or -Denv=unsafe flags…

There is also the use case of using c-libraries that use env as a means to communicate global state - from both Crystal and C sides. I agree with all the statements about thread safety and that using a global state approach is a bad idea, but full control over the lib is not possible.

Using ENV.[]= as a means to do the setting from the Crystal side is certainly convenient, would be great if there was still an ENV.unsafe_setthat was exposed for this purpose.

1 Like

I’m trying a bit of a summary here.

There are some use cases for manipulating the environment. They are very specific ones and could perhaps be handled differently, if we wanted to go all the way to an immutable environment. But this would probably cause some friction.
I still like this approach for its simplicity and inherent safety, but it’s not the most acceptable choice for the immediate future.

And perhaps immutability is not necessary. We can make mutations reasonably safe, at least when used within Crystal.
But then there are not really any useful scenarios for mutating the environment only within Crystal. We have better alternatives for passing values between different pieces of Crystal code.

Either way, it is impossible to ensure thread-safe interaction for environment access from external libraries. They cannot be aware of Crystal’s locks.
Thus, reading the environment pointer in a multi-threaded process can never be save.

My suggestion would be to follow roughly the approach of Go (and Python) for maximum thread safety:

  • Read the environment into owned memory at startup.
  • Reading methods only read from the copy (i.e. they are unaware of external mutations of the process environment)
  • Writing methods alter both, the owned copy and the process environment.
  • Access is protected by a reader-writers lock.

This should be mainly consistent with the existing implementation, except that we won’t be aware of changes to the process environment from a library. As mentioned above, this is impossible to achieve in a multi-threaded process without risking memory corruption.

Additionally, me may want to deprecate the mutating methods. Either to replace them with alternatives that are more explicitly labelled as unsafe, or without replacement (which effectively transitions to an immutable environment).
But this decision is not critical. We could defer it to a later discussion. For now, adding a big warning message to these methods might be sufficient.

4 Likes

This perfectly aligns with my latest attempt :100:

You are right, but .env have it advantage.

  1. You don’t want to pollute the entire environment, but instead specific environment variables to take effect for your app only (maybe you don’t have the permission to set environment variables through systemd?)

2. set same name ENV with different value for different project

Maybe we should update doc add a warning for should use it only when startup. easy, right?