Field study of fiber-local storage

We’re planning to add a stdlib API for fiber-local storage, which is a critical component for concurrent applications.

With the need for such a feature and the absence of an official API, Crystal libraries have adopted several custom implementations in order to get things working. But they’re often subpar (in general; they might be fine for the specific use case).

The goal is to implement a solid and efficient standard API to replace all currently used makeshift alternatives.

In order to understand how Crystal libraries use fiber local storage, I digged through publicly available sources and I’m going to list my findings here. It’s actually quite a large amount of shards using some form of fiber-local storage.

I’ve looked for some specific patterns which are commonly associated with fiber-local storage implementations. It’s likely that I missed some more unique implementations. Please add missing use cases in a comment!
I appreciate any comments on the exploration and classification. Do you think it makes sense?

This study is intended to inform an RFC about fiber-local storage API for Crystal’s standard library.

Implementation patterns

All implementations I found use one of two base mechanism. Of course, each individual implementation may add different amounts of convenience wrapper around.

Custom property on Fiber

class Fiber
  property my_fls : String?
end

Fiber.current.my_fls = "foo"
  • Monkey patching additional properties into Fiber is not ideal. There’s no direct danger besides name clashes with other libraries which do the same. But it’s also not a clean solution.
  • This is equivalent to a class variable, so there’s no direct M:N mapping.
List of implementations

Hash mapping Fiber instances to values

fibers = Hash(Fiber, String).new.compare_by_identity

MyFLS.fibers[Fiber.current] = "foo"
  • The hash can be a class or instance property. The latter allows M:N mappings which is necessary for some use cases (such as connection pools).
  • Some hashes use Fiber#object_id instead of the Fiber instance as the key. I’m not sure what’s the point of that. Maybe this predates Hash#compare_by_identity?
List of implementations

Hash lookup based on Fiber#object_id:

Hash lookup based on Fiber

Extra: Thread Local Storage

Stdlib also has some use cases with thread local storage. This is insufficient when fibers can switch threads (`ThreadLocalValue` is not fiber-aware · Issue #15088 · crystal-lang/crystal · GitHub).
This affects Regex::PCRE2#@match_data and Regex::PCRE2.@@jit_stack.

Use case classification

The different use cases can be classified into several categories:

I think there are two different modes of how fiber local variables are used:

  • A fixed property assigned to a fiber. It has exactly one value per fiber (or none through lazy initialization) and that value is unique per fiber. It must not be inherited by child fibers (structured concurrency).
    Example: Pool checkouts, transactions, recursion detection
  • A scoped property which may change over time. Assignments are usually scoped to a specific range of execution, and may be inherited by child fibers (structured concurrency)
    Example: observability contexts, configuration

A challenge for the stdlib API is how to suit the needs for both of these modes.

Non use-cases

It’s also important to clarify what are not appropriate use case for fiber-local storage.
I think it is primarily intended for matters of code structure (such as a DB transaction is checked out to one specific strain of execution, i.e. a fiber).

Fiber local storage is not suitable for domain specific values, such as client sessions in a web server. These should rather be passed explicitly to clearly express the contextual flow.
Examples for such questionable use are mollie.cr/src/mollie/ext/fiber.cr at 8bd991229696653fea93145cbb83da6bd36abc02 · wout/mollie.cr · GitHub and telegram_bot/src/telegram_bot/fiber.cr at 39e0914d52925b57636bc32420961a308ed01b8b · hangyas/telegram_bot · GitHub

7 Likes

Expanding on the Athena use case a bit, each request (which has its own fiber) should have its own container. However when the request has a keep-alive header, the fiber is re-used so I explicitly have a line that does Fiber.current.container = ADI::ServiceContainer.new within the HTTP::Server proc.

I haven’t really thought too much about how fibers that you spawn while processing a request are handled tho given I haven’t really heard of any use cases related to that. But, it could make sense to share the same container so that you could have access to the same state. Tho at this point it would make sense to have an Athena specific version of spawn to handle this.

I think Go’s use of “Context Value” was the minimum amount of sacrilege to appease the masses who insist on violating CSP.

As easy it is to create a tuple in Crystal, I don’t see why this such a big deal - even for sharing a large class instance.

first fiber finishes up:
chan.send({thing1, thing2, pointerof(fatObj)})

chan receives message. then spawns new fiber with logic of:

t1, t2, fo_ptr= msg
fiber_fat_obj = fo_ptr.value

do work … pass it along!

nextchan.send({newthing3, fo_ptr})

no?

1 Like

I don’t understand the significance of your comment for this thread.

How would the Hash based approach work (in theory)?

If the Hash is not within the fiber, it must be wrapped in a Mutex, and the performance cost will increase with the number of fibers trying to access it concurrently. (Though ysbaddaden’s Sync is quite impressive in this regard)

Additionally, Hash(K, V) is presently not possible, so the stored value types would be quite limited, no?

apologies - what I was inferring, very poorly, is to perhaps consider skipping this feature on account of it being a more lucrative vector for abuse / antipattern than genuine utility. In Go’s concurrency model (from which I am actively porting ~ 15k loc into Crystal!) there’s no support TLS / TLS-like storage - they force it into the Context package which must be passed around in the args. I never quite understood the need … if I can pass a struct by value in an arg, why should it be shoehorned into Context?

Similar with Fibers, I can pass whatever state I need for the fiber’s context when calling the spawn block. I think where fiber-local / thread-local storage tends to be used is down the chain a few iterations - but this is just as easily passed in channels between fibers / goroutines / green-threads as stack-bound data structures to the receivers, and receivers can simply repackage those and keep forwarding them to new receiver fibers for deeper chains /pipelines or for fan-in / fan-out behavior.

Either way IMO it’s better to force users to explicitly manage the data outside of the concurrency primitive to keep from polluting it for downstream callers, but I realize that I’m probably in a minority with that opinion. cheers

I think you misunderstand what Fiber-local storage is supposed to achieve. It’s not really about passing data between fibers, channels are there for that, as you mentioned. It’s about information that needs to be stored for a specific fiber.

An excellent example is this bug for Colorize. Currently a class property is used, but that would need to be local to the executing fiber.

1 Like

thanks for that link. please don’t take my comments as argumentative, I’m a lurker and seldom post in here, but I just gently wanted to challenge the need! I think we’re on the same page. From the gh issue

”The color breaks appear exactly at the boundaries of colorize strings. This would suggest that colorize is broken in multi-threaded environment and it’s not the logging.

And indeed, >> Colorize uses an unsynchronized class variable @@last_color which can be accessed from different fibers. <<

Emphasis mine - ok. so bug identified.

This should probably be a fiber-local variable instead.”

Why should this be a language problem? It would be convenient, yes. but shouldn’t the colorize pkg figure out a better method to synchronize against the last color written? A singleton Class with a mutex around accessing the property would fix this issue, no? that would mean that fibers block if they can’t get the mutex, which is desirable. if we throw the variable into fiber-local storage we’ve still got all the same concurrent problems, just local to the fiber, no? maybe I’m still misunderstanding the bug. –respectfully yours!

I get what you mean, in this case one could probably rewrite Colorize to handle it by itself, perhaps via a Colorizer instance or something. It’s completely doable in this case and it’s not even that complicated. It still illustrates how even existing code would benefit from fiber local storage.

Mutexing it would work, but you’d pay a performance cost for it. If the variable were fiber-local, all of the fibers could access their own @@last_color concurrently.

But there are cases where it is complicated, and you’d have to bend over backwards to make the logic fiber-independent.

2 Likes

Thanks for the clarification. Makes sense to me now what you mean.

Fiber-local storage is a thing in Crystal. Even though it is not an explicit feature. As shown here, custom implementations can be found throughout the ecosystem.

I’d like to use this thread to record and assess the status quo. A separate discussion will be concerned about what to do about that. I appreciate the critical response, though.

But to stay focused, let’s not digress here into considerations for future developments and side issues such as Colorize state (which I did not mention in the list because it currently does not use fiber-local state; if you have a good idea how to fix this, please comment in the issue).

2 Likes

You can look at any of the linked implementations to see how it’s done in practice.

A simple hash obviously is not safe in a multithreaded context. I presume your question is about that?
This thread is only about collecting what’s currently used. And apparently current implementations (at least those with the hash method) are not concerned about multithreading.

1 Like

Possibly related references

The call for FLS is that we do have needs for them in the stdlib itself, and in practice people keep extending Fiber (or HTTP::Context which is expected). We believe we can have a safe and lightweight mechanism for that in stdlib, and that we could expose it publicly.

The reason Go’s context is hash-like is so you can pass the object through any set of middlewares that don’t know the values to be passed but still need to keep a reference and needs to be typed (as context). That way you can compose tools from different libraries around a single context, with each tool taking whatever values it needs out of it. The downside is that the type system won’t help you anymore, and there can be name conflicts on the values, so it’s basically no better than extending Fiber or HTTP::Context.

Classic usages I can think of:

  • save the current user or account (scope)
  • set a local i18n locale / timezone (scope)
  • setup local observers/listeners (scope, observability)

The advantage of using “fiber locals” is that we don’t need to explicitly pass one or multiple contexts or objects to every single call (or at least every single object). You can transparently setup something and if any call needs it anywhere, it shall find it.

@aiac Scopes where child fibers would inherit their parent’s scope is indeed interesting.