Fiber is danger or not on Crystal?

Hi I’m Koichi. Allow me to ask the crystal language question.
(I’m not a Crystal user, but a Ruby user, developer).

I understand Fiber will switch when runtime requires, such as blocking I/O operation and so on.

My question is, no atomicity problem?

Example:

person = ["Alice", 10] # name and age

# I/O simulation method
def search_age_in_internet(name)
  Fiber.yield
  case name
  when "Bob"
    20
  when "Carol"
    30
  else
    0
  end
end


spawn do
  person[0] = "Bob"
  person[1] = search_age_in_internet("Bob") #=> 20
end	

spawn do
  person[0] = "Carol"
  person[1] = search_age_in_internet("Carol") #=> 30
end

Fiber.yield
p person

This code shows ["Carol", 10] but Carol’s age should be 30.
In this case, it is only a small program, easy to find out where the problem. However, if the program is huge, it is difficult to check it, I think.

It is because there are no atomicity protection feature. Document says Channel is only one synchronization feature between fibers, but I think we can not protect this kind of atomicity violation.

I want to know how Crystal language community think about it.

Thanks,
Koichi

2 Likes

The problem with this code is that you are using data from a background process (thread, fiber, whatever) without checking that the result is actually ready.

I think no programming language will protect you from that without some extra work/effort.

That said, after some time playing around with spawn and fiber, I have no Idea how I would wait for the fibers to be done.

So, thats a good question :)

Hi ko1, welcome!

You are a famous Ruby developer :blush: Keep up the good work!

You are right, there’s a data race here. I think the same thing can happen in Ruby too:

person = ["Alice", 10] # name and age

# I/O simulation method
def search_age_in_internet(name)
  sleep(rand(1..2))

  case name
  when "Bob"
    20
  when "Carol"
    30
  else
    0
  end
end

Thread.new do
  person[0] = "Bob"
  person[1] = search_age_in_internet("Bob") # => 20
end

Thread.new do
  person[0] = "Carol"
  person[1] = search_age_in_internet("Carol") # => 30
end

# Wait a bit...
sleep(3)

p person

The above program will sometimes print ["Carol', 20]. So Fiber in Crystal are kind of similar to Thread in Ruby, except that they are lightweight.

I would like the language to prevent these mistakes but it’s really hard without making the language more complex. For example I think Rust can prevent these errors but then you have lifetimes, ownership, the borrow checker, etc., which are very complex.

Whenever you have a variable or some data shared between multiple spawn then you might have a data race. The way to solve it in Crystal is by communicating the data using Channel, or using Mutex. It’s not ideal, but it’s also not very complex (in my opinion). It’s the same way Go works. And eventually we could also add a runtime race detector like in Go.

But maybe the question was about person[0] and person[1] being assigned separately? Then maybe I would change the code to do:

name = "Bob"
age = search_age_in_internet("Bob")
person[0], person[1] = name, age

Or just make spawn use a Channel and send the person there:

channel = Channel(Person).new

spawn do
  name = "Bob"
  age = search_age_in_internet("Bob")
  channel.send(Person.new(name, age))
end

person = channel.receive

I’m sure others in the Crystal community have different perspectives about it.

What does Ruby language community think about this?

Thank you for your reply!

What does Ruby language community think about this?

In Ruby with threads, we have Mutex to avoid data race. However, I can’t find a Mutex (or similar feature) in Crystal, this is why I asked this question.

I agree we can write good code as you suggested (another technique is to write a program immutable). However, we can overlook. For example, maybe people write a program without consideration of Fibers. However, if the component (library) is great, someone can use it with Fibers.

Or Crystal users care about Fiber’s execution every time?

I want to know your Crystal programming experience with Fibers.
No this kind of troubles?
Only a few?
Many (like 199x’s Java threads)?

In other programming language like Go, there are data race detectors. It is one solution to avoid this kind of problem. However, it is difficult to use it on Crystal language because of detector’s technique.

Background of this question

Ruby community (Matz) want to introduce auto-switching Fibers just like Crystal, without any synchronization mechanism (except Queue, Channel in Crystal). I think it is danger and we need to introduce Thread-compatible synchronization mechanism.

Crystal language is a senior of this automatic scheduling Fibers. This is why I want to know Crystal language’s experience.

Thanks,
Koichi

1 Like

Or nobody use Fibers aggressively?

Hi ko1,

There is a Mutex that works for Fibers. Is not a thread safe mutex (yet) since crystal is still single thread.

https://crystal-lang.org/api/0.30.0/Mutex.html

I think this is normally not a problem in Crystal because when you write code you already have concurrency in mind and don’t let shared variables to be updated by separate fibers, even without multithreading.

I did some concurrent programming in Ruby in the past, using Eventmachine and fibers and the issues are similar. You must always be careful when updating shared memory.

Now the question is what’s the impact of adding (automatic) concurrency to a mature language like Ruby. Would it break existing libraries? My bet is that it will, but as long as you add concurrency but not multithreading, the issues will be easy to spot and fix.

I’d love to see a future version of Ruby implementing Crystal like fibers! :heart_eyes:
It was one of the reasons we migrated projects from Ruby to Erlang in the past, because handling concurrency with EM/Fibers was actually a real nightmare and hard to perform well.

I would say that every program take advantage of Fiber since they are used in the std-lib. For example the built-in HTTP server performance is such thanks to the context-switch between fiber is done. This is something that the app developer might not be aware of though. Basically, a fiber that is waiting for IO, leaves the CPU free to be used by another fiber.

I haven’t seen lots of algorithms that use fibers to get the result such as crystal/samples/channel_primes.cr at master · crystal-lang/crystal · GitHub . I think that might fall into the heavy usage of fiber that you are looking for. But it depends on how one is used to expressing solutions/algorithms.

OMG, that is completely my mistake. Thank you for telling me.

Is not a thread safe mutex (yet) since crystal is still single thread.

I think Crystal only focus on Fibers, not Threads, isn’t it? (any plan to introduce threads?)

Now the question is what’s the impact of adding (automatic) concurrency to a mature language like Ruby. Would it break existing libraries? My bet is that it will, but as long as you add concurrency but not multithreading, the issues will be easy to spot and fix.

Ruby already has a threads. So Ruby already supports concurrency.

but as long as you add concurrency but not multithreading, the issues will be easy to spot and fix.

Could you tell me why? Automatic scheduler will introduce non-deterministic behavior and difficult to reproduce the problem. I agree it is easier than Threads (they switch everywhere!) but I can’t say it should be easy.

However, I don’t make a big program with Crystal’s fibers (in other words, co-operative green threads). My recognition can be wrong.

It was one of the reasons we migrated projects from Ruby to Erlang in the past, because handling concurrency with EM/Fibers was actually a real nightmare and hard to perform well.

I think Crystal’s fiber also has Ruby’s EM/Fibers. Not same?
(If it is not a same, I want to know the reason to protect Ruby users from nightmare furthermore :slight_smile: )

App developers don’t care about atomicity, but an app works well?
Or libraries are designed to avoid such issue?

Yes. We’ve been working on it. Stay tuned.

The program runs in single-thread, unless there is an IO, Channel operations or Fiber.yield there will be no surprising change of underlying state.

The language enforces some safe assumptions. For example ivars are assumed that can change and hence some constructs work on local variables but no in ivar: if var - Crystal

Since crystal has some unsafe constructs, at the end of the day it is on the library author to not mess it, but the language will try to help as much as possible. This boils down to type system and memory representation which might not show the same way in Ruby.

This thread seems appropriate to point out a recent study over real-world concurrency bugs in Go.

The paper is available at https://songlh.github.io/paper/go-study.pdf, and there is a nice wrap-up blog post about it: https://blog.acolyer.org/2019/05/17/understanding-real-world-concurrency-bugs-in-go/.

Distinct projects are examined to find whether concurrency bugs are related to channels or sharing memory.

Some interesting facts (read on the references for nicer overview) :

  • 38.6% of examined bugs involve message passing.
  • Message passing seems to introduce more blocking issues (e.g. starving coroutines) than memory sharing does.
  • Go’s runtime race detector
    • detected 2 over 21 reproduced “blocking” bugs.
    • detected half of reproduced “non-blocking” bugs.

Among the causes, authors point out :

  • Coroutine creation with closures
  • Buffered vs unbuffered channel implications
  • Usage of select

It’s also worth mentioning a few things:

  • Crystal is still pretty young and its usage, compared to Go and Ruby, is not that high. That’s why there doesn’t seem to be a lot of code out there taking the most out of spawn and fibers.
  • Crystal runs on a single thread so even though data races are possible (like ko1’s example) they are less common: sharing data is not a problem right now (for example two fibers adding elements to a same array).
  • Crystal might be able to provide better abstractions compared to Go because of its type system and features (generics, modules, overloads, etc.). And I expect the standard library to provide nice abstraction for dealing with the most common concurrency patterns. For example an Actor library would be one such approach. If users are encourages to use these patterns then the amount of bugs might decrease.
1 Like

I use fibers inside the Task Monads https://github.com/alex-lairan/monads/blob/master/src/monads/task.cr

The advantage is that you don’t have to think about concurency, just say you have a task it may fail, and do something else. When you need the data from the task, 2 possibilities :
1 - The task is not finished -> Wait for the end of the task then give the data
2 - The task is finished -> Give the data

When I communicate this the database, I alway create a task :)

Could something like STM in Haskell and Clojure be implemented in Crystal? I don’t know if it is even possible, or how hard it would be. But that might turn out to be a good solution.

Interesting! Mutex is shared with Fibers and Threads? (maybe same situation with Ruby’s plan)

Yes. “unless there is an IO, Channel operations or Fiber.yield” is key. My first example shows unexpected IO (program is small, and it is easy to find out). Of course, it is safer than Threads.

Makes sense. I’m very exciting to see it.

Thank you for sharing great summary. I only know the title of this paper, but not read yet.

BTW (completely off-topic)

Message passing seems to introduce more blocking issues (e.g. starving coroutines) than memory sharing does.

Blocking issue is easy to debug because we can see the backtraces compare with data-race issue.

One comment (off-topic too. Sorry):

Haskell and Clojure are basically immutable languages and introducing STM is good solution because STM data are protected by transactions. In other words, all mutable data are forced to protect by the language using STM. I like this approach very much.

Crystal (and other many languages includes Ruby’s thread) allows mutating a sharing data. It introduces data-race problem.

With Guild abstraction, I want to achieve casual mutable programming and dependable concurrent program in Ruby.

@ko1 Wouldn’t most pure ruby (not C extensions) run correctly multithreaded without the GIL if Mutex was Fiber safe? C extensions could use a recursion safe global lock with additional functions to change lock granularity such as:

# Method is thread safe.  No locking.
rb_method_thread_safe(...)
# Use a specific Mutex instead of the global default.
rb_method_mutex(...)

# Methods defined after any of these calls inherit the
# it's setting for the particular class.
rb_class_use_global_mutex(klass) 
rb_class_use_per_class_mutex(klass, mutex_name) 
rb_class_use_per_instance_mutex(klass, mutex_name)
# Custom mutex possibly shared for the entire library.
rb_class_use_mutex(klass, mutex)
rb_class_thread_safe(klass)

Benefits:

  • I think this would work without modifying ruby existing programs.
  • Guilds and redesigning existing programs aren’t necessary.
  • C extensions have backwards compatibility.
  • Minimal changes are necessary to C extensions for better performance.

Drawbacks:

  • There’s probably a few more places that need thread safety.