Timeline for multithreading support

Yes, it’s RFC 2 and its accompanying shard.

Why file io is not blocking? It’s implemented by io_uring ?

What makes you think it is not blocking?

Is the intent to implement a true parallel threading model which doesn’t involve fibers, e.g. implemented in Rust, C++, D, etc.

With the current currency model using fibers, I hit a memory limit on how many fibers can be used. This problem doesn’t exist with the other languages, as the hardware threads are reused as necessary to accommodate the number of spawned jobs. That is the model I would like to see implemented as well, in order to do non-fiber parallel processing.

I’m not sure if your comparison is justified. The model of running a large number of jobs on a (small) number of threads maps exactly to fibers, running many jobs on a (small) number of fibers. If you spawn a fiber for every job, the analogy would be spawning a thread for every job.

Of course, the runtime can handle much more fibers easily than threads. And it’s feasible to have many more fibers than CPU cores. But whether that’s actually useful depends on the nature of the jobs and their execution characteristics.

My experience with my prime sieves is that if you allocate 1 fiber per spawned instance then the programs bombs if the fiber allocation becomes too large.

On the other hand, if you allocate a small number of fibers for a large number of spawns (parallel allocations) then the amount of system memory use increases and increases until all available system memory is consumed, and then the program bombs.

None of these conditions exist for any of the languages mentioned (and more) because they do true parallel jobs-to-cores|threads hardware allocations, and don’t use an intermediary software fiber abstraction.

I you want to, I can show you it happening in real time with my code.

Why you allocate large number of spawns on small number of fiber? You should limit the use of Fiber and Spawn to a reasonable range, did you remember this answer by @ysbaddaden on collatz post?

The use of fibers to try to do true parallel processing is the crux of the problem. Fibers are software abstractions to preform concurrent, not parallel processing. Parallel processing is about using hardware cores|threads to simultaneously operate processes.

In Rust, C++, D, et al languages that perform true parallel processing, the number of parallel processes to perform are not hardware limited. If I have 8|16|32 etc threads, I can run 100,000, etc spawned jobs (assuming no memory issues for each thread) by reusing the threads until the numbers of jobs to be done is completed. The more threads you have the faster it will complete.

In Rust you have shards like Rayon (and its dependencies) that takes care of all the necessities to do this, that creates a nice user api of methods to use.

So I’m not saying get rid of fibers, I’m saying create a true parallel multi-thread processing model (aka Rust, D, C++, Nim, etc) that can perform true parallel processing, and users can choose what|when to use to best fit their applications.

Parallel processing is about using hardware cores|threads to simultaneously operate processes.

So I’m not saying get rid of fibers, I’m saying create a true parallel multi-thread processing model (aka Rust, D, C++, Nim, etc) that can perform true parallel processing, and users can choose what|when to use to best fit their applications.

I thought what you said(the goal you want to achieve) is what this post is about.

Crystal Fiber multi-threading support is about hiding these details and letting the compiler handle the creation and use (exists) thread, concurrency is about structure, parallelism is about execution,
parallelism is a property of the runtime of our program, not the code, it should be handled by compiler, user just need write concurrency code (as goroute in golang), and compiler make it execution parallelism if possible (as was did by golang compiler), if this’s done, we don’t need specify the CRYSTAL_WORKERS anymore, even, mt always enable as default.

In case you missed it, this is exactly what’s happening:

I’m a bit confused. The part about concurrency vs. parallelism is completely understandable. Parallelism is a necessesity for some algorithms to be performant.
But I don’t get how that affects memory allocation. If you allocate the same large number of jobs and execute them in threads, does that use less memory then if you execute them in fibers? I don’t see how that would be happening.

1 Like

Like I said, I can show it happening in real time under certain circumstances with my sieve programs.

System memory use is affected by the number of active threads. If a single thread requires X memory then N threads will use NX system memory. For my sieve code the memory per thread is low, so I don’t have to worry about it. However, for other applications on an older system (8 threads, 8GB mem) it was a limitation for certain applications. So the more memory you get the less you have to worry about this for most applications.

But since you say the RFC 2 implementation will do true MT parallelism then I’ll be pleased when it’s finished and ready to play with it.