Adding #size to Channel

For application monitoring purposes I would like to know roughly how many messages are sitting in a Channel created with a buffer.

The naive approach suggests that adding a size method to Channel and having that call the internal Deque's @queue.size method should do it however conversation on Restoring Channel#full? suggests that it may not be as simple in MT.

Any opinions?

Also the method name of size might be problematic as it could be confused with the actual size of the buffer when creating a channel. i.e. Channel.new(1024).size would aways return 1024 or the number of items?.

Maybe a separate method called capacity or some-such could also be added for that piece of information, although personally I don’t need it.

You might be able to know how full the buffer is, but that will tell you nothing about other potential fibers waiting on that channel’s buffers.

So, you can use ch.@queue.size as a measure to check whether there are fibers waiting but it won’t offer a complete meassure of load.

Checking the @senders and @receivers size might help, but it they are linked list and implementing a size for them will require more synchronization.

Channel is a communication and synchronization tool. It has a buffer to improve concurrent performance but it’s not intended for other use cases. The buffer should actually not be relevant for application monitoring.
If you need some kind of queue for processing messages, you should better use a dedicated buffer structure instead of repurposing Channel’s buffer ability. Such a buffer would be easy to instrumentalize.

I have one producer putting messages into the Channel and (currently) one “worker” fibre reading from it (potentially more fibres in future) and doing the heavy lifting. From reading the docs this seems like what a Channel is for but maybe I am wrong.

Would I be able to do the same thing with one producer and multiple “worker” fibres with a simple Deque?

Yes, Channel can work for this use case. But it does nothing more than that. If you want to pull metrics, it seems like you might rather like to use a dedicated queue which provides such features. Channel is just a concurrency primitive. Nothing fancy. It’s not designed to give performance statistics. IMO its capacity should usually be either 0 (unbuffered) or a relatively small number (depending on the use case). It’s not really suitable for say a job queue.

You would still need Channel for synchronization. But you could send messages from the producer through a channel to a dedicated queue (for example based on Dequeue). Workers can similarly retrieve messages from the queue through a fiber.
This might seem like a lot of duplication, but considering Channel only as a communication tool, it makes sense to have a stateful mechanism for storing messages. This brings a lot of flexibility through direct access to the queue and allows you to pull metrics.

I think the point here is, if we want to get metrics about the load of the system, we should not look into getting them directly from the concurrency primitives.

That’s understandable. On the other hand, as an engineer looking into capacity planning, I may want to get insight into the efficiency of my system. For this purpose, I would suggest an approach where we instrument our own code, rather than the primitives. For example, our system could run an observer fiber aggregating metrics about system load sent from other fibers.

I’d love to get some time to put together an example to illustrate this, but then again, just another item on my list :man_facepalming:

3 Likes