For application monitoring purposes I would like to know roughly how many messages are sitting in a Channel
created with a buffer.
The naive approach suggests that adding a size
method to Channel
and having that call the internal Deque
's @queue.size
method should do it however conversation on Restoring Channel#full? suggests that it may not be as simple in MT.
Any opinions?
Also the method name of size
might be problematic as it could be confused with the actual size of the buffer when creating a channel. i.e. Channel.new(1024).size
would aways return 1024 or the number of items?.
Maybe a separate method called capacity
or some-such could also be added for that piece of information, although personally I don’t need it.
You might be able to know how full the buffer is, but that will tell you nothing about other potential fibers waiting on that channel’s buffers.
So, you can use ch.@queue.size
as a measure to check whether there are fibers waiting but it won’t offer a complete meassure of load.
Checking the @senders
and @receivers
size might help, but it they are linked list and implementing a size
for them will require more synchronization.
Channel
is a communication and synchronization tool. It has a buffer to improve concurrent performance but it’s not intended for other use cases. The buffer should actually not be relevant for application monitoring.
If you need some kind of queue for processing messages, you should better use a dedicated buffer structure instead of repurposing Channel’s buffer ability. Such a buffer would be easy to instrumentalize.
I have one producer putting messages into the Channel and (currently) one “worker” fibre reading from it (potentially more fibres in future) and doing the heavy lifting. From reading the docs this seems like what a Channel is for but maybe I am wrong.
Would I be able to do the same thing with one producer and multiple “worker” fibres with a simple Deque?
Yes, Channel
can work for this use case. But it does nothing more than that. If you want to pull metrics, it seems like you might rather like to use a dedicated queue which provides such features. Channel
is just a concurrency primitive. Nothing fancy. It’s not designed to give performance statistics. IMO its capacity should usually be either 0 (unbuffered) or a relatively small number (depending on the use case). It’s not really suitable for say a job queue.
You would still need Channel
for synchronization. But you could send messages from the producer through a channel to a dedicated queue (for example based on Dequeue
). Workers can similarly retrieve messages from the queue through a fiber.
This might seem like a lot of duplication, but considering Channel
only as a communication tool, it makes sense to have a stateful mechanism for storing messages. This brings a lot of flexibility through direct access to the queue and allows you to pull metrics.
I think the point here is, if we want to get metrics about the load of the system, we should not look into getting them directly from the concurrency primitives.
That’s understandable. On the other hand, as an engineer looking into capacity planning, I may want to get insight into the efficiency of my system. For this purpose, I would suggest an approach where we instrument our own code, rather than the primitives. For example, our system could run an observer
fiber aggregating metrics about system load sent from other fibers.
I’d love to get some time to put together an example to illustrate this, but then again, just another item on my list
3 Likes