Check for new received message in HTTP::Websocket

Hello,

I’m using HTTP::WebSocket to handle websocket connections, but using the underlying file descriptor in a select(2) function to know when a new message arrived (because I manage multiple file descriptors). I also wrote a read function in the HTTP::WebSocket class to read a single message (as I discussed it in a previous forum message WebSocket and silent disconnections (solved) and I sent a PR https://github.com/crystal-lang/crystal/pull/8392 ). And I encounter a problem.

HTTP::WebSocket can read a message over the TCP connection, and this class manages boundaries correctly: messages do not overlap when multiple messages are sent by a single client at the same time, we get them separately. However, only the first message is handled in my read function (which is basically the #run method without the loop). Once the message has been handled, there is nothing to read on the file descriptor anymore (select(2) does not return).

So, I think that HTTP::WebSocket buffers messages, and I would like to know if there is a message in the buffer I can read.

I tried to read HTTP::WebSocket and HTTP::WebSocket::Protocol but there are a bit obscure to me. As I understand, there is the @io attribute in HTTP::WebSocket::Protocol that is used to read messages in the underlying socket, but I don’t fully understand what’s going on.

Thanks for the help!

I investigated: the problem comes from the Socket library which as buffered IO.

Here is how I create the websocket:

# I accept a TCP connection
client = server.accept
# [...] I do the websocket handshake myself, for reasons
# then at some point I create the WebSocket based on the TCP Socket
wsclient = WebSocket.new client

Now, since the Socket IO is buffered:

  1. the first return of the select function warns me about a new received packet,
  2. I read a single message but the others are buffered by the Socket library,
  3. the code goes back to the select, which doesn’t return since there is nothing to read on the file descriptor anymore,
  4. once another message is received, the client’s WebSocket instance read a buffered message,
  5. since (4) hasn’t read anything on the file descriptor, select returns immediately.

(4) and (5) loop until the end of the buffer, when we have to read the socket for new messages.

The question is: what should I do?

  1. Make unbuffered sockets? I don’t think so. I think it will create other problems.
  2. Create a function in the Socket or IO class to get the buffer size? Is a simple function performing a .size on the IO#in_buffer_rem be enough?

Thanks for your patience.

I fixed the problem by creating a function that tells if there is still a message in the buffer of the Socket.

The Socket library uses the IO::Buffered module, which I chose to upgrade with this new function (it’s relevant for any class including the IO::Buffered module). Since HTTP::WebSocket is compatible with all IO instances, then IO needs this function as well, even if it doesn’t really makes sense here.

# IO class with the new function, always returning false
class IO
  def still_something_to_read? : Bool
    false
  end
end

# IO::Buffered module informs when a message is still in the buffer
module IO::Buffered
  def still_something_to_read? : Bool
    @in_buffer_rem.size > 0
  end
end

With these functions, I’m able to loop over buffered messages. I think it can be useful and I will upstream this. I’m just not sure about the function name. Any idea?

So, I reply to myself once again. This time to announce the PR: https://github.com/crystal-lang/crystal/pull/8693

\o/

I chose to name the method empty?. Is this name appropriate for this function? Hope so.

Thanks everyone for the read.

I don’t think I understand you use case completely. Could you share some code to show what you’re trying to do? Best case it would be a reduced example. But actual implementation would also help.

Have you read Concurrency - Crystal? I think you’re approaching this problem wrong for Crystal. Crystal handles non-blocking IO internally, you shouldn’t call select yourself, instead use fibers and channels.