Asynchronous HTTP requests


#1

So, on the vein of @aarongodin’s comments: If you do

require "http"
addr = "https://jsonplaceholder.typicode.com/todos/1"
HTTP::Client.get addr do |result|
  if result.status_code == 200
    IO.copy result.body_io, STDOUT
    puts "\n ^^ result from #{addr}"
  else
    puts "request failed with status #{result.status_code.inspect}"
  end
end
puts "after the request"
Fiber.yield
puts "after yielding the fiber"

The output is:

{
  "userId": 1,
  "id": 1,
  "title": "delectus aut autem",
  "completed": false
}
 ^^ result from "https://jsonplaceholder.typicode.com/todos/1"
after the request
after yielding the fiber

… no spawn - kind of expected, since MT isn’t happening. However, if you do…

require "http"
addr = "https://jsonplaceholder.typicode.com/todos/1"
spawn do
  HTTP::Client.get addr do |result|
    if result.status_code == 200
      IO.copy result.body_io, STDOUT
      puts "\n ^^ result from #{addr}"
    else
      puts "request failed with status #{result.status_code.inspect}"
    end
  end
end
puts "after the request"
Fiber.yield
puts "after yielding the fiber"

…I would expect the result to be

after the request
{
  "userId": 1,
  "id": 1,
  "title": "delectus aut autem",
  "completed": false
}
 ^^ result from "https://jsonplaceholder.typicode.com/todos/1"
after yielding the fiber

… but instead, it is…

after the request
after yielding the fiber

…is there any way to asynchronously perform an HTTP request?? That seems kinda essential…


#2

Hi!

You might want to read this

When you do Fiber.yield the runtime checks if there’s another fiber ready to execute. Probably there’s none because the HTTP client is still waiting to hear from the server and so execution continues on the main fiber, prints “after yielding the fiber” and then finishes.

It’s probably easier if you tell us what you want to do and we can tell you how to solve it.


#3

You might want to replace Fiber.yield with sleep and some timeout (or literally anything to fill the fiber). Fiber.yield is not guaranteed to yield to the spawned fiber performing the request and even if it does, it might yield back to the main fiber as soon a it hits some IO it needs to wait for. Then the main fiber finishes and the program exits before the request is completed.


#4

@scott I’d suggest to take a look at await_async shard which gives you nice syntactic sugar IIUC useful in such scenarios.


#5

I’d suggest to take a look at await_async

That’s a really unfortunate name. It has nothing to do with traditional async/await found in other languages like C# and Javascript. This is just a regular Future, Task or Promise.


#6

True indeed, @anykeyh might be interested in choosing another name for it then… :slight_smile:


#7

So much trouble I had by naming this gem this way while it behave exactly like await/async in Scala, or in the once awesome IcedCoffeeScript. Two languages somehow way better than Javascript ( trolling intended :smile_cat: ).

But yeah, basically await_async gives you some syntaxic sugar to wrap promises. It also defer exception raised in fiber to the fiber hanging for result also, for simplicity reasons.
It has been built to ship small script-like applications, where architecture doesn’t matter so much. I personnaly use it actively in HTTP scrapings tasks.

If your application turns to be bigger, you may want to implement queuing job with channels. They are standards, easy to write, and it will costs you only few more lines of code.


#8

while it behave exactly like await/async in Scala

No, it doesn’t. Check the scala example. You are supposed to call await inside an async block. Then Scala probably rewrites the whole thing to continuations or something.

In your case you have async, which means future, and await, which means “wait for the future to complete and get its result”.

I still think that naming this async/await is not correct.


#9

IMO, I view this as an issue. The Fiber.yield should yield for that previous spawn, and flow of execution behaving how the OP explained is logically sound. That’s probably what most developers think should happen (unless you know the inner workings of the language).

Replacing Fiber.yield with sleep, or a timeout to “fill the fiber” would never really come across a developer’s mind because they believe the Fiber.yield is waiting for the previous spawn. This is not illogical. @scott Just curious, is this kinda how you felt as well?

edit: I’ve also seen this issue/question being brought up before AFAIK
edit2: When I say “a developer’s mind”, I mean a non core developer of the language


#10

Sorry that I haven’t been on in a couple days. This topic came up in the gitter channel, and that is why I brought up the question. I have found a solution. As soon as I saw @Sija’s comment about the await_async shard, I remembered Crystal’s built-in Future implementation! Using this feature solves the issue IMO:

require "http"
addr = "https://jsonplaceholder.typicode.com/todos/1"
promise = future do
  HTTP::Client.get addr do |result|
    if result.status_code == 200
      IO.copy result.body_io, STDOUT
      puts "\n ^^ result from #{addr}"
    else
      puts "request failed with status #{result.status_code.inspect}"
    end
  end
end
puts "after the request"
promise.get
puts "after yielding the fiber"

#11

Interesting, I was not aware of crystal’s Future method. I’ll try this out as it looks promising (hehe) for what I’m looking to do.

To respond to others asking about the use case for this, I’m a programmer coming from primarily JS where it’s a concern to perform synchronous tasks that halt code execution. I suppose I’m carrying over that concept and wanted to better understand what is happening when I call HTTP::Client.get.


#12

If only that was so simple.

Suppose you have a dns name example.com with three A records. Suppose that it goes down for some reason. Your call will block for 6 minutes (120s is the default TCP timeout, times the number of IP addresses that the socket wil try to connect to). Finally suppose that you program emits an HTTP request every second to ping something there and I hope you can see a problem.


#13

Eventually, HTTP::Client should be able to try to connect to all possible endpoints concurrently if the first one doesn’t respond immediately (way before any regular timeout). We’re not there yet, but I think that’s the plan. The user shouldn’t have to worry about such details.


#14

No, but that’s not the point, just an example. My point is that Futures are not good enough an abstraction to just forget about possible problems, because you will end up with millions futures waiting to resolve if for any reason your code starts to block, and that can be anything from sockets to files to child processes to just plain old race conditions.