Negative WaitGroup counter (Runtime Error)

Hi,

do you have any idea why i get this funny runtime exception when i use my C function with the working code by Jamie (he used HTTP::Client instead)?

karl@rantanplan:~/src/crystal/futures$ ./kurl 
LibAcme.curl("https://crystal-lang.org") # => 200
200 www.python.org
Unhandled exception in spawn: Negative WaitGroup counter (RuntimeError)
  from /home/karl/.crystal-1.13.2-1/share/crystal/src/wait_group.cr:80:5 in 'add'
  from /home/karl/.crystal-1.13.2-1/share/crystal/src/wait_group.cr:87:5 in 'done'
  from src/kurl.cr:52:5 in '->'
  from /home/karl/.crystal-1.13.2-1/share/crystal/src/fiber.cr:143:11 in 'run'
  from /home/karl/.crystal-1.13.2-1/share/crystal/src/fiber.cr:95:34 in '->'
  from ???
Unhandled exception in spawn: Negative WaitGroup counter (RuntimeError)
  from /home/karl/.crystal-1.13.2-1/share/crystal/src/wait_group.cr:65:7 in 'add'
  from src/kurl.cr:42:5 in '->'
  from /home/karl/.crystal-1.13.2-1/share/crystal/src/fiber.cr:143:11 in 'run'
  from /home/karl/.crystal-1.13.2-1/share/crystal/src/fiber.cr:95:34 in '->'
  from ???

The C function called “stand-alone” works (LibAcme.curl("https://crystal-lang.org") # => 200 ) - it only fails in the fiber/WaitGroup context. It doesn’t work either when i call it directly and not with the anonymous function.

@[Link("curl")]
@[Link(ldflags: "#{__DIR__}/libacme.o")]

lib LibAcme
  fun curl = ripp(url : LibC::Char*) : LibC::Long
end

require "http"
require "wait_group"

urls = %w{
  www.python.org
  www.cpan.org
  www.perl.org
  developer.apple.com
  www.sbcl.org
  crystal-lang.org
  www.graalvm.org
}

fn = ->(url : String) {
  LibAcme.curl "https://#{url}"
}

p! LibAcme.curl "https://crystal-lang.org"

# channel = Channel(Tuple(HTTP::Status, String)).new
channel = Channel(Tuple(Int64, String)).new

producer_wg = WaitGroup.new
consumer_wg = WaitGroup.new

urls.each do |url|
  producer_wg.add
  spawn do
    # result = {HTTP::Client.get("https://#{url}").status, url}
    result = {(fn.call(url)), url}
    channel.send result
    consumer_wg.add
  ensure
    producer_wg.done
  end
end

spawn do
  while result = channel.receive?
    status, url = result
    puts "#{status} #{url}"
    consumer_wg.done
  end
end

producer_wg.wait
channel.close
consumer_wg.wait
  

Here the code of the lib (but i think the error doesn’t have do anything with it):

#include <curl/curl.h>

long int ripp(const char *url) {
CURL *curl;
long status;
curl = curl_easy_init();
char *uagent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15) "
               "AppleWebKit/605.1.15 (KHTML, like Gecko) "
               "Version/13.0 Safari/605.1.15";
long int n = 2;
curl_easy_setopt(curl, CURLOPT_TIMEOUT, n);
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_NOBODY, 1L);
curl_easy_setopt(curl, CURLOPT_USERAGENT, uagent);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
curl_easy_perform(curl);
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &status);
return status;
}
                                             

Thanks for any hint.

Best regards, Karl

WaitGroup is meant to count the number of live fibers, not the number of messages sent through a channel (or anything else). You increment before you spawn a fiber, and decrement before terminating the fiber.

In your case, you’ll have N producer fibers (where N is urls.size) but only 1 consumer fiber (that’s enough).

As i mentioned: If i use the original code result = {HTTP::Client.get("https://#{url}").status, url} the exception is not thrown. Then it works like a charm.

Update:

Changing the code in the loop makes it work both with HTTP::Client and libcurl.

Strange enough the consumer_wg.add in the spawn.
Strange enough the whole affair - coding by guessing - less or more.

urls.each do |url|
  producer_wg.add
  consumer_wg.add # <-
  spawn do
    # result = {HTTP::Client.get("https://#{url}").status, url}                                  
    result = {(fn.call(url)), url}
    channel.send result
    # consumer_wg.add  # <-                                                                        
  ensure
    producer_wg.done
  end
end

I think I originally put the consumer_wg.add line inside the loop to account for exceptions in trying to make the request (for example, DNS failure, TCP closed stream, or other I/O errors). I was trying to only add something to the consumer wait group if a response was received and added to the channel — IIRC I added a timeout that raised an error to simulate this. Otherwise, you end up waiting to consume responses that never come back.

What Julien mentioned about the wait group being intended for counting fibers makes sense here, and maybe having a consumer wait group isn’t the best choice. It was just what I could come up with in the moment that worked for the test cases I threw at it.

Actually, everything is easier:

[Link(ldflags: "#{__DIR__}/src/libacme.o")]

lib LibAcme
  fun curl = ripp(url : LibC::Char*) : LibC::Long
end

require "wait_group"

def frob(fn : Proc, urls : Array(String)) : Array(Tuple(Int64, String))
  wg = WaitGroup.new
  chanl = Channel(Tuple(Int64, String)).new

  urls.each do |url|
    wg.add
    gizmo = spawn do
      result = {(fn.call(url)), url}
      chanl.send result
    ensure
      wg.done
    end
  end

  results = Array(Tuple(Int64, String)).new

  spawn do
    while result = chanl.receive?
      results << result
    end
  end

  wg.wait
  results
end

urls = %w{                                                                                                                                                                                                                                        
  www.python.org                                                                                                                                                                                                                                  
  www.cpan.org                                                                                                                                                                                                                                    
  www.perl.org                                                                                                                                                                                                                                    
  developer.apple.com                                                                                                                                                                                                                             
  www.sbcl.org                                                                                                                                                                                                                                    
  crystal-lang.org                                                                                                                                                                                                                                
  www.graalvm.org                                                                                                                                                                                                                                 
}

curl = ->(url : String) {
  LibAcme.curl "https://#{url}"
}

frob(curl, urls).each do |status, url|
  puts "#{status} #{url}"
end

This works mit libcurl as well as with Http::Client.

I was only quite surprised that the code behaved different depending on the methods for the http requests.

Regards, Karl