Waiting for activation when use HTTP::Client (or crest?)

Following is a example.

http://www.ip111.cn, it return the user own IPs when visit China's site or Out China site individually (which is different)

Why this site can know the ip from abroad? it use a iframe tag resent request to another site, the get the abroad ip, as following screenshot:

My question is, is it possible to use only HTTP::Client or crest like tools, waiting the www.ip111.cn to finished it iframe tag, then use a simple parser, e.g. lexbor, parse the expected response?

Thanks.

If you mean how does it know whether you’re in China, it probably uses something like MaxMind to geolocate the user’s IP address. It can also use ASNs as an even more performant way to determine whether they’re in a given network. At my company, we use ASNs in the algorithm that detects bot traffic to determine whether to block the request before they even reach the app.

Sure, that can be done with just the stdlib.

require "http/client"
require "xml"

url = "http://www.ip111.cn/"
if iframe = XML.parse_html(HTTP::Client.get(url).body).xpath_node("//iframe")
  if url = iframe["src"]?
    uri = URI.parse(url)
    ip = uri.host

    pp ip
  else
    puts "No IP address in the iframe's `src` attribute."
  end
else
  puts "no iframe"
end

But what’s your goal in doing that?

Sorry for confusing, what i expected is get my public ip when visit a site which within China or outside China.

This is almost exactly what i do for now, check this code, what i really what is, get the result from the current visit website directly, instead of do the request on my side,

As following screenshot

if waiting a while, it will show IP when visit (1)China site, (2)outside China site (3)blocked by China site.

If we use selenium instead(although it is indeed a bit overkill for this simple tools), we can query on those element, until it return the expected(as the IP address above) result.

My question is, is it if possible when use simple HTTP::Client do same work?

Thanks

@zw963 what your code is doing I think is correct - when a browser receives an HTML payload from a server that contains an iframe tag with a src attribute, the browser also makes a request to that provided URL and puts the newly returned HTML into the iframe tag in the original page (i.e. the browser is also making two requests).

Selenium is essentially a headless (or not headless, maybe) browser, so it runs through those 2+ requests for you. I don’t think it would necessarily make sense for the HTTP::Client to do this for you, since the behavior of the iframe tag is part of the HTML specification, not the HTTP protocol itself.

In fact, the really issue is, some site, e.g.

the browser also makes a request to that provided URL and puts the newly returned HTML into the iframe tag in the original page (i.e. the browser is also making two request

The situation will become complex because some site probably not use iframe, even, direct visit to the provided URL(for get really ip) is not possible, because it need some token to provided. https://ip.skk.moe/ is a example, see screenshot for a token example.

So, I think a headless browser might be the only solution in this case.