Block inlining vs method inlining

Documentation stresses that blocks are always inlined. So, given this code:

module M
  def self.s
    t = 0.0

    N.r do |x|
      t += x
    end

    t
  end
end

module N
  F = 7.5

  def self.r
    t = 2.0
    while (x = rand) < 0.8
      yield x*t*F
    end
  end
end

if I understand the docs correctly, the method M.s ends up being something like this (conceptually):

def self.s
  t = 0.0

  __t = 2.0
  while (__x = rand) < 0.8
    x = __x*__t*::N::F
    t += x
  end

  t
end

(Of course, the convention __ is not meant to be literally that way, it means “somehow the variables do not clash”.)

Wouldn’t you call that “method inlining”? That is what I visualize when a compiler optimization inlines functions in C or Erlang.

Would it be more correct to say: “methods that accept a block are inlined”?

No, it works the other way around. s will turn into something like

  def self.s
    t = 0.0

    N.r__1(out t)

    t
  end

and a r will be generated for each distinct block it’s passed, so r__1 turns into something like:

  def self.r__1(__t)
    t = 2.0
    while (x = rand) < 0.8
      __x = x*t*F
      __t += __x
    end
  end

It’s is very similar to method inlining, just it’s not the block causing a method to be inlined, it itself is the method being inlined.

1 Like

Really! I believe this section of the reference is not very clear then.

You really define a function for every single call site of the method? Every each call in the whole Crystal stdlib and user’s code creates a distinct each method?

Was that out pseudo-code for closured variables? What happens with self? That connects with the discussion in Performance struct vs class.

You really define a function for every single call site of the method? Every each call in the whole Crystal stdlib and user’s code creates a distinct each method?

As far as I’m aware, yes.

Was that out pseudo-code for closured variables? What happens with self ?

Yes, consider that very pseudo code, I don’t think there’s valid Crystal code to emulate what really happens :) I just meant to indicate that __t is the callside’s t. You can imagine the generated method somehow has full access to the callsite’s scope, in the case of a closuring block anyways.

1 Like

Glad I asked! I had it backwards!

So I went full circle. First, I imagined literal block inlining. Since blocks are generally different, the obvious meaning of “inlining” made no sense, and then the examples in the reference made me think it was the method being inlined. So, I am back at “each call site invokes a copy of the original method with the block inlined and closured vars, constants, self, etc. working as expected”, whatever that technically really translates to.

It actually works as Xavier says. Blocks with yield don’t generate new method definitions. They are inlined.

2 Likes

This can also be seen if you generate LLVM IR code: you won’t find a definition of r (nor nothing similar to it). And s will have part of r inlined in it.

That’s interesting! So since the method that yields is always inlined, the method does not even need to exist at all in the final program.

Oh sorry, I didn’t realize! Ignore everything I said then :D

When I was working on debugger and debugging compilter I was looking explicitly on

visit(Yield)

and

codegen_with_block:

it was clear that it is inlining in case if it is a block. Also when I was looking into generated IR code for methods with blocks it shows you the code as inlined.

No problem! It’s not explained anywhere so you could only guess.

It took me a lot of time to realize it as I could not understand at that moment why local block variables was not shown in debugger and why crystal had a bug with methods with blocks when he was jumping somewhere in the local file instead of going into the file that contains the source code for that method.

Some bits and pieces from @asterite helped me to click all pieces to the whole picture.

So that section is correct though.

2 Likes