I feel so helpless right now

I don’t expect much, but maybe someone has ideas?

{dbg: {13, 0.852736937448849, Pointer(Float64)@0x83ca2c68, Pointer(Fiber::Context)@0x14e2bd4, Pointer(Void)@0x83ca2a6c, Pointer(Fiber::Context)@0x70df1c, Pointer(Void)@0xb64a2c7c}}

Thread 1 "tcp-latency" received signal SIGSEGV, Segmentation fault.
0x004a7e5e in swapcontext () at /usr/share/crystal/src/fiber/context/i686.cr:20
(gdb) x/1i 0x004a7e5e
=> 0x4a7e5e <swapcontext+30>:   pop    %esi
(gdb) info registers
eax            0x70df1c 7397148
ecx            0x14e2bd4        21900244
edx            0xd      13
ebx            0x5b9000 6000640
esp            0xb64a2c7c       0xb64a2c7c
ebp            0x83ca2bc0       0x83ca2bc0
esi            0x83ca2a6c       -2083902868
edi            0x14e2bd4        21900244
eip            0x4a7e5e 0x4a7e5e <swapcontext+30>
eflags         0x10202  [ IF RF ]
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51

I’m on i686 by the way. The first line is produced by this patch to resume (callstack is also made public):

    current, @current = @current, fiber
a = rand()
p dbg: {CallStack.new.callstack.size, a, pointerof(a), pointerof(current.@context), current.@context.stack_top, pointerof(fiber.@context), fiber.@context.stack_top}
    Fiber.swapcontext(pointerof(current.@context), pointerof(fiber.@context))

I think I understand now. esp has 0xb64a2c7c, but inspecting /proc/maps shows:

b54a3000-b54a4000 ---p 00000000 00:00 0
b54a4000-b5ca3000 rw-p 00000000 00:00 0
b64a3000-b64a4000 ---p 00000000 00:00 0
b64a4000-b6ca3000 rw-p 00000000 00:00 0

So, somehow the memory was freed, in spite of the pointer to it still existing in the program inside new_context.stack_top. I should say that what I do is not the documented use of fibers, this one in particular finished already, but if anyone is interested, well, here you go.

Maybe you are experiencing the same thing that was solved very recently in https://github.com/crystal-lang/crystal/pull/8138

I’m really not qualified to give my opinion one way or the other. I’m not using mt though if it makes a difference.

Sorry, from

I thought that you might be using fibers with threads, even without the new mt.

What is the undocumented use of fibers you are doing?

Subclassing Fiber, patching scheduler, and in general storing fiber instances. I’m starting to think that last bit was a mistake, but why does the GC free the block that is still in use?

I know it only marks “living fibers” as GC heads when it does a GC…wonder if that plays in…

The issue I point was because the stack memory of the fibers are in a pool to reduce the stress in the GC. But they were marked as available in the pool before the fiber that was using it has finished. If this is not something that give some idea about what is going wrong in your scenario ignore it.

I think that’s indeed what’s going on. I was under the impression that stacks are under GC control, however their lifecycle is managed directly with Fiber::StackPool.

      pointer = LibC.mmap(nil, STACK_SIZE, LibC::PROT_READ | LibC::PROT_WRITE, flags, -1, 0)
          LibC.munmap(stack, STACK_SIZE)

All I have left is a case where with ~100 fibers this behavious occurs, but with ~5 it doesn’t and the program runs well.

Thank you very much, I’ve definitely learned something.