Invalid memory access raised instead of expected cast failed error, what tools should I use to debug this type of error?

Because involves code is quite complicated, so get a minimum reproduce code is not easy.

When i run my code, i get error like this:


╰─ $ cr run -d src/procodile.cr – status
Invalid memory access (signal 11) at address 0x0
[0x555a2a973b86] *Exception::CallStack::print_backtrace:Nil +118 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x555a2a958b2e] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil) +366 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x7f0bd4462a00] ?? +139688782735872 in /usr/lib/libc.so.6
[0x555a2a95eac8] ~procProc((Procodile::ControlClientReplyForStatusCommand | Nil)) +40 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x555a2aa781f8] *Procodile::CLI#dispatch:Nil +248 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x555a2a946bd4] __crystal_main +4356 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x555a2ab924bd] *Crystal::main_user_code<Int32, Pointer(Pointer(UInt8))>:Nil +45 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x555a2ab923fe] *Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32 +94 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x555a2a954a7d] main +45 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x7f0bd444d290] ?? +139688782647952 in /usr/lib/libc.so.6
[0x7f0bd444d34a] __libc_start_main +138 in /usr/lib/libc.so.6
[0x555a2a9459f5] _start +37 in /home/zw963/.cache/crystal/crystal-run-procodile.tmp
[0x0] ???


I know why this issue happen, fix is quite easy, when i add a nil in following code, or add a : Nil return type declaration enforce the status method always return nil, code works.

Because status method was convertd to a Proc and by used in src/procodile/cli.cr#L60 like this:

  callable = ->status
  callable.as(Proc(Nil)).call  # Cast to a Proc(Nil) here.

But, the issue is, it should raise a cast error like following instead of Invalid memory access.

Unhandled exception: cast from Proc((ControlClientReplyForStatusCommand | Nil)) to Proc(Nil) failed, at /home/zw963/Crystal/git/procodile.cr/1.cr:85:1:85 (TypeCastError)

So, assume i am not use any unsafe or low level primitives, when invalid memory access error often happen? What tools should i use to debug that?


reproduce:

  1. clone GitHub - zw963/procodile_cr: 🐊 Run processes in the background (and foreground) on Mac & Linux from a Procfile (for production and/or development environments)
  2. create a Procfile like this:
test1: sleep 1000000
test2: sleep 1000000
  1. start daemon in one terminal

crystal run -d src/procodile.cr – start --clean --foreground

  1. open a new terminal, check the daemon status
crystal run -d src/procodile.cr -- status
  1. comment the above nil in src/procodile/commands/status_command.cr:31, repeat 4 again.

Thanks

This looks like a compiler bug. Apparently the proc is not properly cast to Proc(Nil).

The type of command.callable is declared Proc(Nil), so it should not be able to hold any other proc type.

In fact, i create a snippet to try to reproduce, but no luck.

https://play.crystal-lang.org/#/r/e9zp

I get the expected cast error

Unhandled exception: cast from Proc((ControlClientReplyForStatusCommand | Nil)) to Proc(Nil) failed, at /home/zw963/Crystal/git/1.cr:91:3:91 (TypeCastError)
  from 1.cr:91:3 in '__crystal_main'
  from /home/zw963/Crystal/share/crystal/src/crystal/main.cr:115:5 in 'main_user_code'
  from /home/zw963/Crystal/share/crystal/src/crystal/main.cr:101:7 in 'main'
  from /home/zw963/Crystal/share/crystal/src/crystal/main.cr:127:3 in 'main'
  from /usr/lib/libc.so.6 in '??'
  from /usr/lib/libc.so.6 in '__libc_start_main'
  from ../sysdeps/x86_64/start.S:117 in '_start'
  from ???

so, there are may exists other issue which cause Invalid memory access, i want to find it out, but don’t know how to do it.

Casting to Proc(Nil) should be disallowed and produce a compile error. It’s a breaking change but it’s the only way going forward.

In the original code, this explicit cast isn’t necessary. If you remove it, the code behaves the same way.

diff
--- i/src/procodile/cli.cr
+++ w/src/procodile/cli.cr
@@ -57,7 +57,7 @@ module Procodile

     def dispatch(command)
       if self.class.commands.has_key?(command)
-        self.class.commands[command].callable.as(Proc(Nil)).call
+        self.class.commands[command].callable.call
       else
         raise Error.new("Invalid command '#{command}'")
       end

It appears to be happening somewhere implicitly. The type of CliCommand#@command is Proc(Nil), but somehow the compiler allows assigning a proc that apparently isn’t Proc(Nil).

In the original code, this explicit cast isn’t necessary. If you remove it, the code behaves the same way.

Yes, yes, i found the same issue just now, i add the .as(Proc(Nil)) probably because the compiler prompted me to do so, while porting ruby to crystal at some intermediate step i think. original ruby version use public_send in dispatch method, i change to use Proc#call for same purpose.


I found it, i create new command use CliCommand.new, like this:

{% begin %}
       # .....
        def initialize
          @options = Procodile::CliOptions.new
          @config = uninitialized Procodile::Config

          {% for e in COMMANDS %}
            {% name = e[0] %}
            {% description = e[1] %}

            self.class.commands[{{ name.id.stringify }}] = CliCommand.new(
              name: {{ name.id.stringify }},
              description: {{ description.id.stringify }},
              options: @@options[{{ name }}],
              callable: ->{{ name.id }}
            )
          {% end %}
        end
    {% end %}

In fact, CliCommand is a struct defined by record macro, defined in src/procodile/procfile_option.cr#L64-L68, there enforce callable must be a Proc(Nil).

record CliCommand,
    name : String,
    description : String?,
    options : Proc(OptionParser, Procodile::CLI, Nil)?,
    callable : Proc(Nil)

Then, as you said, somehow the compiler allows assigning a proc that apparently isn’t Proc(Nil).

I consider in this case, compiler should give user a exception error instead of invalid memory access.

1 Like

Yes, the compiler either shouldn’t allow assigning ->status to an ivar with type Proc(Nil) or make sure that it’s properly lowered to that type.
The latter is supposed to be happening, but apparently it doesn’t always work correctly.
I could not reproduce the issue in isolation.

But in the interpreter, the following program shows an effect of the same problem:

record Command, callable : Proc(Nil)

def status
  1.as(Int32?)
end

callable = Command.new(->status).callable
callable.call # Error: BUG: data left on stack (16 bytes): Bytes[11, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]

The error message indicates that the proc actually left a return value on the stack. It’s the type id for Int32 and the value 1. This means it wasn’t properly lowered to Proc(Nil).

This bug is already tracked in Automatic casting of non-Nil proc return type to Nil is wrong for struct types · Issue #10911 · crystal-lang/crystal · GitHub