I ran through my small handful of tests. I’m still having problems with self
(I couldn’t tell if that was fixed), however arguments passed to functions now work for me with lldb (command line – no GUI).
Ok. I fixed handful of debug issues in the last commit, so I am looking for volunteers again.
Self should show its values correctly, and I believe all local variables (including inside of blocks).
You can find my branch here:
https://github.com/skuznetsov/crystal/tree/debug
PS: As a shameless plug: I just created new repo with my JavaScript code for Smart minesweepers based on Evolutionary Algorithm (ported and improved from ai-junkie.com site). If someone interested feel free to look into my https://github.com/skuznetsov/smart_minesweepers repo.
it is just one HTML file with all code inside, so it can be run as a file locally from your computer in browser.
self is working.
(lldb) frame variable self[0]
(SDB::DebugTest) self[0] = {
ihash_sn = 0x00000001006c4680
ihash_nn = 0x00000001006c4640
inumber = 2
ifloat = 2
}
for some reason the lldb command import is failing, I noticed it changed on the git pull request.
(lldb) command script import /Users/lribeiro/Work/crystal/etc/lldb/crystal_formatters.py error: module importing failed: loading unimplemented.
EDIT1: Looked at the python change and looks inconsequential, tested old version just in case and it’s not working as well, looks like something broke in my local environment. Checking
By some reason python interpreter is not starting in Homebrew vanilla lldb 9 by some reason. Native XCode bundled lldb works though with my python formatter
Thanks,
tested again with /Library/Developer/CommandLineTools/usr/bin/lldb and it looks good.
Local variables and Instance variables are working well.
@lribeiro Excellent!
I do have a question though: Does gdb works for you on MacOS?
It does not work for me as it hangs when I run the app (it is codesigned but when I run the app under gdb it not showing me the root password box and just hangs, so I have to kill GDB session)
Ok. I tested my debug changes on my AWS Ubuntu instance with lldb and gdb and they work fine as well.
gdb in mac doesn’t work, in user mode complains about codesign (even though its signed and with csrutil disable) in sudo mode gets stuck in initialisation after the run command.
So that means it is not just my local issue as I was correctly presumed. I did even reinstalled operating system to be due that it is not the configuration issue.
I need more volunteers on different Linux flavours and test debugging on lldb and gdb and get feedback here.
I can test on debian, but I used gdb twice in my life and none of those recent.
Still if you can supply some kind of manual/script to follow I can certainly do that.
I would need to know how to test and some example of what would be usefull to you to test.
thanks for working on ithis!
OK I tested on linux gdb and without --debug it works fine, with --debug I get this, even outside of gdb, FWIW…
$ ./bin.kemal_server --ssl --ssl-key-file _key.pem --ssl-cert-file cert.pem -p 3000
[development] Kemal is ready to lead at https://0.0.0.0:3000
Invalid memory access (signal 11) at address 0x7fcc52a6bb48
[0x559941e78f06] *CallStack::print_backtrace:Int32 +118
[0x559941dd9ce9] __crystal_sigfault_handler +361
[0x5599421c13b1] sigfault_handler +40
[0x7fcc52e03f40] ???
[0x7fcc52a6bb48] ???
within gdb is similar:
gdb) r --ssl --ssl-key-file _key.pem --ssl-cert-file cert.pem -p 3000
Starting program: /home/joshua/dev/sensible-cinema/html5_javascript/kemal_server/bin.kemal_server --ssl --ssl-key-file _key.pem --ssl-cert-file cert.pem -p 3000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff76c4700 (LWP 22654)]
[New Thread 0x7ffff6ec3700 (LWP 22655)]
[New Thread 0x7ffff66c2700 (LWP 22656)]
[development] Kemal is ready to lead at https://0.0.0.0:3000
Thread 1 "bin.kemal_serve" received signal SIGSEGV, Segmentation fault.
0x00007ffff76ceb48 in ?? ()
(gdb) bt
#0 0x00007ffff76ceb48 in ?? ()
#1 0x00007ffff76cebe8 in ?? ()
#2 0x00007ffff76ceb48 in ?? ()
#3 0x00007ffff76cebe8 in ?? ()
#4 0x00007ffff76c8a40 in ?? ()
#5 0x00007ffff76c8a40 in ?? ()
#6 0x00007ffff76ceb40 in ?? ()
#7 0x00007ffff76cebe0 in ?? ()
#8 0x00007ffff76ceb40 in ?? ()
#9 0x00007ffff76cebe0 in ?? ()
#10 0x00005555557b56c5 in resume (fiber=0x7ffff76ceb40)
at /home/joshua/dev/downloads/crystal-debug/src/crystal/scheduler.cr:48
#11 0x00005555557b3d78 in resume (self=0x7ffff76ceb40) at /home/joshua/dev/downloads/crystal-debug/src/fiber.cr:197
#12 0x00005555557b5668 in reschedule (self=0x7ffff76c8a40)
at /home/joshua/dev/downloads/crystal-debug/src/crystal/scheduler.cr:147
#13 0x00005555557b550a in sleep (self=0x7ffff76c8a40, time=...)
at /home/joshua/dev/downloads/crystal-debug/src/crystal/scheduler.cr:162
#14 0x00005555557b5d75 in sleep (time=...) at /home/joshua/dev/downloads/crystal-debug/src/crystal/scheduler.cr:52
#15 0x00005555556aa8c2 in sleep (seconds=1) at /home/joshua/dev/downloads/crystal-debug/src/concurrent.cr:15
#16 0x00005555556ad0d8 in -> () at /home/joshua/dev/sensible-cinema/html5_javascript/kemal_server/kemal_server.cr:24
#17 0x00005555557b421d in run (self=0x7ffff76cebe0) at /home/joshua/dev/downloads/crystal-debug/src/primitives.cr:255
#18 0x000055555568f52d in -> (f=0x7ffff76cebe0) at /home/joshua/dev/downloads/crystal-debug/src/fiber.cr:92
#19 0x0000000000000000 in ?? ()
and lldb (linux):
(lldb) r --ssl --ssl-key-file _key.pem --ssl-cert-file cert.pem -p 3000
Process 22628 launched: '/home/joshua/kemal_server/bin.kemal_server' (x86_64)
[development] Kemal is ready to lead at https://0.0.0.0:3000
Process 22628 stopped
* thread #1, name = 'bin.kemal_serve', stop reason = signal SIGSEGV: address access protected (fault address: 0x7ffff76ceb48)
frame #0: 0x00007ffff76ceb48
-> 0x7ffff76ceb48: movabsb 0x100007ffff3e819, %al
0x7ffff76ceb51: addb %al, (%rax)
0x7ffff76ceb53: addb %al, (%rax)
0x7ffff76ceb55: addb %al, (%rax)
With normal crystal git master --debug it doesn’t segfault, FWIW…thanks.
Code available upon request.
Interesting.
I am not getting it on my test suite on Linux and it is clean on my macOS.
Can you send me a test example you’ve used (if it is nothing secret though) in direct message so I will test it and identify the issue?
Actually I was able to reproduce it with my code. It happens not right away though but after fes seconds in wait mode, while it is waiting to accept any HTTP connection.
Will investigate.
Ok. After step by step execution I was able to identify where it failed:
resumable? (self=0x7ffff47c9a40) at /home/ubuntu/Projects/crystal/src/fiber.cr:172
172 def resumable?
(gdb)
resume (self=0x7ffff7eeabc0, fiber=0x7ffff7eeff00) at /home/ubuntu/Projects/crystal/src/crystal/scheduler.cr:92
92 {% if flag?(:preview_mt) %}
(gdb)
set_stackbottom (stack_bottom=0x7ffff7eeff00) at /home/ubuntu/Projects/crystal/src/gc/boehm.cr:245
245 {% if flag?(:preview_mt) %}
(gdb)
resume (self=0x7ffff7eeabc0, fiber=0x7ffff7eeff00) at /home/ubuntu/Projects/crystal/src/crystal/scheduler.cr:99
99 current, @current = @current, fiber
(gdb)
100 Fiber.swapcontext(pointerof(current.@context), pointerof(fiber.@context))
(gdb)
swapcontext (current_context=0x7ffff7eeff00, new_context=0x7ffff7eeff00) at /home/ubuntu/Projects/crystal/src/fiber/context/x86_64-sysv.cr:18
18 def self.swapcontext(current_context, new_context) : Nil
(gdb)
Thread 1 "crweb" received signal SIGSEGV, Segmentation fault.
0x00007ffff7eefdc8 in ?? ()
Will try to figure out why it happened there in Linux but works in macOS.
One interesting fact that I see is that old and new contexts are the same. Could it be the case?
Shall we call switch context if it is the same context? Can we save cycles here?
I think I know why.
Fiber.swapcontext method is @[Naked] and @[NoInline].
It has only one asm fragment that expects the order of the incoming parameters on register.
I should ignore injecting any debug info into @[Naked] methods.
Ok. Checking for @[Naked] functions and avoiding generating debug info for them did the trick.
After I figured it out I saw commit that Eldar Yusupov did 10 months did the same check in the code.
Can you retest now?
My apologies to not responding earlier.
To make the manual script it will take me some time. I am trying to handle ample of tasks with different priorities at the same time so hopefully I will be able to find some time to make that script.
Working much better now.
It even gives me “better” backtraces, here’s the normal one:
(gdb) bt
#0 -> (env=0x151) at /home/joshua/dev/sensible-cinema/html5_javascript/kemal_server/kemal_server.cr:623
#1 0x00005555556b20c0 in -> (context=0x151)
at /home/joshua/dev/sensible-cinema/html5_javascript/kemal_server/lib/kemal/src/kemal/route.cr:255
#2 0x00005555558c9144 in process_request (self=0x1ab, context=0x151)
at /home/joshua/dev/sensible-cinema/html5_javascript/kemal_server/lib/kemal/src/kemal/route_handler.cr:255
#3 0x00005555558c902d in call (self=0x1ab, context=0x151)
at /home/joshua/dev/sensible-cinema/html5_javascript/kemal_server/lib/kemal/src/kemal/route_handler.cr:17
#4 0x00005555558cf70c in call_next (self=0x1ad, context=0x151) at /usr/share/crystal/src/http/server/handler.cr:26
#5 0x00005555558cf071 in call (self=0x1ad, context=0x151)
with the modified compiler:
#0 -> (env=0x7ffff28d94c0) at /home/joshua/dev/downloads/crystal-debug/src/indexable.cr:267
#1 0x00005555556b3e34 in -> (context=0x7ffff28d94c0) at /home/joshua/dev/downloads/crystal-debug/src/primitives.cr:255
#2 0x00005555558f32df in process_request (self=0x7ffff4e88bc0, context=0x7ffff28d94c0)
at /home/joshua/dev/downloads/crystal-debug/src/primitives.cr:255
#3 0x00005555558f31ac in call (self=0x7ffff4e88bc0, context=0x7ffff28d94c0)
at /home/joshua/dev/sensible-cinema/html5_javascript/kemal_server/lib/kemal/src/kemal/route_handler.cr:17
though the process_request method is nont located in primtiives.cr so I’m not sure what’s at fault there. lldb same behavior.
I can see local variables in lldb and gdb, nice. Wasn’t sure how to actually display strings in either of them. Or array values (or dereference some other variable types like Arrays of NamedTuples).
Stepping through using “next” command gets wonky sometimes, it goes to weird places like
/home/joshua/dev/downloads/crystal-debug/src/indexable.cr:188
instead of stepping to the next line.
Can kind of work around it by setting another breakpoint on the next line of the file.
Then again, the normal compiler also does wonky things when stepping through so possibly unrelated.
Keep up the good work!