Have I hit compiler bugs?

Hi all,

crystal version
Crystal 1.17.0 (2025-07-16)

LLVM: 20.1.8
Default target: aarch64-apple-darwin24.1.0

I’m seeing some weird behaviors and wanted to check with you all to see if these are known issues or have I run into a new compiler bug(s)?

I’ve got a fairly large codebase and I’m seeing two issues:

  1. Compiler freezes sometimes for minutes with shards build (the longest I’ve seen when trying to understand this was ~9m), aborting and rerunning it completes the compilation in meaningful time (about ~2m). And no, it’s not downloading deps as I see the Dependencies are satisfied message.
  2. This one I’m not sure how to explain without sounding crazy – I want to believe my code is fairly straightforward for the most part since the code where I’m seeing this issue is a webapp backend doing routine backend stuff – fetching from db and sending an http response and such – the issue here is the compiler seems to happily chug along even when it sees undefined methods, what’s even puzzling is it complains with one set of args but not another, so if I had a two consecutive lines
this_method_does_not_exist(123) #compile error as expected
this_method_does_not_exist(“user_id=$1”, 123) #commenting above does not cause compiler error on this line

it only complains on the first but not the second! See below for compiler error on the first missing one just to confirm.

Screenshot 2025-07-31 at 11.21.28 PM

I’m unable to share the full code at the moment but will see if I can duplicate it with a smaller example. Meanwhile I was hoping if someone has seen this before.

Thanks!

1 Like

Does it then error on the second line if you comment out the first? If so I’d say this part is somewhat expected because the compiler doesn’t collet all errors, just stops and reports the first one it comes across.

@Blacksmoke16 No, it does not error on the second line if I comment out the first. I understand the compiler shows just one error at a time.

Is it possible the second one works because there is an overload that matches String, Int32 but not only Int32?

No. Only two occurrences in the codebase as you can see below. I’ve tried replacing the word this_method_does_not_exist with gibberish to double check but still the same issue.

1 Like

Been trying to recreate this issue with minimal code and here’s what I have so far:

# EXPECTED -- fails with undefined method 'bar' for top-level
def foo(user : NamedTuple?)
  bar(123)
end

foo(nil) 
# UNEXPECTED -- compiles even though bar is not defined
def foo(user : NamedTuple?)
  bar(user.not_nil![:id]?)
end

foo(nil)
# UNEXPECTED -- compiles even though bar is not defined
def foo(user : Tuple?)
  bar(user.not_nil![0]?)
end

foo(nil)
# UNEXPECTED -- compiles even though bar nor quux is defined
def foo(user : Hash(String,String)?)
  bar(user.not_nil!.quux)
end

foo(nil)
# EXPECTED -- fails with undefined method 'bar' for top-level
def foo(user : NamedTuple?)
  bar(user.not_nil!)
end

foo([true, false].sample ? {id: 1} : nil)

I was expecting all the UNEXPECTED cases above to fail compilation because of the missing method irrespective of whether user happens to be nil or not, is my understanding incorrect?

Also, I’m not a compiler developer so I may be completely wrong here but it looks to me that .not_nil on confirmed nil references are somehow allowing compilation to continue despite missing methods?

2 Likes

I’m about to do one of the most disliked things in today’s world.
Instead of thinking for myself, I will copy and paste an AI’s answer into a forum.
But sadly, it often helps me understand something.

In the age of AI, what truly matters is not how quickly we can give the right answer, but the human mind that holds doubt and curiosity. The value lies in the question itself…

Thanks for sharing this! Looks to be confirming my understanding about Crystal optimizing away unreachable code without alerting. Perhaps there’s a compile flag that would catch everything?

And while this optimization is definitely helpful to reduce compile times, it now leaves me (hopefully I’m not alone) wondering about the correctness of my code even though it has compiled. Trying to understand where the compiler decided to ignore chunks of code in a large codebase is tiring and not easy in my opinion.

I was refactoring a large class and when it compiled without any errors at the first attempt I was surprised I got everything right but knew that was near impossible! Now I’m having to comment out parts of code to understand the source of the issue.

I don’t know where your use crystal compiler 1.17.0, but, above all example all failed when I test with 1.17.1

And, if your screenshot is correct, the two calls for this_method_does_not_exists method is invoked on top-level, this issue should never happen, your’s sample code and the AI answer seem like none of them match your case.

So, my answer is, probably some things is broken for your’s compiler?

@zw963 Thank you, I wasn’t aware 1.17.1 was out, upgraded and tried but still see the same issue as earlier, all UNEXPECTEDs are still compiling fine.

Crystal 1.17.1 (2025-07-22)

LLVM: 20.1.8
Default target: aarch64-apple-darwin24.1.0

I’m on MacOS 15.1 (24B83) if it helps.

It def does reproduce on 1.17, and even master. The real question is what we can/should do to mitigate this as while a bit unexpected, it is working as designed as far as I can tell. Mainly for the sole reason of typeof(user) # => Nil. I.e. the compiler just knows not_nil! will raise instead of only might raise.

IMO this is why it’s always good to have test coverage on stuff as that can trigger the different code paths to ensure everything is working as you expect.

1 Like

@Blacksmoke16 Thanks for checking/confirming. While I agree test coverage could help catch this and may be more (assuming my test coverage is comprehensive, which requires yet more code and time), in my humble opinion relying on coverage defeats the purpose of having an ahead of time compiler to catch such fundamental issues as compiling successfully despite missing code all because a nil could happen somewhere.

Is this case (where the compiler “knows” that .not_nil! will unconditionally raise if the function is executed) something that the compiler could (feasibly) recognize at compile-time and issue a warning for? Presumably implementing such a feature wouldn’t be the highest priority, but I do think it would be helpful and might not be infeasible or impractical to implement.

@RespiteSage I appreciate the possibility of adding compiler checks on straightforward cases to prevent this but I must emphasize that my simplified example was just to drive the point home, in my code which I’m refactoring, I’ve got enough nil checks everywhere but still somehow, this issue has crept in!

If you passed a non-nil value, it would indeed fail to compile.

def foo(user : NamedTuple?)
  bar(user.not_nil![:id]?)
end

foo(nil)
foo({id: 123})

It’s surprising the first time you come across it (also every time you forget about this quirk) and, therefore, not ideal, but when you understand how methods with union-typed arguments are compiled, it makes a lot more sense. Specifically, if you call the method with a subset of the types allowed by the union, Crystal will only compile the method for that subset of types. If you only ever pass nil to foo, the compiler ignores the fact that it can be called with a NamedTuple entirely. This brings an improvement in compilation speed.

The second thing is that, as mentioned above, unreachable code is not compiled. The same thing happens if you have an early unconditional return. This code compiles despite calling a method that doesn’t exist and does not raise an exception.

def foo(user : NamedTuple?)
  return 42

  this_method_does_not_exist!
end

foo(nil)
foo({id: 123})

Your example code is a confluence of both of these quirks. And while I recognize the confusion there, I honestly think those choices make the right tradeoffs. Compilation speed is one of the biggest complaints about Crystal; compiling unreachable code and methods for argument types that are never used could make them significantly worse.

If someone could figure out ways to do it without sacrificing compilation speed, that would be great, but I’m not sure how that would be accomplished.

2 Likes

@jgaskins Thank you – from what I gather so far it looks like Crystal has chosen to aggressively optimize away unreachable code to reduce compile times without giving the developer any hint on the what or the why. This shifts the onus of correctness checking from the compiler back to the developer, which is what a dynamic language does and which is what most developers want to escape from when they reach out for a static language. On compilation speed – I, like the others, would love Crystal to compile at blazing speeds but if that means we will have to forego all the various ways the compiler can save us from ourselves through compiler warnings, I’d rather choose slow, which makes me wonder if this tradeoff can be revisited?

I get what you mean is, when invoke foo(nil), the content of method foo is unreachable, so no compile-time error?

Maybe there’s a way to make tools like crystal tool unreachable better.

I had DeepWiki write a plan for crystal tool undefined and had Claude implement it. Trying to understand what the AI has output.

1 Like

That’s one interpretation. Not the most compassionate one, but you can certainly choose to see it that way.

The compile time savings may not have been intentional. I don’t have enough insight into the design of the compiler to assign intent.