The Crystal Programming Language Forum

Compiler issue with non-existent NIL conditions

@jzakiya in any case this is very easy to figure out:

  • your method last expression is a each
  • each returns nil
  • the compiler is telling you something is nil
  • if you compile with --error-trace the compiler points to each

Maybe put a bit more effort into understanding what the compiler is telling you?

There is absolutely nothing wrong with the semantics of the code.
The code is correct and runs. The problem isn’t the code it’s the compiler enforcing an unnecessary condition, not on the method, but on its use.

Yes, I want the code to fail at runtime! (This would never happen for this code’s use.)
This would tell me there isn’t enough date in the array, which would be obvious to deal with.
The way it is now you are making a hypothetical determination on the runtime use of the method, which is totally inconsistent and subjective.

The compiler should only fail for definitive compile time structural issues, and leave runtime issues for the user to deal with (though the compiler may issue warning notices of potential problems).

I’m assuming (and I appreciate any correction if this is wrong) that you’re coming from Ruby (or a similar language) and interested in Crystal primarily for speed. That’s a valid path to come to this language, but I personally came to Crystal because of its combination of readability/ease-of-use and safety. Compile-time safety is, I think, one of the central components of the ethos of Crystal development. If you’re going to be using Crystal code in some real, potentially long-running application, then it doesn’t make sense to allow a potential runtime error that might be a wild corner case when you could easily catch it at compile-time. Aside from that point, it seems like it would be a gargantuan amount of work (and destructive of most of the core parts of the language) to just make it a runtime error. The type system is integral to Crystal, and this problem (as you think of it) that you’re running into is really just a result of the type system.

Now, if @HankAnderson’s solution ends up making sense and becoming part of the language, the difficulty of this particular will go away for you. However, at the end of the day, the compiler can only do so much type inference. The compiler doesn’t know that it’s inevitable that nums.size == count will eventually be true (and is the only way to break out of this loop). Even if you have a simple conditional, the compiler still plays it safe (which I think is fair):

a = 3

if a == 5
  b = 0
else
  b = nil
end

p! typeof(b) # => (Int32 | Nil)

Anyway, I agree with @Blacksmoke16 that a better way to do this would be to find a way to use an Enumerable method or something to return the nums array directly, instead of requiring the explicit return and the nums below. That may not be feasible, though. An explicit nums really isn’t that bad; if you’re writing any code that you intend to put out in the world, eventually someone will probably have to read it (this is especially the case with Rosetta Code, which is meant to be read for comprehension), and it actually helps to be more explicit.

Can we keep the focus on the issues and not the messenger.

There is nothing semantically wrong with the method’s code. It only “failed” to compile when I added an additional output statement using last(). But when I include in the method the never reached line at the end nums, all the output statements run with no compiler errors.

  1. A fundamental question is then, why does nums line allow compiling for all outputs?
    I assume this isn’t the mechanics for writing code, so then it’s a bug, and it needs to be fixed.

  2. Why doesn’t the compiler flag the nums line as an error/warning?
    The compiler knows the code aborts inside the loop, so why does adding this line that is never reached not only makes it compile, but doesn’t throw an error/warning?

  3. Why does the compiler conclude last() at runtime could be NIL?
    Each output statement is given a constant value at compile time to determine the size of the nums array. So why then doesn’t the compiler know that for each output statement there will always be enough data in the array to satisfy the last() condition?

Look, it’s good that these issues are identified now before 1.0.
You should be trying to eliminate as much technical debt as possible now, and not carry it forward, which will make it harder (but not impossible) to fix/remove.

I would hope people would be curious enough to want to understand the answers to these questions, and try to fix them to make the language more consistent|correct and thus easier and more fun for users to code with.

struct PalindromicGapfuls
  include Enumerable(UInt64)

  @gap : Int32

  def initialize(@digit : Int32)
    @gap = @digit * 11
  end

  def each
    (2..).each do |power|
      base = 10_u64 ** (power >> 1)
      lower = base * @digit
      upper = base * (@digit + 1)
      lower.step(to: upper - 1, by: 10) do |candidate|
        if power.odd?
          candidate.upto(candidate + 9) do |palindrome_half|
            palindrome_half = palindrome_half.to_s
            palindrome = (palindrome_half + palindrome_half.reverse).to_u64
            yield palindrome if palindrome % @gap == 0
          end
        else
          palindrome_half = (candidate // 10).to_s
          palindrome_base = (palindrome_half + "0" + palindrome_half.reverse).to_u64

          9.times do |i|
            palindrome = palindrome_base + (base * i)
            yield palindrome if palindrome % @gap == 0
          end
        end
      end
    end
  end
end

puts "First 20 palindromic gapful numbers ending with:"
1.upto(9) do |digit| 
  puts "#{digit} : #{PalindromicGapfuls.new(digit).first(20)}"
end
        
puts        
puts "Last 15 of first 100 palindromic gapful numbers ending in:"
1.upto(9) do |digit| 
 puts "#{digit} : #{PalindromicGapfuls.new(digit).first(100).last(15)}"
end

This is why I never enjoyed Rosetta Code much. It seems so much more focused on puzzle solving than showing off the actual language.

This is fair.

I’ve tried to simplify your code a little bit to show the structure without actually doing the calculations:

def f(size)
  # this is what you want to output
  nums = [] of Int32
  
  # This loop returns nil (i.e. nothing)
  (2..).each do |n|
    n1   = n * 100
    n2   = n * 200
    
    # This loop returns nil
    n1.step(to: n2 - 1, by: 10) do |outer|
      # This loop returns nil
      (outer..outer + 9).each do |inner|
        num = outer + inner
        nums << num if num % 7  == 0
        
        # We know as programmers that this is always the exit point of this function,
        # but the compiler sees "if" and allows for the case when it isn't
        return nums if nums.size == size
      end
    end
  end
  
  # without the following line, the return type of the function is the return type of
  # the loop above (that is, nil) OR the return type of the "return" line above
  nums
end

puts "#{f(5)}" # => [406, 420, 427, 441, 448]

puts "#{f(5).last(3)}" # => [427, 441, 448]

This has been explained above, but the reason is that up to this point, the compiler doesn’t know that the function necessarily returns in the loops. Until it gets to the nums line, the last statement of the function, which is the implicit return type, returns only nil. However, it also sees that there is a return inside of a conditional, which means that there is a possibility (again, the compiler doesn’t know that it’s certain) of returning Array(UInt64) (or Array(Int32) in my reduction). Therefore, without the nums line, the return type of the function is (Array(UInt64) | Nil). As I mentioned above, though, the last statement of a function is considered the return type (unless there is a return at the end of all conditional branches before it, in which case it’s truly unreachable code according to the compiler), so nums works the same at the end as return nums.

As I wrote above, the bolded section is false. The compiler does not know that the code aborts inside the loop, so it’s not a bug. While we as programmers can see that nums will never be reached at runtime, the compiler does not see that at compile-time because 1) the return in the function body is inside of a conditional, and 2) that conditional is within 3 loops, which the compiler doesn’t necessarily know will always execute at least once. From the compiler’s point-of-view, there are a number of execution paths (the outermost #each never executes, the middle #step never executes, the innermost #each never executes, and the conditional around return is never true) in which that return is unreachable, so it won’t disregard what it sees as viable execution paths (where return is never reached). So there is no warning/error for unreachable code because the compiler sees that line as reachable.

I think I’ve already addressed this, because it’s the same as the reason that, without the nums line, the return type of your function will be (Array(UInt64) | Nil). There’s actually no compilation-time checks about whether the array has enough data for Array#last (because how would you even do that?); if there wasn’t enough data, it would raise an exception at runtime. The error isn’t related to whether the array is long enough but rather whether the receiver of #last is actually an array.

A quick note: if you give your function (when it doesn’t have the nums line) an explicit output type (def palindromegen(digit, count) : Array(UInt64)), you’ll get a more useful compiler error:

Error: method must return Array(UInt64) but it is returning (Array(UInt64) | Nil)

Finally, I hope you find these answers useful. It seems to me both that you’re misunderstanding something about the language and that you’re a very skilled programmer, so I tried to find a good middle-ground in explaining things thoroughly without explaining things you’d already know.

1 Like

@jzakiya Many replies have been given to you.

As a summary: yes, it’s a bit unexpected that you have to add something after (2..).each { ... } to tell the compiler that you don’t expect that loop to end. This is a bug in the standard library, or a feature request to improve the situation.

So maybe open a bug or a feature request specifically for that? There’s no need to continue discussing anything here.

There is no bug.

1 Like

There is a bug, or an improvement, depending on how you want to see it.

This:

def foo
  (2..).each do |x|
    return 1
  end
end

typeof(foo) # => Int32 | Nil

The bug is that for a loop that never ends, the compiler think it ends, then we get Nil in the type.

The compiler knows (2..).each { ... } will never end. It can even know that at compile time (the E in Range(B, E) is Nil, there are already some checks around that). If there’s no end, we can generate while true as a loop. Then the compiler knows the loop never ends. Then the type of foo becomes Int32.

Just replace std code with my code here and you’ll see that the code from @jzakiya start to compile without modifications: Compiler issue with non-existent NIL conditions

Maybe a bug, maybe a feature request, I don’t know. Something can be done to improve the situation.

This compiles|runs using loop do ... versus (2..).each do |pow|.
WHY?
They are functionally equivalent (to a user), but obviously not to the compiler.

def palindromegen(digit, count)
  nums = [] of UInt64
  nn  = digit * 11
  num = 0.to_u64
  pow = 1
  loop do pow += 1        # replaces: (2..).each do |pow|
    base = 10_u64**(pow >> 1)
    n1   = base * digit
    n2   = base * (digit + 1)
    n1.step(to: n2 - 1, by: 10) do |n1|
      ns = (n1 // 10).to_s
      (n1..n1 + 9).each_with_index do |pal, i|
        num = (pal.to_s + pal.to_s.reverse).to_u64      if pow.odd?
        num = (ns + "0" + ns.reverse).to_u64 + base * i if pow.even?
        nums << num if num % nn  == 0
        return nums if nums.size == count
      end
    end
  end
end

count = 20
puts "First 20 palindromic gapful numbers 100 ending with:"
(1..9).each { |digit| print "#{digit} : #{palindromegen(digit, count)}\n" }
count = 100
puts "\nLast 15 of first 100 palindromic gapful numbers ending in:"
(1..9).each { |digit| print "#{digit} : #{palindromegen(digit, count).last(15)}\n"

This also compiles|runs using (2..).select do |pow|

def palindromegen(digit, count)
  nums = [] of UInt64
  nn  = digit * 11
  num = 0.to_u64
  (2..).select do |pow|
    base = 10_u64**(pow >> 1)
    n1   = base * digit
    n2   = base * (digit + 1)
    n1.step(to: n2 - 1, by: 10) do |n1|
      ns = (n1 // 10).to_s
      (n1..n1 + 9).each_with_index do |pal, i|
        num = (pal.to_s + pal.to_s.reverse).to_u64      if pow.odd?
        num = (ns + "0" + ns.reverse).to_u64 + base * i if pow.even?
        nums << num if num % nn  == 0
        return nums if nums.size == count
      end
    end
  end
end

This also works.

def palindromegen(digit, count)
  nums = [] of UInt64
  nn  = digit * 11
  num = 0.to_u64
  pow = 1
  while true
    pow += 1
    base = 10_u64**(pow >> 1)
    n1   = base * digit
    n2   = base * (digit + 1)
    n1.step(to: n2 - 1, by: 10) do |n1|
      ns = (n1 // 10).to_s
      (n1..n1 + 9).each_with_index do |pal, i|
        num = (pal.to_s + pal.to_s.reverse).to_u64      if pow.odd?
        num = (ns + "0" + ns.reverse).to_u64 + base * i if pow.even?
        nums << num if num % nn  == 0
        return nums if nums.size == count
      end
    end
  end
end

So there’s something intrinsically different to the compiler about (2..).each.

If you don’t make an effort to understand how the language works, and why it does what it does, you will keep complaining forever.

Or even read our replies here.

That’s what we’ve been telling you all along! :-)

This is a fruitless back-and-forth because you’re not focusing on what I’m raising, as a language user.

I consider the behavior in this example of the endless range form (2..).each a bug because it doesn’t behave as I expect and as I desire it to behave!.

You don’t care that I consider it a bug, or even just undesirable behavior, so I as a user have to bear the consequences of what I consider is a poor design decision, which can be fixed.

So there’s no point for me to discuss this anymore since I figured out how to write equivalent functional code that performs as I want.

I’m pretty sure he’s agreeing with you in that regard :laughing:. And I tend to agree that this specific case could be improved.

Hey @jhass I like your struct implementation, and ran it over all the outputs examples. It has comparable speed with my version (+|- 1|2 secs), and shows off Crystal’s OOP ability. Do you mind if I post it on Rosetta Code? I’ll list you as it contributor in that thread.

BTW: I corrected a minor error in a loop: 9.times should be 10.times

Sure, consider it CC0. I hope it can inspire future solutions and show how naming is important :)

2 Likes

FYI: final version posted to Rosetta Code.

struct PalindromicGapfuls
  include Enumerable(UInt64)

  @nn : Int32

  def initialize(@digit : Int32)
    @nn = @digit * 11                    # digit gapful divisor
  end

  def each
    (2..).each do |power|
      base    = 10_u64 ** (power >> 1)  # value of middle digit position
      this_lo = base * @digit           # starting half for this digit: 10.. to  90.. 
      next_lo = base * (@digit + 1)     # starting half for next digit: 20.. to 100..
      this_lo.step(to: next_lo - 1, by: 10) do |front_half|   # d_10; d_20; d_30; ...
        if power.odd?                   # for even number of total digits: ab..ba
          front_half.upto(front_half + 9) do |upper_half|     # d_n0 to d_n9
            palindrome_half = upper_half.to_s
            palindrome = (palindrome_half + palindrome_half.reverse).to_u64
            yield palindrome if palindrome % @nn == 0
          end
        else                           # for odd number of total digits: ab..m..ba
          palindrome_half = (front_half // 10).to_s
          palindrome = (palindrome_half + "0" + palindrome_half.reverse).to_u64
          # increase palindome middle digit from ---0--- to ---9---
          10.times do
            yield palindrome if palindrome % @nn == 0
            palindrome += base
          end
        end
      end
    end
  end
end

count, keep = 20, 20
puts "First 20 palindromic gapful numbers ending with:"
1.upto(9) { |digit| puts "#{digit} : #{PalindromicGapfuls.new(digit).first(count).last(keep)}" }
  
count, keep = 100, 15
puts "\nLast 15 of first 100 palindromic gapful numbers ending in:"
1.upto(9) { |digit| puts "#{digit} : #{PalindromicGapfuls.new(digit).first(count).last(keep)}" }
     
count, keep = 1_000, 10
puts "\nLast 10 of first 1000 palindromic gapful numbers ending in:"
1.upto(9) { |digit| puts "#{digit} : #{PalindromicGapfuls.new(digit).first(count).last(keep)}" }

count, keep = 100_000, 1
puts "\n100,000th palindromic gapful number ending with:"
1.upto(9) { |digit| puts "#{digit} : #{PalindromicGapfuls.new(digit).first(count).last(keep)}" }

count, keep = 1_000_000, 1
puts "\n1,000,000th palindromic gapful number ending with:"
1.upto(9) { |digit| puts "#{digit} : #{PalindromicGapfuls.new(digit).first(count).last(keep)}" }

count, keep = 10_000_000, 1
puts "\n10,000,000th palindromic gapful number ending with:"
1.upto(9) { |digit| puts "#{digit} : #{PalindromicGapfuls.new(digit).first(count).last(keep)}" }

Why did you rename @gap to @nn? I agree it’s not a good name, mainly because I had no real context on what this is even solving, but better than @nn certainly. If you feel the need to explain what it is in a comment, rename the variable instead. Code is write once, read often. So from the comment, sounds like @gap_divisor or something might be a good name.

What does lo stand for? I avoid abbreviations that are not 110% clear in the context.

I’m not sure why you feel the need to choose different variable names for upper_half and palindrome_half, reassigning there does not create a union since the variable is not captured anywhere. It’s the same value, just in different representations that are never needed at the same time. I even considered to just add a .map(&.to_s) initially, but wanted to avoid allocating the two iterators.

Also actually let’s use .skip(count - keep).first(keep) and maybe extract the printing to a method instead of copy pasting it five times. The initial two variants seems different enough to keep as is, but looking at those it seems too similar too often.

In general I harmonized all 3 versions (my 2 and yours) and used names that best (to me) conceptually portrayed|explained their roles in the algorithm, and ended up using 95% of your naming conventions. @gap is misleading, and @nn better, by showing the form of the divisor for each digit (considered @dd, but eh…). Basically, I wanted to use short|concise|clear names. The comments are more important than the names, like this_lo|next_lo, which really didn’t need to be changed from n1|n2 in my original versions, because they’re only used once. Ultimately, I chose the names that made the most sense to me, and made me smile. :smiley:

Tried .skip(count - keep).first(keep) too, but it provided no performance|memory advantages I could detect, so I kept the shorter code. (Though good to know it’s doable.)

I also added the Ruby versions, which are about a 98% literal translation.

The bottom line is, this was a good learning exercise for me. I think all the versions now are much easier to comprehend, and illustrate different capabilities of Crystal|Ruby. And I’m finished with this task.