Constants and Compiler

This question is about how the compiler deals with numerical constants at compile time.

The following code compiles without using the overflow operators &**, etc. However at runtime, for large numbers inputs, it gives runtime overflow errors.

But most of these numbers are just representations of constants I thought would be precomputed at compile time, e.g. 46 &* 10 &**8 is the constant 4_600_000_000.

Is my thinking wrong?
And in cases like this, is it better (for speed, mem, etc) to write out the values explicitly as constant numbers, or will the compiler do that anyway (but it seems it doesn’t)?

def select_pg(endnum, startnum)
  start_num = end_num = typeof(endnum)
  end_num = endnum;  start_num = startnum
  range = end_num - start_num
  pg = 5
  if start_num <= isqrt(end_num) #.to_u64   # for one array of primes upto N
    pg =  7 if end_num >  50 * 10 &**4
    pg = 11 if end_num > 305 * 10 &**5
  else                                      # for split array cases
    pg =  7 if (10&**6 <= range < 10&**7 && start_num < 10 &**8)        ||
               (10&**7 <= range < 10&**8 && start_num < 46 &* 10 &**8)  ||
               (10&**8 <= range < 10&**9 && start_num < 16 &* 10 &**10) ||
               (range >= 10&**9 && start_num < 26 &* 10 &**12)
           
    pg = 11 if (10**8 <= range < 10&**9 && start_num < 55 &* 10 &**7)  ||
               (range >= 10&**9 && start_num < 45 &* 10 &**9)
  end
  primes = [2, 3, 5, 7, 11, 13].select { |p| p <= pg }
  {primes, primes.product} # [excluded primes, modpg] for PG
end

There are no constants involved here. It seems you assume arithmetic operations between number literals to be evaluated at compiler time. This is not the case. Operators can be overloaded, so the value of these expressions can only be determined at runtime.

The default type for all number literals is Int32 and that’s also the data type of arithmetic operations between Int32 types. The result of some calculations (for example 10 ** 10) doesn’t fit into Int32, so that’s why it overflows.

I don’t expect you really want to use warping operators here. That seems very odd.

The solution is probably just to use a bigger data type like Int64 if you have values bigger than what fits into Int32.

1 Like

By using the right type Int64, it seem that the operation can be made at compile time:

x = y = 0_i64

Benchmark.ips(warmup: 0, calculation: 1) do |b|
  b.report("4_600_000_000") do
    x = 4_600_000_000
  end
  b.report("46_i64 &* 10 &**3") do
    y = 46_i64 &* 10 &**8
  end
end

puts x
puts y
    4_600_000_000  19.74M ( 50.66ns) (±18.45%)  0.0B/op        fastest
46_i64 &* 10 &**3  19.53M ( 51.19ns) (±17.43%)  0.0B/op   1.01× slower
4600000000
4600000000

Crystal source:

42i64 &** 10i64

LLVM IR:

call i64 @"*Int64@Int#&**<Int32>:Int64"(i64 42, i64 10)

It seems LLVM does some optimizations from there. But that’s outside the control of the compiler, i.e. Crystal can’t make any guaranteees about compile time optimizations.

And it definitely requries wrapping operators. I wouldn’t recommend using them unless you specifically want wrapping behaviour, because it’s easy to silently break your code when the data type isn’t big enough. Also they look ugly. And that’s for a reason.

For your use case it would probably be great to use scientific notation for such constants, like 50e4. Unfortunately that’s currently only supported for float types.

Thanks @straight-shoota for you proposal for creating scientific notation for integer constants:

Just to make sure I understand it, with your proposal 46_i64 &* 10 &**8 could be written as 46e8u64 or 46e8i64, correct?

This would definitely make numerical based source code simpler to write, and more concise.

Let me propose an additional feature to add to Crystal that would not only apply to numerical constants, but also to Strings and possibly other object types as well.

The day after Christmas I proposed this new feature regarding constants on Ruby issues:

Interpreting constants at compile time: https://bugs.ruby-lang.org/issues/17474

Basically, I proposed taking a feature that’s been apart of Forth since its creation in 1970 (by Chuck Moore) to allow for expressions that are evaluated to constants (values/objects) be determined at compile time (in Ruby’s case while the code is parsed), and the resultant constants used in the runtime code. Thankfully (after one rejection), it looks like that are considering it.

This feature would create the syntax for users to tell the compiler to evaluate source code expressions that result in constants at compile time, and substitute the computed results for the runtime executable, to save doing those computations at runtime.

So if we use the syntax, say., [[...]] then:

[[46_i64 &* 10 &**8]] would be evaluated and replaced with 4_600_000_000.

But that’s a simple case that would be addressed by the integer scientific syntax.

It’s real value will be in converting expensive operations that produce constant values/object.

Examples:

[[Math.sqrt(Math.cos(Math::PI/6)**2 + Math.sin(Math::PI/6)**2)]]

... [[SHA.digest.new("some string or value")]]...

...[[("Hello World " * 3).reversed]] ...

...[["Merry Christmas" + " and " + "Happy New Year!"]] ...

Other languages, to some extent, allow some of this capability. See for Rust below.

but these are cumbersome, as they require constants to be extracted outside their place of use. This feature would allow users to write their source code to display exactly what the algorithms and operations are within the code are doing (self documenting) and be able to tell the compiler which expressions to evaluate at compile time, to increase performance.

I know this may not be something that could make it into 1.0, but hope this is something you will consider, as I don’t think (conceptually) it should be that difficult to implement.

Why do we need new syntax for something you can already do?

VALUE = Math.sqrt(Math.cos(Math::PI/6)**2 + Math.sin(Math::PI/6)**2)

if you want it to be a constant why not just use a constant? Or better yet, use the resulting value instead of having it be an expression.

1 Like

Hey @Blacksmoke16 I was giving examples to show conceptually how it would work. Like I said, allot of numerical expressions might not need this being done if the compiler already evaluates them to constants to use at runtime.

But for ''expensive" and compound expressions, especially for non-numeric expressions, this would come in handy.

This is sorta equivalent to the use of pragmas e.g. in Rust, Nim, etc, where you can annotate source code with compiler directives like {inline}, etc, for users to have higher grain control of how the compiler produces the resultant executable.

No compiler (writer) can ever determine every possible case (especially for multiple operations) that could be evaluated at compile time into runtime constants. This gives users the ability to explicitly tell the compiler how to treat source code for cases the compiler won’t naturally handle optimally.

Such a syntax technically already exists in Crystal: Macros.
Macro expressions are evaluated at compile time.

The macro language by far not as extensive as regular Crystal code with stdlib. So of your example, only the last one actually works right now:

{{"Merry Christmas" + " and " + "Happy New Year!"}}

Using custom types and custom methods in macro land isn’t currently supported. There are proposals like https://github.com/crystal-lang/crystal/issues/8835 which would improve on this. But it’s still an open debate whether adding more complexity to the macro language is actually a good idea.

You can also swap expensive compile time calculations out of the compilation process using the run macro. That would execute a Crystal file at compile time and insert the result into the source code. This adds a lot of overhead, though, so it’s only helpful for really heavy things.

What I’m proposing is a very lightweight mechanism for users to annotate code sections, in place, that can be evaluated to constants, with minimum semantics. It’s only for this purpose, and nothing else.

Using macros would be way more than what is needed to do this. Plus, users would have to externally extract the code sections into probably multiple macros, when all they want to do is affect specific code sections that may have no relation to each other (one snippet could be a number, another a String, another a Symbol, etc).

When you develop code your first goal is to get it working to produce the outcomes you seek, then you can try to optimize it (speed, memory, etc). This would be a very simple way to easily annotate code in place to test these kinds of optimizations.

It would essentially be a switch in the compilation process that says, for this annotated code section, determine its compilation as normal, but then execute it and return its results in place, then continue with compilation. This would only apply to code sections that have no runtime inputs|depedencies.