First time being bitten by Crystal autocasting

This is probably not a bug, and more just something that caught me off guard really bad, but I finally had a pretty devastating bug go to production that I’m surprised wasn’t caught by the compiler.

I have some code that looks like this:

class PaymentGateway
  def charge_card(amount : Float64)
    #...
  end
end

class Product
  def amount_in_cents : Int32
    4499
  end
end

Which eventually does something like this:

gateway.charge_card(product.amount_in_cents)

We were doing a fairly large refactor where we had to move a property type from Float to Int, and we somehow missed the gateway update. All specs passed, all test charges through the test gateway went through fine. It wasn’t until it hit production and we had users contacting us that we learned this was sending $4499.00 to the payment processor instead of $44.99

I get why this works, and I’m sure changing it would lead to a lot more issues, but I still wanted to bring this up because of the impact. At the very least, let this serve as a warning for anyone else doing lots of money transactions in production :sweat_smile:

EDIT: Just for clarity here, my post isn’t asking “how” I could have avoided this. I’m well aware of how I could have avoided it. I probably shouldn’t have posted in “help & support”, but none of the topics were really “hey, heads up for new Crystal users”. I was more just surprised that Crystal didn’t catch this for me, and figured this could lead to some potentially bad bugs if others aren’t aware. Now that I know the no_number_autocast exists, it’s an easy workaround, and something to keep in mind.

2 Likes

Why no BigDecimal?

Be sure to write a spec that catches this to save yourself in the future :sweat_smile:.

You can turn it off via -Dno_number_autocast if you want to be more strict. Otherwise yea, this is more a business logic issue as the two numeric types are compatible.

Ref: Compile-time flags - Crystal

3 Likes

oh nice. I didn’t know about that flag.

Another way to solve it is to wrap your classes representing money in types, so that PaymentGateway would take some Dollar class and not a Float, whereas amount_in_cents would return someting denominated in cents.

Aside: representing money as float can be problematic, as they tend to not be able to represent money accurately.

7 Likes

I haven’t needed to make a proper library (as I don’t write software that handles money amounts), but if you’re interested in using fixed-point arithmetic for money representation, I have bindings to the LLVM fixed point arithmetic functions floating around somewhere on my machine, and I’m happy to save you that bit of work.

1 Like

I appreciate the offer. We don’t need anything super complicated. Int32 actually works great. It’s just the API we use requires something specific. Adding the no autocast flag is exactly what we needed; I just didn’t know it existed.

1 Like

I think the idea of the no_autocast thing is that it will eventually go away. It’s just there in case there’s a bug in that feature.

All of this to say: I think this is a very reasonable bad experience due to autocasting. That is, autocasting is very nice until it’s not, like in this example. So it’s definitely something we should consider dropping if it can cause silent bugs.

6 Likes

On the other hand, the case at hand could just as well have been Float64 in both places, and then autocasting couldn’t have been blamed.

Exactly. Autocasting may play a part in this, but I don’t see it as the major issue.

The problem is that both APIs have a different understanding of what a numeric value represents. Even if both used the same number type, they’d still disagree on whether it represents cents or a full denomination (dollar). Arguably, that would be less likely when only identical types match up.

So this particular bug would’ve been spotted without implicit autocasting. But that would be just coincidental. Autocasting is not responsible for the coding error existing in the first place.

Instead of disabling autocasting to avoid such problems, I’d strongly advise to follow the advise given here to write proper tests. Using value objects to represent concepts such as money would also be recommended.

7 Likes

Fixed point decimals and BigDecimal (@mdwagner) are nice. But for the vast majority of use cases, integer based amounts in the lowest representable fraction of a denomination (i.e. cents) are all you need for precise accounting. Ideally wrapped in a value type of course.

2 Likes

When we devised autocasting, we thought “if it fits, then it works”, and in several cases that’s reasonable. A library function returning Int32 can reasonable be casted to Int64, and it will be most reasonably to do so automatically.

But it’s valid to think that a change from a type might have more meaning than just the amount of values it can represent, as this example shows. Perhaps we should disallow such autocast from integer to floats?

3 Likes

I am more in favor to keep autocast only for literal values and not arbitrary expressions. Things can still go wrong in literals, but it feels much more scoped the chance.

Definitely this bug would have happen also if both sizes were using the same type: Int32, but there is a semantic difference about what that number represents: dollars, cents, 1/10 of cents, etc.

5 Likes

I was thinking about this as well.
The fact that it fits is not always an accurate representation of semantics. Integer and floating point types are very different systems for representing numeric values.
I assume most developers would probably not expect implicit casting between int and float types.

Between different integer types this is less surprising. They use the same representation format, with only differing in size. IMO autocasting integer types is a more reasonable feature and I’d expect it to be more useful.

Perhaps a more ideal solution that would be to remove the need for having to deal with different number types in Crystal programs except for some specific use cases (ref: [RFC] Simplify integers and floats · Issue #8111 · crystal-lang/crystal · GitHub). Technically, most programs should not need to use any of the smaller number types. The only major problem is with Int32 being the default type, but there are relevant use cases where you need bigger types (Int64 or Int128) and they work perfectly fine on most platforms, but they’re not the default type :man_shrugging:
As long as there is a common need to combine the use of different number types, I think it makes sense to make usage a bit easier for developers by supporting autocasting.
If that wasn’t the case, I’d even be happy about restricting operations between different number types even more (ref: [RFC] Disallow integer/float operations between different types · Issue #8872 · crystal-lang/crystal · GitHub).

2 Likes

I think I’m more inclined to leave this autocasting enabled. I believe this bug is more due to how types were used to represent something, and how later those same values started meaning something else.

The only scenario where I think this could lead to bugs is exactly this one: passing integer values around but assume they have a different meaning that just those integer values. In what other scenario passing 3 instead of 3.0 would make a difference?

6 Likes

LOL, we got the same problem in a production system last year.
Well, all related micro services are written in Ruby.

People make mistakes.

Would type safety help catching this bug?
I don’t know.

We upped our game in integration testing since that incident. Luckily it’s a green field project and our initial production run was a test run.

When develop a Crystal App, most of us may use cystal build without --release flag many times and use --release flag only a few times, so the usage frequencey of flag decide we should make --release as optional flag instead of default flag.

For most of us , if the frequency of not use autocast is more higher than use autocast, so we should turn it off defalut instead of turn it on default?

The enhanced autocasting is not meant to be an optional language feature that you can turn on and off at your desire. The only reason there’s a flag to opt out is in case of an emergency if some unforseen effect breaks existing code. There has been no reporting of that where the autocasting implementation was at fault.
This flag is expected to vanish in one of the next releases. And it’s probably becoming useless at some point when code in stdlib and other libraries will be written with autocasting in mind and would break without it.

There’s no way the compiler would support both autocasting and non-autocasting semantics side by side. We’re in for either having autocasting for everyone or no autocasting for everyone.

That’s very different from options such as --release which don’t have any semantic impact.

4 Likes

Some languages demands strict types and will rise compiler errors about not matching types, some script languages accepts anything and points errors at runtime, Crystal tries to autocast source type to target type and people like it but sometimes this can put some code at risk the developer don’t want, so… what about some syntax to pinpoint places where autocast should not be used, and only the defined type should be accepted in a strict mode or a compiler error should rise? For example:

class PaymentGateway
  def charge_card(amount : Float64!) # MUST be a Float64, notice the exclamation
    #...
  end
end

class Product
  def amount_in_cents : Int32
    4499
  end
end

gateway.charge_card(product.amount_in_cents) # Compiler error! Expected Float64, received Int32
1 Like

With Allow annotations on method/macro parameters · Issue #12039 · crystal-lang/crystal · GitHub support being released in 1.5, another option could be like:

class PaymentGateway
  def charge_card(@[Strict] amount : Float64)
    #...
  end
end

Naming TBD ofc. Could also be like @[NoCast], or @[Ignore("autocast")], …

1 Like