someMethod?, someMethod and someMethod!

IMO the method names are somewhat inconsistent.
One can rely on that if there is a foo? then this method will either result in some value or nil (if it isn’t a method which results in a bool in the first place).
One can also rely on that if there is a foo! it will raise if there is no value.
But it’s not obvious in general how foo will react, and whether there is a foo? or a foo! counterpart

An example for foo? and foo?

enumerable.max?
enumerable.max

An example for foo and foo!

enumerable.find
enumerable.find!

and somewhat even not_nil! (eventhough there is for obvious reasons no not_nil)

While I prefer the later (probably so the ? can be reserved for boolean methods, and as it fits well with the not_nil! thing), I basically don’t care too much which variant replaces the other.
It would be just nice if it would be handled always in the same way.

From what I’ve seen ! more commonly (in Crystal at least) means that a method mutates self. E.g. Hash#merge! versus Hash#merge. Because of this, a method without ! or ? would mean it would raise by default, and you’d use the ? version if you want nil instead of an exception. ? can also denote a method returning a Bool, like.active?.

2 Likes

I can absolutely live with that. But then things like Enumerables’ index/index! and find/find! (and possibly others) should be adjusted to follow this pattern :+1:

1 Like

I think it’s a good idea, but because of the 1.0 backwards compatibility guarantee it’s going to be hard, or I’d say impossible, to do this change.

I think it’s just something we’ll have to live with.

Can this be a 2.0 change?

I think someone should build a list of all the needed changes. Maybe once that’s done we’ll realize there’s no correct way to standardize it that makes sense. But let’s build the list first.

4 Likes

I don’t think there is any reason to change anything.

The general idea with these method names is that the method name without any suffix provides the most expectable behaviour. Variants with alternative behaviours exist with the ? or ! suffixes, providing nilable or raising alternatives to the main method.

  • Enumerable#max compares the elements in the collection and returns the biggest one. It is expected that it is not empty (otherwise it raises). If you want to care for the empty case, you can use #max? to return nil instead of raising.
  • Enumerable#find searches the collection and returns the first matching element. It is to be expected that there might be no matching element, in which case it returns nil (or any other configurable value). If you are sure there is at least one matching element (or want this to fail otherwise), you can use #find! which raises.

I think this is a pretty good system. It might not always be clear whether a method without suffix raises or returns nil. But the code usually behaves as you would expect. It’s easy to use because it has good defaults.
The main method (without any suffix) is a good fit for most use cases with implicitly reasonable semantics. You only need to employ one of the variants when you need something special.

If we were to use a static system with two fixed styles of method names for raising and nilable variants, using these methods requires an explicit decision on which semantic to use. That makes it harder to use. Instead of following a good default, you’d have to think about these details.

5 Likes

IMO ask is not to change semantics but devise a mechanism for stdlib and documentation to stick to the conventions to avoid any confusions to new comers to the language. Even the examples you have provided in your post contradicts that convention :slight_smile:

Enumerable#max raises while Enumerable#max? returns nil

vs

Enumerable#find returns nil while Enumerable#find! raises

I find right now unpredictable.
foo[ ] vs foo.find - I don’t see why I would expect one of those to be more likely to raise if it fails than the other. And if they wouldn’t be equally likely to raise/fail, I couldn’t tell you which I would expect to be more likely to do so. Especially in this example (with those two methods), as I use them exclusively for the exact same thing: to get one specific object from a container. If I can address this specific object directly (by its position in an array or key in a hash) I will do so with [ ] and if the object is to be identified by instance variables etc of the object, then I will go with find. They are imo absolutely equally likely to fail, as they solve the exact same problem in 99% of cases, well my cases at least; yes, [ ] picks the only one possible, while find aims for the first of theoretically multiple results if one uses it like that; actually I can’t think even of a use case to want just one or the first one of multiple results).

SO far I had understood “!” (if not in a “make changes on self” kind of way) to be like “this might raise, but I will either handle the exception myself, can ensure it won’t fail, or I’m fully aware of the consequences if this goes wrong”. It’s been like that with almost all of such methods, also and especially with casting. “!” and now it’s my responsibility. But then it should follow this pattern always or never, so one really knows what to expect. I actually was surprised by the behaviour of find and did not expect it and I was instead quite sure find would stand for something totally else (the same way like I often get confused with first which I also want to use often with the same intention).

I’m totally fine if everyone else is happy with the status quo. But imo it’s so wrong that it could be almost considered a bug. I would have opened an issue on Github, but as it behaves the way described in the docs, it’s not a bug but a choice. Like I said, I accept it if it stays the way it is. But I really don’t agree with the reasons said to justify the current situation.

Just try it yourself (without looking it up) - which methods foo can you think of that would either relate to a specific object of a container or return nil (and would use foo! to raise if they fail). any, none, one, minmax, …, maybe? find’s pattern isn’t unique, but it’s very rare (e.g. Enumerable actually has only two such methods of which find is one) - and I would have no clue which two methods those are if I wouldn’t have looked it eventually up at some point (and do so every time when I forget).

So if I understand it correctly, all methods should have two variants: the no-suffix one which returns nilable, and the ! which raises? Or would it be the other way around, the non-suffix one which raises and the ? which returns nilable?

When we started designing the language we faced this choice. For array, if you have code like this:

a = [1, 2, 3]
puts a[0] + 1

it’s just felt strange to us that that wouldn’t compile. I can see that the array has 3 elements, so it would be strange to have to write a[0]!. We felt that the code would be littered with ! for no real reason, and that it wouldn’t look good.

That said, the same argument could be made with this code:

a = [1, 2, 3]
e = a.find { |x| x < 2 }

because I can clearly see that there’s such an element.

We just felt that when you access an array by index you usually deal with arrays with a known size. Like, it’s super strange to see in code things like a[28]. Sometimes you want to work with the first element, so you do a[0] or a.first, both of which raise if the element is not found. Or maybe you want the last one, which is a[-1] or a.last. But trying to find an element that matches a condition feels like sometimes you wouldn’t find what you want.

We could try standarizing all of this for Crystal 2.0. But for that we need someone to build the entire list of methods and how they would like them to behave.

1 Like

Just to clarify: This is not about likelyhood to suceed. It’s about expectable semantics of the operation and its premises.

When using an array accessor #[], you’re asking to fetch the element at a specific index. This index is a piece of information that is probably based on some assumptions on the collection which make you expect an element at this index.

The #find method on the other hand is way more unpredictable and often unclear whether a matching element even exists. Of course, this depends on the kind of collection and the type of the condition. There may be cases where it’s obvious that some value will match. Then you could use #find! instead.
But the semantics of #find fit well for the generic case where you’re just looking through the collection to see if any element matches.

Why is this important, though? I think it’s quite easy. If you want to use a method with specific semantics (such as #find), you just use that one. Maybe you need to lookup the exact name for some of them.
If you use the main implementation (without any suffix), you’ll get the default behaviour that works best for this specific operation. It’ll most likely be what you need.

2 Likes

To add another perspective: many times I used hash[...] and met an IndexError when I really should have used hash[...]?. In the case of a Hash it’s less clear whether fetching an element by key will likely succeed or not, so maybe hash[...] should have been nilable from the beginning, and have hash[...]! for the raising variant.

1 Like

I’ll also throw in my request for a consistent convention regarding ? and ! for method suffixes in the stdlib. I’m not too opinionated on what that would be, so long as I could infer information about <method> given only its suffix, before even reading the method name. Sounds like current behavior is:

  • <method>? - returns Bool or instance of nillable type
  • <method> - returns nil (e.g. no return methods), instance of any type, or raises
  • <method>! - returns modified self, or raises

The proposal I saw above by @anon69898395 seems to be instead:

  • <method>? - returns Bool or instance of nilable type
  • <method> - returns nil (e.g. no return methods), instance of any non-nilable type, or raises
  • <method>! - returns modified self

I like this since I can predict what <method>! does without guessing, and <method>? is the only method I would assume to find within an if or otherwise conditional statement. The bare <method>, by convention, gives me something exclusively nil, exclusively a non-nil instance, or it raises.

An alternative approach I think @asterite is driving at as a possible example is:

  • <method>? - returns Bool or instance of nilable type
  • <method> - returns nil, or instance of any type
  • <method>! - returns instance of non-nilable type, or modified self, or raises

This proposal feels like it mixes the punctuation meaning a bit, with both <method>? and <method> returning nilable types, and <method>! being an extra safety check or a modification of self.

The argument that @straight-shoota makes for find vs find! feels more subjective to me - if I have a collection, it’s more often I have certain guarantees around it rather than having little idea what’s in it, and I usually approach it with “Somewhere in this collection is an element with certain properties. Find it.” as opposed to “This collection might have an element with certain properties. Try and find it?”. The punctuation of both those sentences should indicate the first method I usually try and the behavior I expect first :smile: A separate method query might have better semantics for what he describes as find instead, since the word query already implies a question, and query! would override that to being “this should return something, dangit!”. Regardless, the meaning of the punctuation here changes depending on the method name and the interpretation of the reader.

My two cents at least, and apologies if I misconstrued anyone’s points above :stuck_out_tongue:

1 Like