Collections: `any?` vs `!empty`

straight-shoota · January 28, 2023, 12:20am

A PR for applying ameba linter rules in the shards repo sparked a discussion about the semantics of any? vs. !empty? for collections.

The original discussion is in this comment thread:

I think this topic warrants its own discussion in the broader community.

The discussion was sparked by the ameba rule Performance/AnyInsteadOfEmpty which claims that calls to any? should be replaced by negating the result of empty?.
Enuemrable#any? returns true if at least one of the collection’s members is truthy whereas Enumerable#empty? returns false if there is at least on member in the collection, regardless of whether it’s truthy.
The vast majority of use cases cares about the plain existence of any item in the collection and doesn’t care about truthiness. In fact, truthiness doesn’t matter at all when the item type cannot be falsey. That’s the case when it does not include Nil, Bool, or Pointer - those are the only types that can have falsey values - and means that any? and !empty? are exactly equivalent.

(Note: I think the categorization of that rule is wrong. It’s primarily about semantics, performance is only an additional aspect and actually irrelevant because there’s no performance difference when the item type cannot be falsey - and if it can, there’s a difference in semantics and performance doesn’t matter.)

(Note 2: We’re only talking about the #any? overload that doesn’t take any arguments or blocks. The other variants have specific semantics that are unrelated to this discussion.)

I care a lot about code readability.
!empty? is a double negation. Resolving that to a positive predicate reduces cognitive load. When called with a receiver, the additional visual distance between the method name and the prefixed negation operator further increases cognitive load.
The diff from that original discussion is a good showcase:

-    spec.dependencies.any? || (Shards.with_development? && spec.development_dependencies.any?)
+    !spec.dependencies.empty? || (Shards.with_development? && !spec.development_dependencies.empty?)

I find the original expression much easier to comprehend than the suggested alternative.

The semantics in this case are identical: the type of spec.depdendencies is Array(Dependency), so it cannot have any falsey values.

So for this example, I see no good reason to change this code to a less readable version for no other gain.

This thoughtbot article was brought up to underline the preference of !empty? over any?

I completely agree to the article’s sentiment for Ruby. But Crystal is different. Typed collections mean that in most cases any? is already semantically exactly equivalent to !empty?. And what’s important: the item type precisely attests that. So the type of the collection tells whether it could possibly contain any falsey item and determining that does not require to actually iterate the collection as it is in Ruby.
Collections including boolean values or pointers are relatively rare in the first place. The main application of falsey item values in a collection is nil. That applies to both Crystal and Ruby (the latter doesn’t even have pointers) but in Crystal collections with nilable item types are much less common due to static typing: If a type is nilable, you have to explicitly handle that. Thus it’s common to get rid of nil values as early as possible to prevent the compiler constantly bugging about it.
So we see far less nil values appear in Crystal collections than in Ruby.^[1]
And remember, in Crystal when the item type cannot be falsey, any? == !empty? applies.

Maybe it would help to clear things up if any? had exactly the same meaning as !empty?. That would require to go the extra mile for the alternative and less commonly used meaning of “has any element that’s not falsey” with an explicit any?(&.itself) or a new method like any_truthy?.
A possible compromise could also be to concede the name of any? to retain the current semantics, but introduce another method that’s exactly equivalent !empty?. I can’t think of a good concise name for that, though.

Doing the same in Ruby would probably be a good idea, it’s just that nobody bugs you about it. ↩︎

cyangle · January 28, 2023, 1:51am

any? means if there’s any truthy member, empty? means if the collection’s size is 0.
Most of time, people are just checking the size of the collection is 0 or not. any? and !empty? are two different things and most of times, you want !empty? not any?.

I prefer correctness over readability.

There’s a similar issue with using if to filter out nil values:

x : Bool? = false
if y = x
  puts "y is not nil"
else
  puts "y is nil"
end
# => y is nil

Below code is ugly but always filter out nil values only:

x : Bool? = false
unless (y = x).nil?
  puts "y is not nil"
else
  puts "y is nil"
end
# => y is not nil

Personally I don’t like code that’s only correct with certain conditions.

cyangle · January 28, 2023, 1:59am

I highly recommend John Ousterhout’s book: A Philosophy of Software Design.

Code with special cases introduces complexity.

HertzDevil · January 28, 2023, 1:59am

The type of spec is defined here:

github.com

crystal-lang/shards/blob/aa5fa555a9590319c812ed925968e2839fe56bec/src/commands/command.cr#L39-L45


      
          def spec
            @spec ||= if File.exists?(spec_path)
                        Spec.from_file(spec_path)
                      else
                        raise Error.new("Missing #{spec_filename}. Please run 'shards init'")
                      end
          end

The type of spec.dependencies is defined here:

github.com

crystal-lang/shards/blob/aa5fa555a9590319c812ed925968e2839fe56bec/src/spec.cr#L192-L194


      
          def dependencies
            @dependencies ||= [] of Dependency
          end

So it takes reader extra time to go through these two places to infer that spec.dependencies is really an Array of elements that cannot be falsey, such that any? and !empty? are equivalent, when logically that snippet is merely checking for non-emptiness. All of this adds more cognitive load to the above snippet than typing out !empty?. If the double negation is so irritating then even size > 0 is preferable over any?.

That said, I agree that semantic checks like this could be disabled on purely syntactic linters like Ameba, as the find.not_nil! → find! check is precisely why we had to release Crystal 1.7.1.

tsornson · January 28, 2023, 5:27am

I’m going to agree with both side so far - I find the !empty? to be cognitively heavier than it needs to be, and the current any? only working as expected when the collection type is none falsey is unintuitive and likely to cause problems.

Would it make sense to special case the any? method that doesn’t take a block argument to only check if the collection is not empty, regardless of truthy or falsey types?

When I read the block argument version, I instinctively translate to “are there any elements in this collection that look like this?”, while for the none blocked version I translate “are there any elements in this collection?”. The truthy VS falsey aspect is none intuitive to me unless I know it’s a holdover from the blocked version of the method. I the values of the collection matter to me in a truthy VS falsey way, I think I’d first try naively any?(&.itself). Maybe that’s just me, though.

Sija · January 28, 2023, 9:24am

@straight-shoota Btw, where’s the double-negation in the first place? Empty is a state, it’s not a negation. The whole thing reads exactly like it should - as an answer to a question Whether the collection is empty?, No, it’s not empty.

straight-shoota · January 28, 2023, 1:14pm

Yes, that’s exactly what I’m wondering about.

If we only talk about the specific circumstances of this instance or the general axiom to avoid any?, we’re not getting to the root of the problem.
I think Enumerable#any? is wrong.
The mere presence of a linter rule that explicitly and unrestrictively suggests to not use it, is a clear sign that something is at odds.

There are two meanings associated with any?:

Are there any elements in the collection.
Are there any truthy elements in the collection.

That’s a subtle difference, but for most cases it doesn’t matter because both semantics align.
The intention of Enumerable#any? is 2. but people tend to expect 1. That’s exactly what the ameba rule is about, to address the probable misconception about any?.

I think the semantics should rightfully prioritize 1. That’s what a reader implicitly assumes and it’s a far more common use case than 2.

Now changing this would be hard. It’s a breaking change and it’s a silent change of behaviour.

Maybe a possible route could be to deprecate Enumerable#any? for collections that can contain falsey types. That’s currently not possible, but we could enhance the compiler to allow issuing deprecations from macros. Then we could drop that in the next major release to make a hard break, and following that we could introduce it again in a minor release. That’s of course a long journey, but maybe it’s worth it?

An alternative solution would be if we could find another method name to express 1. semantics, then we could introduce that independently of any?. But I don’t have much hope for that.

straight-shoota · January 28, 2023, 1:16pm

Yes, empty is a state. But I don’t care about empty, I care about knowing if there is any element. Empty is the negation of that state. With negating that again to get what I actually need, I’m using a double negation.

straight-shoota · January 28, 2023, 1:21pm

size > 0 expresses the intent correctly, but it’s not a good implementation for generic connections. Enumerable#size actually counts all the elements. At best that only leads to bad performance. But it can have worse effects such as invalidating the entire collection or entering an infinite loop (Remove iteration in Enumerable#size · Issue #10014 · crystal-lang/crystal · GitHub).

HertzDevil · January 28, 2023, 1:31pm

This is not mentioned yet, but Indexable#presence could be an option too, considering its roots in Rails

yxhuvud · January 28, 2023, 6:46pm

Thankfully most collections will be of the types that actually provide a more efficient implementation, like Arrays.

Ragmaanir · January 29, 2023, 9:07pm

I personally tend to avoid any? too because of the extra meaning. Would be nice to have a present? or some? method that just is the same as !empty?.

zw963 · January 30, 2023, 4:26pm

Not read the full thread, but obviously any? is not same as !empty?, the invert version of any? is none?, but empty? only check if the count of elements is zero.

See following ruby example:

[8] pry(main)> [nil,false].any?
=> false
[9] pry(main)> [nil,false].none?
true
[10] pry(main)> ![nil,false].empty?
=> true

hugopl · February 1, 2023, 12:19am

Have any? and any_truthy? seems to me as the best option (for crystal 2.x), maybe because I’m not a native english speaker but for a long time I thought that any? was the inverse of empty?.

asterite · February 1, 2023, 2:01am

If we could start from scratch, I would have the non block versions just rely on counts.

I think it could be a valid path forward behind some flag, so that when 2.0 lands that’s the behavior.

hugopl · February 1, 2023, 3:07am

I like this idea: Without the flag, a deprecation warning telling to use any_truthy?, with the flag, no warnings and any? behaving like !empty?.

Xen · February 13, 2023, 1:05pm

I’ll just say that I’m the type that’ll take the readability of any? even with the gotchas. Just like I’ll use if ($something) in PHP rather than if (!empty($something)), even though I know they’re not quite the same thing (and protect myself against the edge cases in other ways).

straight-shoota · September 20, 2023, 6:52pm

Changing the semantics of Enumerable#any? would be nice. But it’s gonna take a long time.

So I’m thinking about a solution that can be implemented in a reasonable time frame.
I’d like to introduce a new method for determining if the collection contains any elements. It would return the negation of #empty?.

Following Collections: `any?` vs `!empty` - #10 by HertzDevil Enumerable#present? could be an option.
I’m a bit vary about this because it’s related to String#presence which considers whitespace as empty. So these semantics would be comparable to falsey items in a collection.

Another option would be Enumerable#populated?. It’s a bit longer but maybe a bit less ambiguous?

Thoughts?

HertzDevil · September 20, 2023, 8:18pm

By the way, this is how #presence and friends would look like in Crystal if we copy ActiveSupport’s definitions:

class Object
  def blank? : Bool
    responds_to?(:empty?) ? !!empty? : !self
  end

  def present? : Bool
    !blank?
  end

  def presence : self?
    self unless blank?
  end
end

struct Nil
  def blank?
    true
  end
end

class String
  def blank?
    # same as before
  end
end

ActiveSupport considers collections with blank? elements to be present?, so only String has the special treatment regarding spaces, and [nil, false].presence would return itself. (I’m not sure about the purpose of the other overrides.)

kojix2 · September 21, 2023, 1:50pm

I have never used any in the sense of !empty?

In my Japanese mis-understanding of English, I feel that it is normal for any to take a block. In Japanese, any is translated as “どれか（なにか）～がある” and the “～” usually acts as a block. It is not impossible to mean “there is more than zero” by not passing the block “～”, but but such sentences are rare.

Are there any elements in the collection.
Are there any truthy elements in the collection.

I just learned that 1 is preferred in English. If I have to choose between one of the two, I would definitely prefer to keep 2. I like the idea of present or some. (However, this is my personal choice and I would like to follow the community’s opinion.)

I remember that this kind of issue was discussed on slack ruby-jp before, but I don’t remember how the discussion went because slack deletes old posts…

Topic		Replies	Views
Inference of nested empty collection? Help & Support	1	284	January 6, 2021
someMethod?, someMethod and someMethod! Crystal Contrib	12	443	September 26, 2022
Object#presence Crystal Contrib	27	1186	July 29, 2020
Nil check syntax confusion Help & Support	5	406	August 28, 2021
`Enumerable(T).includes?` takes object of any type Crystal Contrib	4	319	April 24, 2022

Collections: `any?` vs `!empty`

Related topics