The Crystal Programming Language Forum

Brainstorm: how to have more Ruby-like iterators?

In Ruby if you write this:

a = [1, 2, 3]
e = a.each

e.each do |x|
  p x
end

e.each do |x|
  p x
end

You get 1, 2, 3, 1, 2, 3 as output (with newlines in between).

In Crystal you just get 1, 2, 3.

I think Ruby’s behavior is a bit more intuitive. An iterator represents some state ready to be executed, and you can execute it as many times you want.

I see this as similar to Rails ActiveRecord. You can do:

# This represents the users with id > 5
users = User.where('id > 5')

# We can iterate them all
users.each do |x|
end

# And again if we want
users.each do |x|
end

That is, users is a computation ready to be executed, and it can be executed as many times as we want.

The thing is, Ruby also has next and rewind on their Enumerator. But calling next or rewind doesn’t affect what happens with each, map, to_a, etc.

However, I think mixing each with next is never done in practice: you either want internal iteration or external iteration.

So maybe we can bring back rewind to Iterator and let each do a rewind just before iterating. Rewinding shouldn’t be expensive…

Or any other way we can make this work?

It’s a small detail, but I think it would make things much more intuitive and fun to work with.

3 Likes

Maybe combining different levels #next and #each isn’t that useful. But what about partial iteration?

Currently, you can do this:

a = [1, 2, 3]
e = a.each
 
e.each do |x|
  p x
  break if x > 1
end
  
puts "now the big numbers:"
 
e.each do |x|
  p x
end

I’m not sure how much this is used, but I figure it can be useful at times. And the fact that the current iterator behaviour hasn’t led to any complaints, it might be a sign that it’s actually okay as it is.
When your proposal makes implementing iterators more complicated because you need to essentially handle two separate iterators (you mentioned that in the other thread), I feel we need a very strong argument to got this way.

On a more general consideration, iterators are generally thought of as (relatively) cheap and disposable. You don’t have to rewind to reuse it. Just get a new one. Ruby’s iterator implementation does some extra stuff, but I doubt it’s worth it. There are many iterator implementations out there which cannot rewind.

1 Like

I think I would expect to already get a “new” one when I do e.each do |x| ... again.
How do you explicitly get a new one?

1 Like

You call each on the original object:

a = [1, 2, 3]
e1 = a.each
e2 = a.each

I really like the usability. Is there a way to measure the performance impact?

The problem is that we don’t know the implementation yet, so it’s hard to measure.

That said, I think what @straight-shoota said makes sense so we don’t need to change or add antyhing.

I think having an explicitly rewindable Iterator would be useful. When I was trying to figure out how to implement a #product method on Iterator (this), I ended up implementing it in basically the same way as @asterite’s sample implementation for rewindable iterators. So I do think there’s a use case for this. However, I also agree that we don’t need to make every iterator a little heavier just to allow it. It seems like most people still use iterators once, so rewindable iterators can be added without necessarily needing them to be the default iterator.

Yeah, the product example is a very good use case. Just for that I would at least bring back rewind, maybe making it raise by default with “not implemented” so you don’t necessary have to implement it if you don’t want to (but all iterators in the stdlib will).

Unless you specifically make it raise during compilation, that seems like it could introduce runtime errors unnecessarily (when they could instead be compiler errors).