Ruby Enumerator and Enumerable interaction: StopIterator dependency

Question

I have just come across some interesting Enumerator behaviour. There seems to be some dependency in Enumerator on position in an Enumerable - once you have peeked the end of the Enumerable and a StopIteration has been raised, no extension of the Enumerable is noted by the Enumerator.

Two examples demonstrate:

a=[1, 2, 3]
e=a.each
 => #<Enumerator: [1, 2, 3]:each> 
2.4.0 :027 > e.next
 => 1 
2.4.0 :028 > a.insert(1, 4)
 => [1, 4, 2, 3] 
2.4.0 :029 > e.next
 => 4 
2.4.0 :031 > e.next
 => 2

OK, so far, so good. But what about this. Let's define a method to extend an array when we hit the end:

def a_ext(a,enum)
  enum.peek
rescue StopIteration
  a << a[-1] + 1
end

Now let's see what happens when we use it

2.4.0 :012 > a=[1, 2, 3]
 => [1, 2, 3] 
2.4.0 :013 > e = a.each
 => #<Enumerator: [1, 2, 3]:each>
2.4.0 :016 > 3.times{e.next} 
 => 3

We have reached the end of the array - so call a_ext to extend the array

2.4.0 :018 > a_ext(a,e)
 => [1, 2, 3, 4] 
2.4.0 :019 > e.peek
StopIteration: iteration reached an end

????!!

It looks like once you have hit StopIteration, the Enumerator won't check again to see if the Array (I guess in general, an Enumerable) has been extended.

Is this expected behaviour? a bug? a feature?

Why might you want to do this. Well - with a Hash you can set a default value by passing Hash::new a block - and you can pass a block to Array::new. But the block that Array::new takes as an argument only has the index as a key, not the Array and the index (like Hash::new whose block yields the hash and the key). So this makes it extremely ugly and difficult to build an array that can be extended while enumerating through it.

For example, image an appointments diary where you want to enumerate through to find the first free day. This is naturally an Array rather than a Hash (as it is ordered), but it is very hard to extend while iterating through it.

Thoughts?

IMHO you should avoid updating an array while iterating over it to avoid inconsistency. In your example you could use find or find_index instead of using an enumerator. — sschmeck

Aleksei Matiushkin Aleksei Matiushkin · Accepted Answer · 2017-09-28T10:26:39

I believe the reason is that StopIteration has a result attribute, that basically is known if and only the iteration loop has ended. Consider following three examples:

[1,2,3].enum_for(:reduce, :*)          # #1, delegated to Array#reduce

[1,2,3].enum_for(:each, method(:puts)) # #2, delegated to Array#each

o = Object.new
def o.each { yield 1; yield 2; yield 3; 100 } # #3

Once the exception is thrown (created,) the value should be known (it’s btw 6 in the first case, [1,2,3] in the second and 100 in the third one.) That basically means that allowing re-entering the loop would introduce inconsistency (the value exists, but is not correct anymore.)

Enumerator must distinguish “in-the-loop” and “finished” states and it can’t go back from the latter to the former due to reasons I described above. That’s probably why it is implemented that way.

Ruby Enumerator and Enumerable interaction: StopIterator dependency

2 Answers