Should I Create A Copy When Iterating Over A List Using Enumerate In Python
Solution 1:
First, let’s answer your direct question:
enumerate
doesn’t help anything here. It works as if it held an iterator to the underlying iterable (and, at least in CPython, that’s exactly what it does), so anything that wouldn’t be legal or safe to do with a list iterator isn’t legal or safe to do with an enumerate object wrapped around that list iterator.
Your original use case—setting test_list[index] = new_value
—is safe in practice—but I’m not sure whether it’s guaranteed to be safe or not.
Your new use case—calling test_list.pop(index)
—is probably not safe.
The most obvious implementation of a list iterator is basically just a reference to the list and an index into that list. So, if you insert or delete at the current position, or to the left of that position, you’re definitely breaking the iterator. For example, if you delete lst[i]
, that shifts everything from i + 1
to the end up one position, so when you move on to i + 1
, you’re skipping over the original i + 1
th value, because it’s now the i
th. But if you insert or delete to the right of the current position, that’s not a problem.
Since test_list.pop(index)
deletes at or left of the current position, it's not safe even with this implementation. (Of course if you've carefully written your algorithm so that skipping over the value after a hit never matters, maybe even that's fine. But more algorithms won't handle that.)
It’s conceivable that a Python implementation could instead store a raw pointer to the current position in the array used for the list storage. Which would mean that inserting anywhere could break the iterator, because an insert can cause the whole list to get reallocated to new memory. And so could deleting anywhere, if the implementation sometimes reallocates lists on shrinking. I don't think the Python disallows implementations that do all of this, so if you want to be paranoid, it may be safer to just never insert or delete while iterating.
If you’re just replacing an existing value, it’s hard to imagine how that could break the iterator under any reasonable implementation. But, as far as I'm aware, the language reference and list
library reference don't actually make any promises about the implementation of list iterators.
So, it's up to you whether you care about "safe in my implementation", "safe in every implementation every written to date", "safe in every conceivable (to me) implementation", or "guaranteed safe by the reference".
I think most people happily replace list items during iteration, but avoid shrinking or growing the list. However, there's definitely production code out there that at least deletes to the right of the iterator.
Solution 2:
Since it was my comment which lead to this, I'll add my follow up:
enumerate can be thought of as a generator so it will just take a sequence (any sequence or iterator actually) and just "generate" an incrementing index to be yielded with each item from the passed sequence (so it's not making a copy or mutating the list anyway just using an "enumerate object").
In the case of the code in that question, you were never changing the length of the list you were iterating over and once the if statement was run the value of the element did not matter. So the copy wasn't needed, it will be needed when an element is removed as the iterator index is shared with the list and does not account for removed elements.
The Python Ninja has a great example of when you should use a copy (or move to list comprehension)
Post a Comment for "Should I Create A Copy When Iterating Over A List Using Enumerate In Python"