I need to iterate over a stream of pandas.Series objects (the kind of objects I want to use us irrelevant though). Optionally, an arbitrary function is applied to each Series, and – here is the clincher – this arbitrary function can be a generator function, that yields two (or more) values. I was hopeful for the more_itertools.flatten
function, but it doesn’t help because it breaks in case a regular function, or no function is mapped over the generator. Is there a way to turn this iterable into a simple generator of Series objects? Here is a simple example that shows the issue:
In (1): from more_itertools import flatten
...:
...: def generator():
...: for i in range(10):
...: yield i
...:
...: def postprocess1(i):
...: yield 2*i
...:
...: def postprocess1_return(i):
...: return 2*i
...:
...: def postprocess2(i):
...: yield from (i, 2*i)
...:
In (2): list(generator())
...:
Out(2): (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
In (3): list(map(postprocess1, generator()))
...:
Out(3):
(<generator object postprocess1 at 0x7f5a402916d0>,
<generator object postprocess1 at 0x7f5a40291e40>,
<generator object postprocess1 at 0x7f5a40291f20>,
<generator object postprocess1 at 0x7f5a40291dd0>,
<generator object postprocess1 at 0x7f5a40291eb0>,
<generator object postprocess1 at 0x7f5a40209040>,
<generator object postprocess1 at 0x7f5a40209190>,
<generator object postprocess1 at 0x7f5a402092e0>,
<generator object postprocess1 at 0x7f5a402090b0>,
<generator object postprocess1 at 0x7f5a40209350>)
In (4): list(map(postprocess1_return, generator()))
...:
Out(4): (0, 2, 4, 6, 8, 10, 12, 14, 16, 18)
In (5): list(map(postprocess2, generator()))
...:
Out(5):
(<generator object postprocess2 at 0x7f5a403ad430>,
<generator object postprocess2 at 0x7f5a40209580>,
<generator object postprocess2 at 0x7f5a402097b0>,
<generator object postprocess2 at 0x7f5a40209510>,
<generator object postprocess2 at 0x7f5a40209430>,
<generator object postprocess2 at 0x7f5a40209740>,
<generator object postprocess2 at 0x7f5a402096d0>,
<generator object postprocess2 at 0x7f5a40209820>,
<generator object postprocess2 at 0x7f5a40209660>,
<generator object postprocess2 at 0x7f5a40209890>)
In (6): list(flatten(generator()))
...:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-7cd770547fa4> in <module>
----> 1 list(flatten(generator()))
TypeError: 'int' object is not iterable
In (7): list(flatten(map(postprocess1, generator())))
...:
Out(7): (0, 2, 4, 6, 8, 10, 12, 14, 16, 18)
In (8): list(flatten(map(postprocess1_return, generator())))
...:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-35ce9aef7285> in <module>
----> 1 list(flatten(map(postprocess1_return, generator())))
TypeError: 'int' object is not iterable
In (9): list(flatten(map(postprocess2, generator())))
Out(9): (0, 0, 1, 2, 2, 4, 3, 6, 4, 8, 5, 10, 6, 12, 7, 14, 8, 16, 9, 18)