54
votes

I've read that in CPython, the interpreter stack (the list of Python functions called to reach this point) is mixed with the C stack (the list of C functions that were called in the interpreter's own code). If so, then how are generators and coroutines implemented? How do they remember their execution state? Does CPython copy each generator's / coroutine's stack to and from an OS stack? Or does CPython simply keep the generator's topmost stack frame on the heap, since the generator can only yield from that topmost frame?

2
A few existing answers and comments claim Python maintain a "program stack" which is completely separated from the VM's C stack. This claim is wrong. Check the link: en.wikipedia.org/wiki/Stackless_Python Stackless Python exists but is not mainstream. The understanding is the question is right.Dong Feng
I accidentally answered myself nearly four years later by co-authoring a chapter that includes an explanation of how generators and coroutines are implemented: aosabook.org/en/500L/a-web-crawler-with-asyncio-coroutines.htmlA. Jesse Jiryu Davis
Great article, very dense.Benjamin Toueg
Unrelated, but... how did you get, in under 4 years, from asking about how generators are implemented to writing a book chapter with Guido on this topic? :)max
Hah! Implementing and maintaining Motor, my MongoDB driver for Tornado and asyncio, meant I kept using and thinking about coroutines for the last few years. I indulged my curiosity by reading CPython source (more legible than I feared it would be) and Tornado's source and then, when asyncio was written, I read that too. Plus I wanted to speak at conferences, which further motivated me to investigate coroutines and async so I could give talks on the subject.A. Jesse Jiryu Davis

2 Answers

19
votes

The yield instruction takes the current executing context as a closure, and transforms it into an own living object. This object has a __iter__ method which will continue after this yield statement.

So the call stack gets transformed into a heap object.

52
votes

The notion that Python's stack and C stack in a running Python program are intermixed can be misleading.

The Python stack is something completely separated than the actual C stack used by the interpreter. The data structures on Python stack are actually full Python "frame" objects (that can even be introspected and have some attributes changed at run time). This stack is managed by the Python virtual machine, which itself runs in C and thus have a normal C program, machine level, stack.

When using generators and iterators, the interpreter simply stores the respective frame object somewhere else than on the Python program stack, and pushes it back there when execution of the generator resumes. This "somewhere else" is the generator object itself.Calling the method "next" or "send" on the generator object causes this to happen.