A Curious Course on Coroutines and Concurrency

Copyright (C) 2009, All Rights Reserved
David Beazley
http://www.dabeaz.com

Presented at PyCon 2009, March 25, 2009.

Related Tutorials

Introduction

This tutorial is a practical exploration of using Python coroutines (extended generators) for solving problems in data processing, event handling, and concurrent programming. The material starts off with generators and builds to writing a complete multitasking environment that can run thousands of concurrent tasks without using threads or using code based on event-driven callbacks (i.e., the "reactor" model).

Note: This tutorial might be viewed as a sequel to the tutorial Generator Tricks for System Programmers I presented at PyCon'08 in Chicago. If you have never used generator functions before, you might want to look at that presentation for more information. This coroutine tutorial is meant to stand on its own, but you'll get a more complete picture if you combine it with the generator presentation.

Requirements and Support Data Files

This tutorial requires the use of Python 2.5 or newer. No third party modules are required. Examples have been tested on both Unix and Windows XP. Examples will work on Python 3 as long as you fix all of the print statements.

The following file contains some supporting data files that are used by the various code samples. Download this to your machine to work with the examples that follow.

This download also includes a PDF of the lecture slides.

Code Samples

Here are various code samples from the course. You can either cut and paste these from the browser or simply work with them directly in the "coroutines" directory. The order in which files are listed follow the course material. These examples are written to run inside the "coroutines" directory that gets created when you unzip the above file containing the support data.

Part 1 : Introduction to Generators and Coroutines

Part 2 : Coroutines, Pipelines, and Dataflow

Part 3 : Coroutines and Event Dispatching

Part 4 : From Data Processing to Concurrent Programming

Part 7 : Writing an Operating System

Part 8 : The Problem with Subroutines and the Stack

Part 9 : Final words

Design Commentary

One of the most tricky parts of working with a language feature like coroutines is figuring out how they should interact with other program elements such as functions and classes. If you look at various libraries and frameworks, you often find that they all vary slightly in how they address this issue.

The design of the task scheduler I wrote for this class is strongly biased towards my prior experience teaching courses on Operating System design. One of the most critical parts of writing an OS is both protecting the system and keeping user applications isolated. In the sample code, the Scheduler class represents a kind of "operating system." You will notice that the tasks defined in this course never directly interact with this scheduler (other than using yield to execute scheduler traps). That is, they do not hold references to the scheduler object, they do not invoke methods on the scheduler, they do not inspect internal scheduler data, and they don't hold references to other tasks. For all practical purposes, the scheduler and the tasks are two completely different execution domains. There is a good reason for keeping this separation. Namely it promotes a loose coupling between tasks and their execution environment. One could imagine creating other kinds of task schedulers that run tasks within threads or subprocesses. If you've set things up right and taken great care to promote the separation of tasks and scheduling, such schedulers would be able to run existing tasks without modification. Of course, the devil is in the details :-).

Contact me

Concurrency is a topic that generally interests me. I welcome all feedback, comments, and suggestions for improvement on this course material. Please feel free to contact me by sending an email to "dave" at "dabeaz.com".


Copyright (C) 2005-2024, David Beazley