[ PYTHON NETWORKS, CONCURRENCY, AND DISTRIBUTED SYSTEMS ]

A one-of-a-kind course that is focused on using Python to build distributed systems. Topics include socket programming, internet data handling (XML, JSON, etc.), simple web programming, WSGI, REST, actors, remote procedure call (RPC), message passing, map-reduce, distributed objects, and asynchronous I/O. The course also includes material on concurrent programming techniques including threads, processes, and multiprocessing. A major focus of this course is on the underlying principles that form the foundation of the programming frameworks and applications that you may be using now--you will walk away with new insight and ideas for improving your code.

This course is also offered on an on-going basis in Chicago.

Syllabus

  1. Network Fundamentals and Socket Programming. An introduction to some basic concepts of network programming. Covers the essential details of TCP/IP and programming with sockets. Students will learn how to write both TCP and UDP based clients and servers.
  2. Client-side programming. A look at high-level library modules that allow Python to connect to standard Internet and web-related services (e.g., HTTP, FTP, XML-RPC, etc.). Special attention will be given to the urllib2 module that allows Python to interact with web servers.
  3. Internet Data Handling. A brief overview of library modules that are used to process common Internet data formats such as HTML, XML, and JSON.
  4. Web Programming. The absolute basics of web programming in Python. Topics include CGI scripting, the WSGI interface, and implementing custom HTTP servers. Note: This section is primarily focused on how to put a web-based interface on low-level services as might be encountered in a distributed computing environment. It does not cover web frameworks or the problem of using Python to build a website.
  5. Thread Programming. Everything you wanted to know about Python threads, but were afraid to ask. Includes the absolute basics of using the threading module and different techniques for using threads to carry out work. Includes detailed coverage of using different synchronization primitives, queues, and thread pools. Also provided detailed information on the Global Interpreter Lock (GIL), tuning parameters, and the interaction between threads and C/C++ extension modules.
  6. Multiprocessing. A tour of features provided by the multiprocessing library added in Python 2.6. Covers processes, queues, pipes, process pools, and shared memory regions. Examples will illustrate how multiprocessing can be used to achieve higher performance when working on multiple CPU cores.
  7. Message Passing and Data Serialization. Message passing is a core component of distributed computation. This section provides an in-depth look at different interprocess communication mechanisms, their performance characteristics, and tuning options. In addition, different approaches for serializing Python data structures are explored. Topics include the subprocess module, named pipes, network sockets, memory mapped regions, pickle, marshal, structure packing, and binary I/O. The section concludes with information on high-level messaging systems such as ZeroMQ and AMQP.
  8. Distributed Programming. An in-depth tour of different distributed programming techniques. Topics include programming with actors, client-server computing, REST, remote procedure call, map-reduce, and distributed objects. Also includes material on XML-RPC and WSGI.
  9. Advanced I/O handling. A look at different I/O handling techniques including blocking, non-blocking, asynchronous, and event-driven I/O. The primary goal of this section is to better understand the I/O handling using by different libraries and frameworks such as asyncore, Twisted, etc.
  10. Generators and Coroutines. An overview of concurrent programming using generators and coroutines. The major focus of this section is on using generators to implement user-level task switching and to better understand libraries based on microthreads, tasklets, green-threads, and similarly named entities.

Instruction Format

The course is designed to be taught on a 9-5 schedule with a one hour lunch break. This course consists of both lecture slides and hands-on programming exercises, with most of the time spent programming. Participants should plan on spending 4-5 hours each day working on exercises.

Prerequisites

This course assumes a working knowledge of Python programming. Students should already know know to write and debug programs and be familiar with core language features such as functions, classes, modules, and the most commonly used modules in the standard library. Given that the topics in this course are heavily focused on networking and systems programming, students are well-advised to have some prior background working with processes, threads, and network programming.

About the Instructor

All courses are taught by David Beazley, author of the Python Essential Reference and nominated member of the Python Software Foundation. David has been an active member of the Python community since 1996 and is the creator of several Python-related packages including SWIG and PLY. From 1990-1997, he worked part time at Los Alamos National Laboratory where he helped pioneer the use of Python on massively parallel supercomputers. From 1998-2005, he was an assistant professor in the department of computer science at the University of Chicago where he taught courses in operating systems, networks, and compilers. In addition to his work with Python, Dave has extensive experience with C, C++, and assembly language programming. Dave has a Ph.D. in computer science and a M.S. in mathematics. An academic CV is available upon request.

Logistics

The class is best suited for 10 or fewer students. A larger class size is possible, but due to the advanced nature of the material it should not exceed 16 students.

You are responsible for providing the instruction space, a video projector, and machines where students can work on the programming exercises. The course can be taught on Windows, Linux, or Mac OS-X. However, all machines must be equipped with the latest version of Python (currently Python 2.6) and may required a small set of third-party libraries.

2013 Schedule and pricing

Classes are normally scheduled at least 8 weeks in advance. However, classes in the Chicago area can often be scheduled on shorter notice depending on availability.

The cost of Python Networks, Concurrency, and Distributed Systems with up to 10 students is $15000. This is an all-inclusive price that includes instructor travel expenses. Additional students can be added for $1200/student.

Contact

For more information, you can contact me by sending email to "dave" at "dabeaz.com".


Copyright (C) 2005-2024, David Beazley