Mastering Python 3 I/O
Copyright (C) 2010
David M. Beazley
http://www.dabeaz.com
Presented at PyCon'10, February 17, 2010, Atlanta, Georgia.
Introduction
As most Python programmers know, Python 3 breaks backwards
compatibility with Python 2 both in syntax and new semantics of
built-in operations. One of the most radical changes concerns the
ground-up redesign of the I/O system. This tutorial aims to take
a tour of the new I/O stack. Topics include text processing, binary data
handling, system interfaces, io library module, memory views, and
porting advice.
Support Files
The following file contains some supporting data files that are used
in some of the code samples. There are also some code fragments to
experiment with things.
This download also includes all of the code samples that follow below.
Code Samples
Here are various code samples that you can use to try things things out
during the course. They're presented in the same order as
presentation slides.
Preliminaries:
- timethis.py. A utility function for
making performance measurements. Used in many of the code samples
that follow.
Part 1 : Introducting Python 3
- printlinks.py. A Python 2 program
that simply prints all of the links on a specified HTML page fetched
with urlopen(). Try converting this program to Python 3 using
2to3.
Part 2 : Working with Text
- textop.py. Performance timings of various
text operations. Try it with different versions of Python.
- textformat.py. Examples of new-style
formatting applied to a list of tuples in order to make a formatted table.
- textformat2.py. Examples of new-style
formatting applied to a list of dictionaries in order to make a formatted table.
- textformat3.py. Examples of new-style
formatting applied to a list of instances in order to make a formatted table.
Part 3 : Binary Data Handling
- msgfrag.py. A comparison ofjoining byte
fragments together using concatenation, join, and bytearray
extension.
- structwrite.py. Two techniques of
writing binary data structures are compared.
Part 4 : System Interfaces
No files
Part 5 : The io module
These files have a few simple performance tests for comparing different
file modes, encodings, etc. You should try these under both Python 2 and 3.
-
iterlines.py. Iterate over lines of a text file using native open().
- itercodecs.py. Iterate over lines of a text file using codecs.open()
- iterbin.py. Iterate over lines of a text file using
binary file mode.
- iterenc.py. Iterate over lines of a text file using different text encodings. (Python 3 only).
- readall.py. Read the entire contents of a file
all at once.
- find404.py. Find all 404 errors in a web server log using text and binary file modes.
Part 6 : Standard Library Issues
No files.
Part 7 : Memory Views and I/O
- pipearray.py and getarray.py. An example of directly sending a binary array through a pipe created with the subprocess module.
- structpack.py. Packing a bytearray in-place versus incremental extension.
- receive.py
and send.py. An example of sending a large
buffer over a socket using memoryviews.
Feedback
I'm always looking for ways to improve presentation materials and examples.
Send your ideas to dave@dabeaz.com.