Mastering Python 3 I/O
Copyright (C) 2011
David M. Beazley
http://www.dabeaz.com
Presented at PyCon'11, March 10, 2011, Atlanta, Georgia.
Note: Click here for the 2010
version of this tutorial.
Introduction
As most Python programmers know, Python 3 breaks backwards
compatibility with Python 2 both in syntax and new semantics of
built-in operations. One of the most radical changes concerns the
ground-up redesign of the I/O system. This tutorial aims to take
a tour of the new I/O system and issues that are critical to know
about if you're going to port existing code. Topics include text processing, binary data
handling, system interfaces, io library module, and
porting advice.
Support Files
The following file contains some supporting data files that are used
in some of the code samples. There are also some code fragments to
experiment with things.
This download also includes all of the code samples that follow below.
Code Samples
Here are a few code samples that you can use to try things things out
during the course. The course doesn't rely heavily upon these
examples, but I'll try a few out here and there.
Preliminaries:
- timethis.py. A utility function for
making performance measurements. Used in many of the code samples
that follow.
Part 1 : Introducting Python 3
- printlinks.py. A Python 2 program
that simply prints all of the links on a specified HTML page fetched
with urlopen(). Try converting this program to Python 3 using
2to3.
Part 2 : Working with Text
- textop.py. Performance timings of various
text operations. Try it with different versions of Python.
Part 3 : Printing and Formatting
- textformat.py. Examples of new-style
formatting applied to a list of tuples in order to make a formatted table.
- textformat2.py. Examples of new-style
formatting applied to a list of dictionaries in order to make a formatted table.
- textformat3.py. Examples of new-style
formatting applied to a list of instances in order to make a formatted table.
Part 4 : Binary Data Handling
- msgfrag.py. A comparison ofjoining byte
fragments together using concatenation, join, and bytearray
extension.
- structwrite.py. Two techniques of
writing binary data structures are compared.
Part 5 : The io module
These files have a few simple performance tests for comparing different
file modes, encodings, etc. You should try these under both Python 2 and 3.
-
iterlines.py. Iterate over lines of a text file using native open().
- itercodecs.py. Iterate over lines of a text file using codecs.open()
- iterbin.py. Iterate over lines of a text file using
binary file mode.
- iterenc.py. Iterate over lines of a text file using different text encodings. (Python 3 only).
- readall.py. Read the entire contents of a file
all at once.
- find404.py. Find all 404 errors in a web server log using text and binary file modes.
Part 6 : System Interfaces
No files
Part 7 : Library Design Issues
No files.
Feedback
I'm always looking for ways to improve presentation materials and examples.
Send your ideas to dave@dabeaz.com.