Py on the Horizon

Notes from Brett Cannon’s talk on the changes in Python 2.6 and 3.0.

Published on September 3, 2008.

from future import coder_who_says_py

Python core developer, Brett Cannon, gave a talk last night at VanPyz on the changes coming in Python 2.6 and 3.0. While anyone can read “what’s new” in 2.6 and 3.0, his talk brought some clarity to the benefits of these changes.

Using my notes, I’ll do my best to summarize what we learned. Please keep in mind that I’m still a Python amateur, and some things were way over my head.

Brett started with what’s new in 2.6, like the format() method with a nicer, more flexible syntax than print %. For example, positional arguments can be in any order, and can be repeated:

"{1} {1} {0} {1} {1}".format('future', 'under')
# => under under future under under

Now any object can support splicing by defining the __index__ method. Abstract Base Classes (ABC) can be used to register what a class implements. Class @decorators add function @decorator like post-processing to classes, providing a simpler mechanism than MetaClasses for many cases.

New modules include io, simplejson as json, PyProcessing as multiprocessing, and the AST (Abstract Syntax Tree) module. AST could be useful for implementing something like C#’s LINQ or Ruby’s Sequel, as well as for templating languages in the vein of Brevé.

Many of the changes help ease the transition to 3.0, such as the -3 flag for extra warnings. Per user site-packages directories make it simple to use (test) the same code with multiple interpreters. You can adopt Python 3.0 behavior in 2.6, for example from __future__ import true_division causes the division operator to return a float (the // operator returns a truncated integer).

Bits and Bytes

Perhaps the most obvious change in Python 3.0 is that the Unicode type is replacing the old string type. For binary data there is a bytes type. In Python 2.6, bytes is an alias for the str type, such that:

(b'a' == 'a') # => True

Python 3.0 introduces an entirely different bytes type, so this is not the case.

For performance reasons Unicode strings are stored in UTF-16 internally, but you will likely stick with UTF-8 and let Python take care of the translation. Source code and identifiers can also use Unicode, though only to a certain point. I’m not sure if שׁלוֹם = 'peace' will work, but this is mainly there to help OLPC kids learn Python.

The mutable counterpart for bytes is the bytearray type. There is also a memoryview planned to replace buffers.

Data Structures

Moving on to Python 3 specific enhancements, list comprehensions have been generalized to apply to dictionaries and sets as well. Sets also have a literal form, {1,2,3}, essentially an unordered dictionary with no values.

When unpacking a list, you can use star unpacking to capture any other elements, whether from the beginning, middle or end:

head, *tail = some_list

Functions

Keyword arguments are handy for APIs with many options. Python 3.0 introduces keyword-only arguments which are especially useful for booleans where True by itself is meaningless. Define func(*, load=True), and any keyword arguments after the * cannot be specified positionally.

Function annotations can be used for type checking (param:int) via a @decorator, but could find uses in documenting parameters.

There is some new MetaClass stuff, but I really didn’t follow.

True Closures

Having learned Lua last year, I was happy when the closures came up. I recently had to do a little hack to get a closure in Python 2.5:

def parse_footnotes(text):
  """Converts my custom style footnotes to Markdown Extra format"""
  count = [0]       # using a list is a trick to do closures in Python
  footnotes = []
  def handle_match(mo):
    """replace matched {fn ...} with [^1]"""
    count[0] += 1
    footnotes.append(mo.group(mo.lastindex))    # store footnote for later
    return u'[^%d]' % count[0]

  ...

In this example, I am writing a callback to process each match for a regular expression, and return the substitute value. I needed to keep a count in order to return [^1], [^2], etc. Writing a class for this is a little excessive, a closure is just right.

Python 3.0 introduces the nonlocal keyword, that works in a similar fashion to global. The above code could be written like this:

def parse_footnotes(text):
  """Converts my custom style footnotes to Markdown Extra format"""
  count = 0
  footnotes = []
  def handle_match(mo):
    """replace matched {fn ...} with [^1]"""
    nonlocal count
    count += 1
    footnotes.append(mo.group(mo.lastindex))    # store footnote for later
    return '[^{0:d}]'.format(count)

  ...

Exceptions

Exceptions received quite an overhaul, not only adopting a consistent syntax, exceptions can also be chained. If an exception occurs while handling another exception, the original exception is not lost, and the traceback is tied to each exception object.

BaseException is at the root, and KeyboardInterrupt, SystemError and Exception inherit from that. You can catch Exception and let the other two fall through for the interpreter to handle.

Refactoring

Brett Cannon did the major overhaul of the standard library, moving related modules into packages, dropping unmaintained OS modules, and renaming things to make more sense.

Relative imports have gone away, save for from . import ....

There are several minor changes that break backwards-compatibility, like raw_input becomes input, and xrange becomes range.

The open() builtin uses the new io module, whereas 2.6 uses old io unless you write io.open().

All these sorts of changes should be handled by running 2to3 after taking care of any deprecation warnings when running under 2.6 with the -3 flag.

Summary

Python 2.6 and 3.0 are presently scheduled for an October release. The changes in 3.0 look like really good improvements that move the language forward.

Porting any one project to 3.0 shouldn’t take too long, it’s mainly an issue of prerequisites. One of Python’s major strengths is the vast library of third-party tools. You need to wait for, or get involved with the porting of any libraries you use before making the switch. That’s going to take some time.

Meanwhile, we need to be aware of the changes if we want the transition to be as smooth as possible.

Thanks to Brett Cannon for his talk, and VanPyZ for hosting it.

Nathan Youngman

Software Developer and Author