465 lines
13 KiB
Plaintext
465 lines
13 KiB
Plaintext
PEP: 290
|
||
Title: Code Migration and Modernization
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Raymond Hettinger <python@rcn.com>
|
||
Status: Active
|
||
Type: Informational
|
||
Content-Type: text/x-rst
|
||
Created: 06-Jun-2002
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP is a collection of procedures and ideas for updating Python
|
||
applications when newer versions of Python are installed.
|
||
|
||
The migration tips highlight possible areas of incompatibility and
|
||
make suggestions on how to find and resolve those differences. The
|
||
modernization procedures show how older code can be updated to take
|
||
advantage of new language features.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
This repository of procedures serves as a catalog or checklist of
|
||
known migration issues and procedures for addressing those issues.
|
||
|
||
Migration issues can arise for several reasons. Some obsolete
|
||
features are slowly deprecated according to the guidelines in :pep:`4`.
|
||
Also, some code relies on undocumented behaviors which are
|
||
subject to change between versions. Some code may rely on behavior
|
||
which was subsequently shown to be a bug and that behavior changes
|
||
when the bug is fixed.
|
||
|
||
Modernization options arise when new versions of Python add features
|
||
that allow improved clarity or higher performance than previously
|
||
available.
|
||
|
||
|
||
Guidelines for New Entries
|
||
==========================
|
||
|
||
Developers with commit access may update this PEP directly. Others
|
||
can send their ideas to a developer for possible inclusion.
|
||
|
||
While a consistent format makes the repository easier to use, feel
|
||
free to add or subtract sections to improve clarity.
|
||
|
||
Grep patterns may be supplied as tool to help maintainers locate code
|
||
for possible updates. However, fully automated search/replace style
|
||
regular expressions are not recommended. Instead, each code fragment
|
||
should be evaluated individually.
|
||
|
||
The contra-indications section is the most important part of a new
|
||
entry. It lists known situations where the update SHOULD NOT be
|
||
applied.
|
||
|
||
|
||
Migration Issues
|
||
================
|
||
|
||
Comparison Operators Not a Shortcut for Producing 0 or 1
|
||
--------------------------------------------------------
|
||
|
||
Prior to Python 2.3, comparison operations returned 0 or 1 rather
|
||
than True or False. Some code may have used this as a shortcut for
|
||
producing zero or one in places where their boolean counterparts are
|
||
not appropriate. For example::
|
||
|
||
def identity(m=1):
|
||
"""Create and m-by-m identity matrix"""
|
||
return [[i==j for i in range(m)] for j in range(m)]
|
||
|
||
In Python 2.2, a call to identity(2) would produce::
|
||
|
||
[[1, 0], [0, 1]]
|
||
|
||
In Python 2.3, the same call would produce::
|
||
|
||
[[True, False], [False, True]]
|
||
|
||
Since booleans are a subclass of integers, the matrix would continue
|
||
to calculate normally, but it will not print as expected. The list
|
||
comprehension should be changed to read::
|
||
|
||
return [[int(i==j) for i in range(m)] for j in range(m)]
|
||
|
||
There are similar concerns when storing data to be used by other
|
||
applications which may expect a number instead of True or False.
|
||
|
||
|
||
Modernization Procedures
|
||
========================
|
||
|
||
Procedures are grouped by the Python version required to be able to
|
||
take advantage of the modernization.
|
||
|
||
Python 2.4 or Later
|
||
-------------------
|
||
|
||
Inserting and Popping at the Beginning of Lists
|
||
'''''''''''''''''''''''''''''''''''''''''''''''
|
||
|
||
Python's lists are implemented to perform best with appends and pops on
|
||
the right. Use of ``pop(0)`` or ``insert(0, x)`` triggers O(n) data
|
||
movement for the entire list. To help address this need, Python 2.4
|
||
introduces a new container, ``collections.deque()`` which has efficient
|
||
append and pop operations on the both the left and right (the trade-off
|
||
is much slower getitem/setitem access). The new container is especially
|
||
helpful for implementing data queues:
|
||
|
||
Pattern::
|
||
|
||
c = list(data) --> c = collections.deque(data)
|
||
c.pop(0) --> c.popleft()
|
||
c.insert(0, x) --> c.appendleft()
|
||
|
||
Locating::
|
||
|
||
grep pop(0 or
|
||
grep insert(0
|
||
|
||
Simplifying Custom Sorts
|
||
''''''''''''''''''''''''
|
||
|
||
In Python 2.4, the ``sort`` method for lists and the new ``sorted``
|
||
built-in function both accept a ``key`` function for computing sort
|
||
keys. Unlike the ``cmp`` function which gets applied to every
|
||
comparison, the key function gets applied only once to each record.
|
||
It is much faster than cmp and typically more readable while using
|
||
less code. The key function also maintains the stability of the
|
||
sort (records with the same key are left in their original order.
|
||
|
||
Original code using a comparison function::
|
||
|
||
names.sort(lambda x,y: cmp(x.lower(), y.lower()))
|
||
|
||
Alternative original code with explicit decoration::
|
||
|
||
tempnames = [(n.lower(), n) for n in names]
|
||
tempnames.sort()
|
||
names = [original for decorated, original in tempnames]
|
||
|
||
Revised code using a key function::
|
||
|
||
names.sort(key=str.lower) # case-insensitive sort
|
||
|
||
|
||
Locating: ``grep sort *.py``
|
||
|
||
Replacing Common Uses of Lambda
|
||
'''''''''''''''''''''''''''''''
|
||
|
||
In Python 2.4, the ``operator`` module gained two new functions,
|
||
itemgetter() and attrgetter() that can replace common uses of
|
||
the ``lambda`` keyword. The new functions run faster and
|
||
are considered by some to improve readability.
|
||
|
||
Pattern::
|
||
|
||
lambda r: r[2] --> itemgetter(2)
|
||
lambda r: r.myattr --> attrgetter('myattr')
|
||
|
||
Typical contexts::
|
||
|
||
sort(studentrecords, key=attrgetter('gpa')) # set a sort field
|
||
map(attrgetter('lastname'), studentrecords) # extract a field
|
||
|
||
Locating: ``grep lambda *.py``
|
||
|
||
Simplified Reverse Iteration
|
||
''''''''''''''''''''''''''''
|
||
|
||
Python 2.4 introduced the ``reversed`` builtin function for reverse
|
||
iteration. The existing approaches to reverse iteration suffered
|
||
from wordiness, performance issues (speed and memory consumption),
|
||
and/or lack of clarity. A preferred style is to express the
|
||
sequence in a forwards direction, apply ``reversed`` to the result,
|
||
and then loop over the resulting fast, memory friendly iterator.
|
||
|
||
Original code expressed with half-open intervals::
|
||
|
||
for i in range(n-1, -1, -1):
|
||
print seqn[i]
|
||
|
||
Alternative original code reversed in multiple steps::
|
||
|
||
rseqn = list(seqn)
|
||
rseqn.reverse()
|
||
for value in rseqn:
|
||
print value
|
||
|
||
Alternative original code expressed with extending slicing::
|
||
|
||
for value in seqn[::-1]:
|
||
print value
|
||
|
||
Revised code using the ``reversed`` function::
|
||
|
||
for value in reversed(seqn):
|
||
print value
|
||
|
||
Python 2.3 or Later
|
||
-------------------
|
||
|
||
Testing String Membership
|
||
'''''''''''''''''''''''''
|
||
|
||
In Python 2.3, for ``string2 in string1``, the length restriction on
|
||
``string2`` is lifted; it can now be a string of any length. When
|
||
searching for a substring, where you don't care about the position of
|
||
the substring in the original string, using the ``in`` operator makes
|
||
the meaning clear.
|
||
|
||
Pattern::
|
||
|
||
string1.find(string2) >= 0 --> string2 in string1
|
||
string1.find(string2) != -1 --> string2 in string1
|
||
|
||
Replace apply() with a Direct Function Call
|
||
'''''''''''''''''''''''''''''''''''''''''''
|
||
|
||
In Python 2.3, apply() was marked for Pending Deprecation because it
|
||
was made obsolete by Python 1.6's introduction of * and ** in
|
||
function calls. Using a direct function call was always a little
|
||
faster than apply() because it saved the lookup for the builtin.
|
||
Now, apply() is even slower due to its use of the warnings module.
|
||
|
||
Pattern::
|
||
|
||
apply(f, args, kwds) --> f(*args, **kwds)
|
||
|
||
Note: The Pending Deprecation was removed from apply() in Python 2.3.3
|
||
since it creates pain for people who need to maintain code that works
|
||
with Python versions as far back as 1.5.2, where there was no
|
||
alternative to apply(). The function remains deprecated, however.
|
||
|
||
|
||
Python 2.2 or Later
|
||
-------------------
|
||
|
||
Testing Dictionary Membership
|
||
'''''''''''''''''''''''''''''
|
||
|
||
For testing dictionary membership, use the 'in' keyword instead of the
|
||
'has_key()' method. The result is shorter and more readable. The
|
||
style becomes consistent with tests for membership in lists. The
|
||
result is slightly faster because ``has_key`` requires an attribute
|
||
search and uses a relatively expensive function call.
|
||
|
||
Pattern::
|
||
|
||
if d.has_key(k): --> if k in d:
|
||
|
||
Contra-indications:
|
||
|
||
1. Some dictionary-like objects may not define a
|
||
``__contains__()`` method::
|
||
|
||
if dictlike.has_key(k)
|
||
|
||
Locating: ``grep has_key``
|
||
|
||
|
||
Looping Over Dictionaries
|
||
'''''''''''''''''''''''''
|
||
|
||
Use the new ``iter`` methods for looping over dictionaries. The
|
||
``iter`` methods are faster because they do not have to create a new
|
||
list object with a complete copy of all of the keys, values, or items.
|
||
Selecting only keys, values, or items (key/value pairs) as needed
|
||
saves the time for creating throwaway object references and, in the
|
||
case of items, saves a second hash look-up of the key.
|
||
|
||
Pattern::
|
||
|
||
for key in d.keys(): --> for key in d:
|
||
for value in d.values(): --> for value in d.itervalues():
|
||
for key, value in d.items():
|
||
--> for key, value in d.iteritems():
|
||
|
||
Contra-indications:
|
||
|
||
1. If you need a list, do not change the return type::
|
||
|
||
def getids(): return d.keys()
|
||
|
||
2. Some dictionary-like objects may not define
|
||
``iter`` methods::
|
||
|
||
for k in dictlike.keys():
|
||
|
||
3. Iterators do not support slicing, sorting or other operations::
|
||
|
||
k = d.keys(); j = k[:]
|
||
|
||
4. Dictionary iterators prohibit modifying the dictionary::
|
||
|
||
for k in d.keys(): del[k]
|
||
|
||
|
||
``stat`` Methods
|
||
''''''''''''''''
|
||
|
||
Replace ``stat`` constants or indices with new ``os.stat`` attributes
|
||
and methods. The ``os.stat`` attributes and methods are not
|
||
order-dependent and do not require an import of the ``stat`` module.
|
||
|
||
Pattern::
|
||
|
||
os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime
|
||
os.stat("foo")[stat.ST_MTIME] --> os.path.getmtime("foo")
|
||
|
||
Locating: ``grep os.stat`` or ``grep stat.S``
|
||
|
||
|
||
Reduce Dependency on ``types`` Module
|
||
'''''''''''''''''''''''''''''''''''''
|
||
|
||
The ``types`` module is likely to be deprecated in the future. Use
|
||
built-in constructor functions instead. They may be slightly faster.
|
||
|
||
Pattern::
|
||
|
||
isinstance(v, types.IntType) --> isinstance(v, int)
|
||
isinstance(s, types.StringTypes) --> isinstance(s, basestring)
|
||
|
||
Full use of this technique requires Python 2.3 or later
|
||
(``basestring`` was introduced in Python 2.3), but Python 2.2 is
|
||
sufficient for most uses.
|
||
|
||
Locating: ``grep types *.py | grep import``
|
||
|
||
|
||
Avoid Variable Names that Clash with the ``__builtins__`` Module
|
||
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
|
||
|
||
In Python 2.2, new built-in types were added for ``dict`` and ``file``.
|
||
Scripts should avoid assigning variable names that mask those types.
|
||
The same advice also applies to existing builtins like ``list``.
|
||
|
||
Pattern::
|
||
|
||
file = open('myfile.txt') --> f = open('myfile.txt')
|
||
dict = obj.__dict__ --> d = obj.__dict__
|
||
|
||
Locating: ``grep 'file ' *.py``
|
||
|
||
|
||
Python 2.1 or Later
|
||
-------------------
|
||
|
||
``whrandom`` Module Deprecated
|
||
''''''''''''''''''''''''''''''
|
||
|
||
All random-related methods have been collected in one place, the
|
||
``random`` module.
|
||
|
||
Pattern::
|
||
|
||
import whrandom --> import random
|
||
|
||
Locating: ``grep whrandom``
|
||
|
||
|
||
Python 2.0 or Later
|
||
-------------------
|
||
|
||
String Methods
|
||
''''''''''''''
|
||
|
||
The string module is likely to be deprecated in the future. Use
|
||
string methods instead. They're faster too.
|
||
|
||
Pattern::
|
||
|
||
import string ; string.method(s, ...) --> s.method(...)
|
||
c in string.whitespace --> c.isspace()
|
||
|
||
Locating: ``grep string *.py | grep import``
|
||
|
||
|
||
``startswith`` and ``endswith`` String Methods
|
||
''''''''''''''''''''''''''''''''''''''''''''''
|
||
|
||
Use these string methods instead of slicing. No slice has to be
|
||
created and there's no risk of miscounting.
|
||
|
||
Pattern::
|
||
|
||
"foobar"[:3] == "foo" --> "foobar".startswith("foo")
|
||
"foobar"[-3:] == "bar" --> "foobar".endswith("bar")
|
||
|
||
|
||
The ``atexit`` Module
|
||
'''''''''''''''''''''
|
||
|
||
The atexit module supports multiple functions to be executed upon
|
||
program termination. Also, it supports parameterized functions.
|
||
Unfortunately, its implementation conflicts with the sys.exitfunc
|
||
attribute which only supports a single exit function. Code relying
|
||
on sys.exitfunc may interfere with other modules (including library
|
||
modules) that elect to use the newer and more versatile atexit module.
|
||
|
||
Pattern::
|
||
|
||
sys.exitfunc = myfunc --> atexit.register(myfunc)
|
||
|
||
|
||
Python 1.5 or Later
|
||
-------------------
|
||
|
||
Class-Based Exceptions
|
||
''''''''''''''''''''''
|
||
|
||
String exceptions are deprecated, so derive from the ``Exception``
|
||
base class. Unlike the obsolete string exceptions, class exceptions
|
||
all derive from another exception or the ``Exception`` base class.
|
||
This allows meaningful groupings of exceptions. It also allows an
|
||
"``except Exception``" clause to catch all exceptions.
|
||
|
||
Pattern::
|
||
|
||
NewError = 'NewError' --> class NewError(Exception): pass
|
||
|
||
Locating: Use `PyChecker <http://pychecker.sourceforge.net/>`__.
|
||
|
||
|
||
All Python Versions
|
||
-------------------
|
||
|
||
Testing for ``None``
|
||
''''''''''''''''''''
|
||
|
||
Since there is only one ``None`` object, equality can be tested with
|
||
identity. Identity tests are slightly faster than equality tests.
|
||
Also, some object types may overload comparison, so equality testing
|
||
may be much slower.
|
||
|
||
Pattern::
|
||
|
||
if v == None --> if v is None:
|
||
if v != None --> if v is not None:
|
||
|
||
Locating: ``grep '== None'`` or ``grep '!= None'``
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
End:
|