560 lines
19 KiB
Plaintext
560 lines
19 KiB
Plaintext
PEP: 355
|
||
Title: Path - Object oriented filesystem paths
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Björn Lindqvist <bjourne@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track (library)
|
||
Created: 24-Jan-2006
|
||
Content-Type: text/plain
|
||
Python-Version: 2.5
|
||
|
||
|
||
Abstract
|
||
|
||
This PEP describes a new class, Path, to be added to the os
|
||
module, for handling paths in an object oriented fashion. The
|
||
"weak" deprecation of various related functions is also discussed
|
||
and recommended.
|
||
|
||
|
||
Background
|
||
|
||
The ideas expressed in this PEP are not recent, but have been
|
||
debated in the Python community for many years. Many have felt
|
||
that the API for manipulating file paths as offered in the os.path
|
||
module is inadequate. The first proposal for a Path object was
|
||
raised by Just van Rossum on python-dev in 2001 [2]. In 2003,
|
||
Jason Orendorff released version 1.0 of the "path module" which
|
||
was the first public implementation that used objects to represent
|
||
paths [3].
|
||
|
||
The path module quickly became very popular and numerous attempts
|
||
were made to get the path module included in the Python standard
|
||
library; [4], [5], [6], [7].
|
||
|
||
This PEP summarizes the the ideas and suggestions people have
|
||
expressed about the path module and proposes that a modified
|
||
version should be included in the standard library.
|
||
|
||
|
||
Motivation
|
||
|
||
Dealing with filesystem paths is a common task in any programming
|
||
language, and very common in a high-level language like Python.
|
||
Good support for this task is needed, because:
|
||
|
||
- Almost every program uses paths to access files. It makes sense
|
||
that a task, that is so often performed, should be as intuitive
|
||
and as easy to perform as possible.
|
||
|
||
- It makes Python an even better replacement language for
|
||
over-complicated shell scripts.
|
||
|
||
Currently, Python has a large number of different functions
|
||
scattered over half a dozen modules for handling paths. This
|
||
makes it hard for newbies and experienced developers to to choose
|
||
the right method.
|
||
|
||
The Path class provides the following enhancements over the
|
||
current common practice:
|
||
|
||
- One "unified" object provides all functionality from previous
|
||
functions.
|
||
|
||
- Subclassability - the Path object can be extended to support
|
||
paths other than filesystem paths. The programmer does not need
|
||
to learn a new API, but can reuse his or her knowledge of Path
|
||
to deal with the extended class.
|
||
|
||
- With all related functionality in one place, the right approach
|
||
is easier to learn as one does not have to hunt through many
|
||
different modules for the right functions.
|
||
|
||
- Python is an object oriented language. Just like files,
|
||
datetimes and sockets are objects so are paths, they are not
|
||
merely strings to be passed to functions. Path objects is
|
||
inherently a pythonic idea.
|
||
|
||
- Path takes advantage of properties. Properties make for more
|
||
readable code.
|
||
|
||
if imgpath.ext == 'jpg':
|
||
jpegdecode(imgpath)
|
||
|
||
Is better than:
|
||
|
||
if os.path.splitexit(imgpath)[1] == 'jpg':
|
||
jpegdecode(imgpath)
|
||
|
||
|
||
Rationale
|
||
|
||
The following points summarize the design:
|
||
|
||
- Path extends from string, therefore all code which expects
|
||
string pathnames need not be modified and no existing code will
|
||
break.
|
||
|
||
- A Path object can be created either by using the classmethod
|
||
Path.cwd, by instantiating the class with a string representing
|
||
a path or by using the default constructor which is equivalent
|
||
to Path(".").
|
||
|
||
- Path provides common pathname manipulation, pattern expansion,
|
||
pattern matching and other high-level file operations including
|
||
copying. Basically Path provides everything path-related except
|
||
the manipulation of file contents, for which file objects are
|
||
better suited.
|
||
|
||
- Platform incompatibilites are dealt with by not instantiating
|
||
system specific methods.
|
||
|
||
|
||
Specification
|
||
|
||
This class defines the following public interface (docstrings have
|
||
been extracted from the reference implementation, and shortened
|
||
for brevity; see the reference implementation for more detail):
|
||
|
||
class Path(str):
|
||
|
||
# Special Python methods:
|
||
def __new__(cls, *args) => Path
|
||
"""
|
||
Creates a new path object concatenating the *args. *args
|
||
may only contain Path objects or strings. If *args is
|
||
empty, Path(os.curdir) is created.
|
||
"""
|
||
def __repr__(self): ...
|
||
def __add__(self, more): ...
|
||
def __radd__(self, other): ...
|
||
|
||
# Alternative constructor.
|
||
def cwd(cls): ...
|
||
|
||
# Operations on path strings:
|
||
def abspath(self) => Path
|
||
"""Returns the absolute path of self as a new Path object."""
|
||
def normcase(self): ...
|
||
def normpath(self): ...
|
||
def realpath(self): ...
|
||
def expanduser(self): ...
|
||
def expandvars(self): ...
|
||
def basename(self): ...
|
||
def expand(self): ...
|
||
def splitpath(self) => (Path, str)
|
||
"""p.splitpath() -> Return (p.parent, p.name)."""
|
||
def stripext(self) => Path
|
||
"""p.stripext() -> Remove one file extension from the path."""
|
||
def splitunc(self): ... [1]
|
||
def splitall(self): ...
|
||
def relpath(self): ...
|
||
def relpathto(self, dest): ...
|
||
|
||
# Properties about the path:
|
||
parent => Path
|
||
"""This Path's parent directory as a new path object."""
|
||
name => str
|
||
"""The name of this file or directory without the full path."""
|
||
ext => str
|
||
"""
|
||
The file extension or an empty string if Path refers to a
|
||
file without an extension or a directory.
|
||
"""
|
||
drive => str
|
||
"""
|
||
The drive specifier. Always empty on systems that don't
|
||
use drive specifiers.
|
||
"""
|
||
namebase => str
|
||
"""
|
||
The same as path.name, but with one file extension
|
||
stripped off.
|
||
"""
|
||
uncshare[1]
|
||
|
||
# Operations that return lists of paths:
|
||
def listdir(self, pattern = None): ...
|
||
def dirs(self, pattern = None): ...
|
||
def files(self, pattern = None): ...
|
||
def walk(self, pattern = None): ...
|
||
def walkdirs(self, pattern = None): ...
|
||
def walkfiles(self, pattern = None): ...
|
||
def match(self, pattern) => bool
|
||
"""Returns True if self.name matches the given pattern."""
|
||
|
||
def matchcase(self, pattern) => bool
|
||
"""
|
||
Like match() but is guaranteed to be case sensitive even
|
||
on platforms with case insensitive filesystems.
|
||
"""
|
||
def glob(self, pattern):
|
||
|
||
# Methods for retrieving information about the filesystem
|
||
# path:
|
||
def exists(self): ...
|
||
def isabs(self): ...
|
||
def isdir(self): ...
|
||
def isfile(self): ...
|
||
def islink(self): ...
|
||
def ismount(self): ...
|
||
def samefile(self, other): ... [1]
|
||
def atime(self): ...
|
||
"""Last access time of the file."""
|
||
def mtime(self): ...
|
||
"""Last-modified time of the file."""
|
||
def ctime(self): ...
|
||
"""
|
||
Return the system's ctime which, on some systems (like
|
||
Unix) is the time of the last change, and, on others (like
|
||
Windows), is the creation time for path.
|
||
"""
|
||
def size(self): ...
|
||
def access(self, mode): ... [1]
|
||
def stat(self): ...
|
||
def lstat(self): ...
|
||
def statvfs(self): ... [1]
|
||
def pathconf(self, name): ... [1]
|
||
|
||
# Methods for manipulating information about the filesystem
|
||
# path.
|
||
def utime(self, times) => None
|
||
def chmod(self, mode) => None
|
||
def chown(self, uid, gid) => None [1]
|
||
def rename(self, new) => None
|
||
def renames(self, new) => None
|
||
|
||
# Create/delete operations on directories
|
||
def mkdir(self, mode = 0777): ...
|
||
def makedirs(self, mode = 0777): ...
|
||
def rmdir(self): ...
|
||
def removedirs(self): ...
|
||
|
||
# Modifying operations on files
|
||
def touch(self): ...
|
||
def remove(self): ...
|
||
def unlink(self): ...
|
||
|
||
# Modifying operations on links
|
||
def link(self, newpath): ...
|
||
def symlink(self, newlink): ...
|
||
def readlink(self): ...
|
||
def readlinkabs(self): ...
|
||
|
||
# High-level functions from shutil
|
||
def copyfile(self, dst): ...
|
||
def copymode(self, dst): ...
|
||
def copystat(self, dst): ...
|
||
def copy(self, dst): ...
|
||
def copy2(self, dst): ...
|
||
def copytree(self, dst, symlinks = True): ...
|
||
def move(self, dst): ...
|
||
def rmtree(self, ignore_errors = False, onerror = None): ...
|
||
|
||
# Special stuff from os
|
||
def chroot(self): ... [1]
|
||
def startfile(self): ... [1]
|
||
|
||
|
||
Replacing older functions with the Path class
|
||
|
||
In this section, "a ==> b" means that b can be used as a
|
||
replacement for a.
|
||
|
||
In the following examples, we assume that the Path class is
|
||
imported with "from path import Path".
|
||
|
||
1. Replacing os.path.join
|
||
--------------------------
|
||
|
||
os.path.join(os.getcwd(), "foobar")
|
||
==>
|
||
Path(Path.cwd(), "foobar")
|
||
|
||
os.path.join("foo", "bar", "baz")
|
||
==>
|
||
Path("foo", "bar", "baz")
|
||
|
||
|
||
2. Replacing os.path.splitext
|
||
------------------------------
|
||
|
||
fname = "Python2.4.tar.gz"
|
||
os.path.splitext(fname)[1]
|
||
==>
|
||
fname = Path("Python2.4.tar.gz")
|
||
fname.ext
|
||
|
||
Or if you want both parts:
|
||
|
||
fname = "Python2.4.tar.gz"
|
||
base, ext = os.path.splitext(fname)
|
||
==>
|
||
fname = Path("Python2.4.tar.gz")
|
||
base, ext = fname.namebase, fname.extx
|
||
|
||
|
||
3. Replacing glob.glob
|
||
-----------------------
|
||
|
||
lib_dir = "/lib"
|
||
libs = glob.glob(os.path.join(lib_dir, "*s.o"))
|
||
==>
|
||
lib_dir = Path("/lib")
|
||
libs = lib_dir.files("*.so")
|
||
|
||
|
||
Deprecations
|
||
|
||
Introducing this module to the standard library introduces a need
|
||
for the "weak" deprecation of a number of existing modules and
|
||
functions. These modules and functions are so widely used that
|
||
they cannot be truly deprecated, as in generating
|
||
DeprecationWarning. Here "weak deprecation" means notes in the
|
||
documentation only.
|
||
|
||
The table below lists the existing functionality that should be
|
||
deprecated.
|
||
|
||
Path method/property Deprecates function
|
||
-------------------- -------------------
|
||
normcase() os.path.normcase()
|
||
normpath() os.path.normpath()
|
||
realpath() os.path.realpath()
|
||
expanduser() os.path.expanduser()
|
||
expandvars() os.path.expandvars()
|
||
parent os.path.dirname()
|
||
name os.path.basename()
|
||
splitpath() os.path.split()
|
||
drive os.path.splitdrive()
|
||
ext os.path.splitext()
|
||
splitunc() os.path.splitunc()
|
||
__new__() os.path.join(), os.curdir
|
||
listdir() os.listdir() [fnmatch.filter()]
|
||
match() fnmatch.fnmatch()
|
||
matchcase() fnmatch.fnmatchcase()
|
||
glob() glob.glob()
|
||
exists() os.path.exists()
|
||
isabs() os.path.isabs()
|
||
isdir() os.path.isdir()
|
||
isfile() os.path.isfile()
|
||
islink() os.path.islink()
|
||
ismount() os.path.ismount()
|
||
samefile() os.path.samefile()
|
||
atime() os.path.getatime()
|
||
ctime() os.path.getctime()
|
||
mtime() os.path.getmtime()
|
||
size() os.path.getsize()
|
||
cwd() os.getcwd()
|
||
access() os.access()
|
||
stat() os.stat()
|
||
lstat() os.lstat()
|
||
statvfs() os.statvfs()
|
||
pathconf() os.pathconf()
|
||
utime() os.utime()
|
||
chmod() os.chmod()
|
||
chown() os.chown()
|
||
rename() os.rename()
|
||
renames() os.renames()
|
||
mkdir() os.mkdir()
|
||
makedirs() os.makedirs()
|
||
rmdir() os.rmdir()
|
||
removedirs() os.removedirs()
|
||
remove() os.remove()
|
||
unlink() os.unlink()
|
||
link() os.link()
|
||
symlink() os.symlink()
|
||
readlink() os.readlink()
|
||
chroot() os.chroot()
|
||
startfile() os.startfile()
|
||
copyfile() shutil.copyfile()
|
||
copymode() shutil.copymode()
|
||
copystat() shutil.copystat()
|
||
copy() shutil.copy()
|
||
copy2() shutil.copy2()
|
||
copytree() shutil.copytree()
|
||
move() shutil.move()
|
||
rmtree() shutil.rmtree()
|
||
|
||
The Path class deprecates the whole of os.path, shutil, fnmatch
|
||
and glob. A big chunk of os is also deprecated.
|
||
|
||
|
||
Closed Issues
|
||
|
||
A number contentious issues have been resolved since this PEP
|
||
first appeared on python-dev:
|
||
|
||
* The __div__() method was removed. Overloading the / (division)
|
||
operator may be "too much magic" and make path concatenation
|
||
appear to be division. The method can always be re-added later
|
||
if the BFDL so desires. In its place, __new__() got an *args
|
||
argument that accepts both Path and string objects. The *args
|
||
are concatenated with os.path.join() which is used to construct
|
||
the Path object. These changes obsoleted the problematic
|
||
joinpath() method which was removed.
|
||
|
||
* The methods and the properties getatime()/atime,
|
||
getctime()/ctime, getmtime()/mtime and getsize()/size duplicated
|
||
each other. These methods and properties have been merged to
|
||
atime(), ctime(), mtime() and size(). The reason they are not
|
||
properties instead, is because there is a possibility that they
|
||
may change unexpectedly. The following example is not
|
||
guaranteed to always pass the assertion:
|
||
|
||
p = Path("foobar")
|
||
s = p.size()
|
||
assert p.size() == s
|
||
|
||
|
||
Open Issues
|
||
|
||
Some functionality of Jason Orendorff's path module have been
|
||
omitted:
|
||
|
||
* Function for opening a path - better handled by the builtin
|
||
open().
|
||
|
||
* Functions for reading and writing whole files - better handled
|
||
by file objects' own read() and write() methods.
|
||
|
||
* A chdir() function may be a worthy inclusion.
|
||
|
||
* A deprecation schedule needs to be set up. How much
|
||
functionality should Path implement? How much of existing
|
||
functionality should it deprecate and when?
|
||
|
||
* The name obviously has to be either "path" or "Path," but where
|
||
should it live? In its own module or in os?
|
||
|
||
* Due to Path subclassing either str or unicode, the following
|
||
non-magic, public methods are availible on Path objects:
|
||
|
||
capitalize(), center(), count(), decode(), encode(),
|
||
endswith(), expandtabs(), find(), index(), isalnum(),
|
||
isalpha(), isdigit(), islower(), isspace(), istitle(),
|
||
isupper(), join(), ljust(), lower(), lstrip(), replace(),
|
||
rfind(), rindex(), rjust(), rsplit(), rstrip(), split(),
|
||
splitlines(), startswith(), strip(), swapcase(), title(),
|
||
translate(), upper(), zfill()
|
||
|
||
On python-dev it has been argued whether this inheritance is
|
||
sane or not. Most persons debating said that most string
|
||
methods doesn't make sense in the context of filesystem paths --
|
||
they are just dead weight. The other position, also argued on
|
||
python-dev, is that inheriting from string is very convenient
|
||
because it allows code to "just work" with Path objects without
|
||
having to be adapted for them.
|
||
|
||
One of the problems is that at the Python level, there is no way
|
||
to make an object "string-like enough," so that it can be passed
|
||
to the builtin function open() (and other builtins expecting a
|
||
string or buffer), unless the object inherits from either str or
|
||
unicode. Therefore, to not inherit from string requires changes
|
||
in CPython's core.
|
||
|
||
The functions and modules that this new module is trying to
|
||
replace (os.path, shutil, fnmatch, glob and parts of os) are
|
||
expected to be available in future Python versions for a long
|
||
time, to preserve backwards compatibility.
|
||
|
||
|
||
Reference Implementation
|
||
|
||
Currently, the Path class is implemented as a thin wrapper around
|
||
the standard library modules fnmatch, glob, os, os.path and
|
||
shutil. The intention of this PEP is to move functionality from
|
||
the aforementioned modules to Path while they are being
|
||
deprecated.
|
||
|
||
For more detail and an implementation see:
|
||
|
||
http://wiki.python.org/moin/PathModule
|
||
|
||
|
||
Examples
|
||
|
||
In this section, "a ==> b" means that b can be used as a
|
||
replacement for a.
|
||
|
||
1. Make all python files in the a directory executable
|
||
------------------------------------------------------
|
||
|
||
DIR = '/usr/home/guido/bin'
|
||
for f in os.listdir(DIR):
|
||
if f.endswith('.py'):
|
||
path = os.path.join(DIR, f)
|
||
os.chmod(path, 0755)
|
||
==>
|
||
for f in Path('/usr/home/guido/bin').files("*.py"):
|
||
f.chmod(0755)
|
||
|
||
2. Delete emacs backup files
|
||
----------------------------
|
||
|
||
def delete_backups(arg, dirname, names):
|
||
for name in names:
|
||
if name.endswith('~'):
|
||
os.remove(os.path.join(dirname, name))
|
||
os.path.walk(os.environ['HOME'], delete_backups, None)
|
||
==>
|
||
d = Path(os.environ['HOME'])
|
||
for f in d.walkfiles('*~'):
|
||
f.remove()
|
||
|
||
3. Finding the relative path to a file
|
||
--------------------------------------
|
||
|
||
b = Path('/users/peter/')
|
||
a = Path('/users/peter/synergy/tiki.txt')
|
||
a.relpathto(b)
|
||
|
||
4. Splitting a path into directory and filename
|
||
-----------------------------------------------
|
||
|
||
os.path.split("/path/to/foo/bar.txt")
|
||
==>
|
||
Path("/path/to/foo/bar.txt").splitpath()
|
||
|
||
5. List all Python scripts in the current directory tree
|
||
--------------------------------------------------------
|
||
|
||
list(Path().walkfiles("*.py"))
|
||
|
||
|
||
References and Footnotes
|
||
|
||
[1] Method is not guaranteed to be availible on all platforms.
|
||
|
||
[2] "(idea) subclassable string: path object?", van Rossum, 2001
|
||
http://mail.python.org/pipermail/python-dev/2001-August/016663.html
|
||
|
||
[3] "path module v1.0 released", Orendorff, 2003
|
||
http://mail.python.org/pipermail/python-announce-list/2003-January/001984.html
|
||
|
||
[4] "Some RFE for review", Birkenfeld, 2005
|
||
http://mail.python.org/pipermail/python-dev/2005-June/054438.html
|
||
|
||
[5] "path module", Orendorff, 2003
|
||
http://mail.python.org/pipermail/python-list/2003-July/174289.html
|
||
|
||
[6] "PRE-PEP: new Path class", Roth, 2004
|
||
http://mail.python.org/pipermail/python-list/2004-January/201672.html
|
||
|
||
[7] http://wiki.python.org/moin/PathClass
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: latin-1
|
||
End:
|