2016-05-16 16:24:35 -04:00
|
|
|
|
PEP: 519
|
|
|
|
|
Title: Adding a file system path protocol
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Brett Cannon <brett@python.org>,
|
|
|
|
|
Koos Zevenhoven <k7hoven@gmail.com>
|
2016-05-16 17:25:39 -04:00
|
|
|
|
Status: Accepted
|
2016-05-16 16:24:35 -04:00
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 11-May-2016
|
2016-05-28 14:32:59 -04:00
|
|
|
|
Python-Version: 3.6
|
2016-05-16 16:24:35 -04:00
|
|
|
|
Post-History: 11-May-2016,
|
|
|
|
|
12-May-2016,
|
|
|
|
|
13-May-2016
|
2016-05-16 17:32:12 -04:00
|
|
|
|
Resolution: https://mail.python.org/pipermail/python-dev/2016-May/144646.html
|
2016-05-16 16:24:35 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
This PEP proposes a protocol for classes which represent a file system
|
|
|
|
|
path to be able to provide a ``str`` or ``bytes`` representation.
|
|
|
|
|
Changes to Python's standard library are also proposed to utilize this
|
|
|
|
|
protocol where appropriate to facilitate the use of path objects where
|
|
|
|
|
historically only ``str`` and/or ``bytes`` file system paths are
|
|
|
|
|
accepted. The goal is to facilitate the migration of users towards
|
|
|
|
|
rich path objects while providing an easy way to work with code
|
|
|
|
|
expecting ``str`` or ``bytes``.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
Historically in Python, file system paths have been represented as
|
|
|
|
|
strings or bytes. This choice of representation has stemmed from C's
|
|
|
|
|
own decision to represent file system paths as
|
|
|
|
|
``const char *`` [#libc-open]_. While that is a totally serviceable
|
|
|
|
|
format to use for file system paths, it's not necessarily optimal. At
|
|
|
|
|
issue is the fact that while all file system paths can be represented
|
|
|
|
|
as strings or bytes, not all strings or bytes represent a file system
|
|
|
|
|
path. This can lead to issues where any e.g. string duck-types to a
|
|
|
|
|
file system path whether it actually represents a path or not.
|
|
|
|
|
|
|
|
|
|
To help elevate the representation of file system paths from their
|
|
|
|
|
representation as strings and bytes to a richer object representation,
|
|
|
|
|
the pathlib module [#pathlib]_ was provisionally introduced in
|
|
|
|
|
Python 3.4 through PEP 428. While considered by some as an improvement
|
|
|
|
|
over strings and bytes for file system paths, it has suffered from a
|
|
|
|
|
lack of adoption. Typically the key issue listed for the low adoption
|
|
|
|
|
rate has been the lack of support in the standard library. This lack
|
|
|
|
|
of support required users of pathlib to manually convert path objects
|
|
|
|
|
to strings by calling ``str(path)`` which many found error-prone.
|
|
|
|
|
|
|
|
|
|
One issue in converting path objects to strings comes from
|
|
|
|
|
the fact that the only generic way to get a string representation of
|
|
|
|
|
the path was to pass the object to ``str()``. This can pose a
|
|
|
|
|
problem when done blindly as nearly all Python objects have some
|
|
|
|
|
string representation whether they are a path or not, e.g.
|
|
|
|
|
``str(None)`` will give a result that
|
|
|
|
|
``builtins.open()`` [#builtins-open]_ will happily use to create a new
|
|
|
|
|
file.
|
|
|
|
|
|
|
|
|
|
Exacerbating this whole situation is the
|
|
|
|
|
``DirEntry`` object [#os-direntry]_. While path objects have a
|
|
|
|
|
representation that can be extracted using ``str()``, ``DirEntry``
|
|
|
|
|
objects expose a ``path`` attribute instead. Having no common
|
|
|
|
|
interface between path objects, ``DirEntry``, and any other
|
|
|
|
|
third-party path library has become an issue. A solution that allows
|
|
|
|
|
any path-representing object to declare that it is a path and a way
|
|
|
|
|
to extract a low-level representation that all path objects could
|
|
|
|
|
support is desired.
|
|
|
|
|
|
|
|
|
|
This PEP then proposes to introduce a new protocol to be followed by
|
|
|
|
|
objects which represent file system paths. Providing a protocol allows
|
|
|
|
|
for explicit signaling of what objects represent file system paths as
|
|
|
|
|
well as a way to extract a lower-level representation that can be used
|
|
|
|
|
with older APIs which only support strings or bytes.
|
|
|
|
|
|
|
|
|
|
Discussions regarding path objects that led to this PEP can be found
|
|
|
|
|
in multiple threads on the python-ideas mailing list archive
|
|
|
|
|
[#python-ideas-archive]_ for the months of March and April 2016 and on
|
|
|
|
|
the python-dev mailing list archives [#python-dev-archive]_ during
|
|
|
|
|
April 2016.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Proposal
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
This proposal is split into two parts. One part is the proposal of a
|
|
|
|
|
protocol for objects to declare and provide support for exposing a
|
|
|
|
|
file system path representation. The other part deals with changes to
|
|
|
|
|
Python's standard library to support the new protocol. These changes
|
|
|
|
|
will also lead to the pathlib module dropping its provisional status.
|
|
|
|
|
|
|
|
|
|
Protocol
|
|
|
|
|
--------
|
|
|
|
|
|
|
|
|
|
The following abstract base class defines the protocol for an object
|
|
|
|
|
to be considered a path object::
|
|
|
|
|
|
|
|
|
|
import abc
|
|
|
|
|
import typing as t
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class PathLike(abc.ABC):
|
|
|
|
|
|
|
|
|
|
"""Abstract base class for implementing the file system path protocol."""
|
|
|
|
|
|
|
|
|
|
@abc.abstractmethod
|
|
|
|
|
def __fspath__(self) -> t.Union[str, bytes]:
|
|
|
|
|
"""Return the file system path representation of the object."""
|
|
|
|
|
raise NotImplementedError
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Objects representing file system paths will implement the
|
|
|
|
|
``__fspath__()`` method which will return the ``str`` or ``bytes``
|
|
|
|
|
representation of the path. The ``str`` representation is the
|
|
|
|
|
preferred low-level path representation as it is human-readable and
|
|
|
|
|
what people historically represent paths as.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Standard library changes
|
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
|
|
It is expected that most APIs in Python's standard library that
|
|
|
|
|
currently accept a file system path will be updated appropriately to
|
|
|
|
|
accept path objects (whether that requires code or simply an update
|
|
|
|
|
to documentation will vary). The modules mentioned below, though,
|
|
|
|
|
deserve specific details as they have either fundamental changes that
|
|
|
|
|
empower the ability to use path objects, or entail additions/removal
|
|
|
|
|
of APIs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
builtins
|
|
|
|
|
''''''''
|
|
|
|
|
|
|
|
|
|
``open()`` [#builtins-open]_ will be updated to accept path objects as
|
|
|
|
|
well as continue to accept ``str`` and ``bytes``.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
os
|
|
|
|
|
'''
|
|
|
|
|
|
|
|
|
|
The ``fspath()`` function will be added with the following semantics::
|
|
|
|
|
|
|
|
|
|
import typing as t
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def fspath(path: t.Union[PathLike, str, bytes]) -> t.Union[str, bytes]:
|
|
|
|
|
"""Return the string representation of the path.
|
|
|
|
|
|
2016-06-15 15:56:31 -04:00
|
|
|
|
If str or bytes is passed in, it is returned unchanged. If __fspath__()
|
|
|
|
|
returns something other than str or bytes then TypeError is raised. If
|
|
|
|
|
this function is given something that is not str, bytes, or os.PathLike
|
|
|
|
|
then TypeError is raised.
|
2016-05-16 16:24:35 -04:00
|
|
|
|
"""
|
|
|
|
|
if isinstance(path, (str, bytes)):
|
|
|
|
|
return path
|
|
|
|
|
|
|
|
|
|
# Work from the object's type to match method resolution of other magic
|
|
|
|
|
# methods.
|
|
|
|
|
path_type = type(path)
|
|
|
|
|
try:
|
2016-06-15 15:56:31 -04:00
|
|
|
|
path = path_type.__fspath__(path)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
except AttributeError:
|
|
|
|
|
if hasattr(path_type, '__fspath__'):
|
|
|
|
|
raise
|
2016-06-15 15:56:31 -04:00
|
|
|
|
else:
|
|
|
|
|
if isinstance(path, (str, bytes)):
|
|
|
|
|
return path
|
|
|
|
|
else:
|
|
|
|
|
raise TypeError("expected __fspath__() to return str or bytes, "
|
|
|
|
|
"not " + type(path).__name__)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
|
2016-06-15 15:56:31 -04:00
|
|
|
|
raise TypeError("expected str, bytes or os.PathLike object, not "
|
|
|
|
|
+ path_type.__name__)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
|
|
|
|
|
The ``os.fsencode()`` [#os-fsencode]_ and
|
|
|
|
|
``os.fsdecode()`` [#os-fsdecode]_ functions will be updated to accept
|
|
|
|
|
path objects. As both functions coerce their arguments to
|
|
|
|
|
``bytes`` and ``str``, respectively, they will be updated to call
|
|
|
|
|
``__fspath__()`` if present to convert the path object to a ``str`` or
|
|
|
|
|
``bytes`` representation, and then perform their appropriate
|
|
|
|
|
coercion operations as if the return value from ``__fspath__()`` had
|
|
|
|
|
been the original argument to the coercion function in question.
|
|
|
|
|
|
|
|
|
|
The addition of ``os.fspath()``, the updates to
|
|
|
|
|
``os.fsencode()``/``os.fsdecode()``, and the current semantics of
|
|
|
|
|
``pathlib.PurePath`` provide the semantics necessary to
|
|
|
|
|
get the path representation one prefers. For a path object,
|
|
|
|
|
``pathlib.PurePath``/``Path`` can be used. To obtain the ``str`` or
|
|
|
|
|
``bytes`` representation without any coersion, then ``os.fspath()``
|
|
|
|
|
can be used. If a ``str`` is desired and the encoding of ``bytes``
|
|
|
|
|
should be assumed to be the default file system encoding, then
|
|
|
|
|
``os.fsdecode()`` should be used. If a ``bytes`` representation is
|
|
|
|
|
desired and any strings should be encoded using the default file
|
|
|
|
|
system encoding, then ``os.fsencode()`` is used. This PEP recommends
|
|
|
|
|
using path objects when possible and falling back to string paths as
|
|
|
|
|
necessary and using ``bytes`` as a last resort.
|
|
|
|
|
|
|
|
|
|
Another way to view this is as a hierarchy of file system path
|
|
|
|
|
representations (highest- to lowest-level): path → str → bytes. The
|
|
|
|
|
functions and classes under discussion can all accept objects on the
|
|
|
|
|
same level of the hierarchy, but they vary in whether they promote or
|
|
|
|
|
demote objects to another level. The ``pathlib.PurePath`` class can
|
|
|
|
|
promote a ``str`` to a path object. The ``os.fspath()`` function can
|
|
|
|
|
demote a path object to a ``str`` or ``bytes`` instance, depending
|
|
|
|
|
on what ``__fspath__()`` returns.
|
|
|
|
|
The ``os.fsdecode()`` function will demote a path object to
|
|
|
|
|
a string or promote a ``bytes`` object to a ``str``. The
|
|
|
|
|
``os.fsencode()`` function will demote a path or string object to
|
|
|
|
|
``bytes``. There is no function that provides a way to demote a path
|
|
|
|
|
object directly to ``bytes`` while bypassing string demotion.
|
|
|
|
|
|
|
|
|
|
The ``DirEntry`` object [#os-direntry]_ will gain an ``__fspath__()``
|
|
|
|
|
method. It will return the same value as currently found on the
|
|
|
|
|
``path`` attribute of ``DirEntry`` instances.
|
|
|
|
|
|
|
|
|
|
The Protocol_ ABC will be added to the ``os`` module under the name
|
|
|
|
|
``os.PathLike``.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
os.path
|
|
|
|
|
'''''''
|
|
|
|
|
|
|
|
|
|
The various path-manipulation functions of ``os.path`` [#os-path]_
|
|
|
|
|
will be updated to accept path objects. For polymorphic functions that
|
|
|
|
|
accept both bytes and strings, they will be updated to simply use
|
|
|
|
|
``os.fspath()``.
|
|
|
|
|
|
|
|
|
|
During the discussions leading up to this PEP it was suggested that
|
|
|
|
|
``os.path`` not be updated using an "explicit is better than implicit"
|
|
|
|
|
argument. The thinking was that since ``__fspath__()`` is polymorphic
|
|
|
|
|
itself it may be better to have code working with ``os.path`` extract
|
|
|
|
|
the path representation from path objects explicitly. There is also
|
|
|
|
|
the consideration that adding support this deep into the low-level OS
|
|
|
|
|
APIs will lead to code magically supporting path objects without
|
|
|
|
|
requiring any documentation updated, leading to potential complaints
|
|
|
|
|
when it doesn't work, unbeknownst to the project author.
|
|
|
|
|
|
|
|
|
|
But it is the view of this PEP that "practicality beats purity" in
|
|
|
|
|
this instance. To help facilitate the transition to supporting path
|
|
|
|
|
objects, it is better to make the transition as easy as possible than
|
|
|
|
|
to worry about unexpected/undocumented duck typing support for
|
|
|
|
|
path objects by projects.
|
|
|
|
|
|
|
|
|
|
There has also been the suggestion that ``os.path`` functions could be
|
|
|
|
|
used in a tight loop and the overhead of checking or calling
|
|
|
|
|
``__fspath__()`` would be too costly. In this scenario only
|
|
|
|
|
path-consuming APIs would be directly updated and path-manipulating
|
|
|
|
|
APIs like the ones in ``os.path`` would go unmodified. This would
|
|
|
|
|
require library authors to update their code to support path objects
|
|
|
|
|
if they performed any path manipulations, but if the library code
|
|
|
|
|
passed the path straight through then the library wouldn't need to be
|
|
|
|
|
updated. It is the view of this PEP and Guido, though, that this is an
|
|
|
|
|
unnecessary worry and that performance will still be acceptable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
pathlib
|
|
|
|
|
'''''''
|
|
|
|
|
|
|
|
|
|
The constructor for ``pathlib.PurePath`` and ``pathlib.Path`` will be
|
|
|
|
|
updated to accept ``PathLike`` objects. Both ``PurePath`` and ``Path``
|
|
|
|
|
will continue to not accept ``bytes`` path representations, and so if
|
|
|
|
|
``__fspath__()`` returns ``bytes`` it will raise an exception.
|
|
|
|
|
|
|
|
|
|
The ``path`` attribute will be removed as this PEP makes it
|
|
|
|
|
redundant (it has not been included in any released version of Python
|
|
|
|
|
and so is not a backwards-compatibility concern).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
C API
|
|
|
|
|
'''''
|
|
|
|
|
|
|
|
|
|
The C API will gain an equivalent function to ``os.fspath()``::
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
Return the file system path representation of the object.
|
|
|
|
|
|
|
|
|
|
If the object is str or bytes, then allow it to pass through with
|
|
|
|
|
an incremented refcount. If the object defines __fspath__(), then
|
|
|
|
|
return the result of that method. All other types raise a TypeError.
|
|
|
|
|
*/
|
|
|
|
|
PyObject *
|
|
|
|
|
PyOS_FSPath(PyObject *path)
|
|
|
|
|
{
|
|
|
|
|
_Py_IDENTIFIER(__fspath__);
|
|
|
|
|
PyObject *func = NULL;
|
|
|
|
|
PyObject *path_repr = NULL;
|
|
|
|
|
|
|
|
|
|
if (PyUnicode_Check(path) || PyBytes_Check(path)) {
|
|
|
|
|
Py_INCREF(path);
|
|
|
|
|
return path;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func = _PyObject_LookupSpecial(path, &PyId___fspath__);
|
|
|
|
|
if (NULL == func) {
|
|
|
|
|
return PyErr_Format(PyExc_TypeError,
|
|
|
|
|
"expected str, bytes or os.PathLike object, "
|
|
|
|
|
"not %S",
|
|
|
|
|
path->ob_type);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
path_repr = PyObject_CallFunctionObjArgs(func, NULL);
|
|
|
|
|
Py_DECREF(func);
|
2016-06-15 15:56:31 -04:00
|
|
|
|
if (!PyUnicode_Check(path_repr) && !PyBytes_Check(path_repr)) {
|
|
|
|
|
Py_DECREF(path_repr);
|
|
|
|
|
return PyErr_Format(PyExc_TypeError,
|
|
|
|
|
"expected __fspath__() to return str or bytes, "
|
|
|
|
|
"not %S",
|
|
|
|
|
path_repr->ob_type);
|
|
|
|
|
}
|
|
|
|
|
|
2016-05-16 16:24:35 -04:00
|
|
|
|
return path_repr;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Backwards compatibility
|
|
|
|
|
=======================
|
|
|
|
|
|
|
|
|
|
There are no explicit backwards-compatibility concerns. Unless an
|
|
|
|
|
object incidentally already defines a ``__fspath__()`` method there is
|
|
|
|
|
no reason to expect the pre-existing code to break or expect to have
|
|
|
|
|
its semantics implicitly changed.
|
|
|
|
|
|
|
|
|
|
Libraries wishing to support path objects and a version of Python
|
|
|
|
|
prior to Python 3.6 and the existence of ``os.fspath()`` can use the
|
|
|
|
|
idiom of
|
|
|
|
|
``path.__fspath__() if hasattr(path, "__fspath__") else path``.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Implementation
|
|
|
|
|
==============
|
|
|
|
|
|
2016-05-20 16:19:36 -04:00
|
|
|
|
This is the task list for what this PEP proposes to be changed in
|
|
|
|
|
Python 3.6:
|
2016-05-16 16:24:35 -04:00
|
|
|
|
|
|
|
|
|
#. Remove the ``path`` attribute from pathlib
|
2016-06-09 17:16:57 -04:00
|
|
|
|
(`done <http://bugs.python.org/issue22570>`__)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
#. Remove the provisional status of pathlib
|
2016-06-10 15:34:59 -04:00
|
|
|
|
(`done <https://hg.python.org/lookup/a5a013ca5687>`__)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
#. Add ``os.PathLike``
|
2016-06-09 20:00:28 -04:00
|
|
|
|
(`code <https://hg.python.org/lookup/e672cf63d08a>`__ and
|
2016-06-15 15:13:03 -04:00
|
|
|
|
`docs <http://hg.python.org/lookup/6239673d5e1d>`__ done)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
#. Add ``PyOS_FSPath()``
|
2016-06-09 20:00:28 -04:00
|
|
|
|
(`code <https://hg.python.org/lookup/780cbe18082e>`__ and
|
2016-06-15 15:13:03 -04:00
|
|
|
|
`docs <http://hg.python.org/lookup/cec1f00c538d>`__ done)
|
2016-05-20 16:19:36 -04:00
|
|
|
|
#. Add ``os.fspath()``
|
2016-06-09 17:16:57 -04:00
|
|
|
|
(`done <done <https://hg.python.org/lookup/780cbe18082e>`__)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
#. Update ``os.fsencode()``
|
2016-06-09 17:16:57 -04:00
|
|
|
|
(`done <https://hg.python.org/lookup/00991aa5fdb5>`__)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
#. Update ``os.fsdecode()``
|
2016-06-09 17:16:57 -04:00
|
|
|
|
(`done <https://hg.python.org/lookup/00991aa5fdb5>`__)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
#. Update ``pathlib.PurePath`` and ``pathlib.Path``
|
2016-06-10 15:34:59 -04:00
|
|
|
|
(`done <https://hg.python.org/lookup/a5a013ca5687>`__)
|
2016-05-20 16:19:36 -04:00
|
|
|
|
|
|
|
|
|
#. Add ``__fspath__()``
|
|
|
|
|
#. Add ``os.PathLike`` support to the constructors
|
|
|
|
|
|
2016-06-10 17:39:51 -04:00
|
|
|
|
#. Add ``__fspath__()`` to ``DirEntry``
|
|
|
|
|
(`done <https://hg.python.org/lookup/5a62d682636e>`__)
|
2016-05-20 16:19:36 -04:00
|
|
|
|
|
|
|
|
|
#. Update ``builtins.open()``
|
2016-06-09 17:16:57 -04:00
|
|
|
|
(`done <https://hg.python.org/lookup/254125a265d2>`__)
|
2016-05-16 16:24:35 -04:00
|
|
|
|
#. Update ``os.path``
|
2016-06-09 17:42:31 -04:00
|
|
|
|
(`bug <http://bugs.python.org/issue27186>`__)
|
2016-06-09 17:16:57 -04:00
|
|
|
|
#. Add a `glossary <https://docs.python.org/3.6/glossary.html>`__ entry for "path-like"
|
|
|
|
|
(`bug <http://bugs.python.org/issue27186>`__)
|
2016-05-20 16:19:36 -04:00
|
|
|
|
#. Update `"What's New" <https://docs.python.org/3.6/whatsnew/3.6.html>`_
|
2016-05-16 16:24:35 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rejected Ideas
|
|
|
|
|
==============
|
|
|
|
|
|
|
|
|
|
Other names for the protocol's method
|
|
|
|
|
-------------------------------------
|
|
|
|
|
|
|
|
|
|
Various names were proposed during discussions leading to this PEP,
|
|
|
|
|
including ``__path__``, ``__pathname__``, and ``__fspathname__``. In
|
|
|
|
|
the end people seemed to gravitate towards ``__fspath__`` for being
|
|
|
|
|
unambiguous without being unnecessarily long.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Separate str/bytes methods
|
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
|
|
At one point it was suggested that ``__fspath__()`` only return
|
|
|
|
|
strings and another method named ``__fspathb__()`` be introduced to
|
|
|
|
|
return bytes. The thinking is that by making ``__fspath__()`` not be
|
|
|
|
|
polymorphic it could make dealing with the potential string or bytes
|
|
|
|
|
representations easier. But the general consensus was that returning
|
|
|
|
|
bytes will more than likely be rare and that the various functions in
|
|
|
|
|
the os module are the better abstraction to promote over direct
|
|
|
|
|
calls to ``__fspath__()``.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Providing a ``path`` attribute
|
|
|
|
|
------------------------------
|
|
|
|
|
|
|
|
|
|
To help deal with the issue of ``pathlib.PurePath`` not inheriting
|
|
|
|
|
from ``str``, originally it was proposed to introduce a ``path``
|
|
|
|
|
attribute to mirror what ``os.DirEntry`` provides. In the end,
|
|
|
|
|
though, it was determined that a protocol would provide the same
|
|
|
|
|
result while not directly exposing an API that most people will never
|
|
|
|
|
need to interact with directly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Have ``__fspath__()`` only return strings
|
|
|
|
|
------------------------------------------
|
|
|
|
|
|
|
|
|
|
Much of the discussion that led to this PEP revolved around whether
|
|
|
|
|
``__fspath__()`` should be polymorphic and return ``bytes`` as well as
|
|
|
|
|
``str`` or only return ``str``. The general sentiment for this view
|
|
|
|
|
was that ``bytes`` are difficult to work with due to their
|
|
|
|
|
inherent lack of information about their encoding and PEP 383 makes
|
|
|
|
|
it possible to represent all file system paths using ``str`` with the
|
|
|
|
|
``surrogateescape`` handler. Thus, it would be better to forcibly
|
|
|
|
|
promote the use of ``str`` as the low-level path representation for
|
|
|
|
|
high-level path objects.
|
|
|
|
|
|
|
|
|
|
In the end, it was decided that using ``bytes`` to represent paths is
|
|
|
|
|
simply not going to go away and thus they should be supported to some
|
|
|
|
|
degree. The hope is that people will gravitate towards path objects
|
|
|
|
|
like pathlib and that will move people away from operating directly
|
|
|
|
|
with ``bytes``.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A generic string encoding mechanism
|
|
|
|
|
-----------------------------------
|
|
|
|
|
|
|
|
|
|
At one point there was a discussion of developing a generic mechanism
|
|
|
|
|
to extract a string representation of an object that had semantic
|
|
|
|
|
meaning (``__str__()`` does not necessarily return anything of
|
|
|
|
|
semantic significance beyond what may be helpful for debugging). In
|
|
|
|
|
the end, it was deemed to lack a motivating need beyond the one this
|
|
|
|
|
PEP is trying to solve in a specific fashion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Have __fspath__ be an attribute
|
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
|
|
It was briefly considered to have ``__fspath__`` be an attribute
|
|
|
|
|
instead of a method. This was rejected for two reasons. One,
|
|
|
|
|
historically protocols have been implemented as "magic methods" and
|
|
|
|
|
not "magic methods and attributes". Two, there is no guarantee that
|
|
|
|
|
the lower-level representation of a path object will be pre-computed,
|
|
|
|
|
potentially misleading users that there was no expensive computation
|
|
|
|
|
behind the scenes in case the attribute was implemented as a property.
|
|
|
|
|
|
|
|
|
|
This also indirectly ties into the idea of introducing a ``path``
|
|
|
|
|
attribute to accomplish the same thing. This idea has an added issue,
|
|
|
|
|
though, of accidentally having any object with a ``path`` attribute
|
|
|
|
|
meet the protocol's duck typing. Introducing a new magic method for
|
|
|
|
|
the protocol helpfully avoids any accidental opting into the protocol.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Provide specific type hinting support
|
|
|
|
|
-------------------------------------
|
|
|
|
|
|
|
|
|
|
There was some consideration to provdinga generic ``typing.PathLike``
|
|
|
|
|
class which would allow for e.g. ``typing.PathLike[str]`` to specify
|
|
|
|
|
a type hint for a path object which returned a string representation.
|
|
|
|
|
While potentially beneficial, the usefulness was deemed too small to
|
|
|
|
|
bother adding the type hint class.
|
|
|
|
|
|
|
|
|
|
This also removed any desire to have a class in the ``typing`` module
|
|
|
|
|
which represented the union of all acceptable path-representing types
|
|
|
|
|
as that can be represented with
|
|
|
|
|
``typing.Union[str, bytes, os.PathLike]`` easily enough and the hope
|
|
|
|
|
is users will slowly gravitate to path objects only.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Provide ``os.fspathb()``
|
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
|
|
It was suggested that to mirror the structure of e.g.
|
|
|
|
|
``os.getcwd()``/``os.getcwdb()``, that ``os.fspath()`` only return
|
|
|
|
|
``str`` and that another function named ``os.fspathb()`` be
|
|
|
|
|
introduced that only returned ``bytes``. This was rejected as the
|
|
|
|
|
purposes of the ``*b()`` functions are tied to querying the file
|
|
|
|
|
system where there is a need to get the raw bytes back. As this PEP
|
|
|
|
|
does not work directly with data on a file system (but which *may*
|
|
|
|
|
be), the view was taken this distinction is unnecessary. It's also
|
|
|
|
|
believed that the need for only bytes will not be common enough to
|
|
|
|
|
need to support in such a specific manner as ``os.fsencode()`` will
|
|
|
|
|
provide similar functionality.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Call ``__fspath__()`` off of the instance
|
|
|
|
|
-----------------------------------------
|
|
|
|
|
|
|
|
|
|
An earlier draft of this PEP had ``os.fspath()`` calling
|
|
|
|
|
``path.__fspath__()`` instead of ``type(path).__fspath__(path)``. The
|
|
|
|
|
changed to be consistent with how other magic methods in Python are
|
|
|
|
|
resolved.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Acknowledgements
|
|
|
|
|
================
|
|
|
|
|
|
|
|
|
|
Thanks to everyone who participated in the various discussions related
|
|
|
|
|
to this PEP that spanned both python-ideas and python-dev. Special
|
|
|
|
|
thanks to Stephen Turnbull for direct feedback on early drafts of this
|
|
|
|
|
PEP. More special thanks to Koos Zevenhoven and Ethan Furman for not
|
|
|
|
|
only feedback on early drafts of this PEP but also helping to drive
|
|
|
|
|
the overall discussion on this topic across the two mailing lists.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
.. [#python-ideas-archive] The python-ideas mailing list archive
|
|
|
|
|
(https://mail.python.org/pipermail/python-ideas/)
|
|
|
|
|
|
|
|
|
|
.. [#python-dev-archive] The python-dev mailing list archive
|
|
|
|
|
(https://mail.python.org/pipermail/python-dev/)
|
|
|
|
|
|
|
|
|
|
.. [#libc-open] ``open()`` documention for the C standard library
|
|
|
|
|
(http://www.gnu.org/software/libc/manual/html_node/Opening-and-Closing-Files.html)
|
|
|
|
|
|
|
|
|
|
.. [#pathlib] The ``pathlib`` module
|
|
|
|
|
(https://docs.python.org/3/library/pathlib.html#module-pathlib)
|
|
|
|
|
|
|
|
|
|
.. [#builtins-open] The ``builtins.open()`` function
|
|
|
|
|
(https://docs.python.org/3/library/functions.html#open)
|
|
|
|
|
|
|
|
|
|
.. [#os-fsencode] The ``os.fsencode()`` function
|
|
|
|
|
(https://docs.python.org/3/library/os.html#os.fsencode)
|
|
|
|
|
|
|
|
|
|
.. [#os-fsdecode] The ``os.fsdecode()`` function
|
|
|
|
|
(https://docs.python.org/3/library/os.html#os.fsdecode)
|
|
|
|
|
|
|
|
|
|
.. [#os-direntry] The ``os.DirEntry`` class
|
|
|
|
|
(https://docs.python.org/3/library/os.html#os.DirEntry)
|
|
|
|
|
|
|
|
|
|
.. [#os-path] The ``os.path`` module
|
|
|
|
|
(https://docs.python.org/3/library/os.path.html#module-os.path)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|