PEP 471: Ben Hoyt updates
This commit is contained in:
parent
06c61b9447
commit
89ae8bb813
318
pep-0471.txt
318
pep-0471.txt
|
@ -8,7 +8,7 @@ Type: Standards Track
|
|||
Content-Type: text/x-rst
|
||||
Created: 30-May-2014
|
||||
Python-Version: 3.5
|
||||
Post-History: 27-Jun-2014, 8-Jul-2014
|
||||
Post-History: 27-Jun-2014, 8-Jul-2014, 14-Jul-2014, 18-Jul-2014
|
||||
|
||||
|
||||
Abstract
|
||||
|
@ -16,9 +16,9 @@ Abstract
|
|||
|
||||
This PEP proposes including a new directory iteration function,
|
||||
``os.scandir()``, in the standard library. This new function adds
|
||||
useful functionality and increases the speed of ``os.walk()`` by 2-10
|
||||
times (depending on the platform and file system) by significantly
|
||||
reducing the number of times ``stat()`` needs to be called.
|
||||
useful functionality and increases the speed of ``os.walk()`` by 2-20
|
||||
times (depending on the platform and file system) by avoiding calls to
|
||||
``os.stat()`` in most cases.
|
||||
|
||||
|
||||
Rationale
|
||||
|
@ -34,8 +34,8 @@ But the underlying system calls -- ``FindFirstFile`` /
|
|||
``FindNextFile`` on Windows and ``readdir`` on POSIX systems --
|
||||
already tell you whether the files returned are directories or not, so
|
||||
no further system calls are needed. Further, the Windows system calls
|
||||
return all the information for a ``stat_result`` object, such as file
|
||||
size and last modification time.
|
||||
return all the information for a ``stat_result`` object on the directory
|
||||
entry, such as file size and last modification time.
|
||||
|
||||
In short, you can reduce the number of system calls required for a
|
||||
tree function like ``os.walk()`` from approximately 2N to N, where N
|
||||
|
@ -56,7 +56,7 @@ iterates instead of returning them as one big list. This improves
|
|||
memory efficiency for iterating very large directories.
|
||||
|
||||
So, as well as providing a ``scandir()`` iterator function for calling
|
||||
directly, Python's existing ``os.walk()`` function could be sped up a
|
||||
directly, Python's existing ``os.walk()`` function can be sped up a
|
||||
huge amount.
|
||||
|
||||
.. _`Issue 11406`: http://bugs.python.org/issue11406
|
||||
|
@ -67,7 +67,8 @@ Implementation
|
|||
|
||||
The implementation of this proposal was written by Ben Hoyt (initial
|
||||
version) and Tim Golden (who helped a lot with the C extension
|
||||
module). It lives on GitHub at `benhoyt/scandir`_.
|
||||
module). It lives on GitHub at `benhoyt/scandir`_. (The implementation
|
||||
may lag behind the updates to this PEP a little.)
|
||||
|
||||
.. _`benhoyt/scandir`: https://github.com/benhoyt/scandir
|
||||
|
||||
|
@ -82,67 +83,83 @@ the standard library, as well as integration into ``posixmodule.c``.
|
|||
Specifics of proposal
|
||||
=====================
|
||||
|
||||
os.scandir()
|
||||
------------
|
||||
|
||||
Specifically, this PEP proposes adding a single function to the ``os``
|
||||
module in the standard library, ``scandir``, that takes a single,
|
||||
optional string as its argument::
|
||||
|
||||
scandir(path='.') -> generator of DirEntry objects
|
||||
scandir(directory='.') -> generator of DirEntry objects
|
||||
|
||||
Like ``listdir``, ``scandir`` calls the operating system's directory
|
||||
iteration system calls to get the names of the files in the ``path``
|
||||
directory, but it's different from ``listdir`` in two ways:
|
||||
iteration system calls to get the names of the files in the given
|
||||
``directory``, but it's different from ``listdir`` in two ways:
|
||||
|
||||
* Instead of returning bare filename strings, it returns lightweight
|
||||
``DirEntry`` objects that hold the filename string and provide
|
||||
simple methods that allow access to the additional data the
|
||||
operating system returned.
|
||||
operating system may have returned.
|
||||
|
||||
* It returns a generator instead of a list, so that ``scandir`` acts
|
||||
as a true iterator instead of returning the full list immediately.
|
||||
|
||||
``scandir()`` yields a ``DirEntry`` object for each file and directory
|
||||
in ``path``. Just like ``listdir``, the ``'.'`` and ``'..'``
|
||||
pseudo-directories are skipped, and the entries are yielded in
|
||||
system-dependent order. Each ``DirEntry`` object has the following
|
||||
attributes and methods:
|
||||
``scandir()`` yields a ``DirEntry`` object for each file and
|
||||
sub-directory in ``directory``. Just like ``listdir``, the ``'.'``
|
||||
and ``'..'`` pseudo-directories are skipped, and the entries are
|
||||
yielded in system-dependent order. Each ``DirEntry`` object has the
|
||||
following attributes and methods:
|
||||
|
||||
* ``name``: the entry's filename, relative to the ``path`` argument
|
||||
(corresponds to the return values of ``os.listdir``)
|
||||
* ``name``: the entry's filename, relative to the ``directory``
|
||||
argument (corresponds to the return values of ``os.listdir``)
|
||||
|
||||
* ``full_name``: the entry's full path name -- the equivalent of
|
||||
``os.path.join(path, entry.name)``
|
||||
* ``path``: the entry's full path name (not necessarily an absolute
|
||||
path) -- the equivalent of ``os.path.join(directory, entry.name)``
|
||||
|
||||
* ``is_dir()``: like ``os.path.isdir()``, but much cheaper -- it never
|
||||
requires a system call on Windows, and usually doesn't on POSIX
|
||||
systems
|
||||
* ``is_dir(*, follow_symlinks=True)``: similar to
|
||||
``pathlib.Path.is_dir()``, but the return value is cached on the
|
||||
``DirEntry`` object; doesn't require a system call in most cases;
|
||||
don't follow symbolic links if ``follow_symlinks`` is False
|
||||
|
||||
* ``is_file()``: like ``os.path.isfile()``, but much cheaper -- it
|
||||
never requires a system call on Windows, and usually doesn't on
|
||||
POSIX systems
|
||||
* ``is_file(*, follow_symlinks=True)``: similar to
|
||||
``pathlib.Path.is_file()``, but the return value is cached on the
|
||||
``DirEntry`` object; doesn't require a system call in most cases;
|
||||
don't follow symbolic links if ``follow_symlinks`` is False
|
||||
|
||||
* ``is_symlink()``: like ``os.path.islink()``, but much cheaper -- it
|
||||
never requires a system call on Windows, and usually doesn't on
|
||||
POSIX systems
|
||||
* ``is_symlink()``: similar to ``pathlib.Path.is_symlink()``, but the
|
||||
return value is cached on the ``DirEntry`` object; doesn't require a
|
||||
system call in most cases
|
||||
|
||||
* ``lstat()``: like ``os.lstat()``, but much cheaper on some systems
|
||||
-- it only requires a system call on POSIX systems
|
||||
* ``stat(*, follow_symlinks=True)``: like ``os.stat()``, but the
|
||||
return value is cached on the ``DirEntry`` object; does not require a
|
||||
system call on Windows (except for symlinks); don't follow symbolic links
|
||||
(like ``os.lstat()``) if ``follow_symlinks`` is False
|
||||
|
||||
The ``is_X`` methods may perform a ``stat()`` call under certain
|
||||
conditions (for example, on certain file systems on POSIX systems),
|
||||
and therefore possibly raise ``OSError``. The ``lstat()`` method will
|
||||
call ``stat()`` on POSIX systems and therefore also possibly raise
|
||||
``OSError``. See the "Notes on exception handling" section for more
|
||||
details.
|
||||
All *methods* may perform system calls in some cases and therefore
|
||||
possibly raise ``OSError`` -- see the "Notes on exception handling"
|
||||
section for more details.
|
||||
|
||||
The ``DirEntry`` attribute and method names were chosen to be the same
|
||||
as those in the new ``pathlib`` module for consistency.
|
||||
as those in the new ``pathlib`` module where possible, for
|
||||
consistency. The only difference in functionality is that the
|
||||
``DirEntry`` methods cache their values on the entry object after the
|
||||
first call.
|
||||
|
||||
Like the other functions in the ``os`` module, ``scandir()`` accepts
|
||||
either a bytes or str object for the ``path`` parameter, and returns
|
||||
the ``DirEntry.name`` and ``DirEntry.full_name`` attributes with the
|
||||
same type as ``path``. However, it is *strongly recommended* to use
|
||||
the str type, as this ensures cross-platform support for Unicode
|
||||
filenames.
|
||||
either a bytes or str object for the ``directory`` parameter, and
|
||||
returns the ``DirEntry.name`` and ``DirEntry.path`` attributes with
|
||||
the same type as ``directory``. However, it is *strongly recommended*
|
||||
to use the str type, as this ensures cross-platform support for
|
||||
Unicode filenames. (On Windows, bytes filenames have been deprecated
|
||||
since Python 3.3).
|
||||
|
||||
os.walk()
|
||||
---------
|
||||
|
||||
As part of this proposal, ``os.walk()`` will also be modified to use
|
||||
``scandir()`` rather than ``listdir()`` and ``os.path.isdir()``. This
|
||||
will increase the speed of ``os.walk()`` very significantly (as
|
||||
mentioned above, by 2-20 times, depending on the system).
|
||||
|
||||
|
||||
Examples
|
||||
|
@ -154,7 +171,7 @@ uses it::
|
|||
|
||||
dirs = []
|
||||
non_dirs = []
|
||||
for entry in os.scandir(path):
|
||||
for entry in os.scandir(directory):
|
||||
if entry.is_dir():
|
||||
dirs.append(entry)
|
||||
else:
|
||||
|
@ -165,19 +182,25 @@ scandir than ``os.listdir()`` and ``os.path.isdir()`` on both Windows
|
|||
and POSIX systems.
|
||||
|
||||
Or, for getting the total size of files in a directory tree, showing
|
||||
use of the ``DirEntry.lstat()`` method and ``DirEntry.full_name``
|
||||
use of the ``DirEntry.stat()`` method and ``DirEntry.path``
|
||||
attribute::
|
||||
|
||||
def get_tree_size(path):
|
||||
"""Return total size of files in path and subdirs."""
|
||||
def get_tree_size(directory):
|
||||
"""Return total size of files in directory and subdirs."""
|
||||
total = 0
|
||||
for entry in os.scandir(path):
|
||||
if entry.is_dir():
|
||||
total += get_tree_size(entry.full_name)
|
||||
for entry in os.scandir(directory):
|
||||
if entry.is_dir(follow_symlinks=False):
|
||||
total += get_tree_size(entry.path)
|
||||
else:
|
||||
total += entry.lstat().st_size
|
||||
total += entry.stat(follow_symlinks=False).st_size
|
||||
return total
|
||||
|
||||
This also shows the use of the ``follow_symlinks`` parameter to
|
||||
``is_dir()`` -- in a recursive function like this, we probably don't
|
||||
want to follow links. (To properly follow links in a recursive
|
||||
function like this we'd want special handling for the case where
|
||||
following a symlink leads to a recursive loop.)
|
||||
|
||||
Note that ``get_tree_size()`` will get a huge speed boost on Windows,
|
||||
because no extra stat call are needed, but on POSIX systems the size
|
||||
information is not returned by the directory iteration functions, so
|
||||
|
@ -188,10 +211,10 @@ Notes on caching
|
|||
----------------
|
||||
|
||||
The ``DirEntry`` objects are relatively dumb -- the ``name`` and
|
||||
``full_name`` attributes are obviously always cached, and the ``is_X``
|
||||
and ``lstat`` methods cache their values (immediately on Windows via
|
||||
``path`` attributes are obviously always cached, and the ``is_X``
|
||||
and ``stat`` methods cache their values (immediately on Windows via
|
||||
``FindNextFile``, and on first use on POSIX systems via a ``stat``
|
||||
call) and never refetch from the system.
|
||||
system call) and never refetch from the system.
|
||||
|
||||
For this reason, ``DirEntry`` objects are intended to be used and
|
||||
thrown away after iteration, not stored in long-lived data structured
|
||||
|
@ -199,50 +222,61 @@ and the methods called again and again.
|
|||
|
||||
If developers want "refresh" behaviour (for example, for watching a
|
||||
file's size change), they can simply use ``pathlib.Path`` objects,
|
||||
or call the regular ``os.lstat()`` or ``os.path.getsize()`` functions
|
||||
or call the regular ``os.stat()`` or ``os.path.getsize()`` functions
|
||||
which get fresh data from the operating system every call.
|
||||
|
||||
|
||||
Notes on exception handling
|
||||
---------------------------
|
||||
|
||||
``DirEntry.is_X()`` and ``DirEntry.lstat()`` are explicitly methods
|
||||
``DirEntry.is_X()`` and ``DirEntry.stat()`` are explicitly methods
|
||||
rather than attributes or properties, to make it clear that they may
|
||||
not be cheap operations, and they may do a system call. As a result,
|
||||
these methods may raise ``OSError``.
|
||||
not be cheap operations (although they often are), and they may do a
|
||||
system call. As a result, these methods may raise ``OSError``.
|
||||
|
||||
For example, ``DirEntry.lstat()`` will always make a system call on
|
||||
For example, ``DirEntry.stat()`` will always make a system call on
|
||||
POSIX-based systems, and the ``DirEntry.is_X()`` methods will make a
|
||||
``stat()`` system call on such systems if ``readdir()`` returns a
|
||||
``d_type`` with a value of ``DT_UNKNOWN``, which can occur under
|
||||
certain conditions or on certain file systems.
|
||||
``stat()`` system call on such systems if ``readdir()`` does not
|
||||
support ``d_type`` or returns a ``d_type`` with a value of
|
||||
``DT_UNKNOWN``, which can occur under certain conditions or on
|
||||
certain file systems.
|
||||
|
||||
For this reason, when a user requires fine-grained error handling,
|
||||
it's good to catch ``OSError`` around these method calls and then
|
||||
handle as appropriate.
|
||||
Often this does not matter -- for example, ``os.walk()`` as defined in
|
||||
the standard library only catches errors around the ``listdir()``
|
||||
calls.
|
||||
|
||||
Also, because the exception-raising behaviour of the ``DirEntry.is_X``
|
||||
methods matches that of ``pathlib`` -- which only raises ``OSError``
|
||||
in the case of permissions or other fatal errors, but returns False
|
||||
if the path doesn't exist or is a broken symlink -- it's often
|
||||
not necessary to catch errors around the ``is_X()`` calls.
|
||||
|
||||
However, when a user requires fine-grained error handling, it may be
|
||||
desirable to catch ``OSError`` around all method calls and handle as
|
||||
appropriate.
|
||||
|
||||
For example, below is a version of the ``get_tree_size()`` example
|
||||
shown above, but with basic error handling added::
|
||||
shown above, but with fine-grained error handling added::
|
||||
|
||||
def get_tree_size(path):
|
||||
"""Return total size of files in path and subdirs. If
|
||||
is_dir() or lstat() fails, print an error message to stderr
|
||||
def get_tree_size(directory):
|
||||
"""Return total size of files in directory and subdirs. If
|
||||
is_dir() or stat() fails, print an error message to stderr
|
||||
and assume zero size (for example, file has been deleted).
|
||||
"""
|
||||
total = 0
|
||||
for entry in os.scandir(path):
|
||||
for entry in os.scandir(directory):
|
||||
try:
|
||||
is_dir = entry.is_dir()
|
||||
is_dir = entry.is_dir(follow_symlinks=False)
|
||||
except OSError as error:
|
||||
print('Error calling is_dir():', error, file=sys.stderr)
|
||||
continue
|
||||
if is_dir:
|
||||
total += get_tree_size(entry.full_name)
|
||||
total += get_tree_size(entry.path)
|
||||
else:
|
||||
try:
|
||||
total += entry.lstat().st_size
|
||||
total += entry.stat(follow_symlinks=False).st_size
|
||||
except OSError as error:
|
||||
print('Error calling lstat():', error, file=sys.stderr)
|
||||
print('Error calling stat():', error, file=sys.stderr)
|
||||
return total
|
||||
|
||||
|
||||
|
@ -316,6 +350,12 @@ For example:
|
|||
Seems pretty solid, so first thing, just want to say nice work!"
|
||||
[via personal email]
|
||||
|
||||
* Matt Z: "I used scandir to dump the contents of a network dir in
|
||||
under 15 seconds. 13 root dirs, 60,000 files in the structure. This
|
||||
will replace some old VBA code embedded in a spreadsheet that was
|
||||
taking 15-20 minutes to do the exact same thing." [via personal
|
||||
email]
|
||||
|
||||
Others have `requested a PyPI package`_ for it, which has been
|
||||
created. See `PyPI package`_.
|
||||
|
||||
|
@ -331,13 +371,11 @@ of July 7, 2014:
|
|||
* Forks: 20
|
||||
* Issues: 4 open, 26 closed
|
||||
|
||||
**However, the much larger point is this:**, if this PEP is accepted,
|
||||
``os.walk()`` can easily be reimplemented using ``scandir`` rather
|
||||
than ``listdir`` and ``stat``, increasing the speed of ``os.walk()``
|
||||
very significantly. There are thousands of developers, scripts, and
|
||||
production code that would benefit from this large speedup of
|
||||
``os.walk()``. For example, on GitHub, there are almost as many uses
|
||||
of ``os.walk`` (194,000) as there are of ``os.mkdir`` (230,000).
|
||||
Also, because this PEP will increase the speed of ``os.walk()``
|
||||
significantly, there are thousands of developers and scripts, and a lot
|
||||
of production code, that would benefit from it. For example, on GitHub,
|
||||
there are almost as many uses of ``os.walk`` (194,000) as there are of
|
||||
``os.mkdir`` (230,000).
|
||||
|
||||
|
||||
Rejected ideas
|
||||
|
@ -392,12 +430,51 @@ and this `June 2014 python-dev thread on PEP 471
|
|||
<https://mail.python.org/pipermail/python-dev/2014-June/135217.html>`_.
|
||||
|
||||
|
||||
Methods not following symlinks by default
|
||||
-----------------------------------------
|
||||
|
||||
There was much debate on python-dev (see messages in `this thread
|
||||
<https://mail.python.org/pipermail/python-dev/2014-July/135485.html>`_)
|
||||
over whether the ``DirEntry`` methods should follow symbolic links or
|
||||
not (when the ``is_X()`` methods had no ``follow_symlinks`` parameter).
|
||||
|
||||
Initially they did not (see previous versions of this PEP and the
|
||||
scandir.py module), but Victor Stinner made a pretty compelling case on
|
||||
python-dev that following symlinks by default is a better idea, because:
|
||||
|
||||
* following links is usually what you want (in 92% of cases in the
|
||||
standard library, functions using ``os.listdir()`` and
|
||||
``os.path.isdir()`` do follow symlinks)
|
||||
|
||||
* that's the precedent set by the similar functions
|
||||
``os.path.isdir()`` and ``pathlib.Path.is_dir()``, so to do
|
||||
otherwise would be confusing
|
||||
|
||||
* with the non-link-following approach, if you wanted to follow links
|
||||
you'd have to say something like ``if (entry.is_symlink() and
|
||||
os.path.isdir(entry.path)) or entry.is_dir()``, which is clumsy
|
||||
|
||||
As a case in point that shows the non-symlink-following version is
|
||||
error prone, this PEP's author had a bug caused by getting this
|
||||
exact test wrong in his initial implementation of ``scandir.walk()``
|
||||
in scandir.py (see `Issue #4 here
|
||||
<https://github.com/benhoyt/scandir/issues/4>`_).
|
||||
|
||||
In the end there was not total agreement that the methods should
|
||||
follow symlinks, but there was basic consensus among the most involved
|
||||
participants, and this PEP's author believes that the above case is
|
||||
strong enough to warrant following symlinks by default.
|
||||
|
||||
In addition, it's straight-forward to call the relevant methods with
|
||||
``follow_symlinks=False`` if the other behaviour is desired.
|
||||
|
||||
|
||||
DirEntry attributes being properties
|
||||
------------------------------------
|
||||
|
||||
In some ways it would be nicer for the ``DirEntry`` ``is_X()`` and
|
||||
``lstat()`` to be properties instead of methods, to indicate they're
|
||||
very cheap or free. However, this isn't quite the case, as ``lstat()``
|
||||
``stat()`` to be properties instead of methods, to indicate they're
|
||||
very cheap or free. However, this isn't quite the case, as ``stat()``
|
||||
will require an OS call on POSIX-based systems but not on Windows.
|
||||
Even ``is_dir()`` and friends may perform an OS call on POSIX-based
|
||||
systems if the ``dirent.d_type`` value is ``DT_UNKNOWN`` (on certain
|
||||
|
@ -422,8 +499,8 @@ In `this July 2014 python-dev message
|
|||
<https://mail.python.org/pipermail/python-dev/2014-July/135303.html>`_,
|
||||
Paul Moore suggested a solution that was a "thin wrapper round the OS
|
||||
feature", where the ``DirEntry`` object had only static attributes:
|
||||
``name``, ``full_name``, and ``is_X``, with the ``st_X`` attributes
|
||||
only present on Windows. The idea was to use this simpler, lower-level
|
||||
``name``, ``path``, and ``is_X``, with the ``st_X`` attributes only
|
||||
present on Windows. The idea was to use this simpler, lower-level
|
||||
function as a building block for higher-level functions.
|
||||
|
||||
At first there was general agreement that simplifying in this way was
|
||||
|
@ -459,19 +536,24 @@ because ``stat()`` will be called (and hence potentially raise
|
|||
``OSError``) during iteration, leading to a rather ugly, hand-made
|
||||
iteration loop::
|
||||
|
||||
it = os.scandir(path)
|
||||
it = os.scandir(directory)
|
||||
while True:
|
||||
try:
|
||||
entry = next(it)
|
||||
except OSError as error:
|
||||
handle_error(path, error)
|
||||
handle_error(directory, error)
|
||||
except StopIteration:
|
||||
break
|
||||
|
||||
Or it means that ``scandir()`` would have to accept an ``onerror``
|
||||
argument -- a function to call when ``stat()`` errors occur during
|
||||
iteration. This seems to this PEP's author neither as direct nor as
|
||||
Pythonic as ``try``/``except`` around a ``DirEntry.lstat()`` call.
|
||||
Pythonic as ``try``/``except`` around a ``DirEntry.stat()`` call.
|
||||
|
||||
Another drawback is that ``os.scandir()`` is written to make code faster.
|
||||
Always calling ``os.lstat()`` on POSIX would not bring any speedup. In most
|
||||
cases, you don't need the full ``stat_result`` object -- the ``is_X()``
|
||||
methods are enough and this information is already known.
|
||||
|
||||
See `Ben Hoyt's July 2014 reply
|
||||
<https://mail.python.org/pipermail/python-dev/2014-July/135312.html>`_
|
||||
|
@ -513,7 +595,7 @@ Return values being overloaded stat_result objects
|
|||
--------------------------------------------------
|
||||
|
||||
Another alternative discussed was making the return values to be
|
||||
overloaded ``stat_result`` objects with ``name`` and ``full_name``
|
||||
overloaded ``stat_result`` objects with ``name`` and ``path``
|
||||
attributes. However, apart from this being a strange (and strained!)
|
||||
kind of overloading, this has the same problems mentioned above --
|
||||
most of the ``stat_result`` information is not fetched by
|
||||
|
@ -526,15 +608,15 @@ Return values being pathlib.Path objects
|
|||
With Antoine Pitrou's new standard library ``pathlib`` module, it
|
||||
at first seems like a great idea for ``scandir()`` to return instances
|
||||
of ``pathlib.Path``. However, ``pathlib.Path``'s ``is_X()`` and
|
||||
``lstat()`` functions are explicitly not cached, whereas ``scandir``
|
||||
``stat()`` functions are explicitly not cached, whereas ``scandir``
|
||||
has to cache them by design, because it's (often) returning values
|
||||
from the original directory iteration system call.
|
||||
|
||||
And if the ``pathlib.Path`` instances returned by ``scandir`` cached
|
||||
lstat values, but the ordinary ``pathlib.Path`` objects explicitly
|
||||
stat values, but the ordinary ``pathlib.Path`` objects explicitly
|
||||
don't, that would be more than a little confusing.
|
||||
|
||||
Guido van Rossum explicitly rejected ``pathlib.Path`` caching lstat in
|
||||
Guido van Rossum explicitly rejected ``pathlib.Path`` caching stat in
|
||||
the context of scandir `here
|
||||
<https://mail.python.org/pipermail/python-dev/2013-November/130583.html>`_,
|
||||
making ``pathlib.Path`` objects a bad choice for scandir return
|
||||
|
@ -564,35 +646,45 @@ here is a short list of some this PEP's author has in mind:
|
|||
Previous discussion
|
||||
===================
|
||||
|
||||
* `Original thread Ben Hoyt started on python-ideas`_ about speeding
|
||||
up ``os.walk()``
|
||||
* `Original November 2012 thread Ben Hoyt started on python-ideas
|
||||
<https://mail.python.org/pipermail/python-ideas/2012-November/017770.html>`_
|
||||
about speeding up ``os.walk()``
|
||||
|
||||
* Python `Issue 11406`_, which includes the original proposal for a
|
||||
scandir-like function
|
||||
|
||||
* `Further thread Ben Hoyt started on python-dev`_ that refined the
|
||||
``scandir()`` API, including Nick Coghlan's suggestion of scandir
|
||||
yielding ``DirEntry``-like objects
|
||||
* `Further May 2013 thread Ben Hoyt started on python-dev
|
||||
<https://mail.python.org/pipermail/python-dev/2013-May/126119.html>`_
|
||||
that refined the ``scandir()`` API, including Nick Coghlan's
|
||||
suggestion of scandir yielding ``DirEntry``-like objects
|
||||
|
||||
* `Another thread Ben Hoyt started on python-dev`_ to discuss the
|
||||
interaction between scandir and the new ``pathlib`` module
|
||||
* `November 2013 thread Ben Hoyt started on python-dev
|
||||
<https://mail.python.org/pipermail/python-dev/2013-November/130572.html>`_
|
||||
to discuss the interaction between scandir and the new ``pathlib``
|
||||
module
|
||||
|
||||
* `Final thread Ben Hoyt started on python-dev`_ to discuss the first
|
||||
version of this PEP, with extensive discussion about the API.
|
||||
* `June 2014 thread Ben Hoyt started on python-dev
|
||||
<https://mail.python.org/pipermail/python-dev/2014-June/135215.html>`_
|
||||
to discuss the first version of this PEP, with extensive discussion
|
||||
about the API
|
||||
|
||||
* `Question on StackOverflow`_ about why ``os.walk()`` is slow and
|
||||
pointers on how to fix it (this inspired the author of this PEP
|
||||
early on)
|
||||
* `First July 2014 thread Ben Hoyt started on python-dev
|
||||
<https://mail.python.org/pipermail/python-dev/2014-July/135377.html>`_
|
||||
to discuss his updates to PEP 471
|
||||
|
||||
* `BetterWalk`_, this PEP's author's previous attempt at this, on
|
||||
which the scandir code is based
|
||||
* `Second July 2014 thread Ben Hoyt started on python-dev
|
||||
<https://mail.python.org/pipermail/python-dev/2014-July/135485.html>`_
|
||||
to discuss the remaining decisions needed to finalize PEP 471,
|
||||
specifically whether the ``DirEntry`` methods should follow symlinks
|
||||
by default
|
||||
|
||||
.. _`Original thread Ben Hoyt started on python-ideas`: https://mail.python.org/pipermail/python-ideas/2012-November/017770.html
|
||||
.. _`Further thread Ben Hoyt started on python-dev`: https://mail.python.org/pipermail/python-dev/2013-May/126119.html
|
||||
.. _`Another thread Ben Hoyt started on python-dev`: https://mail.python.org/pipermail/python-dev/2013-November/130572.html
|
||||
.. _`Final thread Ben Hoyt started on python-dev`: https://mail.python.org/pipermail/python-dev/2014-June/135215.html
|
||||
.. _`Question on StackOverflow`: http://stackoverflow.com/questions/2485719/very-quickly-getting-total-size-of-folder
|
||||
.. _`BetterWalk`: https://github.com/benhoyt/betterwalk
|
||||
* `Question on StackOverflow
|
||||
<http://stackoverflow.com/questions/2485719/very-quickly-getting-total-size-of-folder>`_
|
||||
about why ``os.walk()`` is slow and pointers on how to fix it (this
|
||||
inspired the author of this PEP early on)
|
||||
|
||||
* `BetterWalk <https://github.com/benhoyt/betterwalk>`_, this PEP's
|
||||
author's previous attempt at this, on which the scandir code is based
|
||||
|
||||
|
||||
Copyright
|
||||
|
|
Loading…
Reference in New Issue