PEP 706: Add some notes, remove open issue, fix mistakes & formatting (#3023)

* Add some notes, remove open issue, fix mistakes & formatting

- Remove Open Issue “How far should this be backported?”.
  That'll be up to release managers, no need to put it in the PEP.
- Add a section on adding filters to zipfile. Not *quite* a
  “rejected idea”, but IMO the PEP is good enough without it.
- Add a note on a registration mechanism
- Fix a few obvious mistakes, typos, formatting

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
This commit is contained in:
Petr Viktorin 2023-02-22 17:19:58 +01:00 committed by GitHub
parent 3eb648d9b4
commit 706b7aa2e1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 55 additions and 10 deletions

View File

@ -296,9 +296,9 @@ that will be used if the ``filter`` argument is missing or ``None``.
If both the argument and attribute are ``None``:
* In Python 3.12-3.13, a ``DeprecationWarning`` will be emitted and
extraction will use the ``'fully_trusted'`` filter.
* In Python 3.14+, it will use the ``'data'`` filter.
* In Python 3.12-3.13, a ``DeprecationWarning`` will be emitted and
extraction will use the ``'fully_trusted'`` filter.
* In Python 3.14+, it will use the ``'data'`` filter.
Applications and system integrators may wish to change ``extraction_filter``
of the ``TarFile`` class itself to set a global default.
@ -343,7 +343,7 @@ New docs will tell users to consider:
* checking that filenames have expected extensions (discouraging files that
execute when you “click on them”, or extension-less files like Windows
special device names),
* limiting the number of extracted, files total size of extracted data,
* limiting the number of extracted files, total size of extracted data,
and size of individual files,
* checking for files that would be shadowed on case-insensitive filesystems.
@ -385,11 +385,12 @@ tarfile CLI
-----------
The CLI (``python -m tarfile``) will gain a ``--filter`` option
that will take the nams of one of the provided default filters.
that will take the name of one of the provided default filters.
It won't be possible to specify a custom filter function.
If ``--filter`` is not given, the CLI will use the default filter
(``'legacy_warning'`` for a deprecation period, then ``'data'``).
(``'fully_trusted'`` with a deprecation warning now, and ``'data'`` from
Python 3.14 on).
There will be no short option. (``-f`` would be confusingly similar to
the filename option of GNU ``tar``.)
@ -417,7 +418,9 @@ gains a ``filter`` argument, if it ever does).
If ``filter`` is not specified (or left as ``None``), it won't be passed
on, so extracting a tarball will use the default filter
(``'legacy_warning'`` for a deprecation period, then ``'data'``).
(``'fully_trusted'`` with a deprecation warning now, and ``'data'`` from
Python 3.14 on).
Complex filters
---------------
@ -433,6 +436,14 @@ For example, with a hypothetical ``StatefulFilter`` users would write::
A simple ``StatefulFilter`` example will be added to the docs.
.. note::
The need for stateful filters is a reason against allowing
registration of custom filter names in addition to ``'fully_trusted'``,
``'tar'`` and ``'data'``.
With such a mechanism, API for (at least) set-up and tear-down would need
to be set in stone.
Backwards Compatibility
=======================
@ -564,10 +575,44 @@ Feature-wise, *tar format* and *UNIX-like filesystem* are essentially
equivalent, so ``tar`` is a good name.
Open Issues
===========
Possible Further Work
=====================
How far should this be backported?
Adding filters to zipfile and shutil.unpack_archive
---------------------------------------------------
For consistency, :external+py3.11:mod:`zipfile` and
:external+py3.11:func:`shutil.unpack_archive` could gain support
for a ``filter`` argument.
However, this would require research that this PEP's author can't promise
for Python 3.12.
Filters for ``zipfile`` would probably not help security.
Zip is used primarily for cross-platform data bundles, and correspondingly,
:external+py3.11:meth:`ZipFile.extract <zipfile.ZipFile.extract>`'s defaults
are already similar to what a ``'data'`` filter would do.
A ``'fully_trusted'`` filter, which would *newly allow* absolute paths and
``..`` path components, might not be useful for much except
a unified ``unpack_archive`` API.
Filters should be useful for use cases other than security, but those
would usually need custom filter functions, and those would need API that works
with both :external+py3.11:class:`~tarfile.TarInfo` and
:external+py3.11:class:`~zipfile.ZipInfo`.
That is *definitely* out of scope of this PEP.
If only this PEP is implemented and nothing changes for ``zipfile``,
the effect for callers of ``unpack_archive`` is that the default
for *tar* files is changing from ``'fully_trusted'`` to
the more appropriate ``'data'``.
In the interim period, Python 3.12-3.13 will emit ``DeprecationWarning``.
That's annoying, but there are several ways to handle it: e.g. add a
``filter`` argument conditionally, set ``TarFile.extraction_filter``
globally, or ignore/suppress the warning until Python 3.14.
Also, since many calls to ``unpack_archive`` are likely to be unsafe,
there's hope that the ``DeprecationWarning`` will often turn out to be
a helpful hint to review affected code.
Thanks