PEP 615: Move ICU support and alternate paths to rejected ideas (#1337)

* Move ICU support from "open" to "rejected".

* Move alternate environment variables to rejected
This commit is contained in:
Paul Ganssle 2020-03-29 12:25:55 -04:00 committed by GitHub
parent 38debf93e6
commit cfe6639a6d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 75 additions and 82 deletions

View File

@ -367,6 +367,14 @@ customize it. This PEP provides for three such avenues for customization:
2. Per-run configuration via environment variables
3. Runtime configuration change via a ``reset_tzpath`` function
In all methods of configuration, the search path must consist of only absolute,
rather than relative paths. Implementations may choose to ignore, warn or raise
an exception if a string other than an absolute path is found (and may make
different choices depending on the context — e.g. raising an exception when an
invalid path is passed to ``reset_tzpath`` but warning when one is included in
the environment variable). If an exception is not raised, any strings other
than an absolute path must not be included in the time zone search path.
Compile-time options
####################
@ -603,6 +611,72 @@ The ``zoneinfo`` approach is more closely based on ``dateutil.tz``, which
implemented support for ``fold`` (including a backport to older versions) just
before the release of Python 3.6.
Windows support via Microsoft's ICU API
---------------------------------------
Windows does not ship the time zone database as TZif files, but as of Windows
10's 2017 Creators Update, Microsoft has provided an API for interacting with
the International Components for Unicode (ICU) project [#icu-project]_
[#ms-icu-documentation]_ , which includes an API for accessing time zone data —
sourced from the IANA time zone database. [#icu-timezone-api]_
Providing bindings for this would allow us to support Windows "out of the box"
without the need to install the ``tzdata`` package, but unfortunately the C
headers provided by Windows do not provide any access to the underlying time
zone data — only an API to query the system for transition and offset
information is available. This would constrain the semantics of any ICU-based
implementation in ways that may not be compatible with a non-ICU-based
implementation — particularly around the behavior of the cache.
Since it seems like ICU cannot be used as simply an additional data source for
``ZoneInfo`` files, this PEP considers the ICU support to be out of scope, and
probably better supported by a third-party library.
Alternative environment variable configurations
-----------------------------------------------
This PEP proposes to use a single environment variable: ``PYTHONTZPATH``.
This is based on the assumption that the majority of users who would want to
manipulate the time zone path would want to fully replace it (e.g. "I know
exactly where my time zone data is"), and other use cases like prepending to
the existing search path would be less common.
There are several other schemes that were considered and rejected:
1. Separate ``PYTHON_TZPATH`` into two environment variables:
``DEFAULT_PYTHONTZPATH`` and ``PYTHONTZPATH``, where ``PYTHONTZPATH`` would
contain values to append (or prepend) to the default time zone path, and
``DEFAULT_PYTHONTZPATH`` would *replace* the default time zone path. This
was rejected because it would likely lead to user confusion if the primary
use case is to replace rather than augment.
2. Adding either ``PYTHONTZPATH_PREPEND``, ``PYTHONTZPATH_APPEND`` or both, so
that users can augment the search path on either end without attempting to
determine what the default time zone path is. This was rejected as likely to
be unnecessary, and because it could easily be added in a
backwards-compatible manner in future updates if there is much demand for
such a feature.
3. Use only the ``PYTHONTZPATH`` variable, but provide a custom special value
that represents the default time zone path, e.g. ``<<DEFAULT_TZPATH>>``, so
users could append to the time zone path with, e.g.
``PYTHONTZPATH=<<DEFAULT_TZPATH>>:/my/path`` could be used to append
``/my/path`` to the end of the time zone path.
One advantage to this scheme would be that it would add a natural extension
point for specifying non-file-based elements on the search path, such as
changing the priority of ``tzdata`` if it exists, or if native support for
TZDIST [#rfc7808]_ were to be added to the library in the future.
This was rejected mainly because these sort of special values are not
usually found in ``PATH``-like variables and the only currently proposed use
case is a stand-in for the default ``TZPATH``, which can be acquired by
executing a Python program to query for the default value. An additional
factor in rejecting this is that because ``PYTHONTZPATH`` accepts only
absolute paths, any string that does not represent a valid absolute path is
implicitly reserved for future use, so it would be possible to introduce
these special values as necessary in a backwards-compatible way in future
versions of the library.
Open Issues
===========
@ -695,87 +769,6 @@ separate top-level ``zoneinfo`` module because the benefits of nesting are not
so great that it overwhelms the practical implementation concerns, but this
still requires some discussion.
Structure of the ``PYTHON_TZPATH`` environment variable
=======================================================
This PEP proposes to use a single environment variable: ``PYTHONTZPATH``.
This is based on the assumption that the majority of
users who would want to manipulate the time zone path would want to fully
replace it (e.g. "I know exactly where my time zone data is"), and other
use cases like prepending to the existing search path would be less common.
There are several other schemes that were considered and weakly rejected:
1. Separate ``PYTHON_TZPATH`` into two environment variables:
``DEFAULT_PYTHONTZPATH`` and ``PYTHONTZPATH``, where ``PYTHONTZPATH`` would
contain values to append (or prepend) to the default time zone path, and
``DEFAULT_PYTHONTZPATH`` would *replace* the default time zone path. This
was rejected because it would likely lead to user confusion if the primary
use case is to replace rather than augment.
2. Adding either ``PYTHONTZPATH_PREPEND``, ``PYTHONTZPATH_APPEND`` or both, so
that users can augment the search path on either end without attempting to
determine what the default time zone path is. This was rejected as likely to
be unnecessary, and because it could easily be added in a
backwards-compatible manner in future updates if there is much demand for
such a feature.
3. Use only the ``PYTHONTZPATH`` variable, but provide a custom special value
that represents the default time zone path, e.g. ``<<DEFAULT_TZPATH>>``, so
users could append to the time zone path with, e.g.
``PYTHONTZPATH=<<DEFAULT_TZPATH>>:/my/path`` could be used to append
``/my/path`` to the end of the time zone path.
This was rejected mainly because these sort of special values are not
usually found in ``PATH``-like variables, and it would be hard to discover
mistakes in your implementation.
One advantage to this scheme would be that it would add a natural extension
point for specifying non-file-based elements on the search path, such as
changing the priority of ``tzdata`` if it exists, or if native support for
TZDIST [#rfc7808]_ were to be added to the library in the future.
Windows support via Microsoft's ICU API
=======================================
Windows does not ship the time zone database as TZif files, but as of Windows
10's 2017 Creators Update, Microsoft has provided an API for interacting with
the International Components for Unicode (ICU) project [#icu-project]_
[#ms-icu-documentation]_ , which includes an API for accessing time zone data —
sourced from the IANA time zone database. [#icu-timezone-api]_
Providing bindings for this would allow for a mostly seamless cross-platform
experience for users on sufficiently recent versions of Windows — even without
falling back to the ``tzdata`` package.
This is a promising area, but is less mature than the remainder of the proposal,
and so there are several open issues with regards to Windows support:
1. None of the popular third party time zone libraries provide support for ICU
(``dateutil``'s native windows time zone support relies on legacy time zones
provided in the Windows Registry [#dateutil-tzwin]_, which would be
unsuitable as a drop-in replacement for TZif files), so this would need to
be developed *de novo* in the standard library, rather than first maturing
in the third party ecosystem.
2. The most likely implementation for this would be to have ``TZPATH`` default
to empty on Windows and have a search path precedence of ``TZPATH`` > ICU
> ``tzdata``, but this prevents end users from forcing the use of ``tzdata``
by setting an empty ``TZPATH``.
Two possible solutions for this are:
1. Add a mechanism to disable ICU globally independent of setting
``TZPATH``.
2. Add a cross-platform mechanism to give ``tzdata`` the highest
precedence.
3. This is not part of the reference implementation and it is uncertain whether
it can be ready and vetted in time for the Python 3.9 feature freeze. It is
an open question whether a failure to implement native Windows support in
3.9 should defer the release of ``zoneinfo`` or if only the ICU-based
Windows support should be deferred.
Footnotes
=========
@ -881,7 +874,7 @@ Other time zone implementations:
https://dateutil.readthedocs.io/en/stable/tz.html
.. [#dateutil-tzwin]
``dateutil.tz.win``: Concreate time zone implementations wrapping Windows
``dateutil.tz.win``: Concrete time zone implementations wrapping Windows
time zones
https://dateutil.readthedocs.io/en/stable/tzwin.html