PEP 495: Are two values enough?

This commit is contained in:
Alexander Belopolsky 2015-08-22 17:24:15 -04:00
parent 98fc50b68c
commit 7d2fb4b8db
1 changed files with 107 additions and 5 deletions

View File

@ -396,8 +396,8 @@ This proposal will have little effect on the programs that do not read
the ``first`` flag explicitly or use tzinfo implementations that do. the ``first`` flag explicitly or use tzinfo implementations that do.
The only visible change for such programs will be that conversions to The only visible change for such programs will be that conversions to
and from POSIX timestamps will now round-trip correctly (up to and from POSIX timestamps will now round-trip correctly (up to
floating point rounding). Programs that implemented work-arounds to floating point rounding). Programs that implemented a work-around to
the old incorrect behavior will need to be modified. the old incorrect behavior may need to be modified.
Pickles produced by older programs will remain fully forward Pickles produced by older programs will remain fully forward
compatible. Only datetime/time instances with ``first=False`` pickled compatible. Only datetime/time instances with ``first=False`` pickled
@ -439,10 +439,10 @@ The main problem with the ``tm_isdst`` field is that it is impossible
to know what value is appropriate for ``tm_isdst`` without knowing the to know what value is appropriate for ``tm_isdst`` without knowing the
details about the time zone that are only available to the ``tzinfo`` details about the time zone that are only available to the ``tzinfo``
implementation. Thus while ``tm_isdst`` is useful in the *output* of implementation. Thus while ``tm_isdst`` is useful in the *output* of
methods such as ``time.localtime``, it is cumbursome as an *input* of methods such as ``time.localtime``, it is cumbersome as an *input* of
methods such as ``time.mktime``. methods such as ``time.mktime``.
If the programmer misspecifies a non-negative value of ``tm_isdst`` to If the programmer misspecified a non-negative value of ``tm_isdst`` to
``time.mktime``, the result will be time that is 1 hour off and since ``time.mktime``, the result will be time that is 1 hour off and since
there is rarely a way to know anything about DST *before* a call to there is rarely a way to know anything about DST *before* a call to
``time.mktime`` is made, the only sane choice is usually ``time.mktime`` is made, the only sane choice is usually
@ -480,7 +480,7 @@ The following alternative names have been proposed:
**later** **later**
A close contender to "fold". One author dislikes it because A close contender to "fold". One author dislikes it because
it is confusable with equally fitting "latter," but in the age it is confusable with equally fitting "latter," but in the age
of autocompletion everywhere this is a small consideration. A of auto-completion everywhere this is a small consideration. A
stronger objection may be that in the case of missing time, we stronger objection may be that in the case of missing time, we
will have ``later=True`` instance converted to an earlier time by will have ``later=True`` instance converted to an earlier time by
``.astimezone(timezone.utc)`` that that with ``later=False``. ``.astimezone(timezone.utc)`` that that with ``later=False``.
@ -498,6 +498,108 @@ The following alternative names have been proposed:
above.) above.)
Are two values enough?
---------------------------------------
The ``time.mktime`` interface allows three values for the ``tm_isdst``
flag: -1, 0, and 1. As we explained above, -1 (asking ``mktime`` to
determine whether DST is in effect for the given time from the rest of
the fields) is the only choice that is useful in practice.
With the ``first`` flag, however, ``datetime.timestamp()`` will return
the same value as ``mktime`` with ``tm_isdst=-1`` in 99.98% of the
time for most time zones with DST transitions. Moreover,
``tm_isdst=-1``-like behavior is specified *regardless* of the value
of ``first``.
It is only in the 0.02% cases (2 hours per year) that the
``datetime.timestamp()`` and ``mktime`` with ``tm_isdst=-1`` may
disagree. However, even in this case, most of the ``mktime``
implementations will return the ``first=True`` or the ``first=False``
value even though relevant standards allow ``mktime`` to return -1 and
set an error code in those cases.
In other words, ``tm_isdst=-1`` behavior is not missing from this PEP.
To the contrary, it is the only behavior provided in two different
well-defined flavors. The behavior that is missing is when a given
local hour is interpreted as a different local hour because of the
misspecified ``tm_isdst``.
For example, in the DST-observing time zones in the Northern
hemisphere (where DST is in effect in June) one can get
.. code::
>>> from time import mktime, localtime
>>> t = mktime((2015, 6, 1, 12, 0, 0, -1, -1, 0))
>>> localtime(t)[:]
(2015, 6, 1, 13, 0, 0, 0, 152, 1)
Note that 12:00 was interpreted as 13:00 by ``mktime``. With the
``datetime.timestamp``, ``datetime.fromtimestamp``, it is currently
guaranteed that
.. code::
>>> t = datetime.datetime(2015, 6, 1, 12).timestamp()
>>> datetime.datetime.fromtimestamp(t)
datetime.datetime(2015, 6, 1, 12, 0)
This PEP extends the same guarantee to both values of ``first``:
.. code::
>>> t = datetime.datetime(2015, 6, 1, 12, first=True).timestamp()
>>> datetime.datetime.fromtimestamp(t)
datetime.datetime(2015, 6, 1, 12, 0)
.. code::
>>> t = datetime.datetime(2015, 6, 1, 12, first=False).timestamp()
>>> datetime.datetime.fromtimestamp(t)
datetime.datetime(2015, 6, 1, 12, 0)
Thus one of the suggested uses for ``first=-1`` -- to match the legacy
behavior -- is not needed. Either choice of ``first`` will match the
old behavior except in the few cases where the old behavior was
undefined.
Another suggestion was to use ``first=-1`` or ``first=None`` to
indicate that the program truly has no means to deal with the folds
and gaps and ``dt.utcoffset()`` should raise an error whenever ``dt``
represents an ambiguous or missing local time.
The main problem with this proposal, is that ``dt.utcoffset()`` is
used internally in situations where raising an error is not an option:
for example, in dictionary lookups or list/set membership checks. So
strict gap/fold checking behavior would need to be controlled by a
separate flag, say ``dt.utcoffset(raise_on_gap=True,
raise_on_fold=False)``. However, this functionality can be easily
implemented in user code:
.. code::
def utcoffset(dt, raise_on_gap=True, raise_on_fold=False):
u = dt.utcoffset()
v = dt.replace(first=not dt.first).utcoffset()
if u == v:
return u
if (u < v) == dt.first:
if raise_on_fold:
raise AmbiguousTimeError
else:
if raise_on_gap:
raise MissingTimeError
return u
Moreover, raising an error in the problem cases is only one of many
possible solutions. An interactive program can ask the user for
additional input, while a server process may log a warning and take an
appropriate default action. We cannot possibly provide functions for
all possible user requirements, but this PEP provides the means to
implement any desired behavior in a few lines of code.
Implementation Implementation
============== ==============