python-peps/peps/pep-0495.rst

892 lines
34 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 495
Title: Local Time Disambiguation
Version: $Revision$
Last-Modified: $Date$
Author: Alexander Belopolsky <alexander.belopolsky@gmail.com>, Tim Peters <tim.peters@gmail.com>
Discussions-To: datetime-sig@python.org
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 02-Aug-2015
Python-Version: 3.6
Resolution: https://mail.python.org/pipermail/datetime-sig/2015-September/000900.html
Abstract
========
This PEP adds a new attribute ``fold`` to instances of the
``datetime.time`` and ``datetime.datetime`` classes that can be used
to differentiate between two moments in time for which local times are
the same. The allowed values for the ``fold`` attribute will be 0 and 1
with 0 corresponding to the earlier and 1 to the later of the two
possible readings of an ambiguous local time.
Rationale
=========
In most world locations, there have been and will be times when
local clocks are moved back. [#]_ In those times, intervals are
introduced in which local clocks show the same time twice in the same
day. In these situations, the information displayed on a local clock
(or stored in a Python datetime instance) is insufficient to identify
a particular moment in time. The proposed solution is to add an
attribute to the ``datetime`` instances taking values of 0 and 1 that
will enumerate the two ambiguous times.
.. image:: pep-0495-daylightsavings.png
:align: center
:alt: A cartoon of a strong man struggling to stop the hands of a large clock.
The caption reads: You can't stop time... but you can turn it back one
hour at 2 a.m. Oct. 28 when daylight-saving time ends and standard time
begins."
:width: 30%
.. [#] People who live in locations observing the Daylight Saving
Time (DST) move their clocks back (usually one hour) every Fall.
It is less common, but occasionally clocks can be moved back for
other reasons. For example, Ukraine skipped the spring-forward
transition in March 1990 and instead, moved their clocks back on
July 1, 1990, switching from Moscow Time to Eastern European Time.
In that case, standard (winter) time was in effect before and after
the transition.
Both DST and standard time changes may result in time shifts other
than an hour.
Terminology
===========
When clocks are moved back, we say that a *fold* [#]_ is created in time.
When the clocks are moved forward, a *gap* is created. A local time
that falls in the fold is called *ambiguous*. A local time that falls
in the gap is called *missing*.
.. [#] The term "fall-backward fold" was invented in 1990s by Paul Eggert
of UCLA who used it in various Internet discussions related to the C language
standard that culminated in a `Defect Report #139`_.
.. _Defect Report #139: http://www.open-std.org/jtc1/sc22/wg14/docs/rr/dr_136.html
Proposal
========
The "fold" attribute
--------------------
We propose adding an attribute called ``fold`` to instances of the
``datetime.time`` and ``datetime.datetime`` classes. This attribute
should have the value 0 for all instances except those that represent
the second (chronologically) moment in time in an ambiguous case. For
those instances, the value will be 1. [#]_
.. [#] An instance that has ``fold=1`` in a non-ambiguous case is
said to represent an invalid time (or is invalid for short), but
users are not prevented from creating invalid instances by passing
``fold=1`` to a constructor or to a ``replace()`` method. This
is similar to the current situation with the instances that fall in
the spring-forward gap. Such instances don't represent any valid
time, but neither the constructors nor the ``replace()`` methods
check whether the instances that they produce are valid. Moreover,
this PEP specifies how various functions should behave when given an
invalid instance.
Affected APIs
-------------
Attributes
..........
Instances of ``datetime.time`` and ``datetime.datetime`` classes will
get a new attribute ``fold`` with two possible values: 0 and 1.
Constructors
............
The ``__new__`` methods of the ``datetime.time`` and
``datetime.datetime`` classes will get a new keyword-only argument
called ``fold`` with the default value 0. The value of the
``fold`` argument will be used to initialize the value of the
``fold`` attribute in the returned instance.
Methods
.......
The ``replace()`` methods of the ``datetime.time`` and
``datetime.datetime`` classes will get a new keyword-only argument
called ``fold``. It will behave similarly to the other ``replace()``
arguments: if the ``fold`` argument is specified and given a value 0
or 1, the new instance returned by ``replace()`` will have its
``fold`` attribute set to that value. In CPython, any non-integer
value of ``fold`` will raise a ``TypeError``, but other
implementations may allow the value ``None`` to behave the same as
when ``fold`` is not given. [#]_ (This is
a nod to the existing difference in treatment of ``None`` arguments
in other positions of this method across Python implementations;
it is not intended to leave the door open for future alternative
interpretation of ``fold=None``.) If the ``fold`` argument is not
specified, the original value of the ``fold`` attribute is copied to
the result.
.. [#] PyPy and pure Python implementation distributed with CPython
already allow ``None`` to mean "no change to existing
attribute" for all other attributes in ``replace()``.
C-API
.....
Access macros will be defined to extract the value of ``fold`` from
``PyDateTime_DateTime`` and ``PyDateTime_Time`` objects.
.. code::
int PyDateTime_DATE_GET_FOLD(PyDateTime_DateTime *o)
Return the value of ``fold`` as a C ``int``.
.. code::
int PyDateTime_TIME_GET_FOLD(PyDateTime_Time *o)
Return the value of ``fold`` as a C ``int``.
New constructors will be defined that will take an additional
argument to specify the value of ``fold`` in the created
instance:
.. code::
PyObject* PyDateTime_FromDateAndTimeAndFold(
int year, int month, int day, int hour, int minute,
int second, int usecond, int fold)
Return a ``datetime.datetime`` object with the specified year, month,
day, hour, minute, second, microsecond and fold.
.. code::
PyObject* PyTime_FromTimeAndFold(
int hour, int minute, int second, int usecond, int fold)
Return a ``datetime.time`` object with the specified hour, minute,
second, microsecond and fold.
Affected Behaviors
------------------
What time is it?
................
The ``datetime.now()`` method called without arguments will set
``fold=1`` when returning the second of the two ambiguous times in a
system local time fold. When called with a ``tzinfo`` argument, the
value of the ``fold`` will be determined by the ``tzinfo.fromutc()``
implementation. When an instance of the ``datetime.timezone`` class
(the stdlib's fixed-offset ``tzinfo`` subclass,
*e.g.* ``datetime.timezone.utc``) is passed as ``tzinfo``, the
returned datetime instance will always have ``fold=0``.
The ``datetime.utcnow()`` method is unaffected.
Conversion from naive to aware
..............................
A new feature is proposed to facilitate conversion from naive datetime
instances to aware.
The ``astimezone()`` method will now work for naive ``self``. The
system local timezone will be assumed in this case and the ``fold``
flag will be used to determine which local timezone is in effect
in the ambiguous case.
For example, on a system set to US/Eastern timezone::
>>> dt = datetime(2014, 11, 2, 1, 30)
>>> dt.astimezone().strftime('%D %T %Z%z')
'11/02/14 01:30:00 EDT-0400'
>>> dt.replace(fold=1).astimezone().strftime('%D %T %Z%z')
'11/02/14 01:30:00 EST-0500'
An implication is that ``datetime.now(tz)`` is fully equivalent to
``datetime.now().astimezone(tz)`` (assuming ``tz`` is an instance of a
post-PEP ``tzinfo`` implementation, i.e. one that correctly handles
and sets ``fold``).
Conversion from POSIX seconds from EPOCH
........................................
The ``fromtimestamp()`` static method of ``datetime.datetime`` will
set the ``fold`` attribute appropriately in the returned object.
For example, on a system set to US/Eastern timezone::
>>> datetime.fromtimestamp(1414906200)
datetime.datetime(2014, 11, 2, 1, 30)
>>> datetime.fromtimestamp(1414906200 + 3600)
datetime.datetime(2014, 11, 2, 1, 30, fold=1)
Conversion to POSIX seconds from EPOCH
......................................
The ``timestamp()`` method of ``datetime.datetime`` will return different
values for ``datetime.datetime`` instances that differ only by the value
of their ``fold`` attribute if and only if these instances represent an
ambiguous or a missing time.
When a ``datetime.datetime`` instance ``dt`` represents an ambiguous
time, there are two values ``s0`` and ``s1`` such that::
datetime.fromtimestamp(s0) == datetime.fromtimestamp(s1) == dt
(This is because ``==`` disregards the value of fold -- see below.)
In this case, ``dt.timestamp()`` will return the smaller of ``s0``
and ``s1`` values if ``dt.fold == 0`` and the larger otherwise.
For example, on a system set to US/Eastern timezone::
>>> datetime(2014, 11, 2, 1, 30, fold=0).timestamp()
1414906200.0
>>> datetime(2014, 11, 2, 1, 30, fold=1).timestamp()
1414909800.0
When a ``datetime.datetime`` instance ``dt`` represents a missing
time, there is no value ``s`` for which::
datetime.fromtimestamp(s) == dt
but we can form two "nice to know" values of ``s`` that differ
by the size of the gap in seconds. One is the value of ``s``
that would correspond to ``dt`` in a timezone where the UTC offset
is always the same as the offset right before the gap and the
other is the similar value but in a timezone the UTC offset
is always the same as the offset right after the gap.
The value returned by ``dt.timestamp()`` given a missing
``dt`` will be the greater of the two "nice to know" values
if ``dt.fold == 0`` and the smaller otherwise.
(This is not a typo -- it's intentionally backwards from the rule for
ambiguous times.)
For example, on a system set to US/Eastern timezone::
>>> datetime(2015, 3, 8, 2, 30, fold=0).timestamp()
1425799800.0
>>> datetime(2015, 3, 8, 2, 30, fold=1).timestamp()
1425796200.0
Aware datetime instances
........................
Users of pre-PEP implementations of ``tzinfo`` will not see any
changes in the behavior of their aware datetime instances. Two such
instances that differ only by the value of the ``fold`` attribute will
not be distinguishable by any means other than an explicit access to
the ``fold`` value. (This is because these pre-PEP implementations
are not using the ``fold`` attribute.)
On the other hand, if an object's ``tzinfo`` is set to a fold-aware
implementation, then in a fold or gap the value of ``fold`` will
affect the result of several methods:
``utcoffset()``, ``dst()``, ``tzname()``, ``astimezone()``,
``strftime()`` (if the "%Z" or "%z" directive is used in the format
specification), ``isoformat()``, and ``timetuple()``.
Combining and splitting date and time
.....................................
The ``datetime.datetime.combine()`` method will copy the value of the
``fold`` attribute to the resulting ``datetime.datetime`` instance.
The ``datetime.datetime.time()`` method will copy the value of the
``fold`` attribute to the resulting ``datetime.time`` instance.
Pickles
.......
The value of the fold attribute will only be saved in pickles created
with protocol version 4 (introduced in Python 3.4) or greater.
Pickle sizes for the ``datetime.datetime`` and ``datetime.time``
objects will not change. The ``fold`` value will be encoded in the
first bit of the 3rd byte of the ``datetime.datetime``
pickle payload; and in the first bit of the 1st byte of the
``datetime.time`` payload. In the `current implementation`_
these bytes are used to store the month (1-12) and hour (0-23) values
and the first bit is always 0. We picked these bytes because they are
the only bytes that are checked by the current unpickle code. Thus
loading post-PEP ``fold=1`` pickles in a pre-PEP Python will result in
an exception rather than an instance with out of range components.
.. _current implementation: https://hg.python.org/cpython/file/v3.5.0/Include/datetime.h#l10
Implementations of tzinfo in the Standard Library
=================================================
No new implementations of ``datetime.tzinfo`` abstract class are
proposed in this PEP. The existing (fixed offset) timezones do
not introduce ambiguous local times and their ``utcoffset()``
implementation will return the same constant value as they do now
regardless of the value of ``fold``.
The basic implementation of ``fromutc()`` in the abstract
``datetime.tzinfo`` class will not change. It is currently not used
anywhere in the stdlib because the only included ``tzinfo``
implementation (the ``datetime.timezone`` class implementing fixed
offset timezones) overrides ``fromutc()``. Keeping the default
implementation unchanged has the benefit that pre-PEP 3rd party
implementations that inherit the default ``fromutc()`` are not
accidentally affected.
Guidelines for New tzinfo Implementations
=========================================
Implementors of concrete ``datetime.tzinfo`` subclasses who want to
support variable UTC offsets (due to DST and other causes) should follow
these guidelines.
Ignorance is Bliss
------------------
New implementations of ``utcoffset()``, ``tzname()`` and ``dst()``
methods should ignore the value of ``fold`` unless they are called on
the ambiguous or missing times.
In the Fold
-----------
New subclasses should override the base-class ``fromutc()`` method and
implement it so that in all cases where two different UTC times ``u0`` and
``u1`` (``u0`` <``u1``) correspond to the same local time ``t``,
``fromutc(u0)`` will return an instance with ``fold=0`` and
``fromutc(u1)`` will return an instance with ``fold=1``. In all
other cases the returned instance should have ``fold=0``.
The ``utcoffset()``, ``tzname()`` and ``dst()`` methods should use the
value of the fold attribute to determine whether an otherwise
ambiguous time ``t`` corresponds to the time before or after the
transition. By definition, ``utcoffset()`` is greater before and
smaller after any transition that creates a fold. The values returned
by ``tzname()`` and ``dst()`` may or may not depend on the value of
the ``fold`` attribute depending on the kind of the transition.
.. image:: pep-0495-fold.svg
:align: center
:alt: Diagram of relationship between UTC and local time around a
fall-back transition see full description on page.
:width: 60%
:class: invert-in-dark-mode
The sketch above illustrates the relationship between the UTC and
local time around a fall-back transition. The zig-zag line is a graph
of the function implemented by ``fromutc()``. Two intervals on the
UTC axis adjacent to the transition point and having the size of the
time shift at the transition are mapped to the same interval on the
local axis. New implementations of ``fromutc()`` method should set
the fold attribute to 1 when ``self`` is in the region marked in
yellow on the UTC axis. (All intervals should be treated as closed on
the left and open on the right.)
Mind the Gap
------------
The ``fromutc()`` method should never produce a time in the gap.
If the ``utcoffset()``, ``tzname()`` or ``dst()`` method is called on a
local time that falls in a gap, the rules in effect before the
transition should be used if ``fold=0``. Otherwise, the rules in
effect after the transition should be used.
.. image:: pep-0495-gap.svg
:align: center
:alt: Diagram of relationship between UTC and local time around a
spring-forward transition see full description on page.
:width: 60%
:class: invert-in-dark-mode
The sketch above illustrates the relationship between the UTC and
local time around a spring-forward transition. At the transition, the
local clock is advanced skipping the times in the gap. For the
purposes of determining the values of ``utcoffset()``, ``tzname()``
and ``dst()``, the line before the transition is extended forward to
find the UTC time corresponding to the time in the gap with ``fold=0``
and for instances with ``fold=1``, the line after the transition is
extended back.
Summary of Rules at a Transition
--------------------------------
On ambiguous/missing times ``utcoffset()`` should return values
according to the following table:
+-----------------+----------------+-----------------------------+
| | fold=0 | fold=1 |
+=================+================+=============================+
| Fold | oldoff | newoff = oldoff - delta |
+-----------------+----------------+-----------------------------+
| Gap | oldoff | newoff = oldoff + delta |
+-----------------+----------------+-----------------------------+
where ``oldoff`` (``newoff``) is the UTC offset before (after) the
transition and ``delta`` is the absolute size of the fold or the gap.
Note that the interpretation of the fold attribute is consistent in
the fold and gap cases. In both cases, ``fold=0`` (``fold=1``) means
use ``fromutc()`` line before (after) the transition to find the UTC
time. Only in the "Fold" case, the UTC times ``u0`` and ``u1`` are
"real" solutions for the equation ``fromutc(u) == t``, while in the
"Gap" case they are "imaginary" solutions.
The DST Transitions
-------------------
On a missing time introduced at the start of DST, the values returned
by ``utcoffset()`` and ``dst()`` methods should be as follows
+-----------------+----------------+------------------+
| | fold=0 | fold=1 |
+=================+================+==================+
| utcoffset() | stdoff | stdoff + dstoff |
+-----------------+----------------+------------------+
| dst() | zero | dstoff |
+-----------------+----------------+------------------+
On an ambiguous time introduced at the end of DST, the values returned
by ``utcoffset()`` and ``dst()`` methods should be as follows
+-----------------+----------------+------------------+
| | fold=0 | fold=1 |
+=================+================+==================+
| utcoffset() | stdoff + dstoff| stdoff |
+-----------------+----------------+------------------+
| dst() | dstoff | zero |
+-----------------+----------------+------------------+
where ``stdoff`` is the standard (non-DST) offset, ``dstoff`` is the
DST correction (typically ``dstoff = timedelta(hours=1)``) and ``zero
= timedelta(0)``.
Temporal Arithmetic and Comparison Operators
============================================
.. epigraph::
| In *mathematicks* he was greater
| Than Tycho Brahe, or Erra Pater:
| For he, by geometric scale,
| Could take the size of pots of ale;
| Resolve, by sines and tangents straight,
| If bread or butter wanted weight,
| And wisely tell what hour o' th' day
| The clock does strike by algebra.
-- "Hudibras" by Samuel Butler
The value of the ``fold`` attribute will be ignored in all operations
with naive datetime instances. As a consequence, naive
``datetime.datetime`` or ``datetime.time`` instances that differ only
by the value of ``fold`` will compare as equal. Applications that
need to differentiate between such instances should check the value of
``fold`` explicitly or convert those instances to a timezone that does
not have ambiguous times (such as UTC).
The value of ``fold`` will also be ignored whenever a timedelta is
added to or subtracted from a datetime instance which may be either
aware or naive. The result of addition (subtraction) of a timedelta
to (from) a datetime will always have ``fold`` set to 0 even if the
original datetime instance had ``fold=1``.
No changes are proposed to the way the difference ``t - s`` is
computed for datetime instances ``t`` and ``s``. If both instances
are naive or ``t.tzinfo`` is the same instance as ``s.tzinfo``
(``t.tzinfo is s.tzinfo`` evaluates to ``True``) then ``t - s`` is a
timedelta ``d`` such that ``s + d == t``. As explained in the
previous paragraph, timedelta addition ignores both ``fold`` and
``tzinfo`` attributes and so does intra-zone or naive datetime
subtraction.
Naive and intra-zone comparisons will ignore the value of ``fold`` and
return the same results as they do now. (This is the only way to
preserve backward compatibility. If you need an aware intra-zone
comparison that uses the fold, convert both sides to UTC first.)
The inter-zone subtraction will be defined as it is now: ``t - s`` is
computed as ``(t - t.utcoffset()) - (s -
s.utcoffset()).replace(tzinfo=t.tzinfo)``, but the result will
depend on the values of ``t.fold`` and ``s.fold`` when either
``t.tzinfo`` or ``s.tzinfo`` is post-PEP. [#]_
.. [#] Note that the new rules may result in a paradoxical situation
when ``s == t`` but ``s - u != t - u``. Such paradoxes are
not really new and are inherent in the overloading of the minus
operator differently for intra- and inter-zone operations. For
example, one can easily construct datetime instances ``t`` and ``s``
with some variable offset ``tzinfo`` and a datetime ``u`` with
``tzinfo=timezone.utc`` such that ``(t - u) - (s - u) != t - s``.
The explanation for this paradox is that the minuses inside the
parentheses and the two other minuses are really three different
operations: inter-zone datetime subtraction, timedelta subtraction,
and intra-zone datetime subtraction, which each have the mathematical
properties of subtraction separately, but not when combined in a
single expression.
Aware datetime Equality Comparison
----------------------------------
The aware datetime comparison operators will work the same as they do
now, with results indirectly affected by the value of ``fold`` whenever
the ``utcoffset()`` value of one of the operands depends on it, with one
exception. Whenever one or both of the operands in inter-zone comparison is
such that its ``utcoffset()`` depends on the value of its ``fold``
fold attribute, the result is ``False``. [#]_
.. [#] This exception is designed to preserve the hash and equivalence
invariants in the face of paradoxes of inter-zone arithmetic.
Formally, ``t == s`` when ``t.tzinfo is s.tzinfo`` evaluates to
``False`` can be defined as follows. Let ``toutc(t, fold)`` be a
function that takes an aware datetime instance ``t`` and returns a
naive instance representing the same time in UTC assuming a given
value of ``fold``:
.. code::
def toutc(t, fold):
u = t - t.replace(fold=fold).utcoffset()
return u.replace(tzinfo=None)
Then ``t == s`` is equivalent to
.. code::
toutc(t, fold=0) == toutc(t, fold=1) == toutc(s, fold=0) == toutc(s, fold=1)
Backward and Forward Compatibility
==================================
This proposal will have little effect on the programs that do not read
the ``fold`` flag explicitly or use tzinfo implementations that do.
The only visible change for such programs will be that conversions to
and from POSIX timestamps will now round-trip correctly (up to
floating point rounding). Programs that implemented a work-around to
the old incorrect behavior may need to be modified.
Pickles produced by older programs will remain fully forward
compatible. Only datetime/time instances with ``fold=1`` pickled
in the new versions will become unreadable by the older Python
versions. Pickles of instances with ``fold=0`` (which is the
default) will remain unchanged.
Questions and Answers
=====================
Why not call the new flag "isdst"?
----------------------------------
A non-technical answer
......................
* Alice: Bob - let's have a stargazing party at 01:30 AM tomorrow!
* Bob: Should I presume initially that Daylight Saving Time is or is
not in effect for the specified time?
* Alice: Huh?
-------
* Bob: Alice - let's have a stargazing party at 01:30 AM tomorrow!
* Alice: You know, Bob, 01:30 AM will happen twice tomorrow. Which time do you have in mind?
* Bob: I did not think about it, but let's pick the first.
-------
(same characters, an hour later)
-------
* Bob: Alice - this Py-O-Clock gadget of mine asks me to choose
between fold=0 and fold=1 when I set it for tomorrow 01:30 AM.
What should I do?
* Alice: I've never hear of a Py-O-Clock, but I guess fold=0 is
the first 01:30 AM and fold=1 is the second.
A technical reason
..................
While the ``tm_isdst`` field of the ``time.struct_time`` object can be
used to disambiguate local times in the fold, the semantics of such
disambiguation are completely different from the proposal in this PEP.
The main problem with the ``tm_isdst`` field is that it is impossible
to know what value is appropriate for ``tm_isdst`` without knowing the
details about the time zone that are only available to the ``tzinfo``
implementation. Thus while ``tm_isdst`` is useful in the *output* of
methods such as ``time.localtime``, it is cumbersome as an *input* of
methods such as ``time.mktime``.
If the programmer misspecified a non-negative value of ``tm_isdst`` to
``time.mktime``, the result will be time that is 1 hour off and since
there is rarely a way to know anything about DST *before* a call to
``time.mktime`` is made, the only sane choice is usually
``tm_isdst=-1``.
Unlike ``tm_isdst``, the proposed ``fold`` attribute has no effect on
the interpretation of the datetime instance unless without that
attribute two (or no) interpretations are possible.
Since it would be very confusing to have something called ``isdst``
that does not have the same semantics as ``tm_isdst``, we need a
different name. Moreover, the ``datetime.datetime`` class already has
a method called ``dst()`` and if we called ``fold`` "isdst", we would
necessarily have situations when "isdst" is zero but ``dst()`` is not
or the other way around.
Why "fold"?
-----------
Suggested by Guido van Rossum and favored by one (but initially
disfavored by another) author. A consensus was reached after the
allowed values for the attribute were changed from False/True to 0/1.
The noun "fold" has correct connotations and easy mnemonic rules, but
at the same time does not invite unbased assumptions.
What is "first"?
----------------
This was a working name of the attribute chosen initially because the
obvious alternative ("second") conflicts with the existing attribute.
It was rejected mostly on the grounds that it would make True a
default value.
The following alternative names have also been considered:
**later**
A close contender to "fold". One author dislikes it because
it is confusable with equally fitting "latter," but in the age
of auto-completion everywhere this is a small consideration. A
stronger objection may be that in the case of missing time, we
will have ``later=True`` instance converted to an earlier time by
``.astimezone(timezone.utc)`` that that with ``later=False``.
Yet again, this can be interpreted as a desirable indication that
the original time is invalid.
**which**
The `original`_ placeholder name for the ``localtime`` function
branch index was `independently proposed`_ for the name of the
disambiguation attribute and received `some support`_.
**repeated**
Did not receive any support on the mailing list.
**ltdf**
(Local Time Disambiguation Flag) - short and no-one will attempt
to guess what it means without reading the docs. (This abbreviation
was used in PEP discussions with the meaning ``ltdf=False`` is the
earlier by those who didn't want to endorse any of the alternatives.)
.. _original: https://mail.python.org/pipermail/python-dev/2015-April/139099.html
.. _independently proposed: https://mail.python.org/pipermail/datetime-sig/2015-August/000479.html
.. _some support: https://mail.python.org/pipermail/datetime-sig/2015-August/000483.html
Are two values enough?
----------------------
Several reasons have been raised to allow a ``None`` or -1 value for
the ``fold`` attribute: backward compatibility, analogy with ``tm_isdst``
and strict checking for invalid times.
Backward Compatibility
......................
It has been suggested that backward compatibility can be improved if
the default value of the ``fold`` flag was ``None`` which would
signal that pre-PEP behavior is requested. Based on the analysis
below, we believe that the proposed changes with the ``fold=0``
default are sufficiently backward compatible.
This PEP provides only three ways for a program to discover that two
otherwise identical datetime instances have different values of
``fold``: (1) an explicit check of the ``fold`` attribute; (2) if
the instances are naive - conversion to another timezone using the
``astimezone()`` method; and (3) conversion to ``float`` using the
``timestamp()`` method.
Since ``fold`` is a new attribute, the first option is not available
to the existing programs. Note that option (2) only works for naive
datetimes that happen to be in a fold or a gap in the system time
zone. In all other cases, the value of ``fold`` will be ignored in
the conversion unless the instances use a ``fold``-aware ``tzinfo``
which would not be available in a pre-PEP program. Similarly, the
``astimezone()`` called on a naive instance will not be available in
such program because ``astimezone()`` does not currently work with
naive datetimes.
This leaves us with only one situation where an existing program can
start producing different results after the implementation of this PEP:
when a ``datetime.timestamp()`` method is called on a naive datetime
instance that happen to be in the fold or the gap. In the current
implementation, the result is undefined. Depending on the system
``mktime`` implementation, the programs can see different results or
errors in those cases. With this PEP in place, the value of timestamp
will be well-defined in those cases but will depend on the value of
the ``fold`` flag. We consider the change in
``datetime.timestamp()`` method behavior a bug fix enabled by this
PEP. The old behavior can still be emulated by the users who depend
on it by writing ``time.mktime(dt.timetuple()) + 1e-6*dt.microsecond``
instead of ``dt.timestamp()``.
Analogy with tm_isdst
.....................
The ``time.mktime`` interface allows three values for the ``tm_isdst``
flag: -1, 0, and 1. As we explained above, -1 (asking ``mktime`` to
determine whether DST is in effect for the given time from the rest of
the fields) is the only choice that is useful in practice.
With the ``fold`` flag, however, ``datetime.timestamp()`` will return
the same value as ``mktime`` with ``tm_isdst=-1`` in 99.98% of the
time for most time zones with DST transitions. Moreover,
``tm_isdst=-1``-like behavior is specified *regardless* of the value
of ``fold``.
It is only in the 0.02% cases (2 hours per year) that the
``datetime.timestamp()`` and ``mktime`` with ``tm_isdst=-1`` may
disagree. However, even in this case, most of the ``mktime``
implementations will return the ``fold=0`` or the ``fold=1``
value even though relevant standards allow ``mktime`` to return -1 and
set an error code in those cases.
In other words, ``tm_isdst=-1`` behavior is not missing from this PEP.
To the contrary, it is the only behavior provided in two different
well-defined flavors. The behavior that is missing is when a given
local hour is interpreted as a different local hour because of the
misspecified ``tm_isdst``.
For example, in the DST-observing time zones in the Northern
hemisphere (where DST is in effect in June) one can get
.. code::
>>> from time import mktime, localtime
>>> t = mktime((2015, 6, 1, 12, 0, 0, -1, -1, 0))
>>> localtime(t)[:]
(2015, 6, 1, 13, 0, 0, 0, 152, 1)
Note that 12:00 was interpreted as 13:00 by ``mktime``. With the
``datetime.timestamp``, ``datetime.fromtimestamp``, it is currently
guaranteed that
.. code::
>>> t = datetime.datetime(2015, 6, 1, 12).timestamp()
>>> datetime.datetime.fromtimestamp(t)
datetime.datetime(2015, 6, 1, 12, 0)
This PEP extends the same guarantee to both values of ``fold``:
.. code::
>>> t = datetime.datetime(2015, 6, 1, 12, fold=0).timestamp()
>>> datetime.datetime.fromtimestamp(t)
datetime.datetime(2015, 6, 1, 12, 0)
.. code::
>>> t = datetime.datetime(2015, 6, 1, 12, fold=1).timestamp()
>>> datetime.datetime.fromtimestamp(t)
datetime.datetime(2015, 6, 1, 12, 0)
Thus one of the suggested uses for ``fold=-1`` -- to match the legacy
behavior -- is not needed. Either choice of ``fold`` will match the
old behavior except in the few cases where the old behavior was
undefined.
Strict Invalid Time Checking
............................
Another suggestion was to use ``fold=-1`` or ``fold=None`` to
indicate that the program truly has no means to deal with the folds
and gaps and ``dt.utcoffset()`` should raise an error whenever ``dt``
represents an ambiguous or missing local time.
The main problem with this proposal, is that ``dt.utcoffset()`` is
used internally in situations where raising an error is not an option:
for example, in dictionary lookups or list/set membership checks. So
strict gap/fold checking behavior would need to be controlled by a
separate flag, say ``dt.utcoffset(raise_on_gap=True,
raise_on_fold=False)``. However, this functionality can be easily
implemented in user code:
.. code::
def utcoffset(dt, raise_on_gap=True, raise_on_fold=False):
u = dt.utcoffset()
v = dt.replace(fold=not dt.fold).utcoffset()
if u == v:
return u
if (u < v) == dt.fold:
if raise_on_fold:
raise AmbiguousTimeError
else:
if raise_on_gap:
raise MissingTimeError
return u
Moreover, raising an error in the problem cases is only one of many
possible solutions. An interactive program can ask the user for
additional input, while a server process may log a warning and take an
appropriate default action. We cannot possibly provide functions for
all possible user requirements, but this PEP provides the means to
implement any desired behavior in a few lines of code.
Implementation
==============
* Github fork: https://github.com/abalkin/cpython/tree/issue24773-s3
* Tracker issue: http://bugs.python.org/issue24773
Copyright
=========
This document has been placed in the public domain.
Picture Credit
==============
This image is a work of a U.S. military or Department of Defense
employee, taken or made as part of that person's official duties. As a
work of the U.S. federal government, the image is in the public
domain.