436 lines
15 KiB
Plaintext
436 lines
15 KiB
Plaintext
PEP: 495
|
|
Title: Local Time Disambiguation
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Alexander Belopolsky <alexander.belopolsky@gmail.com>
|
|
Discussions-To: Datetime-SIG <datetime-sig@python.org>
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 02-Aug-2015
|
|
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP adds a boolean member to the instances of ``datetime.time``
|
|
and ``datetime.datetime`` classes that can be used to differentiate
|
|
between two moments in time for which local times are the same.
|
|
|
|
.. sidebar:: US public service advertisement
|
|
|
|
.. image:: pep-0495-daylightsavings.png
|
|
:align: right
|
|
:width: 15%
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
In the most world locations there have been and will be times when
|
|
local clocks are moved back. In those times, intervals are introduced
|
|
in which local clocks show the same time twice in the same day. In
|
|
these situations, the information displayed on a local clock (or
|
|
stored in a Python datetime instance) is insufficient to identify a
|
|
particular moment in time. The proposed solution is to add a boolean
|
|
flag to the ``datetime`` instances that will distinguish between the
|
|
two ambiguous times.
|
|
|
|
|
|
Terminology
|
|
===========
|
|
|
|
When clocks are moved back, we say that a *fold* is created in the
|
|
fabric of time. When the clock are moved forward, a *gap* is created.
|
|
A local time that falls in the fold is called *ambiguous*. A local
|
|
time that falls in the gap is called *missing*.
|
|
|
|
|
|
Proposal
|
|
========
|
|
|
|
The "first" flag
|
|
----------------
|
|
|
|
We propose adding a boolean member called ``first`` to the instances
|
|
of ``datetime.time`` and ``datetime.datetime`` classes. This member
|
|
should have the value True for all instances except those that
|
|
represent the second (chronologically) moment in time in an ambiguous
|
|
case. [#]_
|
|
|
|
.. [#] An instance that has ``first=False`` in a non-ambiguous case is
|
|
said to represent an invalid time (or is invalid for short), but
|
|
users are not prevented from creating invalid instances by passing
|
|
``first=False`` to a constructor or to a ``replace()`` method. This
|
|
is similar to the current situation with the instances that fall in
|
|
the spring-forward gap. Such instances don't represent any valid
|
|
time, but neither the constructors nor the ``replace()`` methods
|
|
check whether the instances that they produce are valid. Moreover,
|
|
this PEP specifies how various functions should behave when given an
|
|
invalid instance.
|
|
|
|
Affected APIs
|
|
-------------
|
|
|
|
Attributes
|
|
..........
|
|
|
|
Instances of ``datetime.time`` and ``datetime.datetime`` will get a
|
|
new boolean attribute called "first."
|
|
|
|
Constructors
|
|
............
|
|
|
|
The ``__new__`` methods of the ``datetime.time`` and
|
|
``datetime.datetime`` classes will get a new keyword-only argument
|
|
called ``first`` with the default value ``True``. The value of the
|
|
``first`` argument will be used to initialize the value of the
|
|
``first`` attribute in the returned instance.
|
|
|
|
Methods
|
|
.......
|
|
|
|
The ``replace()`` methods of the ``datetime.time`` and
|
|
``datetime.datetime`` classes will get a new keyword-only argument
|
|
called ``first``. It will
|
|
behave similarly to the other ``replace()`` arguments: if the ``first``
|
|
argument is specified and given a boolean value, the new instance
|
|
returned by ``replace()`` will have its ``first`` attribute set
|
|
to that value. In CPython, a non-boolean value of ``first`` will
|
|
raise a ``TypeError``, but other implementations may allow the value
|
|
``None`` to behave the same as when ``first`` is not given. If the
|
|
``first`` argument is not specified, the original value of the ``first``
|
|
attribute is copied to the result.
|
|
|
|
C-API
|
|
.....
|
|
|
|
Access macros will be defined to extract the value of ``first`` from
|
|
``PyDateTime_DateTime`` and ``PyDateTime_Time`` objects.
|
|
|
|
.. code::
|
|
|
|
bool PyDateTime_GET_FIRST(PyDateTime_DateTime *o)
|
|
|
|
Return the value of ``first`` as a C99 ``bool``.
|
|
|
|
.. code::
|
|
|
|
bool PyDateTime_TIME_GET_FIRST(PyDateTime_Time *o)
|
|
|
|
Return the value of ``first`` as a C99 ``bool``.
|
|
|
|
Additional constructors will be defined that will take an additional
|
|
boolean argument to specify the value of ``first`` in the created
|
|
instance:
|
|
|
|
.. code::
|
|
|
|
PyObject* PyDateTime_FromDateAndTimeAndFirst(int year, int month, int day, int hour, int minute, int second, int usecond, bool first)
|
|
|
|
Return a ``datetime.datetime`` object with the specified year, month,
|
|
day, hour, minute, second, microsecond and first.
|
|
|
|
.. code::
|
|
|
|
PyObject* PyTime_FromTimeAndFirst(int hour, int minute, int second, int usecond, bool first)
|
|
|
|
Return a ``datetime.time`` object with the specified hour, minute,
|
|
second, microsecond and first.
|
|
|
|
|
|
Affected Behaviors
|
|
------------------
|
|
|
|
Conversion from naive to aware
|
|
..............................
|
|
|
|
The ``astimezone()`` method will now work for naive ``self``. The
|
|
system local timezone will be assumed in this case and the ``first``
|
|
flag will be used to determine which local timezone is in effect
|
|
in the ambiguous case.
|
|
|
|
For example, on a system set to US/Eastern timezone::
|
|
|
|
>>> dt = datetime(2014, 11, 2, 1, 30)
|
|
>>> dt.astimezone().strftime('%D %T %Z%z')
|
|
'11/02/14 01:30:00 EDT-0400'
|
|
>>> dt.replace(first=False).astimezone().strftime('%D %T %Z%z')
|
|
'11/02/14 01:30:00 EST-0500'
|
|
|
|
Conversion to POSIX seconds from EPOCH
|
|
......................................
|
|
|
|
The ``timestamp()`` method of ``datetime.datetime`` will return different
|
|
values for ``datetime.datetime`` instances that differ only by the value
|
|
of their ``first`` attribute if and only if these instances represent an
|
|
ambiguous or a non-existent time.
|
|
|
|
When a ``datetime.datetime`` instance ``dt`` represents an ambiguous
|
|
(repeated) time, there are two values ``s0`` and ``s1`` such that::
|
|
|
|
datetime.fromtimestamp(s0) == datetime.fromtimestamp(s1) == dt
|
|
|
|
In this case, ``dt.timestamp()`` will return the smaller of ``s0``
|
|
and ``s1`` values if ``dt.first == True`` and the larger otherwise.
|
|
|
|
|
|
For example, on a system set to US/Eastern timezone::
|
|
|
|
>>> datetime(2014, 11, 2, 1, 30, first=True).timestamp()
|
|
1414906200.0
|
|
>>> datetime(2014, 11, 2, 1, 30, first=False).timestamp()
|
|
1414909800.0
|
|
|
|
|
|
When a ``datetime.datetime`` instance ``dt`` represents a missing
|
|
time, there is no value ``s`` for which::
|
|
|
|
datetime.fromtimestamp(s) == dt
|
|
|
|
but we can form two "nice to know" values of ``s`` that differ
|
|
by the size of the gap in seconds. One is the value of ``s``
|
|
that would correspond to ``dt`` in a timezone where the UTC offset
|
|
is always the same as the offset right before the gap and the
|
|
other is the similar value but in a timezone the UTC offset
|
|
is always the same as the offset right after the gap.
|
|
|
|
The value returned by ``dt.timestamp()`` given a missing
|
|
``dt`` will be the larger of the two "nice to know" values
|
|
if ``dt.first == True`` and the smaller otherwise.
|
|
|
|
For example, on a system set to US/Eastern timezone::
|
|
|
|
>>> datetime(2015, 3, 8, 2, 30, first=True).timestamp()
|
|
1425799800.0
|
|
>>> datetime(2015, 3, 8, 2, 30, first=False).timestamp()
|
|
1425796200.0
|
|
|
|
|
|
Conversion from POSIX seconds from EPOCH
|
|
........................................
|
|
|
|
|
|
The ``fromtimestamp()`` static method of ``datetime.datetime`` will
|
|
set the ``first`` attribute appropriately in the returned object.
|
|
|
|
For example, on a system set to US/Eastern timezone::
|
|
|
|
>>> datetime.fromtimestamp(1414906200)
|
|
datetime.datetime(2014, 11, 2, 1, 30)
|
|
>>> datetime.fromtimestamp(1414906200 + 3600)
|
|
datetime.datetime(2014, 11, 2, 1, 30, first=False)
|
|
|
|
|
|
Combining and splitting date and time
|
|
.....................................
|
|
|
|
The ``datetime.datetime.combine()`` method will copy the value of the
|
|
``first`` attribute to the resulting ``datetime.datetime`` instance.
|
|
|
|
The ``datetime.datetime.time()`` method will copy the value of the
|
|
``first`` attribute to the resulting ``datetime.time`` instance.
|
|
|
|
|
|
Implementations of tzinfo in stdlib
|
|
...................................
|
|
|
|
No new implementations of ``datetime.tzinfo`` abstract class are
|
|
introduced in this PEP. The existing (fixed offset) timezones do
|
|
not introduce ambiguous local times and their ``utcoffset()``
|
|
implementation will return the same constant value as they do now
|
|
regardless of the value of ``first``.
|
|
|
|
The basic implementation of ``fromutc()`` in the abstract
|
|
``datetime.tzinfo`` class will not change. It is currently not
|
|
used anywhere in the stdlib because the only included ``tzinfo``
|
|
implementation (the ``datetime.timzeone`` class implementing fixed
|
|
offset timezones) override ``fromutc()``.
|
|
|
|
New guidelines will be published for implementing concrete timezones
|
|
with variable UTC offset.
|
|
|
|
|
|
Guidelines for new tzinfo implementations
|
|
-----------------------------------------
|
|
|
|
Implementors of concrete ``datetime.tzinfo`` subclasses who want to
|
|
support variable UTC offsets (due to DST and other causes) must follow
|
|
these guidelines.
|
|
|
|
New subclasses must override the base-class ``fromutc()`` method and
|
|
implement it so that in all cases where two UTC times ``u1`` and
|
|
``u2`` (``u1`` <``u2``) correspond to the same local time
|
|
``fromutc(u1)`` will return an instance with ``first=True`` and
|
|
``fromutc(u2)`` will return an instance with ``first=False``. In all
|
|
other cases the returned instance must have ``first=True``.
|
|
|
|
New implementations of ``utcoffset()`` and ``dst()`` methods should
|
|
ignore the value of ``first`` unless they are called on the ambiguous
|
|
or missing times.
|
|
|
|
On an ambiguous time introduced at the end of DST, the values returned
|
|
by ``utcoffset()`` and ``dst()`` methods should be as follows
|
|
|
|
+-----------------+----------------+------------------+
|
|
| | first=True | first=False |
|
|
+=================+================+==================+
|
|
| utcoff() | stdoff + hour | stdoff |
|
|
+-----------------+----------------+------------------+
|
|
| dst() | hour | zero |
|
|
+-----------------+----------------+------------------+
|
|
|
|
where ``stdoff`` is the standard (non-DST) offset,
|
|
``hour = timedelta(hours=1)`` and ``zero = timedelta(0)``.
|
|
|
|
On a missing time introduced at the start of DST, the values returned
|
|
by ``utcoffset()`` and ``dst()`` methods should be as follows
|
|
|
|
|
|
+-----------------+----------------+------------------+
|
|
| | first=True | first=False |
|
|
+=================+================+==================+
|
|
| utcoff() | stdoff | stdoff + hour |
|
|
+-----------------+----------------+------------------+
|
|
| dst() | zero | hour |
|
|
+-----------------+----------------+------------------+
|
|
|
|
|
|
On ambiguous/missing times introduced by the change in the standard time
|
|
offset, the ``dst()`` method should return the same value regardless of
|
|
the value of ``first`` and the ``utcoff()`` should return values
|
|
according to the following table:
|
|
|
|
+-----------------+----------------+-----------------------------+
|
|
| | first=True | first=False |
|
|
+=================+================+=============================+
|
|
| ambiguous | oldoff | newoff = oldoff - delta |
|
|
+-----------------+----------------+-----------------------------+
|
|
| missing | oldoff | newoff = oldoff + delta |
|
|
+-----------------+----------------+-----------------------------+
|
|
|
|
|
|
|
|
Pickle size
|
|
-----------
|
|
|
|
Pickle sizes for the ``datetime.datetime`` and ``datetime.time``
|
|
objects will not change. The ``first`` flag will be encoded in the
|
|
first bit of the 5th byte of the ``datetime.datetime`` pickle payload
|
|
or the 2nd byte of the datetime.time. In the `current implementation`_
|
|
these bytes are used to store minute value (0-59) and the first bit is
|
|
always 0. Note that ``first=True`` will be encoded as 0 in the first
|
|
bit and ``first=False`` as 1. (This change only affects pickle
|
|
format. In C implementation, the "first" member will get a full byte
|
|
to store the actual boolean value.)
|
|
|
|
We chose the minute byte to store the the "first" bit because this
|
|
choice preserves the natural ordering.
|
|
|
|
.. _current implementation: https://hg.python.org/cpython/file/d3b20bff9c5d/Include/datetime.h#l17
|
|
|
|
Temporal Arithmetics
|
|
--------------------
|
|
|
|
The value of "first" will be ignored in all operations except those
|
|
that involve conversion between timezones. [#]_ As a consequence,
|
|
``datetime.datetime`` or ``datetime.time`` instances that differ only
|
|
by the value of ``first`` will compare as equal. Applications that
|
|
need to differentiate between such instances should check the value of
|
|
``first`` or convert them to a timezone that does not have ambiguous
|
|
times.
|
|
|
|
The result of addition (subtraction) of a timedelta to (from) a
|
|
datetime will always have ``first`` set to ``True`` even if the
|
|
original datetime instance had ``first=False``.
|
|
|
|
.. [#] As of Python 3.5, ``tzinfo`` is ignored whenever timedelta is
|
|
added or subtracted from a ``datetime.datetime`` instance or when
|
|
one ``datetime.datetime`` instance is subtracted from another with
|
|
the same (even not-None) ``tzinfo``. This may change in the future,
|
|
but such changes are outside of the scope of this PEP.
|
|
|
|
|
|
Backward and Forward Compatibility
|
|
----------------------------------
|
|
|
|
This proposal will have little effect on the programs that do not read
|
|
the ``first`` flag explicitly or use tzinfo implementations that do.
|
|
The only visible change for such programs will be that conversions to
|
|
and from POSIX timestamps will now round-trip correctly (up to
|
|
floating point rounding). Programs that implemented work-arounds to
|
|
the old incorrect behavior will need to be modified.
|
|
|
|
Pickles produced by older programs will remain fully forward
|
|
compatible. Only datetime/time instances with ``first=False`` pickled
|
|
in the new versions will become unreadable by the older Python
|
|
versions. Pickles of instances with ``first=True`` (which is the
|
|
default) will remain unchanged.
|
|
|
|
Questions and Answers
|
|
=====================
|
|
|
|
1. Why not call the new flag "isdst"?
|
|
|
|
-------
|
|
|
|
* Alice: Bob - let's have a stargazing party at 01:30 AM tomorrow!
|
|
* Bob: Should I presume initially that summer time (for example, Daylight
|
|
Saving Time) is or is not (respectively) in effect for the specified time?
|
|
* Alice: Huh?
|
|
|
|
-------
|
|
|
|
* Bob: Alice - let's have a stargazing party at 01:30 AM tomorrow!
|
|
* Alice: You know, Bob, 01:30 AM will happen twice tomorrow. Which time do you have in mind?
|
|
* Bob: I did not think about it, but let's pick the first.
|
|
|
|
|
|
2. Why "first"?
|
|
|
|
* Rejections:
|
|
|
|
**second**
|
|
rejected because "second" is already there.
|
|
|
|
**later**
|
|
rejected because "later" is confusable with "latter".
|
|
|
|
**earlier**
|
|
rejected because "earlier" has the same issue as "first" (requires
|
|
default to be True) but is two characters longer.
|
|
|
|
* Remaining possibilities:
|
|
|
|
**repeated**
|
|
this is a strong candidate
|
|
|
|
**is_first**
|
|
arguably more grammatically correct than "first"
|
|
|
|
**ltdf**
|
|
(Local Time Disambiguation Flag) - short and no-one will
|
|
attempt to guess what it means without reading the docs.
|
|
|
|
Implementation
|
|
==============
|
|
|
|
* Github fork: https://github.com/abalkin/cpython
|
|
* Tracker issue: http://bugs.python.org/issue24773
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
Picture Credit
|
|
==============
|
|
|
|
This image is a work of a U.S. military or Department of Defense
|
|
employee, taken or made as part of that person's official duties. As a
|
|
work of the U.S. federal government, the image is in the public
|
|
domain.
|