521 lines
18 KiB
Plaintext
521 lines
18 KiB
Plaintext
PEP: 495
|
|
Title: Local Time Disambiguation
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Alexander Belopolsky <alexander.belopolsky@gmail.com>, Tim Peters <tim.peters@gmail.com>
|
|
Discussions-To: Datetime-SIG <datetime-sig@python.org>
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 02-Aug-2015
|
|
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP adds a boolean member to the instances of ``datetime.time``
|
|
and ``datetime.datetime`` classes that can be used to differentiate
|
|
between two moments in time for which local times are the same.
|
|
|
|
.. sidebar:: US public service advertisement
|
|
|
|
.. image:: pep-0495-daylightsavings.png
|
|
:align: center
|
|
:width: 95%
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
In the most world locations there have been and will be times when
|
|
local clocks are moved back. [#]_ In those times, intervals are introduced
|
|
in which local clocks show the same time twice in the same day. In
|
|
these situations, the information displayed on a local clock (or
|
|
stored in a Python datetime instance) is insufficient to identify a
|
|
particular moment in time. The proposed solution is to add a boolean
|
|
flag to the ``datetime`` instances that will distinguish between the
|
|
two ambiguous times.
|
|
|
|
.. [#] People who live in locations observing the Daylight Saving
|
|
Time (DST) move their clocks back (usually one hour) every Fall.
|
|
|
|
It is less common, but occasionally clocks can be moved back for
|
|
other reasons. For example, Ukraine skipped the spring-forward
|
|
transition in March 1990 and instead, moved their clocks back on
|
|
July 1, 1990, switching from Moscow Time to Eastern European Time.
|
|
In that case, standard (winter) time was in effect before and after
|
|
the transition.
|
|
|
|
Both DST and standard time changes may result in time shifts other
|
|
than an hour.
|
|
|
|
|
|
Terminology
|
|
===========
|
|
|
|
When clocks are moved back, we say that a *fold* is created in time.
|
|
When the clocks are moved forward, a *gap* is created. A local time
|
|
that falls in the fold is called *ambiguous*. A local time that falls
|
|
in the gap is called *missing*.
|
|
|
|
|
|
Proposal
|
|
========
|
|
|
|
The "first" flag
|
|
----------------
|
|
|
|
We propose adding a boolean member called ``first`` to the instances
|
|
of ``datetime.time`` and ``datetime.datetime`` classes. This member
|
|
should have the value ``True`` for all instances except those that
|
|
represent the second (chronologically) moment in time in an ambiguous
|
|
case. [#]_
|
|
|
|
.. [#] An instance that has ``first=False`` in a non-ambiguous case is
|
|
said to represent an invalid time (or is invalid for short), but
|
|
users are not prevented from creating invalid instances by passing
|
|
``first=False`` to a constructor or to a ``replace()`` method. This
|
|
is similar to the current situation with the instances that fall in
|
|
the spring-forward gap. Such instances don't represent any valid
|
|
time, but neither the constructors nor the ``replace()`` methods
|
|
check whether the instances that they produce are valid. Moreover,
|
|
this PEP specifies how various functions should behave when given an
|
|
invalid instance.
|
|
|
|
Affected APIs
|
|
-------------
|
|
|
|
Attributes
|
|
..........
|
|
|
|
Instances of ``datetime.time`` and ``datetime.datetime`` will get a
|
|
new boolean attribute called "first."
|
|
|
|
Constructors
|
|
............
|
|
|
|
The ``__new__`` methods of the ``datetime.time`` and
|
|
``datetime.datetime`` classes will get a new keyword-only argument
|
|
called ``first`` with the default value ``True``. The value of the
|
|
``first`` argument will be used to initialize the value of the
|
|
``first`` attribute in the returned instance.
|
|
|
|
Methods
|
|
.......
|
|
|
|
The ``replace()`` methods of the ``datetime.time`` and
|
|
``datetime.datetime`` classes will get a new keyword-only argument
|
|
called ``first``. It will
|
|
behave similarly to the other ``replace()`` arguments: if the ``first``
|
|
argument is specified and given a boolean value, the new instance
|
|
returned by ``replace()`` will have its ``first`` attribute set
|
|
to that value. In CPython, a non-boolean value of ``first`` will
|
|
raise a ``TypeError``, but other implementations may allow the value
|
|
``None`` to behave the same as when ``first`` is not given. If the
|
|
``first`` argument is not specified, the original value of the ``first``
|
|
attribute is copied to the result.
|
|
|
|
C-API
|
|
.....
|
|
|
|
Access macros will be defined to extract the value of ``first`` from
|
|
``PyDateTime_DateTime`` and ``PyDateTime_Time`` objects.
|
|
|
|
.. code::
|
|
|
|
bool PyDateTime_GET_FIRST(PyDateTime_DateTime *o)
|
|
|
|
Return the value of ``first`` as a C99 ``bool``.
|
|
|
|
.. code::
|
|
|
|
bool PyDateTime_TIME_GET_FIRST(PyDateTime_Time *o)
|
|
|
|
Return the value of ``first`` as a C99 ``bool``.
|
|
|
|
Additional constructors will be defined that will take an additional
|
|
boolean argument to specify the value of ``first`` in the created
|
|
instance:
|
|
|
|
.. code::
|
|
|
|
PyObject* PyDateTime_FromDateAndTimeAndFirst(int year, int month, int day, int hour, int minute, int second, int usecond, bool first)
|
|
|
|
Return a ``datetime.datetime`` object with the specified year, month,
|
|
day, hour, minute, second, microsecond and first.
|
|
|
|
.. code::
|
|
|
|
PyObject* PyTime_FromTimeAndFirst(int hour, int minute, int second, int usecond, bool first)
|
|
|
|
Return a ``datetime.time`` object with the specified hour, minute,
|
|
second, microsecond and first.
|
|
|
|
|
|
Affected Behaviors
|
|
------------------
|
|
|
|
What time is it?
|
|
................
|
|
|
|
The ``datetime.now()`` method called with no arguments, will set
|
|
``first=False`` when returning the second of the two ambiguous times
|
|
in a fold. When called with a ``tzinfo`` argument, the value of the
|
|
``first`` will be determined by the ``tzinfo.fromutc()``
|
|
implementation. If an instance of the built-in ``datetime.timezone``
|
|
is passed as ``tzinfo``, the returned datetime instance will always
|
|
have ``first=True``.
|
|
|
|
|
|
Conversion from naive to aware
|
|
..............................
|
|
|
|
The ``astimezone()`` method will now work for naive ``self``. The
|
|
system local timezone will be assumed in this case and the ``first``
|
|
flag will be used to determine which local timezone is in effect
|
|
in the ambiguous case.
|
|
|
|
For example, on a system set to US/Eastern timezone::
|
|
|
|
>>> dt = datetime(2014, 11, 2, 1, 30)
|
|
>>> dt.astimezone().strftime('%D %T %Z%z')
|
|
'11/02/14 01:30:00 EDT-0400'
|
|
>>> dt.replace(first=False).astimezone().strftime('%D %T %Z%z')
|
|
'11/02/14 01:30:00 EST-0500'
|
|
|
|
|
|
Conversion from POSIX seconds from EPOCH
|
|
........................................
|
|
|
|
The ``fromtimestamp()`` static method of ``datetime.datetime`` will
|
|
set the ``first`` attribute appropriately in the returned object.
|
|
|
|
For example, on a system set to US/Eastern timezone::
|
|
|
|
>>> datetime.fromtimestamp(1414906200)
|
|
datetime.datetime(2014, 11, 2, 1, 30)
|
|
>>> datetime.fromtimestamp(1414906200 + 3600)
|
|
datetime.datetime(2014, 11, 2, 1, 30, first=False)
|
|
|
|
|
|
Conversion to POSIX seconds from EPOCH
|
|
......................................
|
|
|
|
The ``timestamp()`` method of ``datetime.datetime`` will return different
|
|
values for ``datetime.datetime`` instances that differ only by the value
|
|
of their ``first`` attribute if and only if these instances represent an
|
|
ambiguous or a missing time.
|
|
|
|
When a ``datetime.datetime`` instance ``dt`` represents an ambiguous
|
|
time, there are two values ``s0`` and ``s1`` such that::
|
|
|
|
datetime.fromtimestamp(s0) == datetime.fromtimestamp(s1) == dt
|
|
|
|
In this case, ``dt.timestamp()`` will return the smaller of ``s0``
|
|
and ``s1`` values if ``dt.first == True`` and the larger otherwise.
|
|
|
|
|
|
For example, on a system set to US/Eastern timezone::
|
|
|
|
>>> datetime(2014, 11, 2, 1, 30, first=True).timestamp()
|
|
1414906200.0
|
|
>>> datetime(2014, 11, 2, 1, 30, first=False).timestamp()
|
|
1414909800.0
|
|
|
|
|
|
When a ``datetime.datetime`` instance ``dt`` represents a missing
|
|
time, there is no value ``s`` for which::
|
|
|
|
datetime.fromtimestamp(s) == dt
|
|
|
|
but we can form two "nice to know" values of ``s`` that differ
|
|
by the size of the gap in seconds. One is the value of ``s``
|
|
that would correspond to ``dt`` in a timezone where the UTC offset
|
|
is always the same as the offset right before the gap and the
|
|
other is the similar value but in a timezone the UTC offset
|
|
is always the same as the offset right after the gap.
|
|
|
|
The value returned by ``dt.timestamp()`` given a missing
|
|
``dt`` will be the larger of the two "nice to know" values
|
|
if ``dt.first == True`` and the smaller otherwise.
|
|
|
|
For example, on a system set to US/Eastern timezone::
|
|
|
|
>>> datetime(2015, 3, 8, 2, 30, first=True).timestamp()
|
|
1425799800.0
|
|
>>> datetime(2015, 3, 8, 2, 30, first=False).timestamp()
|
|
1425796200.0
|
|
|
|
|
|
Combining and splitting date and time
|
|
.....................................
|
|
|
|
The ``datetime.datetime.combine()`` method will copy the value of the
|
|
``first`` attribute to the resulting ``datetime.datetime`` instance.
|
|
|
|
The ``datetime.datetime.time()`` method will copy the value of the
|
|
``first`` attribute to the resulting ``datetime.time`` instance.
|
|
|
|
|
|
Pickles
|
|
.......
|
|
|
|
Pickle sizes for the ``datetime.datetime`` and ``datetime.time``
|
|
objects will not change. The ``first`` flag will be encoded in the
|
|
first bit of the 5th byte of the ``datetime.datetime`` pickle payload
|
|
or the 2nd byte of the datetime.time. In the `current implementation`_
|
|
these bytes are used to store minute value (0-59) and the first bit is
|
|
always 0. Note that ``first=True`` will be encoded as 0 in the first
|
|
bit and ``first=False`` as 1. (This change only affects pickle
|
|
format. In the C implementation, the "first" member will get a full byte
|
|
to store the actual boolean value.)
|
|
|
|
|
|
.. _current implementation: https://hg.python.org/cpython/file/d3b20bff9c5d/Include/datetime.h#l17
|
|
|
|
Implementations of tzinfo in stdlib
|
|
===================================
|
|
|
|
No new implementations of ``datetime.tzinfo`` abstract class are
|
|
proposed in this PEP. The existing (fixed offset) timezones do
|
|
not introduce ambiguous local times and their ``utcoffset()``
|
|
implementation will return the same constant value as they do now
|
|
regardless of the value of ``first``.
|
|
|
|
The basic implementation of ``fromutc()`` in the abstract
|
|
``datetime.tzinfo`` class will not change. It is currently not
|
|
used anywhere in the stdlib because the only included ``tzinfo``
|
|
implementation (the ``datetime.timzeone`` class implementing fixed
|
|
offset timezones) override ``fromutc()``.
|
|
|
|
|
|
Guidelines for New tzinfo Implementations
|
|
=========================================
|
|
|
|
Implementors of concrete ``datetime.tzinfo`` subclasses who want to
|
|
support variable UTC offsets (due to DST and other causes) should follow
|
|
these guidelines.
|
|
|
|
|
|
Ignorance is Bliss
|
|
------------------
|
|
|
|
New implementations of ``utcoffset()``, ``tzname()`` and ``dst()``
|
|
methods should ignore the value of ``first`` unless they are called on
|
|
the ambiguous or missing times.
|
|
|
|
|
|
In the DST Fold
|
|
---------------
|
|
|
|
New subclasses should override the base-class ``fromutc()`` method and
|
|
implement it so that in all cases where two UTC times ``u1`` and
|
|
``u2`` (``u1`` <``u2``) correspond to the same local time
|
|
``fromutc(u1)`` will return an instance with ``first=True`` and
|
|
``fromutc(u2)`` will return an instance with ``first=False``. In all
|
|
other cases the returned instance should have ``first=True``.
|
|
|
|
On an ambiguous time introduced at the end of DST, the values returned
|
|
by ``utcoffset()`` and ``dst()`` methods should be as follows
|
|
|
|
+-----------------+----------------+------------------+
|
|
| | first=True | first=False |
|
|
+=================+================+==================+
|
|
| utcoffset() | stdoff + dstoff| stdoff |
|
|
+-----------------+----------------+------------------+
|
|
| dst() | dstoff | zero |
|
|
+-----------------+----------------+------------------+
|
|
|
|
where ``stdoff`` is the standard (non-DST) offset, ``dstoff`` is the
|
|
DST correction (typically ``dstoff = timedelta(hours=1)``) and ``zero
|
|
= timedelta(0)``.
|
|
|
|
|
|
Mind the DST Gap
|
|
----------------
|
|
|
|
On a missing time introduced at the start of DST, the values returned
|
|
by ``utcoffset()`` and ``dst()`` methods should be as follows
|
|
|
|
+-----------------+----------------+------------------+
|
|
| | first=True | first=False |
|
|
+=================+================+==================+
|
|
| utcoffset() | stdoff | stdoff + dstoff |
|
|
+-----------------+----------------+------------------+
|
|
| dst() | zero | dstoff |
|
|
+-----------------+----------------+------------------+
|
|
|
|
|
|
Non-DST Folds and Gaps
|
|
----------------------
|
|
|
|
On ambiguous/missing times introduced by the change in the standard time
|
|
offset, the ``dst()`` method should return the same value regardless of
|
|
the value of ``first`` and the ``utcoffset()`` should return values
|
|
according to the following table:
|
|
|
|
+-----------------+----------------+-----------------------------+
|
|
| | first=True | first=False |
|
|
+=================+================+=============================+
|
|
| ambiguous | oldoff | newoff = oldoff - delta |
|
|
+-----------------+----------------+-----------------------------+
|
|
| missing | oldoff | newoff = oldoff + delta |
|
|
+-----------------+----------------+-----------------------------+
|
|
|
|
where ``delta`` is the size of the fold or the gap.
|
|
|
|
|
|
Temporal Arithmetic
|
|
===================
|
|
|
|
The value of "first" will be ignored in all operations except those
|
|
that involve conversion between timezones. [#]_ As a consequence,
|
|
``datetime.datetime`` or ``datetime.time`` instances that differ only
|
|
by the value of ``first`` will compare as equal. Applications that
|
|
need to differentiate between such instances should check the value of
|
|
``first`` or convert them to a timezone that does not have ambiguous
|
|
times.
|
|
|
|
The result of addition (subtraction) of a timedelta to (from) a
|
|
datetime will always have ``first`` set to ``True`` even if the
|
|
original datetime instance had ``first=False``.
|
|
|
|
.. [#] Computing a difference between two aware datetime instances
|
|
with different values of ``tzinfo`` involves an implicit timezone
|
|
conversion. In this case, the result may depend on the value of
|
|
the ``first`` flag in either of the instances, but only if the
|
|
instance has ``tzinfo`` that accounts for the value of ``first``
|
|
in its ``utcoffset()`` method.
|
|
|
|
|
|
Backward and Forward Compatibility
|
|
==================================
|
|
|
|
This proposal will have little effect on the programs that do not read
|
|
the ``first`` flag explicitly or use tzinfo implementations that do.
|
|
The only visible change for such programs will be that conversions to
|
|
and from POSIX timestamps will now round-trip correctly (up to
|
|
floating point rounding). Programs that implemented work-arounds to
|
|
the old incorrect behavior will need to be modified.
|
|
|
|
Pickles produced by older programs will remain fully forward
|
|
compatible. Only datetime/time instances with ``first=False`` pickled
|
|
in the new versions will become unreadable by the older Python
|
|
versions. Pickles of instances with ``first=True`` (which is the
|
|
default) will remain unchanged.
|
|
|
|
|
|
Questions and Answers
|
|
=====================
|
|
|
|
Why not call the new flag "isdst"?
|
|
----------------------------------
|
|
|
|
A non-technical answer
|
|
......................
|
|
|
|
* Alice: Bob - let's have a stargazing party at 01:30 AM tomorrow!
|
|
* Bob: Should I presume initially that Daylight Saving Time is or is
|
|
not in effect for the specified time?
|
|
* Alice: Huh?
|
|
|
|
-------
|
|
|
|
* Bob: Alice - let's have a stargazing party at 01:30 AM tomorrow!
|
|
* Alice: You know, Bob, 01:30 AM will happen twice tomorrow. Which time do you have in mind?
|
|
* Bob: I did not think about it, but let's pick the first.
|
|
|
|
-------
|
|
|
|
A technical reason
|
|
..................
|
|
|
|
While the ``tm_isdst`` field of the ``time.struct_time`` object can be
|
|
used to disambiguate local times in the fold, the semantics of such
|
|
disambiguation are completely different from the proposal in this PEP.
|
|
|
|
The main problem with the ``tm_isdst`` field is that it is impossible
|
|
to know what value is appropriate for ``tm_isdst`` without knowing the
|
|
details about the time zone that are only available to the ``tzinfo``
|
|
implementation. Thus while ``tm_isdst`` is useful in the *output* of
|
|
methods such as ``time.localtime``, it is cumbursome as an *input* of
|
|
methods such as ``time.mktime``.
|
|
|
|
If the programmer misspecifies a non-negative value of ``tm_isdst`` to
|
|
``time.mktime``, the result will be time that is 1 hour off and since
|
|
there is rarely a way to know anything about DST *before* a call to
|
|
``time.mktime`` is made, the only sane choice is usually
|
|
``tm_isdst=-1``.
|
|
|
|
Unlike ``tm_isdst``, the proposed ``first`` flag has no effect on the
|
|
interpretation of the datetime instance unless without that flag two
|
|
(or no) interpretations are possible.
|
|
|
|
Since it would be very confusing to have something called ``isdst``
|
|
that does not have the same semantics as ``tm_isdst``, we need a
|
|
different name. Moreover, the ``datetime.datetime`` class already has
|
|
a method called ``dst()`` and if we called ``first`` "isdst", we would
|
|
necessarily have situations when "isdst" and ``bool(dst())`` values
|
|
are different.
|
|
|
|
|
|
Why "first"?
|
|
------------
|
|
|
|
This is a working name chosen initially because the obvious
|
|
alternative ("second") conflicts with the existing attribute. It has
|
|
since became clear that it is desirable to have a flag with the
|
|
default value ``False`` and such that chronological ordering of
|
|
disambiguated (datetime, flag) pairs would match their lexicographical
|
|
order.
|
|
|
|
The following alternative names have been proposed:
|
|
|
|
**fold**
|
|
Suggested by Guido van Rossum and favored by one (but disfavored by another) author. Has
|
|
correct connotations and easy mnemonic rules, but at the same
|
|
time does not invite unbased assumptions.
|
|
|
|
**later**
|
|
A close contender to "fold". One author dislikes it because
|
|
it is confusable with equally fitting "latter," but in the age
|
|
of autocompletion everywhere this is a small consideration. A
|
|
stronger objection may be that in the case of missing time, we
|
|
will have ``later=True`` instance converted to an earlier time by
|
|
``.astimezone(timezone.utc)`` that that with ``later=False``.
|
|
Yet again, this can be interpreted as a desirable indication that
|
|
the original time is invalid.
|
|
|
|
**repeated**
|
|
Did not receive any support on the mailing list.
|
|
|
|
**ltdf**
|
|
(Local Time Disambiguation Flag) - short and no-one will attempt
|
|
to guess what it means without reading the docs. (Feel free to
|
|
use it in discussions with the meaning ltdf=False is the
|
|
earlier if you don't want to endorse any of the alternatives
|
|
above.)
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
* Github fork: https://github.com/abalkin/cpython
|
|
* Tracker issue: http://bugs.python.org/issue24773
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
Picture Credit
|
|
==============
|
|
|
|
This image is a work of a U.S. military or Department of Defense
|
|
employee, taken or made as part of that person's official duties. As a
|
|
work of the U.S. federal government, the image is in the public
|
|
domain.
|