PEP: 495 Title: Local Time Disambiguation Version: $Revision$ Last-Modified: $Date$ Author: Alexander Belopolsky Discussions-To: Datetime-SIG Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 02-Aug-2015 Abstract ======== This PEP adds a boolean member to the instances of ``datetime.time`` and ``datetime.datetime`` classes that can be used to differentiate between two moments in time for which local times are the same. .. sidebar:: US public service advertisement .. image:: pep-0495-daylightsavings.png :align: right :width: 15% Rationale ========= In the most world locations there have been and will be times when local clocks are moved back. In those times, intervals are introduced in which local clocks show the same time twice in the same day. In these situations, the information displayed on a local clock (or stored in a Python datetime instance) is insufficient to identify a particular moment in time. The proposed solution is to add a boolean flag to the ``datetime`` instances that will distinguish between the two ambiguous times. Terminology =========== When clocks are moved back, we say that a *fold* is created in the fabric of time. When the clock are moved forward, a *gap* is created. A local time that falls in the fold is called *ambiguous*. A local time that falls in the gap is called *missing*. Proposal ======== The "first" flag ---------------- We propose adding a boolean member called ``first`` to the instances of ``datetime.time`` and ``datetime.datetime`` classes. This member should have the value True for all instances except those that represent the second (chronologically) moment in time in an ambiguous case. [#]_ .. [#] An instance that has ``first=False`` in a non-ambiguous case is said to represent an invalid time (or is invalid for short), but users are not prevented from creating invalid instances by passing ``first=False`` to a constructor or to a ``replace()`` method. This is similar to the current situation with the instances that fall in the spring-forward gap. Such instances don't represent any valid time, but neither the constructors nor the ``replace()`` methods check whether the instances that they produce are valid. Moreover, this PEP specifies how various functions should behave when given an invalid instance. Affected APIs ------------- Attributes .......... Instances of ``datetime.time`` and ``datetime.datetime`` will get a new boolean attribute called "first." Constructors ............ The ``__new__`` methods of the ``datetime.time`` and ``datetime.datetime`` classes will get a new keyword-only argument called ``first`` with the default value ``True``. The value of the ``first`` argument will be used to initialize the value of the ``first`` attribute in the returned instance. Methods ....... The ``replace()`` methods of the ``datetime.time`` and ``datetime.datetime`` classes will get a new keyword-only argument called ``first``. It will behave similarly to the other ``replace()`` arguments: if the ``first`` argument is specified and given a boolean value, the new instance returned by ``replace()`` will have its ``first`` attribute set to that value. In CPython, a non-boolean value of ``first`` will raise a ``TypeError``, but other implementations may allow the value ``None`` to behave the same as when ``first`` is not given. If the ``first`` argument is not specified, the original value of the ``first`` attribute is copied to the result. Affected Behaviors ------------------ Conversion from naive to aware .............................. The ``astimezone()`` method will now work for naive ``self``. The system local timezone will be assumed in this case and the ``first`` flag will be used to determine which local timezone is in effect in the ambiguous case. For example, on a system set to US/Eastern timezone:: >>> dt = datetime(2014, 11, 2, 1, 30) >>> dt.astimezone().strftime('%D %T %Z%z') '11/02/14 01:30:00 EDT-0400' >>> dt.replace(first=False).astimezone().strftime('%D %T %Z%z') '11/02/14 01:30:00 EST-0500' Conversion to POSIX seconds from EPOCH ...................................... The ``timestamp()`` method of ``datetime.datetime`` will return different values for ``datetime.datetime`` instances that differ only by the value of their ``first`` attribute if and only if these instances represent an ambiguous or a non-existent time. When a ``datetime.datetime`` instance ``dt`` represents an ambiguous (repeated) time, there are two values ``s0`` and ``s1`` such that:: datetime.fromtimestamp(s0) == datetime.fromtimestamp(s1) == dt In this case, ``dt.timestamp()`` will return the smaller of ``s0`` and ``s1`` values if ``dt.first == True`` and the larger otherwise. For example, on a system set to US/Eastern timezone:: >>> datetime(2014, 11, 2, 1, 30, first=True).timestamp() 1414906200.0 >>> datetime(2014, 11, 2, 1, 30, first=False).timestamp() 1414909800.0 When a ``datetime.datetime`` instance ``dt`` represents a missing time, there is no value ``s`` for which:: datetime.fromtimestamp(s) == dt but we can form two "nice to know" values of ``s`` that differ by the size of the gap in seconds. One is the value of ``s`` that would correspond to ``dt`` in a timezone where the UTC offset is always the same as the offset right before the gap and the other is the similar value but in a timezone the UTC offset is always the same as the offset right after the gap. The value returned by ``dt.timestamp()`` given a missing ``dt`` will be the larger of the two "nice to know" values if ``dt.first == True`` and the larger otherwise. For example, on a system set to US/Eastern timezone:: >>> datetime(2015, 3, 8, 2, 30, first=True).timestamp() 1425799800.0 >>> datetime(2015, 3, 8, 2, 30, first=False).timestamp() 1425796200.0 Conversion from POSIX seconds from EPOCH ........................................ The ``fromtimestamp()`` static method of ``datetime.datetime`` will set the ``first`` attribute appropriately in the returned object. For example, on a system set to US/Eastern timezone:: >>> datetime.fromtimestamp(1414906200) datetime.datetime(2014, 11, 2, 1, 30) >>> datetime.fromtimestamp(1414906200 + 3600) datetime.datetime(2014, 11, 2, 1, 30, first=False) Combining and splitting date and time ..................................... The ``datetime.datetime.combine()`` method will copy the value of the ``first`` attribute to the resulting ``datetime.datetime`` instance. The ``datetime.datetime.time()`` method will copy the value of the ``first`` attribute to the resulting ``datetime.time`` instance. Implementations of tzinfo in stdlib ................................... No new implementations of ``datetime.tzinfo`` abstract class are introduced in this PEP. The existing (fixed offset) timezones do not introduce ambiguous local times and their ``utcoffset()`` implementation will return the same constant value as they do now regardless of the value of ``first``. The basic implementation of ``fromutc()`` in the abstract ``datetime.tzinfo`` class will not change. It is currently not used anywhere in the stdlib because the only included ``tzinfo`` implementation (the ``datetime.timzeone`` class implementing fixed offset timezones) override ``fromutc()``. New guidelines will be published for implementing concrete timezones with variable UTC offset. Guidelines for new tzinfo implementations ----------------------------------------- Implementors of concrete ``datetime.tzinfo`` subclasses who want to support variable UTC offsets (due to DST and other causes) must follow these guidelines. New subclasses must override the base-class ``fromutc()`` method and implement it so that in all cases where two UTC times ``u1`` and ``u2`` (``u1`` <``u2``) correspond to the same local time ``fromutc(u1)`` will return an instance with ``first=True`` and ``fromutc(u2)`` will return an instance with ``first=False``. In all other cases the returned instance must have ``first=True``. New implementations of ``utcoffset()`` and ``dst()`` methods should ignore the value of ``first`` unless they are called on the ambiguous or missing times. On an ambiguous time introduced at the end of DST, the values returned by ``utcoffset()`` and ``dst()`` methods should be as follows +-----------------+----------------+------------------+ | | first=True | first=False | +=================+================+==================+ | utcoff() | stdoff + hour | stdoff | +-----------------+----------------+------------------+ | dst() | hour | zero | +-----------------+----------------+------------------+ where ``stdoff`` is the standard (non-DST) offset, ``hour = timedelta(hours=1)`` and ``zero = timedelta(0)``. On a missing time introduced at the start of DST, the values returned by ``utcoffset()`` and ``dst()`` methods should be as follows +-----------------+----------------+------------------+ | | first=True | first=False | +=================+================+==================+ | utcoff() | stdoff | stdoff + hour | +-----------------+----------------+------------------+ | dst() | zero | hour | +-----------------+----------------+------------------+ On ambiguous/missing times introduced by the change in the standard time offset, the ``dst()`` method should return the same value regardless of the value of ``first`` and the ``utcoff()`` should return values according to the following table: +-----------------+----------------+-----------------------------+ | | first=True | first=False | +=================+================+=============================+ | ambiguous | oldoff | newoff = oldoff - delta | +-----------------+----------------+-----------------------------+ | missing | oldoff | newoff = oldoff + delta | +-----------------+----------------+-----------------------------+ Pickle size ----------- Pickle sizes for the ``datetime.datetime`` and ``datetime.time`` objects will not change. The ``first`` flag will be encoded in the first bit of the 5th byte of the ``datetime.datetime`` pickle payload or the 2nd byte of the datetime.time. In the `current implementation`_ these bytes are used to store minute value (0-59) and the first bit is always 0. Note that ``first=True`` will be encoded as 0 in the first bit and ``first=False`` as 1. (This change only affects pickle format. In C implementation, the "first" member will get a full byte to store the actual boolean value.) We chose the minute byte to store the the "first" bit because this choice preserves the natural ordering. .. _current implementation: https://hg.python.org/cpython/file/d3b20bff9c5d/Include/datetime.h#l17 Temporal Arithmetics -------------------- The value of "first" will be ignored in all operations except those that involve conversion between timezones. [#]_ As a consequence, `datetime.datetime`` or ``datetime.time`` instances that differ only by the value of ``first`` will compare as equal. Applications that need to differentiate between such instances should check the value of ``first`` or convert them to a timezone that does not have ambiguous times. The result of addition (subtraction) of a timedelta to (from) a datetime will always have ``first`` set to ``True`` even if the original datetime instance had ``first=False``. .. [#] As of Python 3.5, ``tzinfo`` is ignored whenever timedelta is added or subtracted from a ``datetime.datetime`` instance or when one ``datetime.datetime`` instance is subtracted from another with the same (even not-None) ``tzinfo``. This may change in the future, but such changes are outside of the scope of this PEP. Backward and Forward Compatibility ---------------------------------- This proposal will have little effect on the programs that do not read the ``first`` flag explicitly or use tzinfo implementations that do. The only visible change for such programs will be that conversions to and from POSIX timestamps will now round-trip correctly (up to floating point rounding). Programs that implemented work-arounds to the old incorrect behavior will need to be modified. Pickles produced by older programs will remain fully forward compatible. Only datetime/time instances with ``first=False`` pickled in the new versions will become unreadable by the older Python versions. Pickles of instances with ``first=True`` (which is the default) will remain unchanged. Questions and Answers ===================== 1. Why not call the new flag "isdst"? ------- * Alice: Bob - let's have a stargazing party at 01:30 AM tomorrow! * Bob: Should I presume initially that summer time (for example, Daylight Saving Time) is or is not (respectively) in effect for the specified time? * Alice: Huh? ------- * Bob: Alice - let's have a stargazing party at 01:30 AM tomorrow! * Alice: You know, Bob, 01:30 AM will happen twice tomorrow. Which time do you have in mind? * Bob: I did not think about it, but let's pick the first. 2. Why "first"? * Rejections: **second** rejected because "second" is already there. **later** rejected because "later" is confusable with "latter". **earlier** rejected because "earlier" has the same issue as "first" (requires default to be True) but is two characters longer. * Remaining possibilities: **repeated** this is a strong candidate **is_first** arguably more grammatically correct than "first" **ltdf** (Local Time Disambiguation Flag) - short and no-one will attempt to guess what it means without reading the docs. Implementation ============== * Github fork: https://github.com/abalkin/cpython * Tracker issue: http://bugs.python.org/issue24773 Copyright ========= This document has been placed in the public domain. Picture Credit ============== This image is a work of a U.S. military or Department of Defense employee, taken or made as part of that person's official duties. As a work of the U.S. federal government, the image is in the public domain.