PEP 410: Rephrase datetime.dateime and tuple of integers sections; add

"compatible with float" in criteria to choose a timestamp type
This commit is contained in:
Victor Stinner 2012-02-17 14:05:52 +01:00
parent 2aa4edc29f
commit 6e3f3791ef
1 changed files with 122 additions and 55 deletions

View File

@ -45,8 +45,8 @@ mailbox with the mailbox module, etc.
An arbitrary resolution is preferred over a fixed resolution (like nanosecond) An arbitrary resolution is preferred over a fixed resolution (like nanosecond)
to not have to change the API when a better resolution is required. For to not have to change the API when a better resolution is required. For
example, the NTP protocol uses fractions of 2\ :sup:`32` seconds example, the NTP protocol uses fractions of 2\ :sup:`32` seconds
(approximatively 2.3 x 10\ :sup:`-10` second), whereas the NTP protocol version (approximatively 2.3 × 10\ :sup:`-10` second), whereas the NTP protocol version
4 uses fractions of 2\ :sup:`64` seconds (5.4 x 10\ :sup:`-20` second). 4 uses fractions of 2\ :sup:`64` seconds (5.4 × 10\ :sup:`-20` second).
.. note:: .. note::
With a resolution of 1 microsecond (10\ :sup:`-6`), float timestamps lose With a resolution of 1 microsecond (10\ :sup:`-6`), float timestamps lose
@ -105,20 +105,27 @@ decimal.Decimal, is only used when requested explicitly.
Alternatives: Timestamp types Alternatives: Timestamp types
============================= =============================
To support timestamps with a nanosecond resolution, five types were considered: To support timestamps with a nanosecond resolution, the following types has
been considered:
* 128 bits float * 128 bits float
* decimal.Decimal * decimal.Decimal
* datetime.datetime * datetime.datetime
* datetime.timedelta * datetime.timedelta
* tuple of integers * tuple of integers
* timespec structure
Criteria: Criteria:
* Doing arithmetic on timestamps must be possible. * Doing arithmetic on timestamps must be possible
* Timestamps must be comparable. * Timestamps must be comparable
* The type must have a resolution of a least 1 nanosecond (without losing * An arbitrary resolution, or at least a resolution of 1 nanosecond without
precision) or an arbitrary resolution. losing precision
* Compatibility with the float type
It should be possible to coerce the new timestamp to float for backward
compatibility, even if programs should not get this new type if they did not
ask explicitly to get it.
128 bits float 128 bits float
-------------- --------------
@ -142,76 +149,136 @@ with the Python license.
datetime.datetime datetime.datetime
----------------- -----------------
Except os.stat(), time.time() and time.clock_gettime(time.CLOCK_GETTIME), all Advantages:
time functions have an unspecified starting point and no timezone information,
and so cannot be converted to datetime.datetime.
* datetime.datetime is the natural choice for a timestamp because it is clear
that this type contains a timestamp, whereas int, float and Decimal are
raw numbers.
* datetime.datetime is an absolute timestamp and so is well defined
* datetime.datetime gives direct access to the year, month, day, hours,
minutes and seconds.
* datetime.datetime has methods related to time like methods to format the
timestamp as string (e.g. datetime.datetime.strftime).
Drawbacks:
* Except os.stat(), time.time() and time.clock_gettime(time.CLOCK_GETTIME),
all time functions have an unspecified starting point and no timezone
information, and so cannot be converted to datetime.datetime.
* datetime.datetime has issues with timezone. For example, a datetime object
without timezone and a datetime with a timezone cannot be compared.
* datetime.datetime has ordering issues with daylight saving time (DST) in the
duplicate hour of switching from DST to normal time.
* datetime.datetime is not as well integrated than Epoch timestamps: there is
no datetime.datetime.totimestamp() function. Most functions expecting
tiemstamps don't support datime.datetime. For example, os.utime() expects a
tuple of Epoch timestamps.
datetime.datetime has been rejected because it cannot be used for functions
using an unspecified starting point like os.times() or time.clock().
.. note::
datetime.datetime only supports microsecond resolution, but can be enhanced datetime.datetime only supports microsecond resolution, but can be enhanced
to support nanosecond. to support nanosecond.
datetime.datetime has issues with timezone. For example, a datetime object
without timezone and a datetime with a timezone cannot be compared.
datetime.datetime has ordering issues with daylight saving time (DST) in the
duplicate hour of switching from DST to normal time.
datetime.datetime is not as well integrated than Epoch timestamps: there is no
datetime.datetime.totimestamp() function. Most functions expecting tiemstamps
don't support datime.datetime. For example, os.utime() expects a tuple of Epoch
timestamps.
datetime.timedelta datetime.timedelta
------------------ ------------------
As datetime.datetime, datetime.timedelta only supports microsecond resolution, datetime.timedelta is the natural choice for a relative timestamp because it is
but can be enhanced to support nanosecond. clear that this type contains a timestamp, whereas int, float and Decimal are
raw numbers. It can be used with datetime.datetime to get an absolute
timestamp.
datetime.timedelta is not as well integrated than Epoch timestamps, some datetime.timedelta has been rejected because it is not "compatible" with float,
functions don't accept this type as input. Converting a timedelta object to a whereas Decimal can be converted to float, and has a fixed resolution. One new
float (number of seconds) requires to call an explicit method, standard timestamp type is enough, and Decimal is preferred over
timedelta.toseconds(). Supporting timedelta would need to change every datetime.timedelta.
functions getting timestamps, whereas all functions supporting float already
accept Decimal because Decimal can be casted to float. .. note::
datetime.timedelta only supports microsecond resolution, but can be enhanced
to support nanosecond.
.. _tuple-integers: .. _tuple-integers:
Tuple of integers Tuple of integers
----------------- -----------------
Creating a tuple of integers is simple and fast, but arithmetic operations To expose C functions in Python, a tuple of integers is the natural choice to
cannot be done directly on tuple. For example, (2, 1) - (2, 0) fails with a store a timestamp because the C language uses structures with integers fields
TypeError. (e.g. timeval and timespec structures). Using only integers avoids the loss of
precision (Python supports integer of arbitrary length). Creating and parsing a
tuple of integers is simple and fast.
An integer fraction can be used to store any number without loss of precision Depending of the exact format of the tuple, the precision can be arbitrary or
with any resolution: (numerator: int, denominator: int). The timestamp value fixed. The precision can be choose as the loss of precision is smaller than
can be computed with a simple division: numerator / denominator. an arbitrary limit like one nanosecond.
For the C implementation, a variant can be used to avoid integer overflow Different formats has been proposed:
because C types have a fixed size: (intpart: int, numerator: int, denominator:
int), value = intpart + numerator / denominator. Still to avoid integer
overflow in C types, numerator can be bigger than denominator while intpart can
be zero.
Other formats have been proposed: * A: (numerator, denominator)
* A: (sec, nsec): value = sec + nsec * 10\ :sup:`-9` * value = numerator / denominator
* B: (intpart, floatpart, exponent): value = intpart + floatpart * 10\ :sup:`exponent` * resolution = 1 / denominator
* C: (intpart, floatpart, base, exponent): value = intpart + floatpart * base\ :sup:`exponent` * the numerator is a signed integer and can be bigger than the denominator
* denominator > 0
The format A only supports nanosecond resolution. Formats A and B lose * B: (seconds, numerator, denominator)
precision if the clock frequency cannot be written as a power of 10: if the
clock frequency is not coprime with 2 and 5.
For some clocks, like ``QueryPerformanceCounter()`` on Windows, the frequency * value = seconds + numerator / denominator
is only known as runtime. The base and exponent has to be computed. If * resolution = 1 / denominator
computing the base and the exponent is too expensive (or not possible, e.g. if * seconds is a signed integer
the frequency is a prime number), exponent=1 can be used. The format (C) is * 0 <= numerator < denominator
just a fractionn if exponent=1. * denominator > 0
The only advantage of these formats is a small optimization if the base is 2 * C: (intpart, floatpart, base, exponent)
for float or if the base 10 for Decimal. In other cases, frequency = base\
:sup:`exponent` must be computed again to convert a timestamp as float or * value = intpart + floatpart × base\ :sup:`exponent`
Decimal. Storing directly the frequency in the denominator is simpler. * resolution = base \ :sup:`-exponent`
* intpart is a signed integer
* 0 <= floatpart < base \ :sup:`exponent`
* base > 0
* exponent is a signed integer and should be negative
* D: (intpart, floatpart, exponent)
* value = intpart + floatpart × 10\ :sup:`exponent`
* resolution = 10 \ :sup:`-exponent`
* intpart is a signed integer
* 0 <= floatpart < 10 \ :sup:`exponent`
* exponent is a signed integer and should be negative
* E: (sec, nsec)
* value = sec + nsec × 10\ :sup:`-9`
* resolution = 10 \ :sup:`-9` (nanosecond)
* sec is a signed integer
* 0 <= nsec < 10 \ :sup:`9`
All formats support an arbitary resolution, except of the format (E).
The format (D) may loss of precision if the clock frequency is arbitrary and
cannot be expressed as 10 \ :sup:`exponent`. The format (C) has a similar
issue, but in such case, it is possible to use base=frequency and exponent=-1.
The formats (D) and (E) allow optimization for conversion to float if the base
is 2 and to decimal.Decimal if the base is 10.
The format (A) supports arbitrary precision, is simple (only two fields), only
requires a simple division to get the floating point value, and is already used
by float.as_integer_ratio().
To simplify the implementation (especially if implemented in C to avoid integer
overflow), it may be possible to accept numerator bigger than the denominator
(e.g. floatpart bigger than base \ :sup:`exponent` for the format (C)), and
normalize the tuple later.
Tuple of integers have been rejected because they don't support arithmetic
operations.
.. note::
On Windows, the ``QueryPerformanceCounter()`` clock uses the frequency of
the processor which is an arbitrary number and can be read using
``QueryPerformanceFrequency()``.
timespec structure timespec structure
------------------ ------------------