PEP 410: Rephrase datetime.dateime and tuple of integers sections; add

"compatible with float" in criteria to choose a timestamp type
This commit is contained in:
Victor Stinner 2012-02-17 14:05:52 +01:00
parent 2aa4edc29f
commit 6e3f3791ef
1 changed files with 122 additions and 55 deletions

View File

@ -45,8 +45,8 @@ mailbox with the mailbox module, etc.
An arbitrary resolution is preferred over a fixed resolution (like nanosecond)
to not have to change the API when a better resolution is required. For
example, the NTP protocol uses fractions of 2\ :sup:`32` seconds
(approximatively 2.3 x 10\ :sup:`-10` second), whereas the NTP protocol version
4 uses fractions of 2\ :sup:`64` seconds (5.4 x 10\ :sup:`-20` second).
(approximatively 2.3 × 10\ :sup:`-10` second), whereas the NTP protocol version
4 uses fractions of 2\ :sup:`64` seconds (5.4 × 10\ :sup:`-20` second).
.. note::
With a resolution of 1 microsecond (10\ :sup:`-6`), float timestamps lose
@ -105,20 +105,27 @@ decimal.Decimal, is only used when requested explicitly.
Alternatives: Timestamp types
=============================
To support timestamps with a nanosecond resolution, five types were considered:
To support timestamps with a nanosecond resolution, the following types has
been considered:
* 128 bits float
* decimal.Decimal
* datetime.datetime
* datetime.timedelta
* tuple of integers
* timespec structure
Criteria:
* Doing arithmetic on timestamps must be possible.
* Timestamps must be comparable.
* The type must have a resolution of a least 1 nanosecond (without losing
precision) or an arbitrary resolution.
* Doing arithmetic on timestamps must be possible
* Timestamps must be comparable
* An arbitrary resolution, or at least a resolution of 1 nanosecond without
losing precision
* Compatibility with the float type
It should be possible to coerce the new timestamp to float for backward
compatibility, even if programs should not get this new type if they did not
ask explicitly to get it.
128 bits float
--------------
@ -142,76 +149,136 @@ with the Python license.
datetime.datetime
-----------------
Except os.stat(), time.time() and time.clock_gettime(time.CLOCK_GETTIME), all
time functions have an unspecified starting point and no timezone information,
and so cannot be converted to datetime.datetime.
Advantages:
datetime.datetime only supports microsecond resolution, but can be enhanced
to support nanosecond.
* datetime.datetime is the natural choice for a timestamp because it is clear
that this type contains a timestamp, whereas int, float and Decimal are
raw numbers.
* datetime.datetime is an absolute timestamp and so is well defined
* datetime.datetime gives direct access to the year, month, day, hours,
minutes and seconds.
* datetime.datetime has methods related to time like methods to format the
timestamp as string (e.g. datetime.datetime.strftime).
datetime.datetime has issues with timezone. For example, a datetime object
without timezone and a datetime with a timezone cannot be compared.
Drawbacks:
datetime.datetime has ordering issues with daylight saving time (DST) in the
duplicate hour of switching from DST to normal time.
* Except os.stat(), time.time() and time.clock_gettime(time.CLOCK_GETTIME),
all time functions have an unspecified starting point and no timezone
information, and so cannot be converted to datetime.datetime.
* datetime.datetime has issues with timezone. For example, a datetime object
without timezone and a datetime with a timezone cannot be compared.
* datetime.datetime has ordering issues with daylight saving time (DST) in the
duplicate hour of switching from DST to normal time.
* datetime.datetime is not as well integrated than Epoch timestamps: there is
no datetime.datetime.totimestamp() function. Most functions expecting
tiemstamps don't support datime.datetime. For example, os.utime() expects a
tuple of Epoch timestamps.
datetime.datetime is not as well integrated than Epoch timestamps: there is no
datetime.datetime.totimestamp() function. Most functions expecting tiemstamps
don't support datime.datetime. For example, os.utime() expects a tuple of Epoch
timestamps.
datetime.datetime has been rejected because it cannot be used for functions
using an unspecified starting point like os.times() or time.clock().
.. note::
datetime.datetime only supports microsecond resolution, but can be enhanced
to support nanosecond.
datetime.timedelta
------------------
As datetime.datetime, datetime.timedelta only supports microsecond resolution,
but can be enhanced to support nanosecond.
datetime.timedelta is the natural choice for a relative timestamp because it is
clear that this type contains a timestamp, whereas int, float and Decimal are
raw numbers. It can be used with datetime.datetime to get an absolute
timestamp.
datetime.timedelta is not as well integrated than Epoch timestamps, some
functions don't accept this type as input. Converting a timedelta object to a
float (number of seconds) requires to call an explicit method,
timedelta.toseconds(). Supporting timedelta would need to change every
functions getting timestamps, whereas all functions supporting float already
accept Decimal because Decimal can be casted to float.
datetime.timedelta has been rejected because it is not "compatible" with float,
whereas Decimal can be converted to float, and has a fixed resolution. One new
standard timestamp type is enough, and Decimal is preferred over
datetime.timedelta.
.. note::
datetime.timedelta only supports microsecond resolution, but can be enhanced
to support nanosecond.
.. _tuple-integers:
Tuple of integers
-----------------
Creating a tuple of integers is simple and fast, but arithmetic operations
cannot be done directly on tuple. For example, (2, 1) - (2, 0) fails with a
TypeError.
To expose C functions in Python, a tuple of integers is the natural choice to
store a timestamp because the C language uses structures with integers fields
(e.g. timeval and timespec structures). Using only integers avoids the loss of
precision (Python supports integer of arbitrary length). Creating and parsing a
tuple of integers is simple and fast.
An integer fraction can be used to store any number without loss of precision
with any resolution: (numerator: int, denominator: int). The timestamp value
can be computed with a simple division: numerator / denominator.
Depending of the exact format of the tuple, the precision can be arbitrary or
fixed. The precision can be choose as the loss of precision is smaller than
an arbitrary limit like one nanosecond.
For the C implementation, a variant can be used to avoid integer overflow
because C types have a fixed size: (intpart: int, numerator: int, denominator:
int), value = intpart + numerator / denominator. Still to avoid integer
overflow in C types, numerator can be bigger than denominator while intpart can
be zero.
Different formats has been proposed:
Other formats have been proposed:
* A: (numerator, denominator)
* A: (sec, nsec): value = sec + nsec * 10\ :sup:`-9`
* B: (intpart, floatpart, exponent): value = intpart + floatpart * 10\ :sup:`exponent`
* C: (intpart, floatpart, base, exponent): value = intpart + floatpart * base\ :sup:`exponent`
* value = numerator / denominator
* resolution = 1 / denominator
* the numerator is a signed integer and can be bigger than the denominator
* denominator > 0
The format A only supports nanosecond resolution. Formats A and B lose
precision if the clock frequency cannot be written as a power of 10: if the
clock frequency is not coprime with 2 and 5.
* B: (seconds, numerator, denominator)
For some clocks, like ``QueryPerformanceCounter()`` on Windows, the frequency
is only known as runtime. The base and exponent has to be computed. If
computing the base and the exponent is too expensive (or not possible, e.g. if
the frequency is a prime number), exponent=1 can be used. The format (C) is
just a fractionn if exponent=1.
* value = seconds + numerator / denominator
* resolution = 1 / denominator
* seconds is a signed integer
* 0 <= numerator < denominator
* denominator > 0
The only advantage of these formats is a small optimization if the base is 2
for float or if the base 10 for Decimal. In other cases, frequency = base\
:sup:`exponent` must be computed again to convert a timestamp as float or
Decimal. Storing directly the frequency in the denominator is simpler.
* C: (intpart, floatpart, base, exponent)
* value = intpart + floatpart × base\ :sup:`exponent`
* resolution = base \ :sup:`-exponent`
* intpart is a signed integer
* 0 <= floatpart < base \ :sup:`exponent`
* base > 0
* exponent is a signed integer and should be negative
* D: (intpart, floatpart, exponent)
* value = intpart + floatpart × 10\ :sup:`exponent`
* resolution = 10 \ :sup:`-exponent`
* intpart is a signed integer
* 0 <= floatpart < 10 \ :sup:`exponent`
* exponent is a signed integer and should be negative
* E: (sec, nsec)
* value = sec + nsec × 10\ :sup:`-9`
* resolution = 10 \ :sup:`-9` (nanosecond)
* sec is a signed integer
* 0 <= nsec < 10 \ :sup:`9`
All formats support an arbitary resolution, except of the format (E).
The format (D) may loss of precision if the clock frequency is arbitrary and
cannot be expressed as 10 \ :sup:`exponent`. The format (C) has a similar
issue, but in such case, it is possible to use base=frequency and exponent=-1.
The formats (D) and (E) allow optimization for conversion to float if the base
is 2 and to decimal.Decimal if the base is 10.
The format (A) supports arbitrary precision, is simple (only two fields), only
requires a simple division to get the floating point value, and is already used
by float.as_integer_ratio().
To simplify the implementation (especially if implemented in C to avoid integer
overflow), it may be possible to accept numerator bigger than the denominator
(e.g. floatpart bigger than base \ :sup:`exponent` for the format (C)), and
normalize the tuple later.
Tuple of integers have been rejected because they don't support arithmetic
operations.
.. note::
On Windows, the ``QueryPerformanceCounter()`` clock uses the frequency of
the processor which is an arbitrary number and can be read using
``QueryPerformanceFrequency()``.
timespec structure
------------------