286 lines
9.9 KiB
Plaintext
286 lines
9.9 KiB
Plaintext
PEP: 410
|
|
Title: Use decimal.Decimal type for timestamps
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Victor Stinner <victor.stinner@haypocalc.com>
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 01-Feburary-2012
|
|
Python-Version: 3.3
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3
|
|
only supports int or float to store timestamps, but these types cannot be use
|
|
to store a timestamp with a nanosecond resolution.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
Python 2.3 introduced float timestamps to support subsecond resolutions.
|
|
os.stat() uses float timestamps by default since Python 2.5.
|
|
|
|
Python 3.3 introduced functions supporting nanosecond resolutions:
|
|
|
|
* os module: stat(), utimensat(), futimens()
|
|
* time module: clock_gettime(), clock_getres(), wallclock()
|
|
|
|
The Python float type uses binary64 format of the IEEE 754 standard. With a
|
|
resolution of 1 nanosecond (10\ :sup:`-9`), float timestamps lose precision for values
|
|
bigger than 2\ :sup:`24` seconds (194 days: 1970-07-14 for an Epoch timestamp).
|
|
|
|
.. note::
|
|
With a resolution of 1 microsecond (10\ :sup:`-6`), float timestamps lose precision
|
|
for values bigger than 2\ :sup:`33` seconds (272 years: 2242-03-16 for an Epoch
|
|
timestamp).
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
Add decimal.Decimal as a new type for timestamps. Decimal supports any
|
|
timestamp resolution, support arithmetic operations and is comparable.
|
|
Functions getting float inputs support directly Decimal, Decimal is converted
|
|
implicitly to float, even if the conversion may lose precision.
|
|
|
|
Add a *timestamp* optional argument to:
|
|
|
|
* os module: fstat(), fstatat(), lstat() and stat()
|
|
* time module: clock(), clock_gettime(), clock_getres(), time() and
|
|
wallclock()
|
|
|
|
The *timestamp* argument is a type, there are three supported types:
|
|
|
|
* int
|
|
* float
|
|
* decimal.Decimal
|
|
|
|
The float type is still used by default for backward compatibility.
|
|
|
|
Support decimal.Decimal (without implicit conversion to float to avoid lose of
|
|
precision) in functions having timestamp arguments:
|
|
|
|
* datetime.datetime.fromtimestamp()
|
|
* time.gmtime(), time.localtime()
|
|
* os.utimensat(), os.futimens()
|
|
|
|
The os.stat_float_times() is deprecated: use timestamp=int argument instead.
|
|
|
|
.. note::
|
|
The decimal module is implemented in Python and is slow, but there is a C
|
|
reimplementation which is almost ready for inclusion in CPython.
|
|
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
The default timestamp type is unchanged, so there is no impact of backwad
|
|
compatibility, nor impact on performances. The new timestamp type,
|
|
decimal.Decimal, is only used when requested explicitly.
|
|
|
|
|
|
Alternatives: Timestamp types
|
|
=============================
|
|
|
|
To support timestamps with a nanosecond resolution, five types were considered:
|
|
|
|
* 128 bits float
|
|
* decimal.Decimal
|
|
* datetime.datetime
|
|
* datetime.timedelta
|
|
* tuple of integers
|
|
|
|
Criteria:
|
|
|
|
* Doing arithmetic on timestamps must be possible.
|
|
* Timestamps must be comparable.
|
|
* The type must have a resolution of a least 1 nanosecond (without losing
|
|
precision) or an arbitrary resolution.
|
|
|
|
128 bits float
|
|
--------------
|
|
|
|
Add a new IEEE 754-2008 quad-precision float type. The IEEE 754-2008 quad
|
|
precision float has 1 sign bit, 15 bits of exponent and 112 bits of mantissa.
|
|
|
|
128 bits float is supported by GCC (4.3), Clang and ICC compilers. Python must
|
|
be portable and so cannot rely on a type only available on some platforms. For
|
|
example, Visual C++ 2008 doesn't support it 128 bits float, whereas it is used
|
|
to build the official Windows executables. Another example: GCC 4.3 does not
|
|
support __float128 in 32-bit mode on x86 (but GCC 4.4 does).
|
|
|
|
Intel CPUs have FPU (x87) supporting 80-bit floats, but not using SSE
|
|
intructions. Other CPU vendors don't support this float size.
|
|
|
|
There is also a license issue: GCC uses the MPFR library for 128 bits float,
|
|
library distributed under the GNU LGPL license. This license is not compatible
|
|
with the Python license.
|
|
|
|
datetime.datetime
|
|
-----------------
|
|
|
|
datetime.datetime only supports microsecond resolution, but can be enhanced
|
|
to support nanosecond.
|
|
|
|
datetime.datetime has issues:
|
|
|
|
- there is no easy way to convert it into "seconds since the epoch"
|
|
- any broken-down time has issues of time stamp ordering in the
|
|
duplicate hour of switching from DST to normal time
|
|
- time zone support is flaky-to-nonexistent in the datetime module
|
|
|
|
datetime.datetime is also more complex than a simple number.
|
|
|
|
datetime.timedelta
|
|
------------------
|
|
|
|
As datetime.datetime, datetime.timedelta only supports microsecond resolution,
|
|
but can be enhanced to support nanosecond.
|
|
|
|
Even if datetime.timedelta have most criteria, it was not selected because it
|
|
is more complex than a simple number and is not accepted by functions getting
|
|
timestamp inputs.
|
|
|
|
|
|
.. _tuple-integers:
|
|
|
|
Tuple of integers
|
|
-----------------
|
|
|
|
Creating a tuple of integers is simple and fast, but arithmetic operations
|
|
cannot be done directly on tuple. For example, (2, 1) - (2, 0) fails with a
|
|
TypeError.
|
|
|
|
An integer fraction can be used to store any number without loss of precision
|
|
with any resolution: (numerator: int, denominator: int). The timestamp value
|
|
can be computed with a simple division: numerator / denominator.
|
|
|
|
For the C implementation, a variant can be used to avoid integer overflow
|
|
because C types have a fixed size: (intpart: int, numerator: int, denominator:
|
|
int), value = intpart + numerator / denominator. Still to avoid integer
|
|
overflow in C types, numerator can be bigger than denominator while intpart can
|
|
be zero.
|
|
|
|
Other formats have been proposed:
|
|
|
|
* A: (sec, nsec): value = sec + nsec * 10\ :sup:`-9`
|
|
* B: (intpart, floatpart, exponent): value = intpart + floatpart * 10\ :sup:`exponent`
|
|
* C: (intpart, floatpart, base, exponent): value = intpart + floatpart * base\ :sup:`exponent`
|
|
|
|
The format A only supports nanosecond resolution. Formats A and B lose
|
|
precision if the clock frequency cannot be written as a power of 10: if the
|
|
clock frequency is not coprime with 2 and 5.
|
|
|
|
For some clocks, like ``QueryPerformanceCounter()`` on Windows, the frequency
|
|
is only known as runtime. The base and exponent has to be computed. If
|
|
computing the base and the exponent is too expensive (or not possible, e.g. if
|
|
the frequency is a prime number), exponent=1 can be used. The format (C) is
|
|
just a fractionn if exponent=1.
|
|
|
|
The only advantage of these formats is a small optimization if the base is 2
|
|
for float or if the base 10 for Decimal. In other cases, frequency = base\
|
|
:sup:`exponent` must be computed again to convert a timestamp as float or
|
|
Decimal. Storing directly the frequency in the denominator is simpler.
|
|
|
|
|
|
Alternatives: API design
|
|
========================
|
|
|
|
Add a global flag to change the timestamp type
|
|
----------------------------------------------
|
|
|
|
A global flag like os.stat_decimal_times(), similar to os.stat_float_times(),
|
|
can be added to set globally the timestamp type.
|
|
|
|
A global flag may cause issues with libraries and applications expecting float
|
|
instead of Decimal. A float cannot be converted implicitly to Decimal. The
|
|
os.stat_float_times() case is different because an int can be converted
|
|
implictly to float.
|
|
|
|
Add a protocol to create a timestamp
|
|
------------------------------------
|
|
|
|
Instead of hardcoding how timestamps are created, a new protocol can be added
|
|
to create a timestamp from a fraction. time.time(timestamp=type) would call
|
|
type.__from_fraction__(numerator, denominator) to create a timestamp object of
|
|
the specified type.
|
|
|
|
If the type doesn't support the protocol, a fallback can be used:
|
|
type(numerator) / type(denominator).
|
|
|
|
A variant is to use a "converter" callback to create a timestamp. Example
|
|
creating a float timestamp:
|
|
|
|
def timestamp_to_float(numerator, denominator):
|
|
return float(numerator) / float(denominator)
|
|
|
|
Common converters can be provided by time, datetime and other modules, or maybe
|
|
a specific "hires" module. Users can defined their own converters.
|
|
|
|
Such protocol has a limitation: the structure of data passed to the protocol or
|
|
the callback has to be decided once and cannot be changed later. For example,
|
|
adding a timezone or the absolution start of the timestamp (e.g. Epoch or
|
|
unspecified start for monotonic clocks) would break the API.
|
|
|
|
The protocol proposition was as being excessive given the requirements, but
|
|
that the specific syntax proposed (time.time(timestamp=type)) allows this to be
|
|
introduced later if compelling use cases are discovered.
|
|
|
|
.. note::
|
|
Other formats can also be used instead of a fraction: see the `Tuple of integers
|
|
<tuple-integers>`_ section
|
|
|
|
Add new fields to os.stat
|
|
-------------------------
|
|
|
|
It was proposed to add 3 fields to os.stat() structure to get nanoseconds of
|
|
timestamps.
|
|
|
|
Populating the extra fields is time consuming. If new fields are available by
|
|
default, any call to os.stat() would be slower. If new fields are optional, the
|
|
stat structure would have a variable number of fields, which can be surprising.
|
|
|
|
Anyway, this approach does not help with the time module.
|
|
|
|
Add a boolean argument
|
|
----------------------
|
|
|
|
Because we only need one new type, decimal.Decimal, a simple boolean flag
|
|
can be added. For example, time.time(decimal=True) or time.time(hires=True).
|
|
|
|
The boolean argument API was rejected because it is not "pythonic". Changing
|
|
the return type with a parameter value is preferred over a boolean parameter (a
|
|
flag).
|
|
|
|
Add new functions
|
|
-----------------
|
|
|
|
Add new functions for each type, examples:
|
|
|
|
* time.clock_decimal()
|
|
* time.time_decimal()
|
|
* os.stat_decimal()
|
|
* etc.
|
|
|
|
Adding a new function for each function creating timestamps duplicate a lot
|
|
of time.
|
|
|
|
|
|
Links
|
|
=====
|
|
|
|
* `Issue #11457: os.stat(): add new fields to get timestamps as Decimal objects with nanosecond resolution <http://bugs.python.org/issue11457>`_
|
|
* `Issue #13882: Add format argument for time.time(), time.clock(), ... to get a timestamp as a Decimal object <http://bugs.python.org/issue13882>`_
|
|
* `[Python-Dev] Store timestamps as decimal.Decimal objects <http://mail.python.org/pipermail/python-dev/2012-January/116025.html>`_
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|