Add PEP 410: Use decimal.Decimal type for timestamps
This commit is contained in:
parent
634ee68583
commit
22e5bccdac
|
@ -0,0 +1,260 @@
|
|||
PEP: 410
|
||||
Title: Use decimal.Decimal type for timestamps
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Victor Stinner <victor.stinner@haypocalc.com>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 01-Feburary-2012
|
||||
Python-Version: 3.3
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3
|
||||
only supports int or float to store timestamps, but these types cannot be use
|
||||
to store a timestamp with a nanosecond resolution.
|
||||
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
Python 2.3 introduced float timestamps to support subsecond resolutions.
|
||||
os.stat() uses float timestamps by default since Python 2.5.
|
||||
|
||||
Python 3.3 introduced functions supporting nanosecond resolutions:
|
||||
|
||||
* os module: stat(), utimensat(), futimens()
|
||||
* time module: clock_gettime(), clock_getres(), wallclock()
|
||||
|
||||
The Python float type uses binary64 format of the IEEE 754 standard. With a
|
||||
resolution of 1 nanosecond (10^-9), float timestamps lose precision for values
|
||||
bigger than 2^24 seconds (194 days: 1970-07-14 for an Epoch timestamp).
|
||||
|
||||
.. note::
|
||||
With a resolution of 1 microsecond (10^-6), float timestamps lose precision
|
||||
for values bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch
|
||||
timestamp).
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
Add decimal.Decimal as a new type for timestamps. Add a *timestamp* optional
|
||||
argument to:
|
||||
|
||||
* os module: fstat(), fstatat(), lstat() and stat()
|
||||
* time module: clock(), clock_gettime(), clock_getres(), time() and
|
||||
wallclock()
|
||||
|
||||
The *timestamp* argument is a type, there are three supported types:
|
||||
|
||||
* int
|
||||
* float
|
||||
* decimal.Decimal
|
||||
|
||||
The float type is still used by default for backward compatibility.
|
||||
|
||||
Support decimal.Decimal (without implicit conversion to float to avoid lose of
|
||||
precision) in functions having timestamp arguments:
|
||||
|
||||
* datetime.datetime.fromtimestamp()
|
||||
* time.gmtime(), time.localtime()
|
||||
* os.utimensat(), os.futimens()
|
||||
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
The default timestamp type is unchanged, so there is no impact of backwad
|
||||
compatibility, nor impact on performances. The new timestamp type,
|
||||
decimal.Decimal, is only used when requested explicitly.
|
||||
|
||||
|
||||
Alternatives: Timestamp types
|
||||
=============================
|
||||
|
||||
To support timestamps with a nanosecond resolution, five types were considered:
|
||||
|
||||
* 128 bits float
|
||||
* decimal.Decimal
|
||||
* datetime.datetime
|
||||
* datetime.timedelta
|
||||
* tuple of integers
|
||||
|
||||
Criteria:
|
||||
|
||||
* Doing arithmetic on timestamps must be possible.
|
||||
* Timestamps must be comparable.
|
||||
* The type must have a resolution of a least 1 nanosecond (without losing
|
||||
precision) or an arbitrary resolution.
|
||||
|
||||
128 bits float
|
||||
--------------
|
||||
|
||||
Add a new IEEE 754-2008 quad-precision float type. The IEEE 754-2008 quad
|
||||
precision float has 1 sign bit, 15 bits of exponent and 112 bits of mantissa.
|
||||
|
||||
128 bits float is supported by GCC (4.3), Clang and ICC. The problem is that
|
||||
Visual C++ 2008 doesn't support it. Python must be portable and so cannot rely
|
||||
on a type only available on some platforms. Another example: GCC 4.3 does not
|
||||
support __float128 in 32-bit mode on x86 (but gcc 4.4 does).
|
||||
|
||||
Intel CPUs have FPU supporting 80-bit floats, but not using SSE intructions.
|
||||
Other CPU vendors don't support this float size.
|
||||
|
||||
There is also a license issue: GCC uses the MPFR library which is distributed
|
||||
under the GNU LGPL license. This license is incompatible with the Python
|
||||
license.
|
||||
|
||||
datetime.datetime
|
||||
-----------------
|
||||
|
||||
datetime.datetime only supports microsecond resolution, but can be enhanced
|
||||
to support nanosecond.
|
||||
|
||||
datetime.datetime has issues:
|
||||
|
||||
- there is no easy way to convert it into "seconds since the epoch"
|
||||
- any broken-down time has issues of time stamp ordering in the
|
||||
duplicate hour of switching from DST to normal time
|
||||
- time zone support is flaky-to-nonexistent in the datetime module
|
||||
|
||||
datetime.timedelta
|
||||
------------------
|
||||
|
||||
As datetime.datetime, datetime.timedelta only supports microsecond resolution,
|
||||
but can be enhanced to support nanosecond.
|
||||
|
||||
Even if datetime.timedelta have most criteria, it was not selected because it
|
||||
is more complex than a simple number and is not accepted by functions getting
|
||||
timestamp inputs.
|
||||
|
||||
|
||||
decimal.Decimal
|
||||
---------------
|
||||
|
||||
Decimal has an arbitrary precision, support arithmetic operations, is
|
||||
comparable. Functions getting float inputs support directly Decimal, Decimal is
|
||||
converted implicitly to float, even if the conversion may lose precision.
|
||||
|
||||
Using Decimal by default would cause bootstrap issue because the module is
|
||||
implemented in Python, but using Decimal by default was not considered.
|
||||
|
||||
The decimal module is implemented in Python and is slow, but there is a C
|
||||
reimplementation which is almost ready for inclusion in CPython.
|
||||
|
||||
Tuple of integers
|
||||
-----------------
|
||||
|
||||
Creating a tuple of integers is simple and fast, but arithmetic operations
|
||||
cannot be done directly on tuple. For example, (2, 1) - (2, 0) fails with a
|
||||
TypeError.
|
||||
|
||||
An integer fraction can be used to store any number without loss of precision
|
||||
with any resolution: (numerator: int, denominator: int). The timestamp value
|
||||
can be computed with a simple division: numerator / denominator.
|
||||
|
||||
For the C implementation, a variant can be used to avoid integer overflow
|
||||
because C types have a fixed size: (intpart: int, numerator: int, denominator:
|
||||
int), value = intpart + numerator / denominator. Still to avoid integer
|
||||
overflow in C types, numerator can be bigger than denominator while intpart can
|
||||
be zero.
|
||||
|
||||
Other formats have been proposed:
|
||||
|
||||
* A: (sec, nsec): value = sec + nsec * 10 ** -9
|
||||
* B: (intpart, floatpart, exponent): value = intpart + floatpart * 10 ** exponent
|
||||
* C: (intpart, floatpart, base, exponent): value = intpart + floatpart * base ** exponent
|
||||
|
||||
The format A only supports nanosecond resolution. Formats A and B lose
|
||||
precision if the clock frequency is not a power of 10. The format C has a
|
||||
similar issue.
|
||||
|
||||
|
||||
Alternatives: API design
|
||||
========================
|
||||
|
||||
Add a global flag to change the timestamp type
|
||||
----------------------------------------------
|
||||
|
||||
A global flag like os.stat_decimal_times(), similar to os.stat_float_times(),
|
||||
can be added to set globally the timestamp type.
|
||||
|
||||
A global flag may cause issues with libraries and applications expecting float
|
||||
instead of Decimal. A float cannot be converted implicitly to Decimal. The
|
||||
os.stat_float_times() case is different because an int can be converted
|
||||
implictly to float.
|
||||
|
||||
Add a protocol to create a timestamp
|
||||
------------------------------------
|
||||
|
||||
Instead of hardcoding how timestamps are created, a new protocol can be added
|
||||
to create a timestamp from a fraction. time.time(timestamp=type) would call
|
||||
type.__from_fraction__(numerator, denominator) to create a timestamp object of
|
||||
the specified type.
|
||||
|
||||
If the type doesn't support the protocol, a fallback can be used:
|
||||
type(numerator) / type(denominator).
|
||||
|
||||
A variant is to use a "converter" callback to create a timestamp. Example
|
||||
creating a float timestamp:
|
||||
|
||||
def timestamp_to_float(numerator, denominator):
|
||||
return float(numerator) / float(denominator)
|
||||
|
||||
Common converters can be provided by time, datetime and other modules, or maybe
|
||||
a specific "hires" module. Users can defined their own converters.
|
||||
|
||||
Such protocol has a limitation: the structure of data passed to the protocol or
|
||||
the callback has to be decided once and cannot be changed later. For example,
|
||||
adding a timezone or the absolution start of the timestamp (e.g. Epoch or
|
||||
unspecified start for monotonic clocks) would break the API.
|
||||
|
||||
Add new fields to os.stat
|
||||
-------------------------
|
||||
|
||||
It was proposed to add 3 fields to os.stat() structure to get nanoseconds of
|
||||
timestamps.
|
||||
|
||||
Populating the extra fields is time consuming. If new fields are available by
|
||||
default, any call to os.stat() would be slower. If new fields are optional, the
|
||||
stat structure would have a variable number of fields, which can be surprising.
|
||||
|
||||
Anyway, this approach does not help with the time module.
|
||||
|
||||
Add a boolean argument
|
||||
----------------------
|
||||
|
||||
Because we only need one new type, decimal.Decimal, a simple boolean flag
|
||||
can be added. For example, time.time(decimal=True) or time.time(hires=True).
|
||||
|
||||
Add new functions
|
||||
-----------------
|
||||
|
||||
Add new functions for each type, examples:
|
||||
|
||||
* time.clock_decimal()
|
||||
* time.time_decimal()
|
||||
* os.stat_decimal()
|
||||
* etc.
|
||||
|
||||
Adding a new function for each function creating timestamps duplicate a lot
|
||||
of time.
|
||||
|
||||
|
||||
Links
|
||||
=====
|
||||
|
||||
* `Issue #11457: os.stat(): add new fields to get timestamps as Decimal objects with nanosecond resolution <http://bugs.python.org/issue11457>`_
|
||||
* `Issue #13882: Add format argument for time.time(), time.clock(), ... to get a timestamp as a Decimal object <http://bugs.python.org/issue13882>`_
|
||||
* `[Python-Dev] Store timestamps as decimal.Decimal objects <http://mail.python.org/pipermail/python-dev/2012-January/116025.html>`_
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
|
Loading…
Reference in New Issue