python-peps/pep-0564.rst

386 lines
12 KiB
ReStructuredText

PEP: 564
Title: Add new time functions with nanosecond resolution
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16-October-2017
Python-Version: 3.7
Abstract
========
Add five new functions to the ``time`` module: ``time_ns()``,
``perf_counter_ns()``, ``monotonic_ns()``, ``clock_gettime_ns()`` and
``clock_settime_ns()``. They are similar to the function without the
``_ns`` suffix, but have nanosecond resolution: use a number of
nanoseconds as a Python int.
The best ``time.time_ns()`` resolution measured in Python is 3 times
better then ``time.time()`` resolution on Linux and Windows.
Rationale
=========
Float type limited to 104 days
------------------------------
The clocks resolution of desktop and latop computers is getting closer
to nanosecond resolution. More and more clocks have a frequency in MHz,
up to GHz for the CPU TSC clock.
The Python ``time.time()`` function returns the current time as a
floatting point number which is usually a 64-bit binary floatting number
(in the IEEE 754 format).
The problem is that the float type starts to lose nanoseconds after 104
days. Conversion from nanoseconds (``int``) to seconds (``float``) and
then back to nanoseconds (``int``) to check if conversions lose
precision::
# no precision loss
>>> x = 2 ** 52 + 1; int(float(x * 1e-9) * 1e9) - x
0
# precision loss! (1 nanosecond)
>>> x = 2 ** 53 + 1; int(float(x * 1e-9) * 1e9) - x
-1
>>> print(datetime.timedelta(seconds=2 ** 53 / 1e9))
104 days, 5:59:59.254741
``time.time()`` returns seconds elapsed since the UNIX epoch: January
1st, 1970. This function loses precision since May 1970 (47 years ago)::
>>> import datetime
>>> unix_epoch = datetime.datetime(1970, 1, 1)
>>> print(unix_epoch + datetime.timedelta(seconds=2**53 / 1e9))
1970-04-15 05:59:59.254741
Previous rejected PEP
---------------------
Five years ago, the PEP 410 proposed a large and complex change in all
Python functions returning time to support nanosecond resolution using
the ``decimal.Decimal`` type.
The PEP was rejected for different reasons:
* The idea of adding a new optional parameter to change the result type
was rejected. It's an uncommon (and bad?) programming practice in
Python.
* It was not clear if hardware clocks really had a resolution of 1
nanosecond, especially at the Python level.
* The ``decimal.Decimal`` type is uncommon in Python and so requires
to adapt code to handle it.
Issues caused by precision loss
-------------------------------
Example 1: measure time delta
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
A server is running for longer than 104 days. A clock is read before
and after running a function to measure its performance. This benchmark
lose precision only because the float type used by clocks, not because
of the clock resolution.
On Python microbenchmarks, it is common to see function calls taking
less than 100 ns. A difference of a single nanosecond becomes
significant.
Example 2: compare time with different resolution
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Two programs "A" and "B" are runing on the same system, so use the system
block. The program A reads the system clock with nanosecond resolution
and writes the timestamp with nanosecond resolution. The program B reads
the timestamp with nanosecond resolution, but compares it to the system
clock read with a worse resolution. To simplify the example, let's say
that it reads the clock with second resolution. If that case, there is a
window of 1 second while the program B can see the timestamp written by A
as "in the future".
Nowadays, more and more databases and filesystems support storing time
with nanosecond resolution.
.. note::
This issue was already fixed for file modification time by adding the
``st_mtime_ns`` field to the ``os.stat()`` result, and by accepting
nanoseconds in ``os.utime()``. This PEP proposes to generalize the
fix.
CPython enhancements of the last 5 years
----------------------------------------
Since the PEP 410 was rejected:
* The ``os.stat_result`` structure got 3 new fields for timestamps as
nanoseconds (Python ``int``): ``st_atime_ns``, ``st_ctime_ns``
and ``st_mtime_ns``.
* The PEP 418 was accepted, Python 3.3 got 3 new clocks:
``time.monotonic()``, ``time.perf_counter()`` and
``time.process_time()``.
* The CPython private "pytime" C API handling time now uses a new
``_PyTime_t`` type: simple 64-bit signed integer (C ``int64_t``).
The ``_PyTime_t`` unit is an implementation detail and not part of the
API. The unit is currently ``1 nanosecond``.
Existing Python APIs using nanoseconds as int
---------------------------------------------
The ``os.stat_result`` structure has 3 fields for timestamps as
nanoseconds (``int``): ``st_atime_ns``, ``st_ctime_ns`` and
``st_mtime_ns``.
The ``ns`` parameter of the ``os.utime()`` function accepts a
``(atime_ns: int, mtime_ns: int)`` tuple: nanoseconds.
Changes
=======
New functions
-------------
This PEP adds five new functions to the ``time`` module:
* ``time.clock_gettime_ns(clock_id)``
* ``time.clock_settime_ns(clock_id, time: int)``
* ``time.perf_counter_ns()``
* ``time.monotonic_ns()``
* ``time.time_ns()``
These functions are similar to the version without the ``_ns`` suffix,
but use nanoseconds as Python ``int``.
For example, ``time.monotonic_ns() == int(time.monotonic() * 1e9)`` if
``monotonic()`` value is small enough to not lose precision.
Unchanged functions
-------------------
This PEP only proposed to add new functions getting or setting clocks
with nanosecond resolution. Clocks are likely to lose precision,
especially when their reference is the UNIX epoch.
Python has other functions handling time (get time, timeout, etc.), but
no nanosecond variant is proposed for them since they are less likely to
lose precision.
Example of unchanged functions:
* ``os`` module: ``sched_rr_get_interval()``, ``times()``, ``wait3()``
and ``wait4()``
* ``resource`` module: ``ru_utime`` and ``ru_stime`` fields of
``getrusage()``
* ``signal`` module: ``getitimer()``, ``setitimer()``
* ``time`` module: ``clock_getres()``
Since the ``time.clock()`` function was deprecated in Python 3.3, no
``time.clock_ns()`` is added.
New nanosecond flavor of these functions may be added later, if a
concrete use case comes in.
Alternatives and discussion
===========================
Sub-nanosecond resolution
-------------------------
``time.time_ns()`` API is not "future-proof": if clocks resolutions
increase, new Python functions may be needed.
In practive, the resolution of 1 nanosecond is currently enough for all
structures used by all operating systems functions.
Hardware clock with a resolution better than 1 nanosecond already
exists. For example, the frequency of a CPU TSC clock is the CPU base
frequency: the resolution is around 0.3 ns for a CPU running at 3
GHz. Users who have access to such hardware and really need
sub-nanosecond resolution can easily extend Python for their needs.
Such rare use case don't justify to design the Python standard library
to support sub-nanosecond resolution.
For the CPython implementation, nanosecond resolution is convenient: the
standard and well supported ``int64_t`` type can be used to store time.
It supports a time delta between -292 years and 292 years. Using the
UNIX epoch as reference, this type supports time since year 1677 to year
2262::
>>> 1970 - 2 ** 63 / (10 ** 9 * 3600 * 24 * 365.25)
1677.728976954687
>>> 1970 + 2 ** 63 / (10 ** 9 * 3600 * 24 * 365.25)
2262.271023045313
Modify time.time() result type
------------------------------
It was proposed to modify ``time.time()`` to return a different float
type with better precision.
The PEP 410 proposed to use ``decimal.Decimal`` which already exists and
supports arbitray precision, but it was rejected. Apart
``decimal.Decimal``, no portable ``float`` type with better precision is
currently available in Python.
Changing the builtin Python ``float`` type is out of the scope of this
PEP.
Moreover, changing existing functions to return a new type introduces a
risk of breaking the backward compatibility even the new type is
designed carefully.
Different types
---------------
Many ideas of new types were proposed to support larger or arbitrary
precision: fractions, structures or 2-tuple using integers,
fixed-precision floating point number, etc.
See also the PEP 410 for a previous long discussion on other types.
Adding a new type requires more effort to support it, than reusing
the existing ``int`` type. The standard library, third party code and
applications would have to be modified to support it.
The Python ``int`` type is well known, well supported, ease to
manipulate, and supports all arithmetic operations like:
``dt = t2 - t1``.
Moreover, using nanoseconds as integer is not new in Python, it's
already used for ``os.stat_result`` and
``os.utime(ns=(atime_ns, mtime_ns))``.
.. note::
If the Python ``float`` type becomes larger (ex: decimal128 or
float128), the ``time.time()`` precision will increase as well.
Different API
-------------
The ``time.time(ns=False)`` API was proposed to avoid adding new
functions. It's an uncommon (and bad?) programming practice in Python to
change the result type depending on a parameter.
Different options were proposed to allow the user to choose the time
resolution. If each Python module uses a different resolution, it can
become difficult to handle different resolutions, instead of just
seconds (``time.time()`` returning ``float``) and nanoseconds
(``time.time_ns()`` returning ``int``). Moreover, as written above,
there is no need for resolution better than 1 nanosecond in practive in
the Python standard library.
New time_ns module
------------------
Add a new ``time_ns`` module which contains the five new functions:
* ``time_ns.clock_gettime(clock_id)``
* ``time_ns.clock_settime(clock_id, time: int)``
* ``time_ns.perf_counter()``
* ``time_ns.monotonic()``
* ``time_ns.time()``
The first question is if the ``time_ns`` should expose exactly the same
API (constants, functions, etc.) than the ``time`` module. It can be
painful to maintain two flavors of the ``time`` module. How users use
suppose to make a choice between these two modules?
If tomorrow, other nanosecond variant are needed in the ``os`` module,
will we have to add a new ``os_ns`` module as well? There are functions
related to time in many modules: ``time``, ``os``, ``signal``,
``resource``, ``select``, etc.
Another idea is to add a ``time.ns`` submodule or a nested-namespace to
get the ``time.ns.time()`` syntax.
Annex: Clocks Resolution in Python
==================================
Script ot measure the smallest difference between two ``time.time()`` and
``time.time_ns()`` reads ignoring differences of zero::
import math
import time
LOOPS = 10 ** 6
print("time.time_ns(): %s" % time.time_ns())
print("time.time(): %s" % time.time())
min_dt = [abs(time.time_ns() - time.time_ns())
for _ in range(LOOPS)]
min_dt = min(filter(bool, min_dt))
print("min time_ns() delta: %s ns" % min_dt)
min_dt = [abs(time.time() - time.time())
for _ in range(LOOPS)]
min_dt = min(filter(bool, min_dt))
print("min time() delta: %s ns" % math.ceil(min_dt * 1e9))
Results of time(), perf_counter() and monotonic().
Linux (kernel 4.12 on Fedora 26):
* time_ns(): **84 ns**
* time(): **239 ns**
* perf_counter_ns(): 84 ns
* perf_counter(): 82 ns
* monotonic_ns(): 84 ns
* monotonic(): 81 ns
Windows 8.1:
* time_ns(): **318000 ns**
* time(): **894070 ns**
* perf_counter_ns(): 100 ns
* perf_counter(): 100 ns
* monotonic_ns(): 15000000 ns
* monotonic(): 15000000 ns
The difference on ``time.time()`` is significant: **84 ns (2.8x better)
vs 239 ns on Linux and 318 us (2.8x better) vs 894 us on Windows**. The
difference (presion loss) will be larger next years since every day adds
864,00,000,000,000 nanoseconds to the system clock.
The difference on ``time.perf_counter()`` and ``time.monotonic clock()``
is not visible in this quick script since the script runs less than 1
minute, and the uptime of the computer used to run the script was
smaller than 1 week. A significant difference should be seen with an
uptime of 104 days or greater.
.. note::
Internally, Python starts ``monotonic()`` and ``perf_counter()``
clocks at zero on some platforms which indirectly reduce the
precision loss.
Links
=====
* `bpo-31784: Implementation of the PEP 564
<https://bugs.python.org/issue31784>`_
Copyright
=========
This document has been placed in the public domain.