PEP 433: cleanup, complete, reorganize

This commit is contained in:
Victor Stinner 2013-01-13 00:04:29 +01:00
parent 81f79a9a15
commit a58a370ff5
1 changed files with 192 additions and 130 deletions

View File

@ -23,7 +23,9 @@ Rationale
On UNIX, subprocess closes file descriptors greater than 2 by default since
Python 3.2 [#subprocess_close]_. All file descriptors created by the parent
process are automatically closed.
process are automatically closed. ``xmlrpc.server.SimpleXMLRPCServer`` sets
the close-on-exec flag of the listening socket, the parent class
``socketserver.BaseServer`` does not set this flag.
There are other cases creating a subprocess or executing a new program where
file descriptors are not closed: functions of the os.spawn*() family and third
@ -31,6 +33,9 @@ party modules calling ``exec()`` or ``fork()`` + ``exec()``. In this case, file
descriptors are shared between the parent and the child processes which is
usually unexpected and causes various issues.
This PEP proposes to continue the work started with the change in the
subprocess, to fix the issue in any code, and not just code using subprocess.
Inherited file descriptors issues
---------------------------------
@ -95,22 +100,10 @@ Applications still have to close explicitly file descriptors after a
``fork()``. The close-on-exec flag only closes file descriptors after
``exec()``, and so after ``fork()`` + ``exec()``.
Many functions of the Python standard library creating file descriptors are not
changed by the PEP, and so will not have the close-on-exec flag set. Some
examples:
* ``os.urandom()``: on UNIX, it creates a file descriptor on UNIX to read
``/dev/urandom``. Adding an ``cloexec`` argument to ``os.urandom()`` does
not make sense on Windows.
* ``curses.windows.getwin()`` and ``curses.windows.putwin()`` creates a temporary file using ``fdopen(fd, "wb+");``
* ``mmap.mmap()`` opens ``/dev/null`` using ``open("/dev/zero", O_RDWR);`` if
``MAP_ANONYMOUS`` is not defined.
* If the ``PYTHONSTARTUP`` environment variable is set, the corresponding file
is opened using ``fopen(startup, "r");``
* ``python script.py`` opens ``script.py`` using ``fopen(filename, "r");``
* etc.
Third party modules creating file descriptors may not set close-on-exec flag.
This PEP only change the close-on-exec flag of file descriptors created by the
Python standard library, or by modules using the standard library. Third party
modules not using the standard library should be modified to conform to this
PEP. The new ``os.set_cloexec()`` function can be used for example.
Impacted functions:
@ -129,9 +122,13 @@ Impacted modules:
XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX
XXX descriptors of the constructor the ``pass_fds`` argument? XXX
.. note::
See `Close file descriptors after fork`_ for a possible solution for
``fork()`` without ``exec()``.
Proposition
===========
Proposal
========
This PEP proposes to add a new optional argument ``cloexec`` on functions
creating file descriptors in the Python standard library. If the argument is
@ -160,41 +157,196 @@ Add a new optional ``cloexec`` argument to:
The default value of the ``cloexec`` argument is ``False`` to keep the backward
compatibility.
The close-on-exec flag will not be set on file descriptors 0 (stdin), 1
(stdout) and 2 (stderr), because these files are expected to be inherited. It
would still be possible to set close-on-exec flag explicitly using
``os.set_cloexec()``.
Drawbacks:
* Many functions of the Python standard library creating file descriptors are
cannot be changed by this proposal, because adding a ``cloexec`` optional
argument would be surprising and too many functions would need it. For
example, ``os.urandom()`` uses a temporary file on UNIX, but it calls a
function of Windows API on Windows. Adding a ``cloexec`` argument to
``os.urandom()`` would not make sense. See `Always set close-on-exec flag`_
for an incomplete list of functions creating file descriptors.
* Checking if a module creates file descriptors is difficult. For example,
``os.urandom()`` creates a file descriptor on UNIX to read ``/dev/urandom``
(and closes it at exit), whereas it is implemented using a function call on
Windows. It is not possible to control close-on-exec flag of the file
descriptor used by ``os.urandom()``, because ``os.urandom()`` API does not
allow it.
Alternatives
============
Always set close-on-exec flag
-----------------------------
Always set close-on-exec flag on new file descriptors created by Python. This
alternative just changes the default value of the new ``cloexec`` argument.
If a file must be inherited by child processes, ``cloexec=False`` argument can
be used.
``subprocess.Popen`` constructor has an ``pass_fds`` argument to specify which
file descriptors must be inherited. The close-on-exec flag of these file
descriptors must be changed with ``os.set_cloexec()``.
Example of functions creating file descriptors which will be modified to
set close-on-exec flag:
* ``os.urandom()`` (on UNIX)
* ``curses.window.getwin()``, ``curses.window.putwin()``
* ``mmap.mmap()`` (if ``MAP_ANONYMOUS`` is not defined)
* ``oss.open()``
* ``Modules/main.c``: ``RunStartupFile()``
* ``Python/pythonrun.c``: ``PyRun_SimpleFileExFlags()``
* ``Modules/getpath.c``: ``search_for_exec_prefix()``
* ``Modules/zipimport.c``: ``read_directory()``
* ``Modules/_ssl.c``: ``load_dh_params()``
* ``PC/getpathp.c``: ``calculate_path()``
* ``Python/errors.c``: ``PyErr_ProgramText()``
* ``Python/import.c``: ``imp_load_dynamic()``
* TODO: ``PC/_msi.c``
Many functions are impacted indirectly by this alternative. Examples:
* ``logging.FileHandler``
Advantages of setting close-on-exec flag by default:
* There are far more programs that are bitten by FD inheritance upon
exec (see `Inherited file descriptors issues`_ and `Security`_) than
programs relying on it
(see `Applications using inherance of file descriptors`_).
Drawbacks of setting close-on-exec flag by default:
* The os module is written as a thin wrapper to system calls (to functions of
the C standard library). If atomic flags to set close-on-exec flag are not
supported (see `Appendix: Operating system support`_), a single Python
function call may call 2 or 3 system calls (see `Performances`_ section).
* Extra system calls, if any, may slow down Python: see `Performances`_.
* It violates the principle of least surprise. Developers using the os module
may expect that Python respects the POSIX standard and so that close-on-exec
flag is not set by default.
Backward compatibility: only a few programs rely on inherance of file
descriptors, and they only pass a few file descriptors, usually just one.
These programs will fail immediatly with ``EBADF`` error, and it will be simple
to fix them: add ``cloexec=False`` argument or use
``os.set_cloexec(fd, False)``.
The ``subprocess`` module will be changed anyway to unset close-on-exec flag on
file descriptors listed in the ``pass_fds`` argument of Popen constructor. So
it possible that these programs will not need any fix if they use the
``subprocess`` module.
Add a function to set close-on-exec flag by default
---------------------------------------------------
An alternative is to add also a function to change globally the default
behaviour. It would be possible to set close-on-exec flag for the whole
application including all modules and the Python standard library. This
alternative is based on the `Proposal`_ and adds extra changes.
Add new functions:
* ``sys.getdefaultcloexec() -> bool``: get the default value of the
close-on-exec flag for new file descriptor
* ``sys.setdefaultcloexec(cloexec: bool)``: enable or disable close-on-exec
flag, the state of the flag can be overriden in each function creating a
file descriptor
The major change is that the default value of the ``cloexec`` argument is
``sys.getdefaultcloexec()``, instead of ``False``.
When ``sys.setdefaultcloexec(True)`` is called to set close-on-exec by default,
we have the same drawbacks than `Always set close-on-exec
flag`_ alternative.
There are additionnal drawbacks of having two behaviours depending on
``sys.getdefaultcloexec()`` value:
* It is not more possible to know if the close-on-exec flag will be set or not
just by reading the source code.
Close file descriptors after fork
---------------------------------
This PEP does not fix issues with applications using ``fork()`` without
``exec()``. Python needs a generic process to register callbacks which
would be called after a fork, see `Add an 'afterfork' module`_. Such
registry could be used to close file descriptors just after a ``fork()``.
Drawbacks:
* This alternative does not solve the problem for programs using ``exec()``
without ``fork()``.
* A third party module may call directly the C function ``fork()`` which will
not call "atfork" callbacks.
* All functions creating file descriptors must be changed to register a
callback and then unregister their callback when the file is closed. Or a
list of *all* open file descriptors must be maintained.
* The operating system is a better place than Python to close automatically
file descriptors. For example, it is not easy to avoid a race condition
between closing the file and unregistering the callback closing the file.
open(): add "e" flag to mode
----------------------------
A new "e" mode would set close-on-exec flag (best-effort).
This alternative only solves the problem for ``open()``. socket.socket() and
os.pipe() do not have a ``mode`` argument for example.
Since its version 2.7, the GNU libc supports ``"e"`` flag for ``fopen()``. It
uses ``O_CLOEXEC`` if available, or use ``fcntl(fd, F_SETFD, FD_CLOEXEC)``.
With Visual Studio, fopen() accepts a "N" flag which uses ``O_NOINHERIT``.
Applications using inherance of file descriptors
================================================
Most developers don't know that file descriptors are inherited by default. Most
programs do not rely on inherance of file descriptors. For example,
``subprocess.Popen`` was changed in Python 3.2 to close all file descriptors
greater than 2 in the child process by default. No user complained about this
behavior change.
Network servers using fork may want to pass the client socket to the child
process. For example, a CGI server pass the socket client through file
descriptors 0 (stdin) and 1 (stdout) using ``dup2()``.
process. For example, on UNIX a CGI server pass the socket client through file
descriptors 0 (stdin) and 1 (stdout) using ``dup2()``. This specific case is
not impacted by this PEP because the close-on-exec flag is never set on file
descriptors smaller than 3.
Example of programs taking file descriptors from the parent process:
To access a restricted resource like creating a socket listening on a TCP port
lower than 1024 or reading a file containing sensitive data like passwords, a
common practice is: start as the root user, create a file descriptor, create
a child process, pass the file descriptor to the child process and exit.
Security is very important in such use case: leaking another file descriptor
would be a critical security vulnerability (see `Security`_). The root process
may not exit but monitors the child process instead, and restarts a new child
process and pass the same file descriptor if the previous child process
crashed.
Example of programs taking file descriptors from the parent process using a
command line option:
* gpg: ``--status-fd <fd>``, ``--logger-fd <fd>``, etc.
* openssl: ``-pass fd:<fd>``
* qemu: ``-add-fd <fd>``
* valgrind: ``--log-fd=<fd>``, ``--input-fd=<fd>``, etc.
* qemu: ``-add-fd <fd>`` command line option
* GnuPG: ``--status-fd <fd>``, ``--logger-fd <fd>``, etc.
* openssl command: ``-pass fd:<fd>``
* xterm: ``-S <fd>``
On Linux, it is possible to use ``/dev/fd/<fd>`` filename to pass a file
descriptor to a program expecting a filename. It can be used to pass a password
for example.
These applications only pass a few file descriptors, usually only one.
Fixing these applications to unset close-on-exec flag should be easy.
If the ``subprocess`` module is used, inherited file descriptors must be specified
using the ``pass_fds`` argument (except if the ``close_fds`` argument is set
explicitly to ``False``). So the ``subprocess`` module knows the list of file
descriptors on which close-on-exec flag must be unset.
File descriptors 0 (stdin), 1 (stdout) and 2 (stderr) are expected to be
inherited and so should not have the close-on-exec flag. So a CGI server should
not be impacted by this PEP.
On Linux, it is possible to use ``"/dev/fd/<fd>"`` filename to pass a file
descriptor to a program expecting a filename.
Performances
@ -313,96 +465,6 @@ There is no backward incompatible change. The default behaviour is unchanged:
the close-on-exec flag is not set by default.
Alternatives
============
Always set close-on-exec flag
-----------------------------
Always set close-on-exec flag on new file descriptors created by Python. This
alternative just changes the default value of the new ``cloexec`` argument.
``subprocess.Popen`` constructor has an ``pass_fds`` argument to specify which
file descriptors must be inherited. The close-on-exec flag of these file
descriptors must be changed with ``os.set_cloexec()``.
If the close-on-exec flag must not be set, ``cloexec=False`` can be specified.
Advantages of setting close-on-exec flag by default:
* There are far more programs that are bitten by FD inheritance upon
exec (see `Inherited file descriptors issues`_ and `Security`_) than
programs relying on it.
* Checking if a module creates file descriptors is difficult. For example,
``os.urandom()`` creates a file descriptor on UNIX to read ``/dev/urandom``
(and closes it at exit), whereas it is implemented using a function call on
Windows. It is not possible to control close-on-exec flag of the file
descriptor used by ``os.urandom()``, because ``os.urandom()`` API does not
allow it.
* No need to add a new ``cloexec`` argument everywhere: functions creating
file descriptors will read ``sys.getdefaultcloexec()`` to decide if the
close-on-exec must be set or not. For example, adding an ``cloexec``
argument to ``os.urandom()`` does not make sense on Windows.
Drawbacks of setting close-on-exec flag by default:
* The os module is written as a thin wrapper to system calls (to functions of
the C standard library). If atomic flags are not supported, a single Python
function call may now call 2 or 3 system calls (see `Performances`_
section).
* Extra system calls, if any, may slow down Python: see `Performances`_.
* It violates the principle of least surprise. Developers using the os module
may expect that Python respects the POSIX standard and so that close-on-exec
flag is not set by default.
* Only file descriptors created by the Python standard library will comply to
``sys.setdefaultcloexec()``. The close-on-exec flag is unchanged for file
descriptors created by third party modules calling directly C functions.
Third party modules will have to be modified to read
``sys.getdefaultcloexec()`` to make them comply to this PEP.
Add a function to set close-on-exec flag by default
---------------------------------------------------
An alternative is to add also a function to change globally the default
behaviour. It would be possible to set close-on-exec flag for the whole
application including all modules and the Python standard library. This
alternative is based on the PEP but adds extra changes.
Add new functions:
* ``sys.getdefaultcloexec() -> bool``: get the default value of the
close-on-exec flag for new file descriptor
* ``sys.setdefaultcloexec(cloexec: bool)``: enable or disable close-on-exec
flag, the state of the flag can be overriden in each function creating a
file descriptor
The major change is that the default value of the ``cloexec`` argument is
``sys.getdefaultcloexec()``, instead of ``False``.
When ``sys.setdefaultcloexec(True)`` is called to set close-on-exec by default,
we have the same drawbacks than `Always set close-on-exec
flag`_ alternative.
There are additionnal drawbacks of having two behaviours depending on
``sys.getdefaultcloexec()`` value:
* It is not more possible to know if the close-on-exec flag will be set or not
just by reading the source code.
open(): add "e" flag to mode
----------------------------
A new "e" mode would set close-on-exec flag (best-effort).
This API does not allow to disable explictly close-on-exec flag if it was
enabled globally with ``sys.setdefaultcloexec()``.
Note: Since its version 2.7, the GNU libc supports ``"e"`` flag for ``fopen()``.
It uses ``O_CLOEXEC`` if available, or use ``fcntl(fd, F_SETFD, FD_CLOEXEC)``.
Appendix: Operating system support
==================================