PEP: 446 Title: Make newly created file descriptors non-inheritable Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 5-August-2013 Python-Version: 3.4 Abstract ======== Leaking file descriptors in child processes causes various annoying issues and is a known major security vulnerability. Using the ``subprocess`` module with the *close_fds* parameter set to ``True`` is not possible in some cases, and has poor performances on some platforms. This PEP proposes to make all file descriptors created by Python non-inheritable by default to reduce the risk of these issues. This PEP fixes also a race condition in multithreaded applications on operating systems supporting atomic flags to create non-inheritable file descriptors. Rationale ========= Inheritance of File Descriptors ------------------------------- Each operating system handles the inheritance of file descriptors differently. Windows creates non-inheritable handles by default, whereas UNIX and the POSIX API of Windows create inheritable file descriptors by default. Python prefers the POSIX API over the native Windows API to have a single code base and to use the same type for file descriptors, and so it creates inheritable file descriptors. There is one exception: ``os.pipe()`` creates non-inheritable pipes on Windows, whereas it creates inheritable pipes on UNIX. The reason is an implementation artifact: ``os.pipe()`` calls ``CreatePipe()`` on Windows (native API), whereas it calls ``pipe()`` on UNIX (POSIX API). The call to ``CreatePipe()`` was added in Python in 1994, before the introduction of ``pipe()`` in the POSIX API in Windows 98. The `issue #4708 `_ proposes to change ``os.pipe()`` on Windows to create inheritable pipes. Inheritance of File Descriptors on Windows ------------------------------------------ On Windows, the native type of file objects are handles (C type ``HANDLE``). These handles have a ``HANDLE_FLAG_INHERIT`` flag which defines if a handle can be inherited in a child process or not. For the POSIX API, the C runtime (CRT) provides also file descriptors (C type ``int``). The handle of a file descriptor can be get using the function ``_get_osfhandle(fd)``. A file descriptor can be created from a handle using the function ``_open_osfhandle(handle)``. Using `CreateProcess() `_, handles are only inherited if their inheritable flag (``HANDLE_FLAG_INHERIT``) is set and if the ``bInheritHandles`` parameter of ``CreateProcess()`` is ``TRUE``; all file descriptors except standard streams (0, 1, 2) are closed in the child process, even if ``bInheritHandles`` is ``TRUE``. Using the ``spawnv()`` function, all inheritable handles and all inheritable file descriptors are inherited in the child process. This function uses the undocumented fields *cbReserved2* and *lpReserved2* of the `STARTUPINFO `_ structure to pass an array of file descriptors. To replace standard streams (stdin, stdout, stderr) using ``CreateProcess()``, the ``STARTF_USESTDHANDLES`` flag must be set in the *dwFlags* field of the ``STARTUPINFO`` structure and the *bInheritHandles* parameter of ``CreateProcess()`` must be set to ``TRUE``. So when at least one standard stream is replaced, all inheritable handles are inherited by the child process. See also: * `Handle Inheritance `_ * `Q315939: PRB: Child Inherits Unintended Handles During CreateProcess Call `_ Inheritance of File Descriptors on UNIX --------------------------------------- POSIX provides a *close-on-exec* flag on file descriptors to close automatically a file descriptor when the C function ``execv()`` is called. File descriptors with the *close-on-exec* flag cleared are inherited in the child process, file descriptors with the flag set are closed in the child process. The flag can be set in two syscalls (one to get current flags, a second to set new flags) using ``fcntl()``:: int flags, res; flags = fcntl(fd, F_GETFD); if (flags == -1) { /* handle the error */ } flags |= FD_CLOEXEC; /* or "flags &= ~FD_CLOEXEC;" to clear the flag */ res = fcntl(fd, F_SETFD, flags); if (res == -1) { /* handle the error */ } FreeBSD, Linux, Mac OS X, NetBSD, OpenBSD and QNX support also setting the flag in a single syscall using ioctl():: int res; res = ioctl(fd, FIOCLEX, 0); if (!res) { /* handle the error */ } The *close-on-exec* flag has no effect on ``fork()``: all file descriptors are inherited by the child process. The `Python issue #16500 "Add an atfork module" `_ proposes to add a new ``atfork`` module to execute code at fork, it may be used to close automatically file descriptors. Issues with Inheritable File Descriptors ---------------------------------------- Most of the time, inheritable file descriptors "leaked" in child processes are not noticed, because they don't cause major bugs. It does not mean that these bugs must not be fixed. Two common issues with inherited file descriptors: * On Windows, a directory cannot be removed before all file handles open in the directory are closed. The same issue can be seen with files, except if the file was created with the ``FILE_SHARE_DELETE`` flag (``O_TEMPORARY`` mode for ``open()``). * If a listening socket is leaked in a child process, the socket address cannot be reused before the parent and child processes terminated. For example, if a web server spawn a new program to handle a process, and the server restarts while the program is not done: the server cannot start because the TCP port is still in use. Example of issues in open source projects: * `Mozilla (Firefox) `_: open since 2002-05 * `dbus library `_: fixed in 2008-05 (`dbus commit `_), close file descriptors in the child process * `autofs `_: fixed in 2009-02, set the CLOEXEC flag * `qemu `_: fixed in 2009-12 (`qemu commit `_), set CLOEXEC flag * `Tor `_: fixed in 2010-12, set CLOEXEC flag * `OCaml `_: open since 2011-04, "PR#5256: Processes opened using Unix.open_process* inherit all opened file descriptors (including sockets)" * `ØMQ `_: open since 2012-08 * `Squid `_: open since 2012-07 Security Vulnerability ---------------------- Leaking file descriptors is also a well known security vulnerability: read `FIO42-C. Ensure files are properly closed when they are no longer needed `_ of the CERT. An untrusted child process can read sensitive data like passwords and take control of the parent process though leaked file descriptors. It is for example a way to escape from a chroot. With a leaked listening socket, a child process can accept new connections to read sensitive data. Example of vulnerabilities: * `Hijacking Apache https by mod_php `_ (2003) * Apache: `Apr should set FD_CLOEXEC if APR_FOPEN_NOCLEANUP is not set `_: fixed in 2009 * PHP: `system() (and similar) don't cleanup opened handles of Apache `_: open since 2006 * `CWE-403: Exposure of File Descriptor to Unintended Control Sphere `_ (2008) * `OpenSSH Security Advisory: portable-keysign-rand-helper.adv `_ (2011) Issues fixed in the subprocess module ------------------------------------- Inherited file descriptors caused 4 issues in the ``subprocess`` module: * `Issue #2320: Race condition in subprocess using stdin `_ (created in 2008) * `Issue #3006: subprocess.Popen causes socket to remain open after close `_ (created in 2008) * `Issue #7213: subprocess leaks open file descriptors between Popen instances causing hangs `_ (created in 2009) * `Issue #12786: subprocess wait() hangs when stdin is closed `_ (created in 2011) These issues were fixed in Python 3.2 by 4 different changes in the ``subprocess`` module: * Pipes are now non-inheritable ; * The default value of the *close_fds* parameter is now ``True``, with one exception on Windows: the default value is ``False`` if at least one standard stream is replaced ; * A new *pass_fds* parameter has been added ; * Creation of a ``_posixsubprocess`` module implemented in C. Atomic Creation of non-inheritable File Descriptors --------------------------------------------------- In a multithreaded application, a inheritable file descriptor can be created just before a new program is spawn, before the file descriptor is made non-inheritable. In this case, the file descriptor is leaked to the child process. This race condition could be avoided if the file descriptor is created directly non-inheritable. FreeBSD, Linux, Mac OS X, Windows and many other operating systems support creating non-inheritable file descriptors with the inheritable flag cleared atomically at the creation of the file descriptor. A new ``WSA_FLAG_NO_HANDLE_INHERIT`` flag for ``WSASocket()`` was added in Windows 7 SP1 and Windows Server 2008 R2 SP1 to create non-inheritable sockets. If this flag is used on an older Windows version (ex: Windows XP SP3), ``WSASocket()`` fails with ``WSAEPROTOTYPE``. On UNIX, new flags were added for files and sockets: * ``O_CLOEXEC``: available on Linux (2.6.23), FreeBSD (8.3), Mac OS 10.8, OpenBSD 5.0, Solaris 11, QNX, BeOS, next NetBSD release (6.1?). This flag is part of POSIX.1-2008. * ``SOCK_CLOEXEC`` flag for ``socket()`` and ``socketpair()``, available on Linux 2.6.27, OpenBSD 5.2, NetBSD 6.0. * ``fcntl()``: ``F_DUPFD_CLOEXEC`` flag, available on Linux 2.6.24, OpenBSD 5.0, FreeBSD 9.1, NetBSD 6.0, Solaris 11. This flag is part of POSIX.1-2008. * ``fcntl()``: ``F_DUP2FD_CLOEXEC`` flag, available on FreeBSD 9.1 and Solaris 11. * ``recvmsg()``: ``MSG_CMSG_CLOEXEC``, available on Linux 2.6.23, NetBSD 6.0. On Linux older than 2.6.23, ``O_CLOEXEC`` flag is simply ignored. So ``fcntl()`` must be called to check if the file descriptor is non-inheritable: ``O_CLOEXEC`` is not supported if the ``FD_CLOEXEC`` flag is missing. On Linux older than 2.6.27, ``socket()`` or ``socketpair()`` fail with ``errno`` set to ``EINVAL`` if the ``SOCK_CLOEXEC`` flag is set in the socket type. New functions: * ``dup3()``: available on Linux 2.6.27 (and glibc 2.9) * ``pipe2()``: available on Linux 2.6.27 (and glibc 2.9) * ``accept4()``: available on Linux 2.6.28 (and glibc 2.10) On Linux older than 2.6.28, ``accept4()`` fails with ``errno`` set to ``ENOSYS``. Summary: =========================== =============== ==================================== Operating System Atomic File Atomic Socket =========================== =============== ==================================== FreeBSD 8.3 (2012) X Linux 2.6.23 (2007) 2.6.27 (2008) Mac OS X 10.8 (2012) X NetBSD 6.1 (?) 6.0 (2012) OpenBSD 5.0 (2011) 5.2 (2012) Solaris 11 (2011) X Windows XP (2001) Seven SP1 (2011), 2008 R2 SP1 (2011) =========================== =============== ==================================== Legend: * "Atomic File": first version of the operating system supporting creating atomically a non-inheritable file descriptor using ``open()`` * "Atomic Socket": first version of the operating system supporting creating atomically a non-inheritable socket * "X": not supported yet Status of Python 3.3 -------------------- Python 3.3 creates inheritable file descriptors on all platforms, except ``os.pipe()`` which creates non-inheritable file descriptors on Windows. New constants and functions related to the atomic creation of non-inheritable file descriptors were added to Python 3.3: ``os.O_CLOEXEC``, ``os.pipe2()`` and ``socket.SOCK_CLOEXEC``. On UNIX, the ``subprocess`` module closes all file descriptors in the child process by default, except standard streams (0, 1, 2) and file descriptors of the *pass_fds* parameter. If the *close_fds* parameter is set to ``False``, all inheritable file descriptors are inherited in the child process. On Windows, the ``subprocess`` closes all handles and file descriptors in the child process by default. If at least one standard stream (stdin, stdout or stderr) is replaced (ex: redirected into a pipe), all inheritable handles are inherited in the child process. All inheritable file descriptors are inherited by the child process using the functions of the ``os.execv*()`` and ``os.spawn*()`` families. On UNIX, the ``multiprocessing`` module uses ``os.fork()`` and so all file descriptors are inherited by child processes. On Windows, all inheritable handles are inherited by the child process using the ``multiprocessing`` module, all file descriptors except standard streams are closed. Summary: =========================== ============= ================== ============= Module FD on UNIX Handles on Windows FD on Windows =========================== ============= ================== ============= subprocess, default STD, pass_fds none STD subprocess, replace stdout STD, pass_fds all STD subprocess, close_fds=False all all STD multiprocessing all all STD os.execv(), os.spawn() all all all =========================== ============= ================== ============= Legend: * "all": all *inheritable* file descriptors or handles are inherited in the child process * "none": all handles are closed in the child process * "STD": only file descriptors 0 (stdin), 1 (stdout) and 2 (stderr) are inherited in the child process * "pass_fds": file descriptors of the *pass_fds* parameter of the subprocess are inherited Performances of Closing All File Descriptors -------------------------------------------- On UNIX, the ``subprocess`` module closes almost all file descriptors in the child process. This operation require MAXFD system calls, where MAXFD is the maximum number of file descriptors, even if there are only few open file descriptors. This maximum can be read using: ``sysconf("SC_OPEN_MAX")``. The operation can be slow if MAXFD is large. For example, on a FreeBSD buildbot with ``MAXFD=655,000``, the operation took 300 ms: see `issue #11284: slow close file descriptors `_. On Linux, Python 3.3 gets the list of all open file descriptors from ``/proc//fd/``, and so performances depends on the number of open file descriptors, not on MAXFD. See also: * `Python issue #1663329 `_: subprocess close_fds perform poor if ``SC_OPEN_MAX`` is high * `Squid Bug #837033 `_: Squid should set CLOEXEC on opened FDs. "32k+ close() calls in each child process take a long time ([12-56] seconds) in Xen PV guests." Proposal ======== Non-inheritable File Descriptors -------------------------------- The following functions are modified to make newly created file descriptors non-inheritable by default: * ``asyncore.dispatcher.create_socket()`` * ``io.FileIO`` * ``io.open()`` * ``open()`` * ``os.dup()`` * ``os.dup2()`` * ``os.fdopen()`` * ``os.open()`` * ``os.openpty()`` * ``os.pipe()`` * ``select.devpoll()`` * ``select.epoll()`` * ``select.kqueue()`` * ``socket.socket()`` * ``socket.socket.accept()`` * ``socket.socket.dup()`` * ``socket.socket.fromfd()`` * ``socket.socketpair()`` When available, atomic flags are used to make file descriptors non-inheritable. The atomicity is not guaranteed because a fallback is required when atomic flags are not available. New Functions ------------- * ``os.get_inheritable(fd: int)``: return ``True`` if the file descriptor can be inherited by child processes, ``False`` otherwise. * ``os.set_inheritable(fd: int, inheritable: bool)``: clear or set the inheritable flag of the specified file descriptor. These new functions are available on all platforms. On Windows, these functions accept also file descriptors of sockets: the result of ``sockobj.fileno()``. Other Changes ------------- * On UNIX, subprocess makes file descriptors of the *pass_fds* parameter inheritable. The file descriptor is made inheritable in the child process after the ``fork()`` and before ``execv()``, so the inheritable flag of file descriptors is unchanged in the parent process. * ``os.dup2(fd, fd2)`` makes *fd2* inheritable if *fd2* is ``0`` (stdin), ``1`` (stdout) or ``2`` (stderr) and *fd2* is different than *fd*. Since Python should only create non-inheritable file descriptors, it is safe to use subprocess with the *close_fds* parameter set to ``False``. Not closing explicitly file descriptors is faster, especially on platform with a large maximum number of file descriptors. The default value of the *close_fds* parameter is unchanged, because third party modules, especially extensions implemented in C, may not conform immediatly to the PEP 446 (still create inheritable file descriptors). Backward Compatibility ====================== This PEP break applications relying on inheritance of file descriptors. Developers are encouraged to reuse the high-level Python module ``subprocess`` which handles the inheritance of file descriptors in a portable way. Applications using the ``subprocess`` module with the *pass_fds* parameter or using ``os.dup2()`` to redirect standard streams should not be affected. Python does no more conform to POSIX, since file descriptors are now made non-inheritable by default. Python was not designed to conform to POSIX, but was designed to develop portable applications. Related Work ============ The programming languages Go, Perl and Ruby make newly created file descriptors non-inheritable by default: since Go 1.0 (2009), Perl 1.0 (1987) and Ruby 2.0 (2013). The SCons project, written in Python, overrides builtin functions ``file()`` and ``open()`` to make files non-inheritable on Windows: see `win32.py `_. Rejected Alternatives ===================== Add a new open_noinherit() function ----------------------------------- In June 2007, Henning von Bargen proposed on the python-dev mailing list to add a new open_noinherit() function to fix issues of inherited file descriptors in child processes. At this time, the default value of the *close_fds* parameter of the subprocess module was ``False``. Read the mail thread: `[Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks `_. PEP 433 ------- The PEP 433 entitled "Easier suppression of file descriptor inheritance" is a previous attempt proposing various other alternatives, but no consensus could be reached. No special case for standard streams ------------------------------------ Functions handling file descriptors should not handle standard streams (file descriptors ``0``, ``1``, ``2``) differently. This option does not work on Windows. On Windows, calling ``SetHandleInformation()`` to set or clear ``HANDLE_FLAG_INHERIT`` flag on standard streams (0, 1, 2) fails with the Windows error 87 (invalid argument). If ``os.dup2(fd, fd2)`` would always make *fd2* non-inheritable, the function would raise an exception when used to redirect standard streams. Another option is to add a new *inheritable* parameter to ``os.dup2()``. This PEP has a special-case for ``os.dup2()`` to not break backward compatibility on applications redirecting standard streams before calling the C function ``execv()``. Examples in the Python standard library: ``CGIHTTPRequestHandler.run_cgi()`` and ``pty.fork()`` use ``os.dup2()`` to redict stdin, stdout and stderr. Links ===== Python issues: * `#10115: Support accept4() for atomic setting of flags at socket creation `_ * `#12105: open() does not able to set flags, such as O_CLOEXEC `_ * `#12107: TCP listening sockets created without FD_CLOEXEC flag `_ * `#16850: Add "e" mode to open(): close-and-exec (O_CLOEXEC) / O_NOINHERIT `_ * `#16860: Use O_CLOEXEC in the tempfile module `_ * `#16946: subprocess: _close_open_fd_range_safe() does not set close-on-exec flag on Linux < 2.6.23 if O_CLOEXEC is defined `_ * `#17070: Use the new cloexec to improve security and avoid bugs `_ * `#18571: Implementation of the PEP 446: non-inheritable file descriptors `_ Other links: * `Secure File Descriptor Handling `_ (Ulrich Drepper, 2008) * `Ghosts of Unix past, part 2: Conflated designs `_ (Neil Brown, 2010) explains the history of ``O_CLOEXEC`` and ``O_NONBLOCK`` flags * `File descriptor handling changes in 2.6.27 `_ Copyright ========= This document has been placed into the public domain.