python-peps/pep-0324.txt

367 lines
14 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 324
Title: process - New POSIX process module
Version: $Revision$
Last-Modified: $Date$
Author: Peter Astrand <astrand@lysator.liu.se>
Status: Draft
Type: Standards Track (library)
Created: 19-Nov-2003
Content-Type: text/plain
Python-Version: 2.4
Abstract
This PEP describes a new module for starting and communicating
with processes on POSIX systems.
Motivation
Starting new processes is a common task in any programming
language, and very common in a high-level language like Python.
Good support for this task is needed, because:
- Inappropriate functions for starting processes could mean a
security risk: If the program is started through the shell, and
the arguments contain shell meta characters, the result can be
disastrous. [1]
- It makes Python an even better replacement language for
over-complicated shell scripts.
Currently, Python has a large number of different functions for
process creation. This makes it hard for developers to choose.
The process module provides the following enhancements over
previous functions:
- One "unified" module provides all functionality from previous
functions.
- Cross-process exceptions: Exceptions happening in the child
before the new process has started to execute are re-raised in
the parent. This means that it's easy to handle exec()
failures, for example. With popen2, for example, it's
impossible to detect if the execution failed.
- A hook for executing custom code between fork and exec. This
can be used for, for example, changing uid.
- No implicit call of /bin/sh. This means that there is no need
for escaping dangerous shell meta characters.
- All combinations of file descriptor redirection is possible.
For example, the "python-dialog" [2] needs to spawn a process
and redirect stderr, but not stdout. This is not possible with
current functions, without using temporary files.
- With the process module, it's possible to control if all open
file descriptors should be closed before the new program is
executed.
- Support for connecting several subprocesses (shell "pipe").
- Universal newline support.
- A communicate() method, which makes it easy to send stdin data
and read stdout and stderr data, without risking deadlocks.
Most people are aware of the flow control issues involved with
child process communication, but not all have the patience or
skills to write a fully correct and deadlock-free select loop.
This means that many Python applications contain race
conditions. A communicate() method in the standard library
solves this problem.
Rationale
The following points summarizes the design:
- process was based on popen2, which is tried-and-tested.
- The factory functions in popen2 have been removed, because I
consider the class constructor equally easy to work with.
- popen2 contains several factory functions and classes for
different combinations of redirection. process, however,
contains one single class. Since the process module supports 12
different combinations of redirection, providing a class or
function for each of them would be cumbersome and not very
intuitive. Even with popen2, this is a readability problem.
For example, many people cannot tell the difference between
popen2.popen2 and popen2.popen4 without using the documentation.
- Two small utility functions are provided: process.call() and
process.callv(). These aims to be an enhancement over
os.system(), while still very easy to use:
- It does not use the Standard C function system(), which has
limitations.
- It does not call the shell implicitly.
- No need for quoting; using a variable argument list.
- The return value is easier to work with.
The call() utility function accepts an 'args' argument, just
like the Popen class constructor. It waits for the command to
complete, then returns the returncode attribute. The
implementation is very simple:
def call(*args, **kwargs):
return Popen(*args, **kwargs).wait()
The motivation behind the call() function is simple: Starting a
process and wait for it to finish is a common task.
The callv() function is identical to call(), except that each
non-keyword argument is treated as a program argument. This
gives a slightly nicer syntax. The drawback is that callv() does
not allow specifying the program and it's arguments as a
whitespace-separated string: The entire (first) string would be
intepreted as the executable. The implementation of callv() is
also very simple:
def callv(*args, **kwargs):
return Popen(args, **kwargs).wait()
While Popen supports a wide range of options, many users have
simple needs. Many people are using os.system() today, mainly
because it provides a simple interface. Consider this example:
os.system("stty sane -F " + device)
With process.call(), this would look like:
process.call(["stty", "sane", "-F", device])
Some people feel that the list brackets are clumsy. With
callv(), they are not needed:
process.callv("stty", "sane", "-F", device)
- The "preexec" functionality makes it possible to run arbitrary
code between fork and exec. One might ask why there are special
arguments for setting the environment and current directory, but
not for, for example, setting the uid. The answer is:
- Changing environment and working directory is considered
fairly common.
- Old functions like spawn() has support for an
"env"-argument.
- env and cwd are considered quite cross-platform: They make
sense even on Windows.
Specification
This module defines one class called Popen:
class Popen(args, bufsize=0, executable=None,
stdin=None, stdout=None, stderr=None,
preexec_fn=None, close_fds=False,
cwd=None, env=None, universal_newlines=False,
startupinfo=None, creationflags=0):
Arguments are:
- args should be a sequence of program arguments. The program to
execute is normally the first item in the args sequence, but can
be explicitly set by using the executable argument.
- On UNIX: the Popen class uses os.execvp() to execute the child
program, which operates on sequences. If args is a string, it
will be converted to a sequence using the cmdline2list method.
Please note that syntax for quoting arguments is different from
a typical UNIX shell. See the documentation of the cmdline2list
method for more information.
- On Windows: the Popen class uses CreateProcess() to execute the
child program, which operates on strings. If args is a
sequence, it will be converted to a string using the
list2cmdline method. Please note that not all MS Windows
applications interpret the command line the same way: The
list2cmdline is designed for applications using the same rules
as the MS C runtime.
- bufsize, if given, has the same meaning as the corresponding
argument to the built-in open() function: 0 means unbuffered, 1
means line buffered, any other positive value means use a buffer
of (approximately) that size. A negative bufsize means to use
the system default, which usually means fully buffered. The
default value for bufsize is 0 (unbuffered).
- stdin, stdout and stderr specify the executed programs' standard
input, standard output and standard error file handles,
respectively. Valid values are PIPE, an existing file
descriptor (a positive integer), an existing file object, and
None. PIPE indicates that a new pipe to the child should be
created. With None, no redirection will occur; the child's file
handles will be inherited from the parent. Additionally, stderr
can be STDOUT, which indicates that the stderr data from the
applications should be captured into the same file handle as for
stdout.
- If preexec_fn is set to a callable object, this object will be
called in the child process just before the child is executed.
- If close_fds is true, all file descriptors except 0, 1 and 2
will be closed before the child process is executed.
- If cwd is not None, the current directory will be changed to cwd
before the child is executed.
- If env is not None, it defines the environment variables for the
new process.
- If universal_newlines is true, the file objects stdout and
stderr are opened as a text files, but lines may be terminated
by any of '\n', the Unix end-of-line convention, '\r', the
Macintosh convention or '\r\n', the Windows convention. All of
these external representations are seen as '\n' by the Python
program. Note: This feature is only available if Python is
built with universal newline support (the default). Also, the
newlines attribute of the file objects stdout, stdin and stderr
are not updated by the communicate() method.
- The startupinfo and creationflags, if given, will be passed to
the underlying CreateProcess() function. They can specify
things such as appearance of the main window and priority for
the new process. (Windows only)
This module also defines two shortcut functions:
- call(*args, **kwargs):
Run command with arguments. Wait for command to complete, then
return the returncode attribute.
The arguments are the same as for the Popen constructor. Example:
retcode = call(["ls", "-l"])
- callv(*args, **kwargs):
Run command with arguments. Wait for command to complete, then
return the returncode attribute.
This function is identical to call(), except that each non-keyword
argument is treated as a program argument. Example:
retcode = callv("ls", "-l")
This is equivalent to:
retcode = call(["ls", "-l"])
Exceptions
----------
Exceptions raised in the child process, before the new program has
started to execute, will be re-raised in the parent.
Additionally, the exception object will have one extra attribute
called 'child_traceback', which is a string containing traceback
information from the childs point of view.
The most common exception raised is OSError. This occurs, for
example, when trying to execute a non-existent file. Applications
should prepare for OSErrors.
A ValueError will be raised if Popen is called with invalid arguments.
Security
--------
Unlike some other popen functions, this implementation will never call
/bin/sh implicitly. This means that all characters, including shell
metacharacters, can safely be passed to child processes.
Popen objects
-------------
Instances of the Popen class have the following methods:
poll()
Check if child process has terminated. Returns returncode
attribute.
wait()
Wait for child process to terminate. Returns returncode attribute.
communicate(input=None)
Interact with process: Send data to stdin. Read data from
stdout and stderr, until end-of-file is reached. Wait for
process to terminate. The optional stdin argument should be a
string to be sent to the child process, or None, if no data
should be sent to the child.
communicate() returns a tuple (stdout, stderr).
Note: The data read is buffered in memory, so do not use this
method if the data size is large or unlimited.
The following attributes are also available:
stdin
If the stdin argument is PIPE, this attribute is a file object
that provides input to the child process. Otherwise, it is None.
stdout
If the stdout argument is PIPE, this attribute is a file object
that provides output from the child process. Otherwise, it is
None.
stderr
If the stderr argument is PIPE, this attribute is file object that
provides error output from the child process. Otherwise, it is
None.
pid
The process ID of the child process.
returncode
The child return code. A None value indicates that the
process hasn't terminated yet. A negative value -N indicates
that the child was terminated by signal N (UNIX only).
Open Issues
Currently, the reference implementation requires the "win32all"
extensions when running on the Windows platform. This dependency
could probably be eliminated by providing a small "glue" module
written in C, just like the _winreg module.
Reference Implementation
A reference implementation is available from
http://www.lysator.liu.se/~astrand/popen5/.
References
[1] Secure Programming for Linux and Unix HOWTO, section 8.3.
http://www.dwheeler.com/secure-programs/
[2] Python Dialog
http://pythondialog.sourceforge.net/
Copyright
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End: