reSTify PEP 209 (#406)

This commit is contained in:
Huang Huang 2017-09-13 07:22:41 +08:00 committed by Guido van Rossum
parent 58037ff300
commit ca41d18043
1 changed files with 193 additions and 165 deletions

View File

@ -5,12 +5,14 @@ Last-Modified: $Date$
Author: barrett@stsci.edu (Paul Barrett), oliphant@ee.byu.edu (Travis Oliphant)
Status: Withdrawn
Type: Standards Track
Content-Type: text/x-rst
Created: 03-Jan-2001
Python-Version: 2.2
Post-History:
Abstract
========
This PEP proposes a redesign and re-implementation of the multi-
dimensional array module, Numeric, to make it easier to add new
@ -28,6 +30,7 @@ Abstract
Motivation
==========
Multi-dimensional arrays are commonly used to store and manipulate
data in science, engineering, and computing. Python currently has
@ -43,6 +46,7 @@ Motivation
Proposal
========
This proposal recommends a re-design and re-implementation of
Numeric 1, henceforth called Numeric 2, which will enable new
@ -80,7 +84,7 @@ Proposal
responsibility of coercion to the operator. By using internal
buffers, a coercion operation can be done for each array
(including output arrays), if necessary, at the time of the
operation. Benchmarks [1] have shown that performance is at
operation. Benchmarks [1]_ have shown that performance is at
most degraded only slightly and is improved in cases where the
internal buffers are less than the L2 cache size and the
processor is under load. To avoid array coercion altogether,
@ -178,6 +182,7 @@ Proposal
Design and Implementation
=========================
The design of Numeric 2 has four primary classes:
@ -187,13 +192,15 @@ Design and Implementation
of an array-type, e.g. its name, its size in bytes, its coercion
relations with respect to other types, etc., e.g.
> Int32 = ArrayType('Int32', 4, 'doc-string')
::
Int32 = ArrayType('Int32', 4, 'doc-string')
Its relation to the other types is defined when the C-extension
module for that type is imported. The corresponding Python code
is:
is::
> Int32.astype[Real64] = Real64
Int32.astype[Real64] = Real64
This says that the Real64 array-type has higher priority than the
Int32 array-type.
@ -202,7 +209,8 @@ Design and Implementation
implementation. Additional attributes can be added on an
individual basis, e.g. .bitsize or .bitstrides for the bit type.
Attributes:
Attributes::
.name: e.g. "Int32", "Float64", etc.
.typecode: e.g. 'i', 'f', etc.
(for backward compatibility)
@ -210,13 +218,14 @@ Design and Implementation
.array_rules (mapping): rules between array types
.pyobj_rules (mapping): rules between array and python types
.doc: documentation string
Methods:
Methods::
__init__(): initialization
__del__(): destruction
__repr__(): representation
C-API:
This still needs to be fleshed-out.
C-API: This still needs to be fleshed-out.
2. UFunc:
@ -226,7 +235,9 @@ Design and Implementation
object whose attributes are name, total and input number of
arguments, a document string, and an empty CFunc dictionary; e.g.
> add = UFunc('add', 3, 2, 'doc-string')
::
add = UFunc('add', 3, 2, 'doc-string')
When defined the add instance has no C functions associated with
it and therefore can do no work. The CFunc dictionary is
@ -235,23 +246,25 @@ Design and Implementation
function name, function descriptor, and the CUFunc object. The
corresponding Python code is
> add.register('add', (Int32, Int32, Int32), cfunc-add)
::
add.register('add', (Int32, Int32, Int32), cfunc-add)
In the initialization function of an array type module, e.g.
Int32, there are two C API functions: one to initialize the
coercion rules and the other to register the CFunc objects.
When an operation is applied to some arrays, the __call__ method
When an operation is applied to some arrays, the ``__call__`` method
is invoked. It gets the type of each array (if the output array
is not given, it is created from the coercion rules) and checks
the CFunc dictionary for a key that matches the argument types.
If it exists the operation is performed immediately, otherwise the
coercion rules are used to search for a related operation and set
of conversion functions. The __call__ method then invokes a
of conversion functions. The ``__call__`` method then invokes a
compute method written in C to iterate over slices of each array,
namely:
namely::
> _ufunc.compute(slice, data, func, swap, conv)
_ufunc.compute(slice, data, func, swap, conv)
The 'func' argument is a CFuncObject, while the 'swap' and 'conv'
arguments are lists of CFuncObjects for those arrays needing pre-
@ -260,7 +273,7 @@ Design and Implementation
of iterations for each dimension along with the buffer offset and
step size for each array and each dimension.
We have predefined several UFuncs for use by the __call__ method:
We have predefined several UFuncs for use by the ``__call__`` method:
cast, swap, getobj, and setobj. The cast and swap functions do
coercion and byte-swapping, respectively and the getobj and setobj
functions do coercion between Numeric arrays and Python sequences.
@ -268,13 +281,16 @@ Design and Implementation
The following attributes and methods are proposed for the core
implementation.
Attributes:
Attributes::
.name: e.g. "add", "subtract", etc.
.nargs: number of total arguments
.iargs: number of input arguments
.cfuncs (mapping): the set C functions
.doc: documentation string
Methods:
Methods::
__init__(): initialization
__del__(): destruction
__repr__(): representation
@ -284,8 +300,7 @@ Design and Implementation
register(): register a CUFunc
unregister(): unregister a CUFunc
C-API:
This still needs to be fleshed-out.
C-API: This still needs to be fleshed-out.
3. Array:
@ -293,18 +308,23 @@ Design and Implementation
type, endian-ness of the data, etc.. Its operators, '+', '-',
etc. just invoke the corresponding UFunc function, e.g.
> def __add__(self, other):
> return ufunc.add(self, other)
::
def __add__(self, other):
return ufunc.add(self, other)
The following attributes, methods, and functions are proposed for
the core implementation.
Attributes:
Attributes::
.shape: shape of the array
.format: type of the array
.real (only complex): real part of a complex array
.imag (only complex): imaginary part of a complex array
Methods:
Methods::
__init__(): initialization
__del__(): destruction
__repr_(): representation
@ -320,15 +340,15 @@ Design and Implementation
aslist(): create list from array
asstring(): create string from array
Functions:
Functions::
fromlist(): create array from sequence
fromstring(): create array from string
array(): create array with shape and value
concat(): concatenate two arrays
resize(): resize array
C-API:
This still needs to be fleshed-out.
C-API: This still needs to be fleshed-out.
4. ArrayView
@ -337,8 +357,7 @@ Design and Implementation
arrays cannot be reshaped or flattened using just pointer and
step-size information.
C-API:
This still needs to be fleshed-out.
C-API: This still needs to be fleshed-out.
5. C-extension modules:
@ -351,6 +370,8 @@ Design and Implementation
i.e. iterate over arrays using a specified C function. The
interface of these functions is the same as Numeric 1, i.e.
::
int (*CFunc)(char *data, int *steps, int repeat, void *func);
and their functionality is expected to be the same, i.e. they
@ -361,11 +382,11 @@ Design and Implementation
Attributes:
Methods:
Methods::
compute():
C-API:
This still needs to be fleshed-out.
C-API: This still needs to be fleshed-out.
b. _int32, _real64, etc.:
@ -379,6 +400,7 @@ Design and Implementation
Open Issues
===========
1. Does slicing syntax default to copy or view behavior?
@ -402,13 +424,13 @@ Open Issues
2. Does item syntax default to copy or view behavior?
A similar question arises with the item syntax. For example, if a
= [[0,1,2], [3,4,5]] and b = a[0], then changing b[0] also changes
a[0][0], because a[0] is a reference or view of the first row of
a. Therefore, if c is a 2-d array, it would appear that c[i]
A similar question arises with the item syntax. For example, if
``a = [[0,1,2], [3,4,5]]`` and ``b = a[0]``, then changing ``b[0]`` also changes
``a[0][0]``, because ``a[0]`` is a reference or view of the first row of a.
Therefore, if c is a 2-d array, it would appear that ``c[i]``
should return a 1-d array which is a view into, instead of a copy
of, c for consistency. Yet, c[i] can be considered just a
shorthand for c[i,:] which would imply copy behavior assuming
of, c for consistency. Yet, ``c[i]`` can be considered just a
shorthand for ``c[i,:]`` which would imply copy behavior assuming
slicing syntax returns a copy. Should Numeric 2 behave the same
way as lists and return a view or should it return a copy.
@ -531,6 +553,7 @@ Open Issues
Implementation Steps
====================
1. Implement basic UFunc capability
@ -567,7 +590,7 @@ Implementation Steps
b. Implement multidimensional arrays
c. Implement some of basic Array methods using UFuncs:
+, -, *, /, etc.
+, -, \*, /, etc.
d. Enable UFuncs to use Python sequences.
@ -587,6 +610,7 @@ Implementation Steps
Incompatibilities
=================
The following is a list of incompatibilities in behavior between
Numeric 1 and Numeric 2.
@ -629,6 +653,7 @@ Incompatibilities
Appendices
==========
A. Implicit sub-arrays iteration
@ -651,34 +676,37 @@ Appendices
Copyright
=========
This document is placed in the public domain.
Related PEPs
============
PEP 207: Rich Comparisons
* PEP 207: Rich Comparisons
by Guido van Rossum and David Ascher
PEP 208: Reworking the Coercion Model
* PEP 208: Reworking the Coercion Model
by Neil Schemenauer and Marc-Andre' Lemburg
PEP 211: Adding New Linear Algebra Operators to Python
* PEP 211: Adding New Linear Algebra Operators to Python
by Greg Wilson
PEP 225: Elementwise/Objectwise Operators
* PEP 225: Elementwise/Objectwise Operators
by Huaiyu Zhu
PEP 228: Reworking Python's Numeric Model
* PEP 228: Reworking Python's Numeric Model
by Moshe Zadka
References
==========
[1] P. Greenfield 2000. private communication.
.. [1] P. Greenfield 2000. private communication.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil