686 lines
29 KiB
Plaintext
686 lines
29 KiB
Plaintext
PEP: 209
|
||
Title: Multi-dimensional Arrays
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: barrett@stsci.edu (Paul Barrett), oliphant@ee.byu.edu (Travis Oliphant)
|
||
Status: Withdrawn
|
||
Type: Standards Track
|
||
Created: 03-Jan-2001
|
||
Python-Version: 2.2
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
|
||
This PEP proposes a redesign and re-implementation of the multi-
|
||
dimensional array module, Numeric, to make it easier to add new
|
||
features and functionality to the module. Aspects of Numeric 2
|
||
that will receive special attention are efficient access to arrays
|
||
exceeding a gigabyte in size and composed of inhomogeneous data
|
||
structures or records. The proposed design uses four Python
|
||
classes: ArrayType, UFunc, Array, and ArrayView; and a low-level
|
||
C-extension module, _ufunc, to handle the array operations
|
||
efficiently. In addition, each array type has its own C-extension
|
||
module which defines the coercion rules, operations, and methods
|
||
for that type. This design enables new types, features, and
|
||
functionality to be added in a modular fashion. The new version
|
||
will introduce some incompatibilities with the current Numeric.
|
||
|
||
|
||
Motivation
|
||
|
||
Multi-dimensional arrays are commonly used to store and manipulate
|
||
data in science, engineering, and computing. Python currently has
|
||
an extension module, named Numeric (henceforth called Numeric 1),
|
||
which provides a satisfactory set of functionality for users
|
||
manipulating homogeneous arrays of data of moderate size (of order
|
||
10 MB). For access to larger arrays (of order 100 MB or more) of
|
||
possibly inhomogeneous data, the implementation of Numeric 1 is
|
||
inefficient and cumbersome. In the future, requests by the
|
||
Numerical Python community for additional functionality is also
|
||
likely as PEPs 211: Adding New Linear Operators to Python, and
|
||
225: Elementwise/Objectwise Operators illustrate.
|
||
|
||
|
||
Proposal
|
||
|
||
This proposal recommends a re-design and re-implementation of
|
||
Numeric 1, henceforth called Numeric 2, which will enable new
|
||
types, features, and functionality to be added in an easy and
|
||
modular manner. The initial design of Numeric 2 should focus on
|
||
providing a generic framework for manipulating arrays of various
|
||
types and should enable a straightforward mechanism for adding new
|
||
array types and UFuncs. Functional methods that are more specific
|
||
to various disciplines can then be layered on top of this core.
|
||
This new module will still be called Numeric and most of the
|
||
behavior found in Numeric 1 will be preserved.
|
||
|
||
The proposed design uses four Python classes: ArrayType, UFunc,
|
||
Array, and ArrayView; and a low-level C-extension module to handle
|
||
the array operations efficiently. In addition, each array type
|
||
has its own C-extension module which defines the coercion rules,
|
||
operations, and methods for that type. At a later date, when core
|
||
functionality is stable, some Python classes can be converted to
|
||
C-extension types.
|
||
|
||
Some planned features are:
|
||
|
||
1. Improved memory usage
|
||
|
||
This feature is particularly important when handling large arrays
|
||
and can produce significant improvements in performance as well as
|
||
memory usage. We have identified several areas where memory usage
|
||
can be improved:
|
||
|
||
a. Use a local coercion model
|
||
|
||
Instead of using Python's global coercion model which creates
|
||
temporary arrays, Numeric 2, like Numeric 1, will implement a
|
||
local coercion model as described in PEP 208 which defers the
|
||
responsibility of coercion to the operator. By using internal
|
||
buffers, a coercion operation can be done for each array
|
||
(including output arrays), if necessary, at the time of the
|
||
operation. Benchmarks [1] have shown that performance is at
|
||
most degraded only slightly and is improved in cases where the
|
||
internal buffers are less than the L2 cache size and the
|
||
processor is under load. To avoid array coercion altogether,
|
||
C functions having arguments of mixed type are allowed in
|
||
Numeric 2.
|
||
|
||
b. Avoid creation of temporary arrays
|
||
|
||
In complex array expressions (i.e. having more than one
|
||
operation), each operation will create a temporary array which
|
||
will be used and then deleted by the succeeding operation. A
|
||
better approach would be to identify these temporary arrays
|
||
and reuse their data buffers when possible, namely when the
|
||
array shape and type are the same as the temporary array being
|
||
created. This can be done by checking the temporary array's
|
||
reference count. If it is 1, then it will be deleted once the
|
||
operation is done and is a candidate for reuse.
|
||
|
||
c. Optional use of memory-mapped files
|
||
|
||
Numeric users sometimes need to access data from very large
|
||
files or to handle data that is greater than the available
|
||
memory. Memory-mapped arrays provide a mechanism to do this
|
||
by storing the data on disk while making it appear to be in
|
||
memory. Memory- mapped arrays should improve access to all
|
||
files by eliminating one of two copy steps during a file
|
||
access. Numeric should be able to access in-memory and
|
||
memory-mapped arrays transparently.
|
||
|
||
d. Record access
|
||
|
||
In some fields of science, data is stored in files as binary
|
||
records. For example, in astronomy, photon data is stored as a
|
||
1 dimensional list of photons in order of arrival time. These
|
||
records or C-like structures contain information about the
|
||
detected photon, such as its arrival time, its position on the
|
||
detector, and its energy. Each field may be of a different
|
||
type, such as char, int, or float. Such arrays introduce new
|
||
issues that must be dealt with, in particular byte alignment
|
||
or byte swapping may need to be performed for the numeric
|
||
values to be properly accessed (though byte swapping is also
|
||
an issue for memory mapped data). Numeric 2 is designed to
|
||
automatically handle alignment and representational issues
|
||
when data is accessed or operated on. There are two
|
||
approaches to implementing records; as either a derived array
|
||
class or a special array type, depending on your point-of-
|
||
view. We defer this discussion to the Open Issues section.
|
||
|
||
|
||
2. Additional array types
|
||
|
||
Numeric 1 has 11 defined types: char, ubyte, sbyte, short, int,
|
||
long, float, double, cfloat, cdouble, and object. There are no
|
||
ushort, uint, or ulong types, nor are there more complex types
|
||
such as a bit type which is of use to some fields of science and
|
||
possibly for implementing masked-arrays. The design of Numeric 1
|
||
makes the addition of these and other types a difficult and
|
||
error-prone process. To enable the easy addition (and deletion)
|
||
of new array types such as a bit type described below, a re-design
|
||
of Numeric is necessary.
|
||
|
||
a. Bit type
|
||
|
||
The result of a rich comparison between arrays is an array of
|
||
boolean values. The result can be stored in an array of type
|
||
char, but this is an unnecessary waste of memory. A better
|
||
implementation would use a bit or boolean type, compressing
|
||
the array size by a factor of eight. This is currently being
|
||
implemented for Numeric 1 (by Travis Oliphant) and should be
|
||
included in Numeric 2.
|
||
|
||
3. Enhanced array indexing syntax
|
||
|
||
The extended slicing syntax was added to Python to provide greater
|
||
flexibility when manipulating Numeric arrays by allowing
|
||
step-sizes greater than 1. This syntax works well as a shorthand
|
||
for a list of regularly spaced indices. For those situations
|
||
where a list of irregularly spaced indices are needed, an enhanced
|
||
array indexing syntax would allow 1-D arrays to be arguments.
|
||
|
||
4. Rich comparisons
|
||
|
||
The implementation of PEP 207: Rich Comparisons in Python 2.1
|
||
provides additional flexibility when manipulating arrays. We
|
||
intend to implement this feature in Numeric 2.
|
||
|
||
5. Array broadcasting rules
|
||
|
||
When an operation between a scalar and an array is done, the
|
||
implied behavior is to create a new array having the same shape as
|
||
the array operand containing the scalar value. This is called
|
||
array broadcasting. It also works with arrays of lesser rank,
|
||
such as vectors. This implicit behavior is implemented in Numeric
|
||
1 and will also be implemented in Numeric 2.
|
||
|
||
|
||
Design and Implementation
|
||
|
||
The design of Numeric 2 has four primary classes:
|
||
|
||
1. ArrayType:
|
||
|
||
This is a simple class that describes the fundamental properties
|
||
of an array-type, e.g. its name, its size in bytes, its coercion
|
||
relations with respect to other types, etc., e.g.
|
||
|
||
> Int32 = ArrayType('Int32', 4, 'doc-string')
|
||
|
||
Its relation to the other types is defined when the C-extension
|
||
module for that type is imported. The corresponding Python code
|
||
is:
|
||
|
||
> Int32.astype[Real64] = Real64
|
||
|
||
This says that the Real64 array-type has higher priority than the
|
||
Int32 array-type.
|
||
|
||
The following attributes and methods are proposed for the core
|
||
implementation. Additional attributes can be added on an
|
||
individual basis, e.g. .bitsize or .bitstrides for the bit type.
|
||
|
||
Attributes:
|
||
.name: e.g. "Int32", "Float64", etc.
|
||
.typecode: e.g. 'i', 'f', etc.
|
||
(for backward compatibility)
|
||
.size (in bytes): e.g. 4, 8, etc.
|
||
.array_rules (mapping): rules between array types
|
||
.pyobj_rules (mapping): rules between array and python types
|
||
.doc: documentation string
|
||
Methods:
|
||
__init__(): initialization
|
||
__del__(): destruction
|
||
__repr__(): representation
|
||
|
||
C-API:
|
||
This still needs to be fleshed-out.
|
||
|
||
|
||
2. UFunc:
|
||
|
||
This class is the heart of Numeric 2. Its design is similar to
|
||
that of ArrayType in that the UFunc creates a singleton callable
|
||
object whose attributes are name, total and input number of
|
||
arguments, a document string, and an empty CFunc dictionary; e.g.
|
||
|
||
> add = UFunc('add', 3, 2, 'doc-string')
|
||
|
||
When defined the add instance has no C functions associated with
|
||
it and therefore can do no work. The CFunc dictionary is
|
||
populated or registered later when the C-extension module for an
|
||
array-type is imported. The arguments of the register method are:
|
||
function name, function descriptor, and the CUFunc object. The
|
||
corresponding Python code is
|
||
|
||
> add.register('add', (Int32, Int32, Int32), cfunc-add)
|
||
|
||
In the initialization function of an array type module, e.g.
|
||
Int32, there are two C API functions: one to initialize the
|
||
coercion rules and the other to register the CFunc objects.
|
||
|
||
When an operation is applied to some arrays, the __call__ method
|
||
is invoked. It gets the type of each array (if the output array
|
||
is not given, it is created from the coercion rules) and checks
|
||
the CFunc dictionary for a key that matches the argument types.
|
||
If it exists the operation is performed immediately, otherwise the
|
||
coercion rules are used to search for a related operation and set
|
||
of conversion functions. The __call__ method then invokes a
|
||
compute method written in C to iterate over slices of each array,
|
||
namely:
|
||
|
||
> _ufunc.compute(slice, data, func, swap, conv)
|
||
|
||
The 'func' argument is a CFuncObject, while the 'swap' and 'conv'
|
||
arguments are lists of CFuncObjects for those arrays needing pre-
|
||
or post-processing, otherwise None is used. The data argument is
|
||
a list of buffer objects, and the slice argument gives the number
|
||
of iterations for each dimension along with the buffer offset and
|
||
step size for each array and each dimension.
|
||
|
||
We have predefined several UFuncs for use by the __call__ method:
|
||
cast, swap, getobj, and setobj. The cast and swap functions do
|
||
coercion and byte-swapping, respectively and the getobj and setobj
|
||
functions do coercion between Numeric arrays and Python sequences.
|
||
|
||
The following attributes and methods are proposed for the core
|
||
implementation.
|
||
|
||
Attributes:
|
||
.name: e.g. "add", "subtract", etc.
|
||
.nargs: number of total arguments
|
||
.iargs: number of input arguments
|
||
.cfuncs (mapping): the set C functions
|
||
.doc: documentation string
|
||
Methods:
|
||
__init__(): initialization
|
||
__del__(): destruction
|
||
__repr__(): representation
|
||
__call__(): look-up and dispatch method
|
||
initrule(): initialize coercion rule
|
||
uninitrule(): uninitialize coercion rule
|
||
register(): register a CUFunc
|
||
unregister(): unregister a CUFunc
|
||
|
||
C-API:
|
||
This still needs to be fleshed-out.
|
||
|
||
3. Array:
|
||
|
||
This class contains information about the array, such as shape,
|
||
type, endian-ness of the data, etc.. Its operators, '+', '-',
|
||
etc. just invoke the corresponding UFunc function, e.g.
|
||
|
||
> def __add__(self, other):
|
||
> return ufunc.add(self, other)
|
||
|
||
The following attributes, methods, and functions are proposed for
|
||
the core implementation.
|
||
|
||
Attributes:
|
||
.shape: shape of the array
|
||
.format: type of the array
|
||
.real (only complex): real part of a complex array
|
||
.imag (only complex): imaginary part of a complex array
|
||
Methods:
|
||
__init__(): initialization
|
||
__del__(): destruction
|
||
__repr_(): representation
|
||
__str__(): pretty representation
|
||
__cmp__(): rich comparison
|
||
__len__():
|
||
__getitem__():
|
||
__setitem__():
|
||
__getslice__():
|
||
__setslice__():
|
||
numeric methods:
|
||
copy(): copy of array
|
||
aslist(): create list from array
|
||
asstring(): create string from array
|
||
|
||
Functions:
|
||
fromlist(): create array from sequence
|
||
fromstring(): create array from string
|
||
array(): create array with shape and value
|
||
concat(): concatenate two arrays
|
||
resize(): resize array
|
||
|
||
C-API:
|
||
This still needs to be fleshed-out.
|
||
|
||
4. ArrayView
|
||
|
||
This class is similar to the Array class except that the reshape
|
||
and flat methods will raise exceptions, since non-contiguous
|
||
arrays cannot be reshaped or flattened using just pointer and
|
||
step-size information.
|
||
|
||
C-API:
|
||
This still needs to be fleshed-out.
|
||
|
||
5. C-extension modules:
|
||
|
||
Numeric2 will have several C-extension modules.
|
||
|
||
a. _ufunc:
|
||
|
||
The primary module of this set is the _ufuncmodule.c. The
|
||
intention of this module is to do the bare minimum,
|
||
i.e. iterate over arrays using a specified C function. The
|
||
interface of these functions is the same as Numeric 1, i.e.
|
||
|
||
int (*CFunc)(char *data, int *steps, int repeat, void *func);
|
||
|
||
and their functionality is expected to be the same, i.e. they
|
||
iterate over the inner-most dimension.
|
||
|
||
The following attributes and methods are proposed for the core
|
||
implementation.
|
||
|
||
Attributes:
|
||
|
||
Methods:
|
||
compute():
|
||
|
||
C-API:
|
||
This still needs to be fleshed-out.
|
||
|
||
b. _int32, _real64, etc.:
|
||
|
||
There will also be C-extension modules for each array type,
|
||
e.g. _int32module.c, _real64module.c, etc. As mentioned
|
||
previously, when these modules are imported by the UFunc
|
||
module, they will automatically register their functions and
|
||
coercion rules. New or improved versions of these modules can
|
||
be easily implemented and used without affecting the rest of
|
||
Numeric 2.
|
||
|
||
|
||
Open Issues
|
||
|
||
1. Does slicing syntax default to copy or view behavior?
|
||
|
||
The default behavior of Python is to return a copy of a sub-list
|
||
or tuple when slicing syntax is used, whereas Numeric 1 returns a
|
||
view into the array. The choice made for Numeric 1 is apparently
|
||
for reasons of performance: the developers wish to avoid the
|
||
penalty of allocating and copying the data buffer during each
|
||
array operation and feel that the need for a deep copy of an array
|
||
to be rare. Yet, some have argued that Numeric's slice notation
|
||
should also have copy behavior to be consistent with Python lists.
|
||
In this case the performance penalty associated with copy behavior
|
||
can be minimized by implementing copy-on-write. This scheme has
|
||
both arrays sharing one data buffer (as in view behavior) until
|
||
either array is assigned new data at which point a copy of the
|
||
data buffer is made. View behavior would then be implemented by
|
||
an ArrayView class, whose behavior be similar to Numeric 1 arrays,
|
||
i.e. .shape is not settable for non-contiguous arrays. The use of
|
||
an ArrayView class also makes explicit what type of data the array
|
||
contains.
|
||
|
||
2. Does item syntax default to copy or view behavior?
|
||
|
||
A similar question arises with the item syntax. For example, if a
|
||
= [[0,1,2], [3,4,5]] and b = a[0], then changing b[0] also changes
|
||
a[0][0], because a[0] is a reference or view of the first row of
|
||
a. Therefore, if c is a 2-d array, it would appear that c[i]
|
||
should return a 1-d array which is a view into, instead of a copy
|
||
of, c for consistency. Yet, c[i] can be considered just a
|
||
shorthand for c[i,:] which would imply copy behavior assuming
|
||
slicing syntax returns a copy. Should Numeric 2 behave the same
|
||
way as lists and return a view or should it return a copy.
|
||
|
||
3. How is scalar coercion implemented?
|
||
|
||
Python has fewer numeric types than Numeric which can cause
|
||
coercion problems. For example, when multiplying a Python scalar
|
||
of type float and a Numeric array of type float, the Numeric array
|
||
is converted to a double, since the Python float type is actually
|
||
a double. This is often not the desired behavior, since the
|
||
Numeric array will be doubled in size which is likely to be
|
||
annoying, particularly for very large arrays. We prefer that the
|
||
array type trumps the python type for the same type class, namely
|
||
integer, float, and complex. Therefore, an operation between a
|
||
Python integer and an Int16 (short) array will return an Int16
|
||
array. Whereas an operation between a Python float and an Int16
|
||
array would return a Float64 (double) array. Operations between
|
||
two arrays use normal coercion rules.
|
||
|
||
4. How is integer division handled?
|
||
|
||
In a future version of Python, the behavior of integer division
|
||
will change. The operands will be converted to floats, so the
|
||
result will be a float. If we implement the proposed scalar
|
||
coercion rules where arrays have precedence over Python scalars,
|
||
then dividing an array by an integer will return an integer array
|
||
and will not be consistent with a future version of Python which
|
||
would return an array of type double. Scientific programmers are
|
||
familiar with the distinction between integer and float-point
|
||
division, so should Numeric 2 continue with this behavior?
|
||
|
||
5. How should records be implemented?
|
||
|
||
There are two approaches to implementing records depending on your
|
||
point-of-view. The first is two divide arrays into separate
|
||
classes depending on the behavior of their types. For example,
|
||
numeric arrays are one class, strings a second, and records a
|
||
third, because the range and type of operations of each class
|
||
differ. As such, a record array is not a new type, but a
|
||
mechanism for a more flexible form of array. To easily access and
|
||
manipulate such complex data, the class is comprised of numeric
|
||
arrays having different byte offsets into the data buffer. For
|
||
example, one might have a table consisting of an array of Int16,
|
||
Real32 values. Two numeric arrays, one with an offset of 0 bytes
|
||
and a stride of 6 bytes to be interpreted as Int16, and one with an
|
||
offset of 2 bytes and a stride of 6 bytes to be interpreted as
|
||
Real32 would represent the record array. Both numeric arrays
|
||
would refer to the same data buffer, but have different offset and
|
||
stride attributes, and a different numeric type.
|
||
|
||
The second approach is to consider a record as one of many array
|
||
types, albeit with fewer, and possibly different, array operations
|
||
than for numeric arrays. This approach considers an array type to
|
||
be a mapping of a fixed-length string. The mapping can either be
|
||
simple, like integer and floating-point numbers, or complex, like
|
||
a complex number, a byte string, and a C-structure. The record
|
||
type effectively merges the struct and Numeric modules into a
|
||
multi-dimensional struct array. This approach implies certain
|
||
changes to the array interface. For example, the 'typecode'
|
||
keyword argument should probably be changed to the more
|
||
descriptive 'format' keyword.
|
||
|
||
a. How are record semantics defined and implemented?
|
||
|
||
Which ever implementation approach is taken for records, the
|
||
syntax and semantics of how they are to be accessed and
|
||
manipulated must be decided, if one wishes to have access to
|
||
sub-fields of records. In this case, the record type can
|
||
essentially be considered an inhomogeneous list, like a tuple
|
||
returned by the unpack method of the struct module; and a 1-d
|
||
array of records may be interpreted as a 2-d array with the
|
||
second dimension being the index into the list of fields.
|
||
This enhanced array semantics makes access to an array of one
|
||
or more of the fields easy and straightforward. It also
|
||
allows a user to do array operations on a field in a natural
|
||
and intuitive way. If we assume that records are implemented
|
||
as an array type, then last dimension defaults to 0 and can
|
||
therefore be neglected for arrays comprised of simple types,
|
||
like numeric.
|
||
|
||
6. How are masked-arrays implemented?
|
||
|
||
Masked-arrays in Numeric 1 are implemented as a separate array
|
||
class. With the ability to add new array types to Numeric 2, it
|
||
is possible that masked-arrays in Numeric 2 could be implemented
|
||
as a new array type instead of an array class.
|
||
|
||
7. How are numerical errors handled (IEEE floating-point errors in
|
||
particular)?
|
||
|
||
It is not clear to the proposers (Paul Barrett and Travis
|
||
Oliphant) what is the best or preferred way of handling errors.
|
||
Since most of the C functions that do the operation, iterate over
|
||
the inner-most (last) dimension of the array. This dimension
|
||
could contain a thousand or more items having one or more errors
|
||
of differing type, such as divide-by-zero, underflow, and
|
||
overflow. Additionally, keeping track of these errors may come at
|
||
the expense of performance. Therefore, we suggest several
|
||
options:
|
||
|
||
a. Print a message of the most severe error, leaving it to
|
||
the user to locate the errors.
|
||
|
||
b. Print a message of all errors that occurred and the number
|
||
of occurrences, leaving it to the user to locate the errors.
|
||
|
||
c. Print a message of all errors that occurred and a list of
|
||
where they occurred.
|
||
|
||
d. Or use a hybrid approach, printing only the most severe
|
||
error, yet keeping track of what and where the errors
|
||
occurred. This would allow the user to locate the errors
|
||
while keeping the error message brief.
|
||
|
||
8. What features are needed to ease the integration of FORTRAN
|
||
libraries and code?
|
||
|
||
It would be a good idea at this stage to consider how to ease the
|
||
integration of FORTRAN libraries and user code in Numeric 2.
|
||
|
||
|
||
Implementation Steps
|
||
|
||
1. Implement basic UFunc capability
|
||
|
||
a. Minimal Array class:
|
||
|
||
Necessary class attributes and methods, e.g. .shape, .data,
|
||
.type, etc.
|
||
|
||
b. Minimal ArrayType class:
|
||
|
||
Int32, Real64, Complex64, Char, Object
|
||
|
||
c. Minimal UFunc class:
|
||
|
||
UFunc instantiation, CFunction registration, UFunc call for
|
||
1-D arrays including the rules for doing alignment,
|
||
byte-swapping, and coercion.
|
||
|
||
d. Minimal C-extension module:
|
||
|
||
_UFunc, which does the innermost array loop in C.
|
||
|
||
This step implements whatever is needed to do: 'c = add(a, b)'
|
||
where a, b, and c are 1-D arrays. It teaches us how to add
|
||
new UFuncs, to coerce the arrays, to pass the necessary
|
||
information to a C iterator method and to do the actually
|
||
computation.
|
||
|
||
2. Continue enhancing the UFunc iterator and Array class
|
||
|
||
a. Implement some access methods for the Array class:
|
||
print, repr, getitem, setitem, etc.
|
||
|
||
b. Implement multidimensional arrays
|
||
|
||
c. Implement some of basic Array methods using UFuncs:
|
||
+, -, *, /, etc.
|
||
|
||
d. Enable UFuncs to use Python sequences.
|
||
|
||
3. Complete the standard UFunc and Array class behavior
|
||
|
||
a. Implement getslice and setslice behavior
|
||
|
||
b. Work on Array broadcasting rules
|
||
|
||
c. Implement Record type
|
||
|
||
4. Add additional functionality
|
||
|
||
a. Add more UFuncs
|
||
|
||
b. Implement buffer or mmap access
|
||
|
||
|
||
Incompatibilities
|
||
|
||
The following is a list of incompatibilities in behavior between
|
||
Numeric 1 and Numeric 2.
|
||
|
||
1. Scalar coercion rules
|
||
|
||
Numeric 1 has single set of coercion rules for array and Python
|
||
numeric types. This can cause unexpected and annoying problems
|
||
during the calculation of an array expression. Numeric 2 intends
|
||
to overcome these problems by having two sets of coercion rules:
|
||
one for arrays and Python numeric types, and another just for
|
||
arrays.
|
||
|
||
2. No savespace attribute
|
||
|
||
The savespace attribute in Numeric 1 makes arrays with this
|
||
attribute set take precedence over those that do not have it set.
|
||
Numeric 2 will not have such an attribute and therefore normal
|
||
array coercion rules will be in effect.
|
||
|
||
3. Slicing syntax returns a copy
|
||
|
||
The slicing syntax in Numeric 1 returns a view into the original
|
||
array. The slicing behavior for Numeric 2 will be a copy. You
|
||
should use the ArrayView class to get a view into an array.
|
||
|
||
4. Boolean comparisons return a boolean array
|
||
|
||
A comparison between arrays in Numeric 1 results in a Boolean
|
||
scalar, because of current limitations in Python. The advent of
|
||
Rich Comparisons in Python 2.1 will allow an array of Booleans to
|
||
be returned.
|
||
|
||
5. Type characters are deprecated
|
||
|
||
Numeric 2 will have an ArrayType class composed of Type instances,
|
||
for example Int8, Int16, Int32, and Int for signed integers. The
|
||
typecode scheme in Numeric 1 will be available for backward
|
||
compatibility, but will be deprecated.
|
||
|
||
|
||
Appendices
|
||
|
||
A. Implicit sub-arrays iteration
|
||
|
||
A computer animation is composed of a number of 2-D images or
|
||
frames of identical shape. By stacking these images into a single
|
||
block of memory, a 3-D array is created. Yet the operations to be
|
||
performed are not meant for the entire 3-D array, but on the set
|
||
of 2-D sub-arrays. In most array languages, each frame has to be
|
||
extracted, operated on, and then reinserted into the output array
|
||
using a for-like loop. The J language allows the programmer to
|
||
perform such operations implicitly by having a rank for the frame
|
||
and array. By default these ranks will be the same during the
|
||
creation of the array. It was the intention of the Numeric 1
|
||
developers to implement this feature, since it is based on the
|
||
language J. The Numeric 1 code has the required variables for
|
||
implementing this behavior, but was never implemented. We intend
|
||
to implement implicit sub-array iteration in Numeric 2, if the
|
||
array broadcasting rules found in Numeric 1 do not fully support
|
||
this behavior.
|
||
|
||
|
||
Copyright
|
||
|
||
This document is placed in the public domain.
|
||
|
||
|
||
Related PEPs
|
||
|
||
PEP 207: Rich Comparisons
|
||
by Guido van Rossum and David Ascher
|
||
|
||
PEP 208: Reworking the Coercion Model
|
||
by Neil Schemenauer and Marc-Andre' Lemburg
|
||
|
||
PEP 211: Adding New Linear Algebra Operators to Python
|
||
by Greg Wilson
|
||
|
||
PEP 225: Elementwise/Objectwise Operators
|
||
by Huaiyu Zhu
|
||
|
||
PEP 228: Reworking Python's Numeric Model
|
||
by Moshe Zadka
|
||
|
||
|
||
References
|
||
|
||
[1] P. Greenfield 2000. private communication.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
End:
|