Added optional two-phase commit (TPC) API extension as proposed

by James Henstridge and discussed on the DB-SIG.

Lots of small text edits (XXX->*, prepend all methods with a dot,
more indents, etc.).

Updated note on the Python datetime module objects.
This commit is contained in:
Marc-André Lemburg 2008-03-03 12:37:19 +00:00
parent 989daa17e3
commit 6cc202c7bf
1 changed files with 232 additions and 96 deletions

View File

@ -27,11 +27,12 @@ Introduction
* Implementation Hints for Module Authors
* Optional DB API Extensions
* Optional Error Handling Extensions
* Optional Two-Phase Commit Extensions
* Frequently Asked Questions
* Major Changes from Version 1.0 to Version 2.0
* Open Issues
* Footnotes
* Acknowledgements
* Acknowledgments
Comments and questions about this specification may be directed
to the SIG for Database Interfacing with Python
@ -242,14 +243,14 @@ Connection Objects
Cursor Objects
These objects represent a database cursor, which is used to
manage the context of a fetch operation. Cursors created from
the same connection are not isolated, i.e., any changes
done to the database by a cursor are immediately visible by the
other cursors. Cursors created from different connections can
or can not be isolated, depending on how the transaction support
is implemented (see also the connection's rollback() and commit()
methods.)
These objects represent a database cursor, which is used to manage
the context of a fetch operation. Cursors created from the same
connection are not isolated, i.e., any changes done to the
database by a cursor are immediately visible by the other
cursors. Cursors created from different connections can or can not
be isolated, depending on how the transaction support is
implemented (see also the connection's .rollback() and .commit()
methods).
Cursor Objects should respond to the following methods and
attributes:
@ -257,16 +258,26 @@ Cursor Objects
.description
This read-only attribute is a sequence of 7-item
sequences. Each of these sequences contains information
describing one result column: (name, type_code,
display_size, internal_size, precision, scale,
null_ok). The first two items (name and type_code) are
mandatory, the other five are optional and must be set to
None if meaningfull values are not provided.
sequences.
Each of these sequences contains information describing
one result column:
(name,
type_code,
display_size,
internal_size,
precision,
scale,
null_ok)
The first two items (name and type_code) are mandatory,
the other five are optional and are set to None if no
meaningful values can be provided.
This attribute will be None for operations that
do not return rows or if the cursor has not had an
operation invoked via the executeXXX() method yet.
operation invoked via the .execute*() method yet.
The type_code can be interpreted by comparing it to the
Type Objects specified in the section below.
@ -274,13 +285,13 @@ Cursor Objects
.rowcount
This read-only attribute specifies the number of rows that
the last executeXXX() produced (for DQL statements like
the last .execute*() produced (for DQL statements like
'select') or affected (for DML statements like 'update' or
'insert').
The attribute is -1 in case no executeXXX() has been
The attribute is -1 in case no .execute*() has been
performed on the cursor or the rowcount of the last
operation is not determinable by the interface. [7]
operation is cannot be determined by the interface. [7]
Note: Future versions of the DB API specification could
redefine the latter case to have the object return None
@ -300,7 +311,7 @@ Cursor Objects
The procedure may also provide a result set as
output. This must then be made available through the
standard fetchXXX() methods.
standard .fetch*() methods.
.close()
@ -324,7 +335,7 @@ Cursor Objects
but different parameters are bound to it (many times).
For maximum efficiency when reusing an operation, it is
best to use the setinputsizes() method to specify the
best to use the .setinputsizes() method to specify the
parameter types and sizes ahead of time. It is legal for
a parameter to not match the predefined information; the
implementation should compensate, possibly with a loss of
@ -332,7 +343,7 @@ Cursor Objects
The parameters may also be specified as list of tuples to
e.g. insert multiple rows in a single operation, but this
kind of usage is deprecated: executemany() should be used
kind of usage is deprecated: .executemany() should be used
instead.
Return values are not defined.
@ -344,7 +355,7 @@ Cursor Objects
found in the sequence seq_of_parameters.
Modules are free to implement this method using multiple
calls to the execute() method or by using array operations
calls to the .execute() method or by using array operations
to have the database process the sequence as a whole in
one call.
@ -354,7 +365,7 @@ Cursor Objects
an exception when it detects that a result set has been
created by an invocation of the operation.
The same comments as for execute() also apply accordingly
The same comments as for .execute() also apply accordingly
to this method.
Return values are not defined.
@ -366,10 +377,10 @@ Cursor Objects
available. [6]
An Error (or subclass) exception is raised if the previous
call to executeXXX() did not produce any result set or no
call to .execute*() did not produce any result set or no
call was issued yet.
fetchmany([size=cursor.arraysize])
.fetchmany([size=cursor.arraysize])
Fetch the next set of rows of a query result, returning a
sequence of sequences (e.g. a list of tuples). An empty
@ -384,14 +395,14 @@ Cursor Objects
returned.
An Error (or subclass) exception is raised if the previous
call to executeXXX() did not produce any result set or no
call to .execute*() did not produce any result set or no
call was issued yet.
Note there are performance considerations involved with
the size parameter. For optimal performance, it is
usually best to use the arraysize attribute. If the size
parameter is used, then it is best for it to retain the
same value from one fetchmany() call to the next.
same value from one .fetchmany() call to the next.
.fetchall()
@ -401,7 +412,7 @@ Cursor Objects
performance of this operation.
An Error (or subclass) exception is raised if the previous
call to executeXXX() did not produce any result set or no
call to .execute*() did not produce any result set or no
call was issued yet.
.nextset()
@ -419,23 +430,23 @@ Cursor Objects
result set.
An Error (or subclass) exception is raised if the previous
call to executeXXX() did not produce any result set or no
call to .execute*() did not produce any result set or no
call was issued yet.
.arraysize
This read/write attribute specifies the number of rows to
fetch at a time with fetchmany(). It defaults to 1 meaning
to fetch a single row at a time.
fetch at a time with .fetchmany(). It defaults to 1
meaning to fetch a single row at a time.
Implementations must observe this value with respect to
the fetchmany() method, but are free to interact with the
the .fetchmany() method, but are free to interact with the
database a single row at a time. It may also be used in
the implementation of executemany().
the implementation of .executemany().
.setinputsizes(sizes)
This can be used before a call to executeXXX() to
This can be used before a call to .execute*() to
predefine memory areas for the operation's parameters.
sizes is specified as a sequence -- one item for each
@ -446,7 +457,7 @@ Cursor Objects
area will be reserved for that column (this is useful to
avoid predefined areas for large inputs).
This method would be used before the executeXXX() method
This method would be used before the .execute*() method
is invoked.
Implementations are free to have this method do nothing
@ -460,7 +471,7 @@ Cursor Objects
will set the default size for all large columns in the
cursor.
This method would be used before the executeXXX() method
This method would be used before the .execute*() method
is invoked.
Implementations are free to have this method do nothing
@ -475,7 +486,7 @@ Type Objects and Constructors
database in a particular string format. Similar problems exist
for "Row ID" columns or large binary items (e.g. blobs or RAW
columns). This presents problems for Python since the parameters
to the executeXXX() method are untyped. When the database module
to the .execute*() method are untyped. When the database module
sees a Python string object, it doesn't know if it should be bound
as a simple CHAR column, as a raw BINARY item, or as a DATE.
@ -567,23 +578,11 @@ Type Objects and Constructors
Implementation Hints for Module Authors
* The preferred object types for the date/time objects are those
defined in the mxDateTime package. It provides all necessary
constructors and methods both at Python and C level.
* The preferred object type for Binary objects are the
buffer types available in standard Python starting with
version 1.5.2. Please see the Python documentation for
details. For information about the C interface have a
look at Include/bufferobject.h and
Objects/bufferobject.c in the Python source
distribution.
* Starting with Python 2.3, module authors can also use the object
types defined in the standard datetime module for date/time
processing. However, it should be noted that this does not
expose a C API like mxDateTime does which means that integration
with C based database modules is more difficult.
* Date/time objects can be implemented as Python datetime module
objects (available since Python 2.3, with a C API since 2.4) or
using the mxDateTime package (available for all Python versions
since 1.5.2). They both provide all necessary constructors and
methods at Python and C level.
* Here is a sample implementation of the Unix ticks based
constructors for date/time delegating work to the generic
@ -592,13 +591,21 @@ Implementation Hints for Module Authors
import time
def DateFromTicks(ticks):
return apply(Date,time.localtime(ticks)[:3])
return Date(*time.localtime(ticks)[:3])
def TimeFromTicks(ticks):
return apply(Time,time.localtime(ticks)[3:6])
return Time(*time.localtime(ticks)[3:6])
def TimestampFromTicks(ticks):
return apply(Timestamp,time.localtime(ticks)[:6])
return Timestamp(*time.localtime(ticks)[:6])
* The preferred object type for Binary objects are the
buffer types available in standard Python starting with
version 1.5.2. Please see the Python documentation for
details. For information about the C interface have a
look at Include/bufferobject.h and
Objects/bufferobject.c in the Python source
distribution.
* This Python class allows implementing the above type
objects even though the description type code field yields
@ -675,17 +682,18 @@ Optional DB API Extensions
It has been proposed to make usage of these extensions optionally
visible to the programmer by issuing Python warnings through the
Python warning framework. To make this feature useful, the warning
messages must be standardized in order to be able to mask them. These
standard messages are referred to below as "Warning Message".
messages must be standardized in order to be able to mask
them. These standard messages are referred to below as "Warning
Message".
Cursor Attribute .rownumber
This read-only attribute should provide the current 0-based
index of the cursor in the result set or None if the index cannot
be determined.
index of the cursor in the result set or None if the index
cannot be determined.
The index can be seen as index of the cursor in a sequence (the
result set). The next fetch operation will fetch the row
The index can be seen as index of the cursor in a sequence
(the result set). The next fetch operation will fetch the row
indexed by .rownumber in that sequence.
Warning Message: "DB-API extension cursor.rownumber used"
@ -740,7 +748,7 @@ Optional DB API Extensions
this cursor.
The list is cleared by all standard cursor methods calls (prior
to executing the call) except for the .fetchXXX() calls
to executing the call) except for the .fetch*() calls
automatically to avoid excessive memory usage and can also be
cleared by executing "del cursor.messages[:]".
@ -779,7 +787,8 @@ Optional DB API Extensions
Cursor Method .__iter__()
Return self to make cursors compatible to the iteration protocol.
Return self to make cursors compatible to the iteration
protocol [8].
Warning Message: "DB-API extension cursor.__iter__() used"
@ -812,31 +821,151 @@ Optional Error Handling Extensions
Cursor/Connection Attribute .errorhandler
Read/write attribute which references an error handler to call
in case an error condition is met.
Read/write attribute which references an error handler to call
in case an error condition is met.
The handler must be a Python callable taking the following
arguments: errorhandler(connection, cursor, errorclass,
errorvalue) where connection is a reference to the connection
on which the cursor operates, cursor a reference to the cursor
(or None in case the error does not apply to a cursor),
errorclass is an error class which to instantiate using
errorvalue as construction argument.
The handler must be a Python callable taking the following
arguments:
The standard error handler should add the error information to
the appropriate .messages attribute (connection.messages or
cursor.messages) and raise the exception defined by the given
errorclass and errorvalue parameters.
errorhandler(connection, cursor, errorclass, errorvalue)
If no errorhandler is set (the attribute is None), the standard
error handling scheme as outlined above, should be applied.
where connection is a reference to the connection on which the
cursor operates, cursor a reference to the cursor (or None in
case the error does not apply to a cursor), errorclass is an
error class which to instantiate using errorvalue as
construction argument.
Warning Message: "DB-API extension .errorhandler used"
The standard error handler should add the error information to
the appropriate .messages attribute (connection.messages or
cursor.messages) and raise the exception defined by the given
errorclass and errorvalue parameters.
If no errorhandler is set (the attribute is None), the
standard error handling scheme as outlined above, should be
applied.
Warning Message: "DB-API extension .errorhandler used"
Cursors should inherit the .errorhandler setting from their
connection objects at cursor creation time.
Optional Two-Phase Commit Extensions
Many databases have support for two-phase commit (TPC) which
allows managing transactions across multiple database connections
and other resources.
If a database backend provides support for two-phase commit and
the database module author wishes to expose this support, the
following API should be implemented. NotSupportedError should be
raised, if the database backend support for two-phase commit
can only be checked at run-time.
TPC Transaction IDs
As many databases follow the XA specification, transaction IDs
are formed from three components:
* a format ID
* a global transaction ID
* a branch qualifier
For a particular global transaction, the first two components
should be the same for all resources. Each resource in the
global transaction should be assigned a different branch
qualifier.
The various components must satisfy the following criteria:
* format ID: a non-negative 32-bit integer.
* global transaction ID and branch qualifier: byte strings no
longer than 64 characters.
Transaction IDs are created with the .xid() connection method:
.xid(format_id, global_transaction_id, branch_qualifier)
Returns a transaction ID object suitable for passing to the
.tpc_*() methods of this connection.
If the database connection does not support TPC, a
NotSupportedError is raised.
The type of the object returned by .xid() is not defined, but
it must provide sequence behaviour, allowing access to the
three components. A conforming database module could choose
to represent transaction IDs with tuples rather than a custom
object.
TPC Connection Methods
.tpc_begin(xid)
Begins a TPC transaction with the given transaction ID xid.
This method should be called outside of a transaction
(i.e. nothing may have executed since the last .commit() or
.rollback()).
Furthermore, it is an error to call .commit() or .rollback()
within the TPC transaction. A ProgrammingError is raised, if
the application calls .commit() or .rollback() during an
active TPC transaction.
If the database connection does not support TPC, a
NotSupportedError is raised.
.tpc_prepare()
Performs the first phase of a transaction started with
.tpc_begin(). A ProgrammingError should be raised if this
method outside of a TPC transaction.
After calling .tpc_prepare(), no statements can be executed
until tpc_commit() or tpc_rollback() have been called.
.tpc_commit([xid])
When called with no arguments, .tpc_commit() commits a TPC
transaction previously prepared with .tpc_prepare().
If .tpc_commit() is called prior to .tpc_prepare(), a single
phase commit is performed. A transaction manager may choose
to do this if only a single resource is participating in the
global transaction.
When called with a transaction ID xid, the database commits
the given transaction. If an invalid transaction ID is
provided, a ProgrammingError will be raised. This form should
be called outside of a transaction, and is intended for use in
recovery.
On return, the TPC transaction is ended.
.tpc_rollback([xid])
When called with no arguments, .tpc_rollback() rolls back a
TPC transaction. It may be called before or after
.tpc_prepare().
When called with a transaction ID xid, it rolls back the given
transaction. If an invalid transaction ID is provided, a
ProgrammingError is raised. This form should be called
outside of a transaction, and is intended for use in recovery.
On return, the TPC transaction is ended.
.tpc_recover()
Returns a list of pending transaction IDs suitable for use
with .tpc_commit(xid) or .tpc_rollback(xid).
If the database does not support transaction recovery, it may
return an empty list or raise NotSupportedError.
Frequently Asked Questions
The database SIG often sees reoccurring questions about the DB API
@ -846,7 +975,7 @@ Frequently Asked Questions
Question:
How can I construct a dictionary out of the tuples returned by
.fetchxxx():
.fetch*():
Answer:
@ -856,7 +985,7 @@ Frequently Asked Questions
as basis for the keys in the row dictionary.
Note that the reason for not extending the DB API specification
to also support dictionary return values for the .fetchxxx()
to also support dictionary return values for the .fetch*()
methods is that this approach has several drawbacks:
* Some databases don't support case-sensitive column names or
@ -891,8 +1020,8 @@ Major Changes from Version 1.0 to Version 2.0
found in modern SQL databases.
* New constants (apilevel, threadlevel, paramstyle) and
methods (executemany, nextset) were added to provide better
database bindings.
methods (.executemany(), .nextset()) were added to provide
better database bindings.
* The semantics of .callproc() needed to call stored
procedures are now clearly defined.
@ -927,8 +1056,8 @@ Open Issues
* Define a useful return value for .nextset() for the case where
a new result set is available.
* Create a fixed point numeric type for use as loss-less
monetary and decimal interchange format.
* Integrate the decimal module Decimal object for use as
loss-less monetary and decimal interchange format.
Footnotes
@ -937,15 +1066,15 @@ Footnotes
implemented as keyword parameters for more intuitive use and
follow this order of parameters:
dsn Data source name as string
user User name as string (optional)
password Password as string (optional)
host Hostname (optional)
database Database name (optional)
dsn Data source name as string
user User name as string (optional)
password Password as string (optional)
host Hostname (optional)
database Database name (optional)
E.g. a connect could look like this:
connect(dsn='myhost:MYDB',user='guido',password='234$')
connect(dsn='myhost:MYDB',user='guido',password='234$')
[2] Module implementors should prefer 'numeric', 'named' or
'pyformat' over the other formats because these offer more
@ -970,7 +1099,7 @@ Footnotes
[4] a database interface may choose to support named cursors by
allowing a string argument to the method. This feature is
not part of the specification, since it complicates
semantics of the .fetchXXX() methods.
semantics of the .fetch*() methods.
[5] The module will use the __getitem__ method of the parameters
object to map either positions (integers) or names (strings)
@ -992,7 +1121,11 @@ Footnotes
[7] The rowcount attribute may be coded in a way that updates
its value dynamically. This can be useful for databases that
return usable rowcount values only after the first call to
a .fetchXXX() method.
a .fetch*() method.
[8] Implementation Note: Python C extensions will have to
implement the tp_iter slot on the cursor object instead of the
.__iter__() method.
Acknowledgements
@ -1000,6 +1133,9 @@ Acknowledgements
Database API Specification 2.0 from the original HTML format into
the PEP format.
Many thanks to James Henstridge for leading the discussion which
led to the standardization of the two-phase commit API extensions.
Copyright
This document has been placed in the Public Domain.