Trying to bring PEP up to date with discussions on mailing list. I hope I
have not misinterpreted the conclusions. * dialect argument is now either a string to identify one of the internally defined parameter sets, otherwise it is an object which contains attributes which correspond to the parameter set. * Altered set_dialect() to take dialect name and dialect object. * Altered get_dialect() to take dialect name and return dialect object. * Fleshed out formatting parameters, adding escapechar, lineterminator, quoting.
This commit is contained in:
parent
59fc3f4bc8
commit
e1e46e817b
47
pep-0305.txt
47
pep-0305.txt
|
@ -105,19 +105,42 @@ Dialects
|
||||||
--------
|
--------
|
||||||
|
|
||||||
Readers and writers support a dialect argument which is just a
|
Readers and writers support a dialect argument which is just a
|
||||||
convenient (string) handle on a group of lower level parameters.
|
convenient handle on a group of lower level parameters.
|
||||||
|
|
||||||
|
When dialect is a string it identifies one of the dialect which is
|
||||||
|
known to the module, otherwise it is processed as a dialect class as
|
||||||
|
described below.
|
||||||
|
|
||||||
Dialects will generally be named after applications or organizations
|
Dialects will generally be named after applications or organizations
|
||||||
which define specific sets of format constraints. The initial dialect
|
which define specific sets of format constraints. The initial dialect
|
||||||
is "excel2000", which describes the format constraints of Excel 2000's
|
is excel2000, which describes the format constraints of Excel 2000's
|
||||||
CSV format. Another possible dialect (used here only as an example)
|
CSV format. Another possible dialect (used here only as an example)
|
||||||
might be "gnumeric".
|
might be "gnumeric".
|
||||||
|
|
||||||
|
Dialects are implemented as attribute only classes to enable user to
|
||||||
|
construct variant dialects by subclassing. The excel2000 dialect is
|
||||||
|
implemented as follows::
|
||||||
|
|
||||||
|
class excel2000:
|
||||||
|
quotechar = '"'
|
||||||
|
delimiter = ','
|
||||||
|
escapechar = None
|
||||||
|
skipinitialspace = False
|
||||||
|
lineterminator = '\r\n'
|
||||||
|
quoting = 'minimal'
|
||||||
|
|
||||||
|
An excel tab separated dialect can then be defined in user code as
|
||||||
|
follows::
|
||||||
|
|
||||||
|
class exceltsv(csv.excel2000):
|
||||||
|
delimiter = '\t'
|
||||||
|
|
||||||
Two functions are defined in the API to set and retrieve dialects::
|
Two functions are defined in the API to set and retrieve dialects::
|
||||||
|
|
||||||
set_dialect(dialect, pdict)
|
set_dialect(name, dialect)
|
||||||
pdict = get_dialect(dialect)
|
dialect = get_dialect(name)
|
||||||
|
|
||||||
The pdict parameter is a dictionary whose keys are the names the
|
The dialect parameter is a class or instance whose attributes are the
|
||||||
formatting parameters defined in the next section.
|
formatting parameters defined in the next section.
|
||||||
|
|
||||||
|
|
||||||
|
@ -135,6 +158,9 @@ for the set_dialect() and get_dialect() module functions.
|
||||||
- delimiter specifies a one-character string to use as the field
|
- delimiter specifies a one-character string to use as the field
|
||||||
separator. It defaults to ','.
|
separator. It defaults to ','.
|
||||||
|
|
||||||
|
- escapechar specifies a one character string used to escape the
|
||||||
|
delimiter when quotechar is set to None.
|
||||||
|
|
||||||
- skipinitialspace specifies how to interpret whitespace which
|
- skipinitialspace specifies how to interpret whitespace which
|
||||||
immediately follows a delimiter. It defaults to False, which means
|
immediately follows a delimiter. It defaults to False, which means
|
||||||
that whitespace immediate following a delimiter is part of the
|
that whitespace immediate following a delimiter is part of the
|
||||||
|
@ -143,6 +169,17 @@ for the set_dialect() and get_dialect() module functions.
|
||||||
- lineterminator specifies the character sequence which should
|
- lineterminator specifies the character sequence which should
|
||||||
terminate rows.
|
terminate rows.
|
||||||
|
|
||||||
|
- quoting controls when quotes should be generated by the
|
||||||
|
writer.
|
||||||
|
|
||||||
|
"minimal" means only when required, for example, when a field
|
||||||
|
contains either the quotechar or the delimiter
|
||||||
|
|
||||||
|
"always" means that quotes are always placed around fields.
|
||||||
|
|
||||||
|
"nonnumeric" means that quotes are always placed around fields
|
||||||
|
which contain characters other than [+-0-9.].
|
||||||
|
|
||||||
... XXX More to come XXX ...
|
... XXX More to come XXX ...
|
||||||
|
|
||||||
When processing a dialect setting and one or more of the other
|
When processing a dialect setting and one or more of the other
|
||||||
|
|
Loading…
Reference in New Issue