Trying to bring PEP up to date with discussions on mailing list. I hope I
have not misinterpreted the conclusions. * dialect argument is now either a string to identify one of the internally defined parameter sets, otherwise it is an object which contains attributes which correspond to the parameter set. * Altered set_dialect() to take dialect name and dialect object. * Altered get_dialect() to take dialect name and return dialect object. * Fleshed out formatting parameters, adding escapechar, lineterminator, quoting.
This commit is contained in:
parent
59fc3f4bc8
commit
e1e46e817b
47
pep-0305.txt
47
pep-0305.txt
|
@ -105,19 +105,42 @@ Dialects
|
|||
--------
|
||||
|
||||
Readers and writers support a dialect argument which is just a
|
||||
convenient (string) handle on a group of lower level parameters.
|
||||
convenient handle on a group of lower level parameters.
|
||||
|
||||
When dialect is a string it identifies one of the dialect which is
|
||||
known to the module, otherwise it is processed as a dialect class as
|
||||
described below.
|
||||
|
||||
Dialects will generally be named after applications or organizations
|
||||
which define specific sets of format constraints. The initial dialect
|
||||
is "excel2000", which describes the format constraints of Excel 2000's
|
||||
is excel2000, which describes the format constraints of Excel 2000's
|
||||
CSV format. Another possible dialect (used here only as an example)
|
||||
might be "gnumeric".
|
||||
|
||||
Dialects are implemented as attribute only classes to enable user to
|
||||
construct variant dialects by subclassing. The excel2000 dialect is
|
||||
implemented as follows::
|
||||
|
||||
class excel2000:
|
||||
quotechar = '"'
|
||||
delimiter = ','
|
||||
escapechar = None
|
||||
skipinitialspace = False
|
||||
lineterminator = '\r\n'
|
||||
quoting = 'minimal'
|
||||
|
||||
An excel tab separated dialect can then be defined in user code as
|
||||
follows::
|
||||
|
||||
class exceltsv(csv.excel2000):
|
||||
delimiter = '\t'
|
||||
|
||||
Two functions are defined in the API to set and retrieve dialects::
|
||||
|
||||
set_dialect(dialect, pdict)
|
||||
pdict = get_dialect(dialect)
|
||||
set_dialect(name, dialect)
|
||||
dialect = get_dialect(name)
|
||||
|
||||
The pdict parameter is a dictionary whose keys are the names the
|
||||
The dialect parameter is a class or instance whose attributes are the
|
||||
formatting parameters defined in the next section.
|
||||
|
||||
|
||||
|
@ -135,6 +158,9 @@ for the set_dialect() and get_dialect() module functions.
|
|||
- delimiter specifies a one-character string to use as the field
|
||||
separator. It defaults to ','.
|
||||
|
||||
- escapechar specifies a one character string used to escape the
|
||||
delimiter when quotechar is set to None.
|
||||
|
||||
- skipinitialspace specifies how to interpret whitespace which
|
||||
immediately follows a delimiter. It defaults to False, which means
|
||||
that whitespace immediate following a delimiter is part of the
|
||||
|
@ -143,6 +169,17 @@ for the set_dialect() and get_dialect() module functions.
|
|||
- lineterminator specifies the character sequence which should
|
||||
terminate rows.
|
||||
|
||||
- quoting controls when quotes should be generated by the
|
||||
writer.
|
||||
|
||||
"minimal" means only when required, for example, when a field
|
||||
contains either the quotechar or the delimiter
|
||||
|
||||
"always" means that quotes are always placed around fields.
|
||||
|
||||
"nonnumeric" means that quotes are always placed around fields
|
||||
which contain characters other than [+-0-9.].
|
||||
|
||||
... XXX More to come XXX ...
|
||||
|
||||
When processing a dialect setting and one or more of the other
|
||||
|
|
Loading…
Reference in New Issue