Trying to bring PEP up to date with discussions on mailing list. I hope I

have not misinterpreted the conclusions.
* dialect argument is now either a string to identify one of the internally
  defined parameter sets, otherwise it is an object which contains
  attributes which correspond to the parameter set.
* Altered set_dialect() to take dialect name and dialect object.
* Altered get_dialect() to take dialect name and return dialect object.
* Fleshed out formatting parameters, adding escapechar, lineterminator,
  quoting.
This commit is contained in:
Dave Cole 2003-01-30 12:11:27 +00:00
parent 59fc3f4bc8
commit e1e46e817b
1 changed files with 42 additions and 5 deletions

View File

@ -105,19 +105,42 @@ Dialects
--------
Readers and writers support a dialect argument which is just a
convenient (string) handle on a group of lower level parameters.
convenient handle on a group of lower level parameters.
When dialect is a string it identifies one of the dialect which is
known to the module, otherwise it is processed as a dialect class as
described below.
Dialects will generally be named after applications or organizations
which define specific sets of format constraints. The initial dialect
is "excel2000", which describes the format constraints of Excel 2000's
is excel2000, which describes the format constraints of Excel 2000's
CSV format. Another possible dialect (used here only as an example)
might be "gnumeric".
Dialects are implemented as attribute only classes to enable user to
construct variant dialects by subclassing. The excel2000 dialect is
implemented as follows::
class excel2000:
quotechar = '"'
delimiter = ','
escapechar = None
skipinitialspace = False
lineterminator = '\r\n'
quoting = 'minimal'
An excel tab separated dialect can then be defined in user code as
follows::
class exceltsv(csv.excel2000):
delimiter = '\t'
Two functions are defined in the API to set and retrieve dialects::
set_dialect(dialect, pdict)
pdict = get_dialect(dialect)
set_dialect(name, dialect)
dialect = get_dialect(name)
The pdict parameter is a dictionary whose keys are the names the
The dialect parameter is a class or instance whose attributes are the
formatting parameters defined in the next section.
@ -135,6 +158,9 @@ for the set_dialect() and get_dialect() module functions.
- delimiter specifies a one-character string to use as the field
separator. It defaults to ','.
- escapechar specifies a one character string used to escape the
delimiter when quotechar is set to None.
- skipinitialspace specifies how to interpret whitespace which
immediately follows a delimiter. It defaults to False, which means
that whitespace immediate following a delimiter is part of the
@ -143,6 +169,17 @@ for the set_dialect() and get_dialect() module functions.
- lineterminator specifies the character sequence which should
terminate rows.
- quoting controls when quotes should be generated by the
writer.
"minimal" means only when required, for example, when a field
contains either the quotechar or the delimiter
"always" means that quotes are always placed around fields.
"nonnumeric" means that quotes are always placed around fields
which contain characters other than [+-0-9.].
... XXX More to come XXX ...
When processing a dialect setting and one or more of the other