diff --git a/pep-0305.txt b/pep-0305.txt index 229f6210b..142817d45 100644 --- a/pep-0305.txt +++ b/pep-0305.txt @@ -105,19 +105,42 @@ Dialects -------- Readers and writers support a dialect argument which is just a -convenient (string) handle on a group of lower level parameters. +convenient handle on a group of lower level parameters. + +When dialect is a string it identifies one of the dialect which is +known to the module, otherwise it is processed as a dialect class as +described below. + Dialects will generally be named after applications or organizations which define specific sets of format constraints. The initial dialect -is "excel2000", which describes the format constraints of Excel 2000's +is excel2000, which describes the format constraints of Excel 2000's CSV format. Another possible dialect (used here only as an example) might be "gnumeric". +Dialects are implemented as attribute only classes to enable user to +construct variant dialects by subclassing. The excel2000 dialect is +implemented as follows:: + + class excel2000: + quotechar = '"' + delimiter = ',' + escapechar = None + skipinitialspace = False + lineterminator = '\r\n' + quoting = 'minimal' + +An excel tab separated dialect can then be defined in user code as +follows:: + + class exceltsv(csv.excel2000): + delimiter = '\t' + Two functions are defined in the API to set and retrieve dialects:: - set_dialect(dialect, pdict) - pdict = get_dialect(dialect) + set_dialect(name, dialect) + dialect = get_dialect(name) -The pdict parameter is a dictionary whose keys are the names the +The dialect parameter is a class or instance whose attributes are the formatting parameters defined in the next section. @@ -135,6 +158,9 @@ for the set_dialect() and get_dialect() module functions. - delimiter specifies a one-character string to use as the field separator. It defaults to ','. +- escapechar specifies a one character string used to escape the + delimiter when quotechar is set to None. + - skipinitialspace specifies how to interpret whitespace which immediately follows a delimiter. It defaults to False, which means that whitespace immediate following a delimiter is part of the @@ -143,6 +169,17 @@ for the set_dialect() and get_dialect() module functions. - lineterminator specifies the character sequence which should terminate rows. +- quoting controls when quotes should be generated by the + writer. + + "minimal" means only when required, for example, when a field + contains either the quotechar or the delimiter + + "always" means that quotes are always placed around fields. + + "nonnumeric" means that quotes are always placed around fields + which contain characters other than [+-0-9.]. + ... XXX More to come XXX ... When processing a dialect setting and one or more of the other