609 lines
23 KiB
Plaintext
609 lines
23 KiB
Plaintext
PEP: 391
|
||
Title: Dictionary-Based Configuration For Logging
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Vinay Sajip <vinay_sajip at red-dove.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 15-Oct-2009
|
||
Python-Version: 2.7, 3.2
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP describes a new way of configuring logging using a dictionary
|
||
to hold configuration information.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
The present means for configuring Python's logging package is either
|
||
by using the logging API to configure logging programmatically, or
|
||
else by means of ConfigParser-based configuration files.
|
||
|
||
Programmatic configuration, while offering maximal control, fixes the
|
||
configuration in Python code. This does not facilitate changing it
|
||
easily at runtime, and, as a result, the ability to flexibly turn the
|
||
verbosity of logging up and down for different parts of a using
|
||
application is lost. This limits the usability of logging as an aid
|
||
to diagnosing problems - and sometimes, logging is the only diagnostic
|
||
aid available in production environments.
|
||
|
||
The ConfigParser-based configuration system is usable, but does not
|
||
allow its users to configure all aspects of the logging package. For
|
||
example, Filters cannot be configured using this system. Furthermore,
|
||
the ConfigParser format appears to engender dislike (sometimes strong
|
||
dislike) in some quarters. Though it was chosen because it was the
|
||
only configuration format supported in the Python standard at that
|
||
time, many people regard it (or perhaps just the particular schema
|
||
chosen for logging's configuration) as 'crufty' or 'ugly', in some
|
||
cases apparently on purely aesthetic grounds.
|
||
|
||
Recent versions of Python include JSON support in the standard
|
||
library, and this is also usable as a configuration format. In other
|
||
environments, such as Google App Engine, YAML is used to configure
|
||
applications, and usually the configuration of logging would be
|
||
considered an integral part of the application configuration.
|
||
Although the standard library does not contain YAML support at
|
||
present, support for both JSON and YAML can be provided in a common
|
||
way because both of these serialization formats allow deserialization
|
||
of Python dictionaries.
|
||
|
||
By providing a way to configure logging by passing the configuration
|
||
in a dictionary, logging will be easier to configure not only for
|
||
users of JSON and/or YAML, but also for users of bespoke configuration
|
||
methods, by providing a common format in which to describe the desired
|
||
configuration.
|
||
|
||
Another drawback of the current ConfigParser-based configuration
|
||
system is that it does not support incremental configuration: a new
|
||
configuration completely replaces the existing configuration.
|
||
Although full flexibility for incremental configuration is difficult
|
||
to provide in a multi-threaded environment, the new configuration
|
||
mechanism will allow the provision of limited support for incremental
|
||
configuration.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
The specification consists of two parts: the API and the format of the
|
||
dictionary used to convey configuration information (i.e. the schema
|
||
to which it must conform).
|
||
|
||
|
||
Naming
|
||
------
|
||
|
||
Historically, the logging package has not been PEP 8 conformant [1]_.
|
||
At some future time, this will be corrected by changing method and
|
||
function names in the package in order to conform with PEP 8.
|
||
However, in the interests of uniformity, the proposed additions to the
|
||
API use a naming scheme which is consistent with the present scheme
|
||
used by logging.
|
||
|
||
|
||
API
|
||
---
|
||
|
||
The logging.config module will have the following additions:
|
||
|
||
* A function, called ``dictConfig()``, which takes a single argument
|
||
- the dictionary holding the configuration. Nothing will be
|
||
returned, though exceptions will be raised if there are errors
|
||
while processing the dictionary.
|
||
|
||
It will be possible to customize this API - see the section on `API
|
||
Customization`_.
|
||
|
||
Dictionary Schema - Overview
|
||
----------------------------
|
||
|
||
Before describing the schema in detail, it is worth saying a few words
|
||
about object connections, support for user-defined objects and access
|
||
to external and internal objects.
|
||
|
||
|
||
Object connections
|
||
''''''''''''''''''
|
||
|
||
The schema is intended to describe a set of logging objects - loggers,
|
||
handlers, formatters, filters - which are connected to each other in
|
||
an object graph. Thus, the schema needs to represent connections
|
||
between the objects. For example, say that, once configured, a
|
||
particular logger has an attached to it a particular handler. For the
|
||
purposes of this discussion, we can say that the logger represents the
|
||
source, and the handler the destination, of a connection between the
|
||
two. Of course in the configured objects this is represented by the
|
||
logger holding a reference to the handler. In the configuration dict,
|
||
this is done by giving each destination object an id which identifies
|
||
it unambiguously, and then using the id in the source object's
|
||
configuration to indicate that a connection exists between the source
|
||
and the destination object with that id.
|
||
|
||
So, for example, consider the following YAML snippet::
|
||
|
||
handlers:
|
||
h1: #This is an id
|
||
# configuration of handler with id h1 goes here
|
||
h2: #This is another id
|
||
# configuration of handler with id h2 goes here
|
||
loggers:
|
||
foo.bar.baz:
|
||
# other configuration for logger 'foo.bar.baz'
|
||
handlers: [h1, h2]
|
||
|
||
(Note: YAML will be used in this document as it is a little more
|
||
readable than the equivalent Python source form for the dictionary.)
|
||
|
||
The ids for loggers are the logger names which would be used
|
||
programmatically to obtain a reference to those loggers, e.g.
|
||
``foo.bar.baz``. The ids for other objects can be any string value
|
||
(such as ``h1``, ``h2`` above) and they are transient, in that they
|
||
are only meaningful for processing the configuration dictionary and
|
||
used to determine connections between objects, and are not persisted
|
||
anywhere when the configuration call is complete.
|
||
|
||
The above snippet indicates that logger named ``foo.bar.baz`` should
|
||
have two handlers attached to it, which are described by the handler
|
||
ids ``h1`` and ``h2``.
|
||
|
||
|
||
User-defined objects
|
||
''''''''''''''''''''
|
||
|
||
The schema should support user-defined objects for handlers, filters
|
||
and formatters. (Loggers do not need to have different types for
|
||
different instances, so there is no support - in the configuration -
|
||
for user-defined logger classes.)
|
||
|
||
Objects to be configured will typically be described by dictionaries
|
||
which detail their configuration. In some places, the logging system
|
||
will be able to infer from the context how an object is to be
|
||
instantiated, but when a user-defined object is to be instantiated,
|
||
the system will not know how to do this. In order to provide complete
|
||
flexibility for user-defined object instantiation, the user will need
|
||
to provide a 'factory' - a callable which is called with a
|
||
configuration dictionary and which returns the instantiated object.
|
||
This will be signalled by the factory being made available under the
|
||
special key ``'()'``. Here's a concrete example::
|
||
|
||
formatters:
|
||
brief:
|
||
format: '%(message)s'
|
||
default:
|
||
format: '%(asctime)s %(levelname)-8s %(name)-15s %(message)s'
|
||
datefmt: '%Y-%m-%d %H:%M:%S'
|
||
custom:
|
||
(): my.package.customFormatterFactory
|
||
bar: baz
|
||
spam: 99.9
|
||
answer: 42
|
||
|
||
The above YAML snippet defines three formatters. The first, with id
|
||
``brief``, is a standard ``logging.Formatter`` instance with the
|
||
specified format string. The second, with id ``default``, has a
|
||
longer format and also defines the time format explicitly, and will
|
||
result in a ``logging.Formatter`` initialized with those two format
|
||
strings. Shown in Python source form, the ``brief`` and ``default``
|
||
formatters have configuration sub-dictionaries::
|
||
|
||
{
|
||
'format' : '%(message)s'
|
||
}
|
||
|
||
and::
|
||
|
||
{
|
||
'format' : '%(asctime)s %(levelname)-8s %(name)-15s %(message)s',
|
||
'datefmt' : '%Y-%m-%d %H:%M:%S'
|
||
}
|
||
|
||
respectively, and as these dictionaries do not contain the special key
|
||
``'()'``, the instantiation is inferred from the context: as a result,
|
||
standard ``logging.Formatter`` instances are created. The
|
||
configuration sub-dictionary for the third formatter, with id
|
||
``custom``, is::
|
||
|
||
{
|
||
'()' : 'my.package.customFormatterFactory',
|
||
'bar' : 'baz',
|
||
'spam' : 99.9,
|
||
'answer' : 42
|
||
}
|
||
|
||
and this contains the special key ``'()'``, which means that
|
||
user-defined instantiation is wanted. In this case, the specified
|
||
factory callable will be located using normal import mechanisms and
|
||
called with the *remaining* items in the configuration sub-dictionary
|
||
as keyword arguments. In the above example, the formatter with id
|
||
``custom`` will be assumed to be returned by the call::
|
||
|
||
my.package.customFormatterFactory(bar='baz', spam=99.9, answer=42)
|
||
|
||
The key ``'()'`` has been used as the special key because it is not a
|
||
valid keyword parameter name, and so will not clash with the names of
|
||
the keyword arguments used in the call. The ``'()'`` also serves as a
|
||
mnemonic that the corresponding value is a callable.
|
||
|
||
|
||
Access to external objects
|
||
''''''''''''''''''''''''''
|
||
|
||
There are times where a configuration will need to refer to objects
|
||
external to the configuration, for example ``sys.stderr``. If the
|
||
configuration dict is constructed using Python code then this is
|
||
straightforward, but a problem arises when the configuration is
|
||
provided via a text file (e.g. JSON, YAML). In a text file, there is
|
||
no standard way to distinguish ``sys.stderr`` from the literal string
|
||
``'sys.stderr'``. To facilitate this distinction, the configuration
|
||
system will look for certain special prefixes in string values and
|
||
treat them specially. For example, if the literal string
|
||
``'ext://sys.stderr'`` is provided as a value in the configuration,
|
||
then the ``ext://`` will be stripped off and the remainder of the
|
||
value processed using normal import mechanisms.
|
||
|
||
The handling of such prefixes will be done in a way analogous to
|
||
protocol handling: there will be a generic mechanism to look for
|
||
prefixes which match the regular expression
|
||
``^(?P<prefix>[a-z]+)://(?P<suffix>.*)$`` whereby, if the ``prefix``
|
||
is recognised, the ``suffix`` is processed in a prefix-dependent
|
||
manner and the result of the processing replaces the string value. If
|
||
the prefix is not recognised, then the string value will be left
|
||
as-is.
|
||
|
||
The implementation will provide for a set of standard prefixes such as
|
||
``ext://`` but it will be possible to disable the mechanism completely
|
||
or provide additional or different prefixes for special handling.
|
||
|
||
Access to internal objects
|
||
''''''''''''''''''''''''''
|
||
|
||
As well as external objects, there is sometimes also a need to refer
|
||
to objects in the configuration. This will be done implicitly by the
|
||
configuration system for things that it knows about. For example, the
|
||
string value ``'DEBUG'`` for a ``level`` in a logger or handler will
|
||
automatically be converted to the value ``logging.DEBUG``, and the
|
||
``handlers``, ``filters`` and ``formatter`` entries will take an
|
||
object id and resolve to the appropriate destination object.
|
||
|
||
However, a more generic mechanism needs to be provided for the case
|
||
of user-defined objects which are not known to logging. For example,
|
||
take the instance of ``logging.handlers.MemoryHandler``, which takes
|
||
a ``target`` which is another handler to delegate to. Since the system
|
||
already knows about this class, then in the configuration, the given
|
||
``target`` just needs to be the object id of the relevant target
|
||
handler, and the system will resolve to the handler from the id. If,
|
||
however, a user defines a ``my.package.MyHandler`` which has a
|
||
``alternate`` handler, the configuration system would not know that
|
||
the ``alternate`` referred to a handler. To cater for this, a
|
||
generic resolution system will be provided which allows the user to
|
||
specify::
|
||
|
||
handlers:
|
||
file:
|
||
# configuration of file handler goes here
|
||
|
||
custom:
|
||
(): my.package.MyHandler
|
||
alternate: int://handlers.file
|
||
|
||
The literal string ``'int://handlers.file'`` will be resolved in an
|
||
analogous way to the strings with the ``ext://`` prefix, but looking
|
||
in the configuration itself rather than the import namespace. The
|
||
mechanism will allow access by dot or by index, in a similar way to
|
||
that provided by ``str.format``. Thus, given the following snippet::
|
||
|
||
handlers:
|
||
email:
|
||
class: logging.handlers.SMTPHandler
|
||
mailhost: localhost
|
||
fromaddr: my_app@domain.tld
|
||
toaddrs:
|
||
- support_team@domain.tld
|
||
- dev_team@domain.tld
|
||
subject: Houston, we have a problem.
|
||
|
||
in the configuration, the string ``'int://handlers'`` would resolve to
|
||
the dict with key ``handlers``, the string ``'int://handlers.email``
|
||
would resolve to the dict with key ``email`` in the ``handlers`` dict,
|
||
and so on. The string ``'int://handlers.email.toaddrs[1]`` would
|
||
resolve to ``'dev_team.domain.tld'`` and the string
|
||
``'int://handlers.email.toaddrs[0]'`` would resolve to the value
|
||
``'support_team@domain.tld'``. The ``subject`` value could be accessed
|
||
using either ``'int://handlers.email.subject'`` or, equivalently,
|
||
``'int://handlers.email[subject]'``. The latter form only needs to be
|
||
used if the key contains spaces or non-alphanumeric characters. If an
|
||
index value consists only of decimal digits, access will be attempted
|
||
using the corresponding integer value, falling back to the string
|
||
value if needed.
|
||
|
||
Given a string ``int://handlers.myhandler.mykey.123``, this will
|
||
resolve to ``config_dict['handlers']['myhandler']['mykey']['123']``.
|
||
If the string is specified as ``int://handlers.myhandler.mykey[123]``,
|
||
the system will attempt to retrieve the value from
|
||
``config_dict['handlers']['myhandler']['mykey'][123]``, ad fall back
|
||
to ``config_dict['handlers']['myhandler']['mykey']['123']`` if that
|
||
fails.
|
||
|
||
Note: the ``ext`` and ``int`` prefixes are provisional. If better
|
||
alternatives are suggested during the PEP review process, they will be
|
||
used.
|
||
|
||
Dictionary Schema - Detail
|
||
--------------------------
|
||
|
||
The dictionary passed to ``dictConfig()`` must contain the following
|
||
keys:
|
||
|
||
* `version` - to be set to an integer value representing the schema
|
||
version. The only valid value at present is 1, but having this key
|
||
allows the schema to evolve while still preserving backwards
|
||
compatibility.
|
||
|
||
All other keys are optional, but if present they will be interpreted
|
||
as described below. In all cases below where a 'configuring dict' is
|
||
mentioned, it will be checked for the special ``'()'`` key to see if a
|
||
custom instantiation is required. If so, the mechanism described
|
||
above is used to instantiate; otherwise, the context is used to
|
||
determine how to instantiate.
|
||
|
||
* `formatters` - the corresponding value will be a dict in which each
|
||
key is a formatter id and each value is a dict describing how to
|
||
configure the corresponding Formatter instance.
|
||
|
||
The configuring dict is searched for keys ``format`` and ``datefmt``
|
||
(with defaults of ``None``) and these are used to construct a
|
||
``logging.Formatter`` instance.
|
||
|
||
* `filters` - the corresponding value will be a dict in which each key
|
||
is a filter id and each value is a dict describing how to configure
|
||
the corresponding Filter instance.
|
||
|
||
The configuring dict is searched for key ``name`` (defaulting to the
|
||
empty string) and this is used to construct a ``logging.Filter``
|
||
instance.
|
||
|
||
* `handlers` - the corresponding value will be a dict in which each
|
||
key is a handler id and each value is a dict describing how to
|
||
configure the corresponding Handler instance.
|
||
|
||
The configuring dict is searched for the following keys:
|
||
|
||
* ``class`` (mandatory). This is the fully qualified name of the
|
||
handler class.
|
||
|
||
* ``level`` (optional). The level of the handler.
|
||
|
||
* ``formatter`` (optional). The id of the formatter for this
|
||
handler.
|
||
|
||
* ``filters`` (optional). A list of ids of the filters for this
|
||
handler.
|
||
|
||
All *other* keys are passed through as keyword arguments to the
|
||
handler's constructor. For example, given the snippet::
|
||
|
||
handlers:
|
||
console:
|
||
class : logging.StreamHandler
|
||
formatter: brief
|
||
level : INFO
|
||
filters: [allow_foo]
|
||
stream : ext://sys.stdout
|
||
file:
|
||
class : logging.handlers.RotatingFileHandler
|
||
formatter: precise
|
||
filename: logconfig.log
|
||
maxBytes: 1024
|
||
backupCount: 3
|
||
|
||
the handler with id ``console`` is instantiated as a
|
||
``logging.StreamHandler``, using ``sys.stdout`` as the underlying
|
||
stream. The handler with id ``file`` is instantiated as a
|
||
``logging.handlers.RotatingFileHandler`` with the keyword arguments
|
||
``filename='logconfig.log', maxBytes=1024, backupCount=3``.
|
||
|
||
* `loggers` - the corresponding value will be a dict in which each key
|
||
is a logger name and each value is a dict describing how to
|
||
configure the corresponding Logger instance.
|
||
|
||
The configuring dict is searched for the following keys:
|
||
|
||
* ``level`` (optional). The level of the logger.
|
||
|
||
* ``propagate`` (optional). The propagation setting of the logger.
|
||
|
||
* ``filters`` (optional). A list of ids of the filters for this
|
||
logger.
|
||
|
||
* ``handlers`` (optional). A list of ids of the handlers for this
|
||
logger.
|
||
|
||
The specified loggers will be configured according to the level,
|
||
propagation, filters and handlers specified.
|
||
|
||
* `root` - this will be the configuration for the root logger.
|
||
Processing of the configuration will be as for any logger, except
|
||
that the ``propagate`` setting will not be applicable.
|
||
|
||
* `incremental` - whether the configuration is to be interpreted as
|
||
incremental to the existing configuration. This value defaults to
|
||
``False``, which means that the specified configuration replaces the
|
||
existing configuration with the same semantics as used by the
|
||
existing ``fileConfig()`` API.
|
||
|
||
If the specified value is ``True``, the configuration is processed
|
||
as described in the section on `Incremental Configuration`_, below.
|
||
|
||
|
||
A Working Example
|
||
-----------------
|
||
|
||
The following is an actual working configuration in YAML format
|
||
(except that the email addresses are bogus)::
|
||
|
||
formatters:
|
||
brief:
|
||
format: '%(levelname)-8s: %(name)-15s: %(message)s'
|
||
precise:
|
||
format: '%(asctime)s %(name)-15s %(levelname)-8s %(message)s'
|
||
filters:
|
||
allow_foo:
|
||
name: foo
|
||
handlers:
|
||
console:
|
||
class : logging.StreamHandler
|
||
formatter: brief
|
||
level : INFO
|
||
stream : ext://sys.stdout
|
||
filters: [allow_foo]
|
||
file:
|
||
class : logging.handlers.RotatingFileHandler
|
||
formatter: precise
|
||
filename: logconfig.log
|
||
maxBytes: 1024
|
||
backupCount: 3
|
||
debugfile:
|
||
class : logging.FileHandler
|
||
formatter: precise
|
||
filename: logconfig-detail.log
|
||
mode: a
|
||
email:
|
||
class: logging.handlers.SMTPHandler
|
||
mailhost: localhost
|
||
fromaddr: my_app@domain.tld
|
||
toaddrs:
|
||
- support_team@domain.tld
|
||
- dev_team@domain.tld
|
||
subject: Houston, we have a problem.
|
||
loggers:
|
||
foo:
|
||
level : ERROR
|
||
handlers: [debugfile]
|
||
spam:
|
||
level : CRITICAL
|
||
handlers: [debugfile]
|
||
propagate: no
|
||
bar.baz:
|
||
level: WARNING
|
||
root:
|
||
level : DEBUG
|
||
handlers : [console, file]
|
||
|
||
|
||
Incremental Configuration
|
||
=========================
|
||
|
||
It is difficult to provide complete flexibility for incremental
|
||
configuration. For example, because objects such as handlers, filters
|
||
and formatters are anonymous, once a configuration is set up, it is
|
||
not possible to refer to such anonymous objects when augmenting a
|
||
configuration. For example, if an initial call is made to configure
|
||
the system where logger ``foo`` has a handler with id ``console``
|
||
attached, then a subsequent call to configure a logger ``bar`` with id
|
||
``console`` would create a new handler instance, as the id ``console``
|
||
from the first call isn't kept.
|
||
|
||
Furthermore, there is not a compelling case for arbitrarily altering
|
||
the object graph of loggers, handlers, filters, formatters at
|
||
run-time, once a configuration is set up; the verbosity of loggers can
|
||
be controlled just by setting levels (and perhaps propagation flags).
|
||
|
||
Thus, when the ``incremental`` key of a configuration dict is present
|
||
and is ``True``, the system will ignore any ``formatters``,
|
||
``filters``, ``handlers`` entries completely, and process only the
|
||
``level`` and ``propagate`` settings in the ``loggers`` and ``root``
|
||
entries.
|
||
|
||
It's certainly possible to provide incremental configuration by other
|
||
means, for example making ``dictConfig()`` take an ``incremental``
|
||
keyword argument which defaults to ``False``. The reason for
|
||
suggesting that a flag in the configuration dict be used is that it
|
||
allows for configurations to be sent over the wire as pickled dicts
|
||
to a socket listener. Thus, the logging verbosity of a long-running
|
||
application can be altered over time with no need to stop and
|
||
restart the application.
|
||
|
||
Note: Feedback on incremental configuration needs based on your
|
||
practical experience will be particularly welcome.
|
||
|
||
|
||
API Customization
|
||
=================
|
||
|
||
The bare-bones ``dictConfig()`` API will not be sufficient for all
|
||
use cases. Provision for customization of the API will be made by
|
||
providing the following:
|
||
|
||
* A class, called ``DictConfigurator``, whose constructor is passed
|
||
the dictionary used for configuration, and which has a
|
||
``configure()`` method.
|
||
|
||
* A callable, called ``dictConfigClass``, which will (by default) be
|
||
set to ``DictConfigurator``. This is provided so that if desired,
|
||
``DictConfigurator`` can be replaced with a suitable user-defined
|
||
implementation.
|
||
|
||
The ``dictConfig()`` function will call ``dictConfigClass`` passing
|
||
the specified dictionary, and then call the ``configure()`` method on
|
||
the returned object to actually put the configuration into effect::
|
||
|
||
def dictConfig(config):
|
||
dictConfigClass(config).configure()
|
||
|
||
This should cater to all customization needs. For example, a subclass
|
||
of ``DictConfigurator`` could call ``DictConfigurator.__init__()`` in
|
||
its own ``__init__()``, then set up custom prefixes which would be
|
||
usable in the subsequent ``configure() call``. The ``dictConfigClass``
|
||
would be bound to the subclass, and then ``dictConfig()`` could be
|
||
called exactly as in the default, uncustomized state.
|
||
|
||
|
||
Configuration Errors
|
||
====================
|
||
|
||
If an error is encountered during configuration, the system will raise
|
||
a ``ValueError``, ``TypeError``, ``AttributeError`` or ``ImportError``
|
||
with a suitably descriptive message. The following is a (possibly
|
||
incomplete) list of conditions which will raise an error:
|
||
|
||
* A ``level`` which is not a string or which is a string not
|
||
corresponding to an actual logging level
|
||
|
||
* A ``propagate`` value which is not a boolean
|
||
|
||
* An id which does not have a corresponding destination
|
||
|
||
* An invalid logger name
|
||
|
||
* Inability to resolve to an internal or external object
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] PEP 8, Style Guide for Python Code, van Rossum, Warsaw
|
||
(http://www.python.org/dev/peps/pep-0008)
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|