335 lines
9.9 KiB
ReStructuredText
335 lines
9.9 KiB
ReStructuredText
PEP: 584
|
||
Title: Add + and - operators to the built-in dict class.
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Steven D'Aprano <steve@pearwood.info>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 01-Mar-2019
|
||
Post-History:
|
||
|
||
=====================================
|
||
PEP-584 Dict addition and subtraction
|
||
=====================================
|
||
|
||
**DRAFT** -- This is a draft document for discussion.
|
||
|
||
Abstract
|
||
--------
|
||
|
||
This PEP suggests adding merge ``+`` and difference ``-`` operators to
|
||
the built-in ``dict`` class.
|
||
|
||
The merge operator will have the same relationship to the
|
||
``dict.update`` method as the list concatenation operator has to
|
||
``list.extend``, with dict difference being defined analogously.
|
||
|
||
|
||
Examples
|
||
--------
|
||
|
||
Dict addition will return a new dict containing the left operand
|
||
merged with the right operand.
|
||
|
||
>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3}
|
||
>>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
|
||
>>> d + e
|
||
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
|
||
>>> e + d
|
||
{'cheese': 3, 'aardvark': 'Ethel', 'spam': 1, 'eggs': 2}
|
||
|
||
The augmented assignment version operates in-place.
|
||
|
||
>>> d += e
|
||
>>> print(d)
|
||
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
|
||
|
||
Analogously with list addition, the operator version is more
|
||
restrictive, and requires that both arguments are dicts, while the
|
||
augmented assignment version allows anything the ``update`` method
|
||
allows, such as iterables of key/value pairs.
|
||
|
||
>>> d + [('spam', 999)]
|
||
Traceback (most recent call last):
|
||
...
|
||
TypeError: can only merge dict (not "list") to dict
|
||
|
||
>>> d += [('spam', 999)]
|
||
>>> print(d)
|
||
{'spam': 999, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
|
||
|
||
|
||
Dict difference ``-`` will return a new dict containing the items from
|
||
the left operand which are not in the right operand.
|
||
|
||
>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3}
|
||
>>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
|
||
>>> d - e
|
||
{'spam': 1, 'eggs': 2}
|
||
>>> e - d
|
||
{'aardvark': 'Ethel'}
|
||
|
||
Augmented assignment will operate in place.
|
||
|
||
>>> d -= e
|
||
>>> print(d)
|
||
{'spam': 1, 'eggs': 2}
|
||
|
||
|
||
Like the merge operator and list concatenation, the difference
|
||
operator requires both operands to be dicts, while the augmented
|
||
version allows any iterable of keys.
|
||
|
||
>>> d - {'spam', 'parrot'}
|
||
Traceback (most recent call last):
|
||
...
|
||
TypeError: cannot take the difference of dict and set
|
||
|
||
>>> d -= {'spam', 'parrot'}
|
||
>>> print(d)
|
||
{'eggs': 2}
|
||
|
||
|
||
Semantics
|
||
---------
|
||
|
||
For the merge operator, if a key appears in both operands, the
|
||
last-seen value (i.e. that from the right-hand operand) wins. This
|
||
shows that dict addition is not commutative, in general ``d + e`` will
|
||
not equal ``e + d``. This joins a number of other non-commutative
|
||
addition operators among the builtins, including lists, tuples,
|
||
strings and bytes.
|
||
|
||
Having the last-seen value wins makes the merge operator match the
|
||
semantics of the ``update`` method, so that ``d + e`` is an operator
|
||
version of ``d.update(e)``.
|
||
|
||
The error messages shown above are not part of the API, and may change
|
||
at any time.
|
||
|
||
|
||
Rejected semantics
|
||
~~~~~~~~~~~~~~~~~~
|
||
|
||
Rejected alternatives semantics for ``d + e`` include:
|
||
|
||
- Add only new keys from ``e``, without overwriting existing keys in
|
||
``d``. This may be done by reversing the operands ``e + d``, or
|
||
using dict difference first, ``d + (e - d)``. The later is
|
||
especially useful for the in-place version ``d += (e - d)``.
|
||
|
||
- Raise an exception if there are duplicate keys. This seems
|
||
unnecessarily restrictive and is not likely to be useful in
|
||
practice. For example, updating default configuration values with
|
||
user-supplied values would most often fail under the requirement
|
||
that keys are unique::
|
||
|
||
prefs = site_defaults + user_defaults + document_prefs
|
||
|
||
- Add the values of d2 to the corresponding values of d1. This is the
|
||
behaviour implemented by ``collections.Counter``.
|
||
|
||
|
||
Syntax
|
||
------
|
||
|
||
An alternative to the ``+`` operator is the pipe ``|`` operator, which
|
||
is used for set union. This suggestion did not receive much support
|
||
on Python-Ideas.
|
||
|
||
The ``+`` operator was strongly preferred on Python-Ideas.[1] It is
|
||
more familiar than the pipe operator, matches nicely with ``-`` as a
|
||
pair, and the Counter subclass already uses ``+`` for merging.
|
||
|
||
|
||
Current Alternatives
|
||
--------------------
|
||
|
||
To create a new dict containing the merged items of two (or more)
|
||
dicts, one can currently write::
|
||
|
||
{**d1, **d2}
|
||
|
||
but this is neither obvious nor easily discoverable. It is only
|
||
guaranteed to work if the keys are all strings. If the keys are not
|
||
strings, it currently works in CPython, but it may not work with other
|
||
implementations, or future versions of CPython[2].
|
||
|
||
It is also limited to returning a built-in dict, not a subclass,
|
||
unless re-written as ``MyDict(**d1, **d2)``, in which case non-string
|
||
keys will raise TypeError.
|
||
|
||
There is currently no way to perform dict subtraction except through a
|
||
manual loop.
|
||
|
||
|
||
Implementation
|
||
--------------
|
||
|
||
The implementation will be in C. (The author of this PEP would like
|
||
to make it known that he is not able to write the implementation.)
|
||
|
||
An approximate pure-Python implementation of the merge operator will
|
||
be::
|
||
|
||
def __add__(self, other):
|
||
if isinstance(other, dict):
|
||
new = type(self)() # May be a subclass of dict.
|
||
new.update(self)
|
||
new.update(other)
|
||
return new
|
||
return NotImplemented
|
||
def __radd__(self, other):
|
||
if isinstance(other, dict):
|
||
new = type(other)()
|
||
new.update(other)
|
||
new.update(self)
|
||
return new
|
||
return NotImplemented
|
||
|
||
Note that the result type will be the type of the left operand; in the
|
||
event of matching keys, the winner is the right operand.
|
||
|
||
Augmented assignment will just call the ``update`` method. This is
|
||
analogous to the way ``list +=`` calls the ``extend`` method, which
|
||
accepts any iterable, not just lists.
|
||
|
||
def __iadd__(self, other):
|
||
self.update(other)
|
||
|
||
|
||
An approximate pure-Python implementation of the difference operator will be::
|
||
|
||
def __sub__(self, other):
|
||
if isinstance(other, dict):
|
||
new = type(self)()
|
||
for k in self:
|
||
if k not in other:
|
||
new[k] = self[k]
|
||
return new
|
||
return NotImplemented
|
||
def __rsub__(self, other):
|
||
if isinstance(other, dict):
|
||
new = type(other)()
|
||
for k in other:
|
||
if k not in self:
|
||
new[k] = other[k]
|
||
return new
|
||
return NotImplemented
|
||
|
||
Augmented assignment will operate on equivalent terms to ``update``.
|
||
If the operand has a key method, it will be used, otherwise the
|
||
operand will be iterated over::
|
||
|
||
def __isub__(self, other):
|
||
if hasattr(other, 'keys'):
|
||
for k in other.keys():
|
||
if k in self:
|
||
del self[k]
|
||
else:
|
||
for k in other:
|
||
if k in self:
|
||
del self[k]
|
||
|
||
|
||
These semantics are intended to match those of ``update`` as closely
|
||
as possible. For the dict built-in itself, calling ``keys`` is
|
||
redundant as iteration over a dict iterates over its keys; but for
|
||
subclasses or other mappings, ``update`` prefers to use the keys
|
||
method.
|
||
|
||
.. attention:: The above paragraph may be inaccurate.
|
||
Although the dict docstring states that ``keys``
|
||
will be called if it exists, this does not seem to
|
||
be the case for dict subclasses. Bug or feature?
|
||
|
||
|
||
Contra-indications
|
||
------------------
|
||
|
||
(Or when to avoid using these new operators.)
|
||
|
||
For merging multiple dicts, the ``d1 + d2 + d3 + d4 + ...`` idiom will
|
||
suffer from the same unfortunate O(N\*\*2) Big Oh performance as does
|
||
list and tuple addition, and for similar reasons. If one expects to
|
||
be merging a large number of dicts where performance is an issue, it
|
||
may be better to use an explicit loop and in-place merging::
|
||
|
||
new = {}
|
||
for d in many_dicts:
|
||
new += d
|
||
|
||
This is unlikely to be a problem in practice as most uses of the merge
|
||
operator are expected to only involve a small number of dicts.
|
||
Similarly, most uses of list and tuple concatenation only use a few
|
||
objects.
|
||
|
||
Using the dict augmented assignment operators on a dict inside a tuple
|
||
(or other immutable data structure) will lead to the same problem that
|
||
occurs with list concatenation[3], namely the in-place addition will
|
||
succeed, but the operation will raise an exception.
|
||
|
||
>>> a_tuple = ({'spam': 1, 'eggs': 2}, None)
|
||
>>> a_tuple[0] += {'spam': 999}
|
||
Traceback (most recent call last):
|
||
...
|
||
TypeError: 'tuple' object does not support item assignment
|
||
>>> a_tuple[0]
|
||
{'spam': 999, 'eggs': 2}
|
||
|
||
Similar remarks apply to the ``-`` operator.
|
||
|
||
|
||
Other discussions
|
||
-----------------
|
||
|
||
`Latest discussion which motivated this PEP
|
||
<https://mail.python.org/pipermail/python-ideas/2019-February/055509.html>`_
|
||
|
||
`Ticket on the bug tracker <https://bugs.python.org/issue36144>`_
|
||
|
||
`A previous discussion
|
||
<https://mail.python.org/pipermail/python-ideas/2015-February/031748.html>`_
|
||
and `commentary on it <https://lwn.net/Articles/635397/>`_. Note that
|
||
the author of this PEP was skeptical of this proposal at the time.
|
||
|
||
`How to merge dictionaries
|
||
<https://treyhunner.com/2016/02/how-to-merge-dictionaries-in-python/>`_
|
||
in idiomatic Python.
|
||
|
||
|
||
Open questions
|
||
--------------
|
||
|
||
Should these operators be part of the ABC ``Mapping`` API?
|
||
|
||
|
||
References
|
||
----------
|
||
|
||
[1] Guido's declaration that plus wins over pipe:
|
||
https://mail.python.org/pipermail/python-ideas/2019-February/055519.html
|
||
|
||
[2] Non-string keys: https://bugs.python.org/issue35105 and
|
||
https://mail.python.org/pipermail/python-dev/2018-October/155435.html
|
||
|
||
[3] Behaviour in tuples:
|
||
https://docs.python.org/3/faq/programming.html#why-does-a-tuple-i-item-raise-an-exception-when-the-addition-works
|
||
|
||
|
||
Copyright
|
||
---------
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|