394 lines
15 KiB
Plaintext
394 lines
15 KiB
Plaintext
PEP: 3144
|
|
Title: IP Address Manipulation Library for the Python Standard Library
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Peter Moody <peter@hda3.com>
|
|
Discussions-To: ipaddr-py-dev@googlegroups.com
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/plain
|
|
Created: 13-Aug-2009
|
|
Python-Version: 3.2
|
|
|
|
|
|
Abstract:
|
|
|
|
This PEP proposes a design for a lightweight ip address manipulation module
|
|
for python.
|
|
|
|
|
|
Motivation:
|
|
|
|
Many network administrators use python in their day to day jobs. Finding a
|
|
library to assist with the common ip address manipulation tasks is easy.
|
|
Finding a good library for performing those tasks can be somewhat more
|
|
difficult. For this reason, I (like many before me) scratched an itch and
|
|
wrote my own with an emphasis on being easy to understand and fast for the
|
|
most common operations.
|
|
|
|
For context, a previous version of this library was up for inclusion in
|
|
python 3.1, see issue 3959 [1] for more information.
|
|
|
|
|
|
Rationale:
|
|
|
|
ipaddr was designed with a few basic principals in mind:
|
|
|
|
- IPv4 and IPv6 objects are distinct.
|
|
- IP addresses and IP networks are distinct.
|
|
- the library should be useful and the assumptions obvious to the network
|
|
programmer.
|
|
- IP networks should be treated as lists (as opposed to some other
|
|
python intrinsic) in so far as it makes sense.
|
|
- the library should be lightweight and fast without sacrificing
|
|
expected functionality.
|
|
|
|
- Distinct IPV4 and IPV6 objects.
|
|
|
|
While there are many similarities, IPV4 and IPV6 objects are fundamentally
|
|
different. The similarities allow for easy abstraction of certain
|
|
operations which affect the bits from both in the same manner, but their
|
|
differences mean attempts to combine them into one object yield unexpected
|
|
results. According to Vint Cerf, "I have seen a substantial amount of
|
|
traffic about IPv4 and IPv6 comparisons and the general consensus is that
|
|
these are not comparable." (Vint Cerf [2]). For python versions >= 3.0,
|
|
this means that (<, >, <=, >=) comparison operations between IPv4 and IPv6
|
|
objects raise a TypeError per the Ordering Comparisons [3].
|
|
|
|
- Distinct network and address objects.
|
|
|
|
An IPV4 address is a single 32 bit number while the IPV4 address assigned
|
|
to a networked computer is a 32 bit address and associated network.
|
|
Similarly, an IPV6 address is a 128 bit number while an IPV6 address
|
|
assigned to a networked computer is a 128 bit number and associated network
|
|
information. The similarities leads to easy abstraction of some methods
|
|
and properties, but there are obviously a number of address/network
|
|
specific properties which require they be distinct. For instance, IP
|
|
networks contain a network address (the base address of the network),
|
|
broadcast address (the upper end of the network, also the address to
|
|
which every machine on a given network is supposed listen, hence the name
|
|
broadcast), supernetworks and subnetworks, etc. The individual property
|
|
addresses in an IP network obviously don't have the same properties,
|
|
they're simply 32 or 128 bit numbers.
|
|
|
|
- Principal of least confusion for network programmers.
|
|
|
|
It should be understood that, above all, this module is designed with the
|
|
network administrator in mind. In practice, this means that a number of
|
|
assumptions are made with regards to common usage and the library prefers
|
|
the usefulness of accepted practice over strict adherence to RFCs. For
|
|
example, ipaddr accepts '192.168.1.1/24' as a network definition because
|
|
this is a very common way of describing an address + netmask despite the
|
|
fact that 192.168.1.1 is actually an IP address on the network
|
|
192.168.1.0/24. Strict adherence would require that networks have all of
|
|
the host bits masked to zero, which would require two objects to describe
|
|
that IP + network. In practice, a looser interpretation of a network is
|
|
a very useful if common abstraction, so ipaddr prefers to make this
|
|
available. For the developer who is concerned with strict adherence,
|
|
ipaddr provides an optional 'strict' boolean argument to the
|
|
IPv(4|6)Network constructors which guarantees that all host bits are masked
|
|
down.
|
|
|
|
- Treat network elements as lists (in so far as it's possible).
|
|
|
|
Treating IP networks as lists is a natural extension from viewing the
|
|
network as a series of individual ip addresses. Most of the standard list
|
|
methods should be implemented and should behave in a manner that would be
|
|
consistent if the IP network object were actually a list of strings or
|
|
integers. The methods which actually modify a lists contents don't extend
|
|
as well to this model (__add__, __iadd__, __sub__, __isub__, etc) but
|
|
others (__contains__, __iter__, etc) work quite nicely. It should be noted
|
|
that __len__ doesn't work as expected since python internals has this
|
|
limited to a 32 bit integer and it would need to be at least 128 bits to
|
|
work with IPV6.
|
|
|
|
- Lightweight.
|
|
|
|
While some network programmers will undoubtedly want more than this library
|
|
provides, keeping the functionality to strictly what's required from a IP
|
|
address manipulation module is critical to keeping the code fast, easily
|
|
comprehensible and extensible. It is a goal to provide enough options in
|
|
terms of functionality to allow the developer to easily do their work
|
|
without needlessly cluttering the library. Finally, It's important to note
|
|
that this design doesn't prevent subclassing or otherwise extending to meet
|
|
the unforeseen needs.
|
|
|
|
|
|
Specification:
|
|
|
|
A slightly more detailed look at the library follows.
|
|
|
|
- Design
|
|
|
|
ipaddr has four main classes most people will use:
|
|
|
|
1. IPv4Address. (eg, '192.168.1.1')
|
|
2. IPv4Network (eg, '192.168.0.0/16')
|
|
3. IPv6Address (eg, '::1')
|
|
4. IPv6Network (eg, '2001::/32')
|
|
|
|
Most of the operations a network administrator performs on networks are
|
|
similar for both IPv4 and IPv6 networks. Ie. finding subnets, supernets,
|
|
determining if an address is contained in a given network, etc. Similarly,
|
|
both addresses and networks (of the same ip version!) have much in common;
|
|
the process for turning a given 32 or 128 bit number into a human readable
|
|
string notation, determining if the ip is within the valid specified range,
|
|
etc. Finally, there are some pythonic abstractions which are valid for all
|
|
addresses and networks, both IPv4 and IPv6. In short, there is common
|
|
functionality shared between (ipaddr class names in parentheses):
|
|
|
|
1. all IP addresses and networks, both IPv4 and IPv6. (_IPAddrBase)
|
|
|
|
2. all IP addresses of both versions. (_BaseIP)
|
|
|
|
3. all IP networks of both version. (_BaseNet)
|
|
|
|
4. all IPv4 objects, both addresses and networks. (_BaseV4)
|
|
|
|
5. all IPv6 objects, both addresses and networks. (_BaseV6)
|
|
|
|
Seeing this as a clear hierarchy is important for recognizing how much
|
|
code is common between the four main classes. For this reason, ipaddr uses
|
|
class inheritance to abstract out as much common code is possible and
|
|
appropriate. This lack of duplication and very clean layout also makes
|
|
the job of the developer much easier should they need to debug code (either
|
|
theirs or mine).
|
|
|
|
Knowing that there might be cases where the developer doesn't so much care
|
|
as to the types of IP they might be receiving, ipaddr comes with two
|
|
important helper functions, IPAddress() and IPNetwork(). These, as you
|
|
might guess, return the appropriately typed address or network objects for
|
|
the given argument.
|
|
|
|
Finally, as mentioned earlier, there is no meaningful natural ordering
|
|
between IPv4 and IPv6 addresses and networks [2]. Rather than invent a
|
|
standard, ipaddr follows Ordering Comparisons and returns a TypeError
|
|
when asked to compare objects of differing IP versions. In practice, there
|
|
are many ways a programmer may wish to order the addresses, so this this
|
|
shouldn't pose a problem for the developer who can easily write:
|
|
|
|
v4 = [x for x in mixed_list if x._version == 4]
|
|
v6 = [x for x in mixed_list if x._version == 6]
|
|
|
|
# perform operations on v4 and v6 here.
|
|
|
|
return v4_return + v6_return
|
|
|
|
- Multiple ways of displaying an IP Address.
|
|
|
|
Not everyone will want to display the same information in the same format;
|
|
IP addresses in cisco syntax are represented by network/hostmask, junipers
|
|
are (network/IP)/prefixlength and IPTables are (network/IP)/(prefixlength/
|
|
netmask). The ipaddr library provides multiple ways to display an address.
|
|
|
|
In [1]: IPNetwork('1.1.1.1').with_prefixlen
|
|
Out[1]: '1.1.1.1/32'
|
|
|
|
In [1]: IPNetwork('1.1.1.1').with_netmask
|
|
Out[1]: '1.1.1.1/255.255.255.255'
|
|
|
|
In [1]: IPNetwork('1.1.1.1').with_hostmask
|
|
Out[1]: '1.1.1.1/0.0.0.0'
|
|
|
|
the same applies to IPv6. It should be noted that netmasks and hostmasks
|
|
are not commonly used in IPv6, the methods exist for compatibility with
|
|
IPv4.
|
|
|
|
- Lazy evaluation combined with aggressive caching of network elements.
|
|
|
|
(the following example is for IPv6Network objects but the exact same
|
|
properties apply to IPv6Network objects).
|
|
|
|
As mentioned, an IP network object is defined by a number of properties.
|
|
The object
|
|
|
|
In [1]: IPv4Network('1.1.1.0/24')
|
|
|
|
has a number of IPv4Address properties
|
|
|
|
In [1]: o = IPv4Network('1.1.1.0/24')
|
|
|
|
In [2]: o.network
|
|
Out[2]: IPv4Address('1.1.1.0')
|
|
|
|
In [3]: o.broadcast
|
|
Out[3]: IPv4Address('1.1.1.255')
|
|
|
|
In [4]: o.hostmask
|
|
Out[4]: IPv4Address('0.0.0.255')
|
|
|
|
If we were to compute them all at object creation time, we would incur a
|
|
non-negligible performance hit. Since these properties are required to
|
|
define the object completely but their values aren't always of interest to
|
|
the programmer, their computation should be done only when requested.
|
|
However, in order to avoid the performance hit in the case where one
|
|
attribute for a particular object is requested repeatedly (and continuously
|
|
recomputed), the results of the computation should be cached.
|
|
|
|
- Address list summarization.
|
|
|
|
ipaddr supports easy summarization of lists of possibly contiguous
|
|
addresses, as this is something network administrators constantly find
|
|
themselves doing. This currently works in a number of ways.
|
|
|
|
1. collapse_address_list([list]):
|
|
|
|
Given a list of networks, ipaddr will collapse the list into the smallest
|
|
possible list of networks that wholey contain the addresses supplied.
|
|
|
|
In [1]: collapse_address_list([IPNetwork('1.1.0.0/24'),
|
|
...: IPNetwork('1.1.1.0/24')])
|
|
Out[1]: [IPv4Network('1.1.0.0/23')]
|
|
|
|
more elaborately:
|
|
|
|
In [1]: collapse_address_list([IPNetwork(x) for x in
|
|
...: IPNetwork('1.1.0.0/23')])
|
|
Out[1]: [IPv4Network('1.1.0.0/23')]
|
|
|
|
2. summarize_address_range(first, last).
|
|
|
|
Given a start and end address, ipaddr will provide the smallest number of
|
|
networks to cover the given range.
|
|
|
|
|
|
In [1]: summarize_address_range(IPv4Address('1.1.1.0'),
|
|
...: IPv4Address('2.2.2.0'))
|
|
Out[1]:
|
|
[IPv4Network('1.1.1.0/24'),
|
|
IPv4Network('1.1.2.0/23'),
|
|
IPv4Network('1.1.4.0/22'),
|
|
IPv4Network('1.1.8.0/21'),
|
|
IPv4Network('1.1.16.0/20'),
|
|
IPv4Network('1.1.32.0/19'),
|
|
IPv4Network('1.1.64.0/18'),
|
|
IPv4Network('1.1.128.0/17'),
|
|
IPv4Network('1.2.0.0/15'),
|
|
IPv4Network('1.4.0.0/14'),
|
|
IPv4Network('1.8.0.0/13'),
|
|
IPv4Network('1.16.0.0/12'),
|
|
IPv4Network('1.32.0.0/11'),
|
|
IPv4Network('1.64.0.0/10'),
|
|
IPv4Network('1.128.0.0/9'),
|
|
IPv4Network('2.0.0.0/15'),
|
|
IPv4Network('2.2.0.0/23'),
|
|
IPv4Network('2.2.2.0/32')]
|
|
|
|
- Address Exclusion.
|
|
|
|
Used somewhat less often, but all the more annoying, is the case where an
|
|
programmer would want "all of the addresses in a newtork *except* these".
|
|
ipaddr performs this exclusion equally well for IPv4 and IPv6 networks
|
|
and collapses the resulting address list.
|
|
|
|
In [1]: IPNetwork('1.1.0.0/15').address_exclude(IPNetwork('1.1.1.0/24'))
|
|
Out[1]:
|
|
[IPv4Network('1.0.0.0/16'),
|
|
IPv4Network('1.1.0.0/24'),
|
|
IPv4Network('1.1.2.0/23'),
|
|
IPv4Network('1.1.4.0/22'),
|
|
IPv4Network('1.1.8.0/21'),
|
|
IPv4Network('1.1.16.0/20'),
|
|
IPv4Network('1.1.32.0/19'),
|
|
IPv4Network('1.1.64.0/18'),
|
|
IPv4Network('1.1.128.0/17')]
|
|
|
|
In [1]: IPNewtork('::1/96').address_exclude(IPNetwork('::1/112'))
|
|
Out[1]:
|
|
[IPv6Network('::1:0/112'),
|
|
IPv6Network('::2:0/111'),
|
|
IPv6Network('::4:0/110'),
|
|
IPv6Network('::8:0/109'),
|
|
IPv6Network('::10:0/108'),
|
|
IPv6Network('::20:0/107'),
|
|
IPv6Network('::40:0/106'),
|
|
IPv6Network('::80:0/105'),
|
|
IPv6Network('::100:0/104'),
|
|
IPv6Network('::200:0/103'),
|
|
IPv6Network('::400:0/102'),
|
|
IPv6Network('::800:0/101'),
|
|
IPv6Network('::1000:0/100'),
|
|
IPv6Network('::2000:0/99'),
|
|
IPv6Network('::4000:0/98'),
|
|
IPv6Network('::8000:0/97')]
|
|
|
|
- IPv6 address compression.
|
|
|
|
By default, IPv6 addresses are compressed internally (see the method
|
|
BaseV6._compress_hextets), but ipaddr makes both the compressed and the
|
|
exploded representations available.
|
|
|
|
In [1]: IPNetwork('::1').compressed
|
|
Out[1]: '::1/128'
|
|
|
|
In [2]: IPNetwork('::1').exploded
|
|
Out[2]: '0000:0000:0000:0000:0000:0000:0000:1/128'
|
|
|
|
In [3]: IPv6Address('::1').exploded
|
|
Out[3]: '0000:0000:0000:0000:0000:0000:0000:0001'
|
|
|
|
In [4]: IPv6Address('::1').compressed
|
|
Out[4]: '::1'
|
|
|
|
(the same methods exist for IPv4 networks and addresses, but they're
|
|
just stubs for returning the normal __str__ representation).
|
|
|
|
- Most other common operations.
|
|
|
|
It is a design goal to support all of the common operation expected from
|
|
an IP address manipulation module. As such, finding supernets, subnets,
|
|
address and network containment etc are all supported.
|
|
|
|
Reference Implementation:
|
|
|
|
A reference implementation is available at:
|
|
http://ipaddr-py.googlecode.com/svn/trunk
|
|
|
|
|
|
References:
|
|
|
|
[1] http://bugs.python.org/issue3959
|
|
[2] Appealing to authority is a logical fallacy, but Vint Cerf is an
|
|
an authority who can't be ignored. Full text of the email follows:
|
|
|
|
"""
|
|
I have seen a substantial amount of traffic about IPv4 and IPv6
|
|
comparisons and the general consensus is that these are not comparable.
|
|
|
|
If we were to take a very simple minded view, we might treat these as
|
|
pure integers in which case there is an ordering but not a useful one.
|
|
|
|
In the IPv4 world, "length" is important because we take longest (most
|
|
specific) address first for routing. Length is determine by the mask,
|
|
as you know.
|
|
|
|
Assuming that the same style of argument works in IPv6, we would have
|
|
to conclude that treating an IPv6 value purely as an integer for
|
|
comparison with IPv4 would lead to some really strange results.
|
|
|
|
All of IPv4 space would lie in the host space of 0::0/96 prefix of
|
|
IPv6. For any useful interpretation of IPv4, this is a non-starter.
|
|
|
|
I think the only sensible conclusion is that IPv4 values and IPv6 values
|
|
should be treated as non-comparable.
|
|
|
|
Vint
|
|
"""
|
|
|
|
[3] http://docs.python.org/dev/3.0/whatsnew/3.0.html#ordering-comparisons
|
|
|
|
|
|
Copyright:
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|