updated PEP 376 to reflect the prototype API + more details
This commit is contained in:
parent
51b7df09e2
commit
58cbf93c36
300
pep-0376.txt
300
pep-0376.txt
|
@ -18,18 +18,20 @@ This PEP proposes various enhancements for Distutils:
|
|||
|
||||
- A new format for the .egg-info structure.
|
||||
- Some APIs to read the meta-data of a project
|
||||
- Replace PEP 262
|
||||
- An uninstall feature
|
||||
|
||||
Definitions
|
||||
===========
|
||||
|
||||
A **project** is a Python application composed of one or several files, which can
|
||||
be Python modules, extensions or data. It is distributed using a `setup.py` script
|
||||
with Distutils and/or Setuptools. The `setup.py` script indicates where each
|
||||
be Python modules, extensions or data. It is distributed using a `setup.py` script
|
||||
with Distutils and/or Setuptools. The `setup.py` script indicates where each
|
||||
elements should be installed.
|
||||
|
||||
Once installed, the elements are located in various places in the system, like:
|
||||
|
||||
- in Python's site-packages (Python modules, Python modules organized into packages,
|
||||
- in Python's site-packages (Python modules, Python modules organized into packages,
|
||||
Extensions, etc.)
|
||||
- in Python's `include` directory.
|
||||
- in Python's `bin` or `Script` directory.
|
||||
|
@ -46,16 +48,16 @@ There are two problems right now in the way projects are installed in Python:
|
|||
How projects are installed
|
||||
--------------------------
|
||||
|
||||
Right now, when a project is installed in Python, every elements its contains
|
||||
is installed in various directories.
|
||||
Right now, when a project is installed in Python, every elements its contains
|
||||
is installed in various directories.
|
||||
|
||||
The pure Python code for instance is installed in the `purelib` directory,
|
||||
which is located in the Python installation in `lib\python2.6\site-packages`
|
||||
for example under unix-like systems or Mac OS X, and in `Lib/site-packages`
|
||||
for example under unix-like systems or Mac OS X, and in `Lib/site-packages`
|
||||
under Windows. This is done with the Distutils `install` command, which calls
|
||||
various subcommands.
|
||||
|
||||
The `install_egg_info` subcommand is called during this process, in order to
|
||||
The `install_egg_info` subcommand is called during this process, in order to
|
||||
create an `.egg-info` file in the `purelib` directory.
|
||||
|
||||
For example, if the `zlib` project (which contains one package) is installed,
|
||||
|
@ -67,17 +69,17 @@ two elements will be installed in `site-packages`::
|
|||
Where `zlib` is a Python package, and `zlib-2.5.2-py2.4.egg-info` is
|
||||
a file containing the project metadata as described in PEP 314 [#pep314]_.
|
||||
|
||||
This file corresponds to the file called `PKG-INFO`, built by
|
||||
This file corresponds to the file called `PKG-INFO`, built by
|
||||
the `sdist` command.
|
||||
|
||||
The problem is that many people use `easy_install` (setuptools [#setuptools]_)
|
||||
or `pip` [#pip]_ to install their packages, and these third-party tools do not
|
||||
The problem is that many people use `easy_install` (setuptools [#setuptools]_)
|
||||
or `pip` [#pip]_ to install their packages, and these third-party tools do not
|
||||
install packages in the same way that Distutils does:
|
||||
|
||||
- `easy_install` creates an `EGG-INFO` directory inside an `.egg` directory,
|
||||
and adds a `PKG-INFO` file inside this directory. The `.egg` directory
|
||||
- `easy_install` creates an `EGG-INFO` directory inside an `.egg` directory,
|
||||
and adds a `PKG-INFO` file inside this directory. The `.egg` directory
|
||||
contains in that case all the elements of the project that are supposed to
|
||||
be installed in `site-packages`, and is placed in the `site-packages`
|
||||
be installed in `site-packages`, and is placed in the `site-packages`
|
||||
directory.
|
||||
|
||||
- `pip` creates an `.egg-info` directory inside the `site-packages` directory
|
||||
|
@ -97,12 +99,12 @@ were installed. Then look over the `.pth` file to clean them if necessary.
|
|||
And the process differs, depending on the tools you have used to install the
|
||||
project, and if the project's `setup.py` uses Distutils or Setuptools.
|
||||
|
||||
Under some circumstances, you might not be able to know for sure that you
|
||||
Under some circumstances, you might not be able to know for sure that you
|
||||
have removed everything, or that you didn't break another project by
|
||||
removing a file that was shared among the two projects.
|
||||
removing a file that was shared among several projects.
|
||||
|
||||
But there's common behavior: when you install a project, files are copied
|
||||
in your system. And there's a way to keep track of theses files, so to remove
|
||||
But there's common behavior: when you install a project, files are copied
|
||||
in your system. And there's a way to keep track of theses files, so to remove
|
||||
them.
|
||||
|
||||
What this PEP proposes
|
||||
|
@ -110,23 +112,29 @@ What this PEP proposes
|
|||
|
||||
To address those issues, this PEP proposes a few changes:
|
||||
|
||||
- a new `.egg-info` structure using a directory;
|
||||
- a list of elements this directory holds;
|
||||
- new functions in `pkgutil` to be able to query the information
|
||||
of installed projects.
|
||||
- a new `.egg-info` structure using a directory, based on the `EggFormats`
|
||||
standard from `setuptools` [#eggformats]_.
|
||||
- new APIs in `pkgutil` to be able to query the information of installed
|
||||
projects.
|
||||
- a de-facto replacement for PEP 262
|
||||
- an uninstall function in Distutils.
|
||||
|
||||
|
||||
.egg-info becomes a directory
|
||||
=============================
|
||||
|
||||
The first change would be to make `.egg-info` a directory and let it
|
||||
hold the `PKG-INFO` file built by the `write_pkg_file` method of
|
||||
hold the `PKG-INFO` file built by the `write_pkg_file` method of
|
||||
the `Distribution` class in Distutils.
|
||||
|
||||
This change will not impact Python itself, because `egg-info` files are not
|
||||
used anywhere yet in the standard library besides Distutils.
|
||||
Notice that this change is based on the standard proposed by `EggFormats`.
|
||||
You may refer to its documentation for more information.
|
||||
|
||||
Although it will impact the `setuptools` and `pip` projects, but given
|
||||
the fact that they already work with a directory that contains a `PKG-INFO`
|
||||
This change will not impact Python itself, because `egg-info` files are not
|
||||
used anywhere yet in the standard library besides Distutils.
|
||||
|
||||
Although it will impact the `setuptools` and `pip` projects, but given
|
||||
the fact that they already work with a directory that contains a `PKG-INFO`
|
||||
file, the change will have no deep consequences.
|
||||
|
||||
For example, if the `zlib` package is installed, the elements that
|
||||
|
@ -136,32 +144,53 @@ will be installed in `site-packages` will become::
|
|||
- zlib-2.5.2.egg-info/
|
||||
PKG-INFO
|
||||
|
||||
The Python version will also be removed from the `.egg-info` directory
|
||||
name.
|
||||
The syntax of the egg-info directory name is as follows::
|
||||
|
||||
Adding a RECORD in the .egg-info directory
|
||||
==========================================
|
||||
name + '-' + version + '.egg-info'
|
||||
|
||||
The egg-info directory name is created using a new function called
|
||||
``egg_info_dirname(name, version)`` added to ``pkgutil``. ``name`` is
|
||||
converted to a standard distribution name any runs of non-alphanumeric
|
||||
characters are replaced with a single '-'. ``version`` is converted
|
||||
to a standard version string. Spaces become dots, and all other
|
||||
non-alphanumeric characters become dashes, with runs of multiple dashes
|
||||
condensed to a single dash. Both attributes are then converted into their
|
||||
filename-escaped form. Any '-' characters are currently replaced with '_'.
|
||||
|
||||
Examples::
|
||||
|
||||
>>> egg_info_dirname('zlib', '2.5.2')
|
||||
'zlib-2.5.2.egg-info'
|
||||
|
||||
>>> egg_info_dirname('python-ldap', '2.5')
|
||||
'python_ldap-2.5.egg-info'
|
||||
|
||||
>>> egg_info_dirname('python-ldap', '2.5 a---5')
|
||||
'python_ldap-2.5.a_5.egg-info'
|
||||
|
||||
Adding a RECORD file in the .egg-info directory
|
||||
===============================================
|
||||
|
||||
A `RECORD` file will be added inside the `.egg-info` directory at installation
|
||||
time. The `RECORD` file will hold the list of installed files. These correspond
|
||||
to the files listed by the `record` option of the `install` command, and will
|
||||
always be generated. This will allow uninstallation, as explained later in this
|
||||
time. The `RECORD` file will hold the list of installed files. These correspond
|
||||
to the files listed by the `record` option of the `install` command, and will
|
||||
always be generated. This will allow uninstallation, as explained later in this
|
||||
PEP. This RECORD file is inspired from PEP 262 FILES [#pep262]_.
|
||||
|
||||
The RECORD format
|
||||
-----------------
|
||||
|
||||
The `RECORD` file is composed of records, one line per installed file.
|
||||
Each record is composed of three elements separated by a <tab> character:
|
||||
The `RECORD` file is a CSV-like file, composed of records, one line per
|
||||
installed file. Each record is composed of three elements.
|
||||
|
||||
- the file's full **path**
|
||||
|
||||
- if the installed file is located in the directory where the .egg-info
|
||||
directory of the package is located, it will be a '/'-separated relative
|
||||
path, no matter what is the target system. This makes this information
|
||||
directory of the package is located, it will be a '/'-separated relative
|
||||
path, no matter what is the target system. This makes this information
|
||||
cross-compatible and allows simple installation to be relocatable.
|
||||
|
||||
- if the installed file is located elsewhere in the system, a
|
||||
- if the installed file is located elsewhere in the system, a
|
||||
'/'-separated absolute path is used.
|
||||
|
||||
- the **MD5** hash of the file, encoded in hex. Notice that `pyc` and `pyo`
|
||||
|
@ -169,6 +198,10 @@ Each record is composed of three elements separated by a <tab> character:
|
|||
|
||||
- the file's size in bytes
|
||||
|
||||
The ``csv`` module with its default options will be used to generate this file,
|
||||
so the field separator will be ",". Any "," characters found within a field
|
||||
will be escaped automatically by ``csv``.
|
||||
|
||||
Example
|
||||
-------
|
||||
|
||||
|
@ -181,116 +214,166 @@ Back to our `zlib` example, we will have::
|
|||
|
||||
And the RECORD file will contain::
|
||||
|
||||
zlib/include/zconf.h b690274f621402dda63bf11ba5373bf2 9544
|
||||
zlib/include/zlib.h 9c4b84aff68aa55f2e9bf70481b94333 66188
|
||||
zlib/lib/libz.a e6d43fb94292411909404b07d0692d46 91128
|
||||
zlib/share/man/man3/zlib.3 785dc03452f0508ff0678fba2457e0ba 4486
|
||||
zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195
|
||||
zlib/include/zconf.h,b690274f621402dda63bf11ba5373bf2,9544
|
||||
zlib/include/zlib.h,9c4b84aff68aa55f2e9bf70481b94333,66188
|
||||
zlib/lib/libz.a,e6d43fb94292411909404b07d0692d46,91128
|
||||
zlib/share/man/man3/zlib.3,785dc03452f0508ff0678fba2457e0ba,4486
|
||||
zlib-2.5.2.egg-info/PKG-INFO,6fe57de576d749536082d8e205b77748,195
|
||||
zlib-2.5.2.egg-info/RECORD
|
||||
|
||||
Notice that:
|
||||
|
||||
- the `RECORD` file can't contain a hash of itself and is just mentioned here
|
||||
- `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the file
|
||||
- `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the file
|
||||
paths are relative to it.
|
||||
|
||||
New functions in pkgutil
|
||||
========================
|
||||
New APIs in pkgutil
|
||||
===================
|
||||
|
||||
To use the `.egg-info` directory content, we need to add in the standard
|
||||
To use the `.egg-info` directory content, we need to add in the standard
|
||||
library a set of APIs. The best place to put these APIs seems to be `pkgutil`.
|
||||
|
||||
The new functions added in the package are :
|
||||
EggInfo class
|
||||
-------------
|
||||
|
||||
- get_projects() -> iterator
|
||||
A new class called ``EggInfo`` is created, which provides the following
|
||||
attributes:
|
||||
|
||||
Provides an iterator that will return (name, path) tuples, where `name`
|
||||
is the name of a registered project and `path` the path to its `egg-info`
|
||||
directory.
|
||||
- ``name``: The name of the project
|
||||
|
||||
- get_egg_info(project_name) -> path or None
|
||||
- ``metadata``: A ``DistributionMetadata`` instance loaded with the project's
|
||||
PKG-INFO file
|
||||
|
||||
Scans all elements in `sys.path` and looks for all directories ending with
|
||||
`.egg-info`. Returns the directory path that contains a PKG-INFO that matches
|
||||
`project_name` for the `name` metadata. Notice that there should be at most
|
||||
one result. The first result founded will be returned.
|
||||
The following methods are provided:
|
||||
|
||||
If the directory is not found, returns None.
|
||||
- ``get_installed_files(local=False)`` -> iterator of (path, md5, size)
|
||||
|
||||
XXX The implementation of `get_egg_info` will focus on minimizing the I/O
|
||||
accesses.
|
||||
Iterates over the `RECORD` entries and return a tuple ``(path, md5, size)``
|
||||
for each line. If ``local`` is ``True``, the path is transformed into a
|
||||
local absolute path. Otherwise the raw value from `RECORD` is returned.
|
||||
|
||||
- get_metadata(project_name) -> DistributionMetadata or None
|
||||
- ``uses(path)`` -> Boolean
|
||||
|
||||
Uses `get_egg_info` to get the `PKG-INFO` file, and returns a
|
||||
`DistributionMetadata` instance that contains the metadata.
|
||||
Returns ``True`` if ``path`` is listed in `RECORD`. ``path``
|
||||
can be a local absolute path or a relative '/'-separated path.
|
||||
|
||||
- get_files(project_name, local=False) -> iterator of (path, hash, size,
|
||||
other_projects)
|
||||
- ``owns(path)`` -> Boolean
|
||||
|
||||
Uses `get_egg_info` to get the `RECORD` file, and returns an iterator.
|
||||
Returns ``True`` if ``path`` is owned by the project.
|
||||
Owned means that the path is used only by this project and is not used
|
||||
by any other project. ``path`` can be a local absolute path or a relative
|
||||
'/'-separated path.
|
||||
|
||||
Each returned element is a tuple `(path, hash, size, other_projects)` where
|
||||
``path``, ``hash``, ``size`` are the values found in the RECORD file.
|
||||
- ``get_file(path, binary=False)`` -> file object
|
||||
|
||||
`path` is the raw value founded in the RECORD file. If `local` is
|
||||
set to True, `path` will be translated to its real absolute path, using
|
||||
the local path separator.
|
||||
Returns a ``file`` instance for the file pointed by ``path``. ``path`` can be
|
||||
a local absolute path or a relative '/'-separated path. If ``binary`` is
|
||||
``True``, opens the file in binary mode.
|
||||
|
||||
`other_projects` is a tuple containing the name of the projects that are
|
||||
also referring to this file in their own RECORD file (same path).
|
||||
.egg-info functions
|
||||
-------------------
|
||||
|
||||
If `other_projects` is empty, it means that the file is only referred by the
|
||||
current project. In other words, it can be removed if the project is removed.
|
||||
The new functions added in the ``pkgutil`` are :
|
||||
|
||||
- get_egg_info_file(project_name, path, binary=False) -> file object or None
|
||||
- ``get_egg_infos()`` -> iterator
|
||||
|
||||
Uses `get_egg_info` and gets any element inside the directory,
|
||||
pointed by its relative path. `get_egg_info_file` will perform
|
||||
an `os.path.join` on `get_egg_info(project_name)` and `path` to build the
|
||||
whole path.
|
||||
Provides an iterator that looks for ``.egg-info`` directories in ``sys.path``
|
||||
and returns ``EggInfo`` instances for each one of them.
|
||||
|
||||
`path` can be a '/'-separated path or can use the local separator.
|
||||
`get_egg_info_file` will automatically convert it using the platform path
|
||||
separator, to look for the file.
|
||||
- ``get_egg_info(project_name)`` -> path or None
|
||||
|
||||
If `binary` is set True, the file will be opened using the binary mode.
|
||||
Scans all elements in ``sys.path`` and looks for all directories ending with
|
||||
``.egg-info``. Returns an ``EggInfo`` corresponding to the ``.egg-info``
|
||||
directory that contains a PKG-INFO that matches `project_name` for the `name`
|
||||
metadata.
|
||||
|
||||
Let's use it with our `zlib` example::
|
||||
Notice that there should be at most one result. The first result founded
|
||||
will be returned. If the directory is not found, returns None.
|
||||
|
||||
- ``get_file_users(path)`` -> iterator of ``EggInfo`` instances.
|
||||
|
||||
Iterates over all projects to find out which project uses ``path``.
|
||||
``path`` can be a local absolute path or a relative '/'-separated path.
|
||||
|
||||
Cache functions
|
||||
---------------
|
||||
|
||||
The functions from the previous section work with a global memory cache to
|
||||
reduce the numbers of I/O accesses and speed up the lookups.
|
||||
|
||||
The cache can be managed with these functions:
|
||||
|
||||
- ``purge_cache``: removes all entries from cache.
|
||||
- ``cache_enabled``: returns ``True`` if the cache is enabled.
|
||||
- ``enable_cache``: enables the cache.
|
||||
- ``disable_cache``: disables the cache.
|
||||
|
||||
Example
|
||||
-------
|
||||
|
||||
Let's use some of the new APIs with our `zlib` example::
|
||||
|
||||
>>> from pkgutil import get_egg_info, get_file_users
|
||||
>>> egg_info = get_egg_info('zlib')
|
||||
>>> egg_info.name
|
||||
'zlib'
|
||||
>>> egg_info.metadata.version
|
||||
'2.5.2'
|
||||
|
||||
>>> from pkgutil import (get_egg_info, get_metadata, get_egg_info_file,
|
||||
... get_files)
|
||||
>>> get_egg_info('zlib')
|
||||
'/opt/local/lib/python2.6/site-packages/zlib-2.5.2.egg-info'
|
||||
>>> metadata = get_metadata('zlib')
|
||||
>>> metadata.version
|
||||
'2.5.2'
|
||||
>>> get_egg_info_file('zlib', 'PKG-INFO').read()
|
||||
some
|
||||
...
|
||||
files
|
||||
>>> for path, hash, size, other_projects in get_files('zlib'):
|
||||
... print '%s %s %d %s' % (path, hash, size, ','.join(other_projects))
|
||||
|
||||
>>> for path, hash, size in egg_info.get_installed_files()::
|
||||
... print '%s %s %d %s' % (path, hash, size)
|
||||
...
|
||||
zlib/include/zconf.h b690274f621402dda63bf11ba5373bf2 9544
|
||||
zlib/include/zlib.h 9c4b84aff68aa55f2e9bf70481b94333 66188
|
||||
zlib/lib/libz.a e6d43fb94292411909404b07d0692d46 91128
|
||||
zlib/share/man/man3/zlib.3 785dc03452f0508ff0678fba2457e0ba 4486
|
||||
zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195
|
||||
zlib-2.5.2.egg-info/RECORD None None
|
||||
zlib/lib/libz.a e6d43fb94292411909404b07d0692d46 91128
|
||||
zlib/share/man/man3/zlib.3 785dc03452f0508ff0678fba2457e0ba 4486
|
||||
zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195
|
||||
zlib-2.5.2.egg-info/RECORD None None
|
||||
|
||||
>>> egg_info.uses('zlib/include/zlib.h')
|
||||
True
|
||||
>>> egg_info.owns('zlib/include/zlib.h')
|
||||
True
|
||||
|
||||
>>> egg_info.get_file('zlib/include/zlib.h')
|
||||
<open file at ...>
|
||||
|
||||
PEP 262 replacement
|
||||
===================
|
||||
|
||||
In the past an attempt was made to create a installation database (see PEP 262
|
||||
[#pep262]_).
|
||||
|
||||
Extract from PEP 262 Requirements:
|
||||
|
||||
" We need a way to figure out what distributions, and what versions of
|
||||
those distributions, are installed on a system..."
|
||||
|
||||
|
||||
Since the APIs proposed in the current PEP provide everything needed to meet
|
||||
this requirement, PEP 376 will replace PEP 262 and will become the official
|
||||
`installation database` standard.
|
||||
|
||||
The new version of PEP 345 (XXX work in progress) will extend the Metadata
|
||||
standard and will fullfill the requirements described in PEP 262, like the
|
||||
`REQUIRES` section.
|
||||
|
||||
Adding an Uninstall function
|
||||
============================
|
||||
|
||||
Distutils provides a very basic way to install a project, which is running
|
||||
Distutils already provides a very basic way to install a project, which is running
|
||||
the `install` command over the `setup.py` script of the distribution.
|
||||
|
||||
Distutils will provide a very basic ``uninstall`` function, that will be added
|
||||
in ``distutils.util`` and will take the name of the project to uninstall as
|
||||
its argument. ``uninstall`` will use ``pkgutil.get_files`` and remove all
|
||||
Distutils will provide a very basic ``uninstall`` function, that will be added
|
||||
in ``distutils.util`` and will take the name of the project to uninstall as
|
||||
its argument. ``uninstall`` will use the APIs desribed earlier and remove all
|
||||
unique files, as long as their hash didn't change. Then it will remove
|
||||
directories where it removed the last elements.
|
||||
empty directories left behind.
|
||||
|
||||
``uninstall`` will return a list of uninstalled files::
|
||||
|
||||
|
@ -301,9 +384,9 @@ directories where it removed the last elements.
|
|||
|
||||
If the project is not found, a ``DistutilsUninstallError`` will be raised.
|
||||
|
||||
To make it a reference API for third-party projects that wish to control
|
||||
how `uninstall` works, a second callable argument can be used. It will be
|
||||
called for each file that is removed. If the callable returns `True`, the
|
||||
To make it a reference API for third-party projects that wish to control
|
||||
how `uninstall` works, a second callable argument can be used. It will be
|
||||
called for each file that is removed. If the callable returns `True`, the
|
||||
file will be removed. If it returns False, it will be left alone.
|
||||
|
||||
Examples::
|
||||
|
@ -320,7 +403,7 @@ Examples::
|
|||
...
|
||||
>>> uninstall('zlib', _dry_run)
|
||||
|
||||
Of course, a third-party tool can use ``pkgutil.get_files``, to implement
|
||||
Of course, a third-party tool can use ``pkgutil`` APIs to implement
|
||||
its own uninstall feature.
|
||||
|
||||
Backward compatibility and roadmap
|
||||
|
@ -349,6 +432,9 @@ References
|
|||
.. [#pip]
|
||||
http://pypi.python.org/pypi/pip
|
||||
|
||||
.. [#eggformats]
|
||||
http://peak.telecommunity.com/DevCenter/EggFormats
|
||||
|
||||
Aknowledgments
|
||||
==============
|
||||
|
||||
|
|
Loading…
Reference in New Issue