updated PEP 376 to reflect the prototype API + more details

This commit is contained in:
Tarek Ziadé 2009-05-25 10:22:46 +00:00
parent 51b7df09e2
commit 58cbf93c36
1 changed files with 193 additions and 107 deletions

View File

@ -18,6 +18,8 @@ This PEP proposes various enhancements for Distutils:
- A new format for the .egg-info structure. - A new format for the .egg-info structure.
- Some APIs to read the meta-data of a project - Some APIs to read the meta-data of a project
- Replace PEP 262
- An uninstall feature
Definitions Definitions
=========== ===========
@ -99,7 +101,7 @@ project, and if the project's `setup.py` uses Distutils or Setuptools.
Under some circumstances, you might not be able to know for sure that you Under some circumstances, you might not be able to know for sure that you
have removed everything, or that you didn't break another project by have removed everything, or that you didn't break another project by
removing a file that was shared among the two projects. removing a file that was shared among several projects.
But there's common behavior: when you install a project, files are copied But there's common behavior: when you install a project, files are copied
in your system. And there's a way to keep track of theses files, so to remove in your system. And there's a way to keep track of theses files, so to remove
@ -110,10 +112,13 @@ What this PEP proposes
To address those issues, this PEP proposes a few changes: To address those issues, this PEP proposes a few changes:
- a new `.egg-info` structure using a directory; - a new `.egg-info` structure using a directory, based on the `EggFormats`
- a list of elements this directory holds; standard from `setuptools` [#eggformats]_.
- new functions in `pkgutil` to be able to query the information - new APIs in `pkgutil` to be able to query the information of installed
of installed projects. projects.
- a de-facto replacement for PEP 262
- an uninstall function in Distutils.
.egg-info becomes a directory .egg-info becomes a directory
============================= =============================
@ -122,6 +127,9 @@ The first change would be to make `.egg-info` a directory and let it
hold the `PKG-INFO` file built by the `write_pkg_file` method of hold the `PKG-INFO` file built by the `write_pkg_file` method of
the `Distribution` class in Distutils. the `Distribution` class in Distutils.
Notice that this change is based on the standard proposed by `EggFormats`.
You may refer to its documentation for more information.
This change will not impact Python itself, because `egg-info` files are not This change will not impact Python itself, because `egg-info` files are not
used anywhere yet in the standard library besides Distutils. used anywhere yet in the standard library besides Distutils.
@ -136,11 +144,32 @@ will be installed in `site-packages` will become::
- zlib-2.5.2.egg-info/ - zlib-2.5.2.egg-info/
PKG-INFO PKG-INFO
The Python version will also be removed from the `.egg-info` directory The syntax of the egg-info directory name is as follows::
name.
Adding a RECORD in the .egg-info directory name + '-' + version + '.egg-info'
==========================================
The egg-info directory name is created using a new function called
``egg_info_dirname(name, version)`` added to ``pkgutil``. ``name`` is
converted to a standard distribution name any runs of non-alphanumeric
characters are replaced with a single '-'. ``version`` is converted
to a standard version string. Spaces become dots, and all other
non-alphanumeric characters become dashes, with runs of multiple dashes
condensed to a single dash. Both attributes are then converted into their
filename-escaped form. Any '-' characters are currently replaced with '_'.
Examples::
>>> egg_info_dirname('zlib', '2.5.2')
'zlib-2.5.2.egg-info'
>>> egg_info_dirname('python-ldap', '2.5')
'python_ldap-2.5.egg-info'
>>> egg_info_dirname('python-ldap', '2.5 a---5')
'python_ldap-2.5.a_5.egg-info'
Adding a RECORD file in the .egg-info directory
===============================================
A `RECORD` file will be added inside the `.egg-info` directory at installation A `RECORD` file will be added inside the `.egg-info` directory at installation
time. The `RECORD` file will hold the list of installed files. These correspond time. The `RECORD` file will hold the list of installed files. These correspond
@ -151,8 +180,8 @@ PEP. This RECORD file is inspired from PEP 262 FILES [#pep262]_.
The RECORD format The RECORD format
----------------- -----------------
The `RECORD` file is composed of records, one line per installed file. The `RECORD` file is a CSV-like file, composed of records, one line per
Each record is composed of three elements separated by a <tab> character: installed file. Each record is composed of three elements.
- the file's full **path** - the file's full **path**
@ -169,6 +198,10 @@ Each record is composed of three elements separated by a <tab> character:
- the file's size in bytes - the file's size in bytes
The ``csv`` module with its default options will be used to generate this file,
so the field separator will be ",". Any "," characters found within a field
will be escaped automatically by ``csv``.
Example Example
------- -------
@ -181,11 +214,11 @@ Back to our `zlib` example, we will have::
And the RECORD file will contain:: And the RECORD file will contain::
zlib/include/zconf.h b690274f621402dda63bf11ba5373bf2 9544 zlib/include/zconf.h,b690274f621402dda63bf11ba5373bf2,9544
zlib/include/zlib.h 9c4b84aff68aa55f2e9bf70481b94333 66188 zlib/include/zlib.h,9c4b84aff68aa55f2e9bf70481b94333,66188
zlib/lib/libz.a e6d43fb94292411909404b07d0692d46 91128 zlib/lib/libz.a,e6d43fb94292411909404b07d0692d46,91128
zlib/share/man/man3/zlib.3 785dc03452f0508ff0678fba2457e0ba 4486 zlib/share/man/man3/zlib.3,785dc03452f0508ff0678fba2457e0ba,4486
zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195 zlib-2.5.2.egg-info/PKG-INFO,6fe57de576d749536082d8e205b77748,195
zlib-2.5.2.egg-info/RECORD zlib-2.5.2.egg-info/RECORD
Notice that: Notice that:
@ -194,83 +227,106 @@ Notice that:
- `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the file - `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the file
paths are relative to it. paths are relative to it.
New functions in pkgutil New APIs in pkgutil
======================== ===================
To use the `.egg-info` directory content, we need to add in the standard To use the `.egg-info` directory content, we need to add in the standard
library a set of APIs. The best place to put these APIs seems to be `pkgutil`. library a set of APIs. The best place to put these APIs seems to be `pkgutil`.
The new functions added in the package are : EggInfo class
-------------
- get_projects() -> iterator A new class called ``EggInfo`` is created, which provides the following
attributes:
Provides an iterator that will return (name, path) tuples, where `name` - ``name``: The name of the project
is the name of a registered project and `path` the path to its `egg-info`
directory.
- get_egg_info(project_name) -> path or None - ``metadata``: A ``DistributionMetadata`` instance loaded with the project's
PKG-INFO file
Scans all elements in `sys.path` and looks for all directories ending with The following methods are provided:
`.egg-info`. Returns the directory path that contains a PKG-INFO that matches
`project_name` for the `name` metadata. Notice that there should be at most
one result. The first result founded will be returned.
If the directory is not found, returns None. - ``get_installed_files(local=False)`` -> iterator of (path, md5, size)
XXX The implementation of `get_egg_info` will focus on minimizing the I/O Iterates over the `RECORD` entries and return a tuple ``(path, md5, size)``
accesses. for each line. If ``local`` is ``True``, the path is transformed into a
local absolute path. Otherwise the raw value from `RECORD` is returned.
- get_metadata(project_name) -> DistributionMetadata or None - ``uses(path)`` -> Boolean
Uses `get_egg_info` to get the `PKG-INFO` file, and returns a Returns ``True`` if ``path`` is listed in `RECORD`. ``path``
`DistributionMetadata` instance that contains the metadata. can be a local absolute path or a relative '/'-separated path.
- get_files(project_name, local=False) -> iterator of (path, hash, size, - ``owns(path)`` -> Boolean
other_projects)
Uses `get_egg_info` to get the `RECORD` file, and returns an iterator. Returns ``True`` if ``path`` is owned by the project.
Owned means that the path is used only by this project and is not used
by any other project. ``path`` can be a local absolute path or a relative
'/'-separated path.
Each returned element is a tuple `(path, hash, size, other_projects)` where - ``get_file(path, binary=False)`` -> file object
``path``, ``hash``, ``size`` are the values found in the RECORD file.
`path` is the raw value founded in the RECORD file. If `local` is Returns a ``file`` instance for the file pointed by ``path``. ``path`` can be
set to True, `path` will be translated to its real absolute path, using a local absolute path or a relative '/'-separated path. If ``binary`` is
the local path separator. ``True``, opens the file in binary mode.
`other_projects` is a tuple containing the name of the projects that are .egg-info functions
also referring to this file in their own RECORD file (same path). -------------------
If `other_projects` is empty, it means that the file is only referred by the The new functions added in the ``pkgutil`` are :
current project. In other words, it can be removed if the project is removed.
- get_egg_info_file(project_name, path, binary=False) -> file object or None - ``get_egg_infos()`` -> iterator
Uses `get_egg_info` and gets any element inside the directory, Provides an iterator that looks for ``.egg-info`` directories in ``sys.path``
pointed by its relative path. `get_egg_info_file` will perform and returns ``EggInfo`` instances for each one of them.
an `os.path.join` on `get_egg_info(project_name)` and `path` to build the
whole path.
`path` can be a '/'-separated path or can use the local separator. - ``get_egg_info(project_name)`` -> path or None
`get_egg_info_file` will automatically convert it using the platform path
separator, to look for the file.
If `binary` is set True, the file will be opened using the binary mode. Scans all elements in ``sys.path`` and looks for all directories ending with
``.egg-info``. Returns an ``EggInfo`` corresponding to the ``.egg-info``
directory that contains a PKG-INFO that matches `project_name` for the `name`
metadata.
Let's use it with our `zlib` example:: Notice that there should be at most one result. The first result founded
will be returned. If the directory is not found, returns None.
- ``get_file_users(path)`` -> iterator of ``EggInfo`` instances.
Iterates over all projects to find out which project uses ``path``.
``path`` can be a local absolute path or a relative '/'-separated path.
Cache functions
---------------
The functions from the previous section work with a global memory cache to
reduce the numbers of I/O accesses and speed up the lookups.
The cache can be managed with these functions:
- ``purge_cache``: removes all entries from cache.
- ``cache_enabled``: returns ``True`` if the cache is enabled.
- ``enable_cache``: enables the cache.
- ``disable_cache``: disables the cache.
Example
-------
Let's use some of the new APIs with our `zlib` example::
>>> from pkgutil import get_egg_info, get_file_users
>>> egg_info = get_egg_info('zlib')
>>> egg_info.name
'zlib'
>>> egg_info.metadata.version
'2.5.2'
>>> from pkgutil import (get_egg_info, get_metadata, get_egg_info_file,
... get_files)
>>> get_egg_info('zlib')
'/opt/local/lib/python2.6/site-packages/zlib-2.5.2.egg-info' '/opt/local/lib/python2.6/site-packages/zlib-2.5.2.egg-info'
>>> metadata = get_metadata('zlib') >>> metadata = get_metadata('zlib')
>>> metadata.version >>> metadata.version
'2.5.2' '2.5.2'
>>> get_egg_info_file('zlib', 'PKG-INFO').read()
some >>> for path, hash, size in egg_info.get_installed_files()::
... ... print '%s %s %d %s' % (path, hash, size)
files
>>> for path, hash, size, other_projects in get_files('zlib'):
... print '%s %s %d %s' % (path, hash, size, ','.join(other_projects))
... ...
zlib/include/zconf.h b690274f621402dda63bf11ba5373bf2 9544 zlib/include/zconf.h b690274f621402dda63bf11ba5373bf2 9544
zlib/include/zlib.h 9c4b84aff68aa55f2e9bf70481b94333 66188 zlib/include/zlib.h 9c4b84aff68aa55f2e9bf70481b94333 66188
@ -279,18 +335,45 @@ Let's use it with our `zlib` example::
zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195 zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195
zlib-2.5.2.egg-info/RECORD None None zlib-2.5.2.egg-info/RECORD None None
>>> egg_info.uses('zlib/include/zlib.h')
True
>>> egg_info.owns('zlib/include/zlib.h')
True
>>> egg_info.get_file('zlib/include/zlib.h')
<open file at ...>
PEP 262 replacement
===================
In the past an attempt was made to create a installation database (see PEP 262
[#pep262]_).
Extract from PEP 262 Requirements:
" We need a way to figure out what distributions, and what versions of
those distributions, are installed on a system..."
Since the APIs proposed in the current PEP provide everything needed to meet
this requirement, PEP 376 will replace PEP 262 and will become the official
`installation database` standard.
The new version of PEP 345 (XXX work in progress) will extend the Metadata
standard and will fullfill the requirements described in PEP 262, like the
`REQUIRES` section.
Adding an Uninstall function Adding an Uninstall function
============================ ============================
Distutils provides a very basic way to install a project, which is running Distutils already provides a very basic way to install a project, which is running
the `install` command over the `setup.py` script of the distribution. the `install` command over the `setup.py` script of the distribution.
Distutils will provide a very basic ``uninstall`` function, that will be added Distutils will provide a very basic ``uninstall`` function, that will be added
in ``distutils.util`` and will take the name of the project to uninstall as in ``distutils.util`` and will take the name of the project to uninstall as
its argument. ``uninstall`` will use ``pkgutil.get_files`` and remove all its argument. ``uninstall`` will use the APIs desribed earlier and remove all
unique files, as long as their hash didn't change. Then it will remove unique files, as long as their hash didn't change. Then it will remove
directories where it removed the last elements. empty directories left behind.
``uninstall`` will return a list of uninstalled files:: ``uninstall`` will return a list of uninstalled files::
@ -320,7 +403,7 @@ Examples::
... ...
>>> uninstall('zlib', _dry_run) >>> uninstall('zlib', _dry_run)
Of course, a third-party tool can use ``pkgutil.get_files``, to implement Of course, a third-party tool can use ``pkgutil`` APIs to implement
its own uninstall feature. its own uninstall feature.
Backward compatibility and roadmap Backward compatibility and roadmap
@ -349,6 +432,9 @@ References
.. [#pip] .. [#pip]
http://pypi.python.org/pypi/pip http://pypi.python.org/pypi/pip
.. [#eggformats]
http://peak.telecommunity.com/DevCenter/EggFormats
Aknowledgments Aknowledgments
============== ==============