changes after feedback from python-dev

This commit is contained in:
Tarek Ziadé 2009-05-16 16:02:06 +00:00
parent c1b662b7e8
commit b07af3ab3a
1 changed files with 144 additions and 65 deletions

View File

@ -22,28 +22,41 @@ This PEP proposes various enhancements for Distutils:
Definitions
===========
A **project** is a Python application composed of one or many Python packages.
It is distributed using a `setup.py` script with Distutils and/or Setuptools.
A **project** is a Python application composed of one or several files, which can
be Python modules, extensions or data. It is distributed using a `setup.py` script
with Distutils and/or Setuptools. The `setup.py` script indicates where each
elements should be installed.
Once installed, one or several **packages** are added in Python's site-packages.
Once installed, the elements are located in various places in the system, like:
- in Python's site-packages (Python modules, Python modules organized into packages,
Extensions, etc.)
- in Python's `include` directory.
- in Python's `bin` or `Script` directory.
- etc.
Rationale
=========
There are two problems right now in the way projects are installed in Python:
- There are too many ways to install a project in Python.
- There are too many ways to do it.
- There is no API to get the metadata of installed projects.
How projects are installed
--------------------------
Right now, when a project is installed in Python, every package its contains
is installed in the `site-packages` directory with the Distutils `install`
command.
Right now, when a project is installed in Python, every elements its contains
is installed in various directories.
The pure Python code for instance is installed in the `purelib` directory,
which is located in the Python installation in `lib\python2.6\site-packages`
for example under unix-like systems or Mac OS X, and in `Lib/site-packages`
under Windows. This is done with the Distutils `install` command, which calls
various subcommands.
The `install_egg_info` subcommand is called during this process, in order to
create an `.egg-info` file in the `site-packages` directory.
create an `.egg-info` file in the `purelib` directory.
For example, if the `zlib` project (which contains one package) is installed,
two elements will be installed in `site-packages`::
@ -51,8 +64,8 @@ two elements will be installed in `site-packages`::
- zlib
- zlib-2.5.2-py2.4.egg-info
Where `zlib` is the package, and `zlib-2.5.2-py2.4.egg-info` is
a file containing the package metadata as described in PEP 314.
Where `zlib` is a Python package, and `zlib-2.5.2-py2.4.egg-info` is
a file containing the project metadata as described in PEP 314 [#pep314]_.
This file corresponds to the file called `PKG-INFO`, built by
the `sdist` command.
@ -63,11 +76,13 @@ packages in the same way that Distutils does:
- `easy_install` creates an `EGG-INFO` directory inside an `.egg` directory,
and adds a `PKG-INFO` file inside this directory. The `.egg` directory
contains in that case the packages of the project.
contains in that case all the elements of the project that are supposed to
be installed in `site-packages`, and is placed in the `site-packages`
directory.
- `pip` creates an `.egg-info` directory inside the site-packages directory
and adds a `PKG-INFO` file inside it. Packages are installed in
site-packages directory in a regular way.
- `pip` creates an `.egg-info` directory inside the `site-packages` directory
and adds a `PKG-INFO` file inside it. Elements of the project are then
installed in various places like Distutils does.
They both add other files in the `EGG-INFO` or `.egg-info` directory, and
create or modify `.pth` files.
@ -76,16 +91,19 @@ Uninstall information
---------------------
Distutils doesn't provide any `uninstall` command. If you want to uninstall
a project, you have to be a power user and remove the various package
directories from the right `site-packages` directory, then look over the right
`pth` files. And this method differs, depending on the tools you are using.
a project, you have to be a power user and remove the various elements that
were installed. Then look over the `.pth` file to clean them if necessary.
The worst issue is that you depend on the way the packager created his package.
When you call `python setup.py install`, it will not be installed the same way
depending on the tool used by the packager (mainly Distutils or Setuptools).
And the process differs, depending on the tools you have used to install the
project, and if the project's `setup.py` uses Distutils or Setuptools.
But there's common behavior: files are copied in your installation.
And there's a way to keep track of theses file, so to remove them.
Under some circumstances, you might not be able to know for sure that you
have removed everything, or that you didn't break another project by
removing a file that was shared among the two projects.
But there's common behavior: when you install a project, files are copied
in your system. And there's a way to keep track of theses files, so to remove
them.
What this PEP proposes
----------------------
@ -101,40 +119,42 @@ To address those issues, this PEP proposes a few changes:
=============================
The first change would be to make `.egg-info` a directory and let it
hold the `PKG-INFO` file built by the `write_pkg_file` method.
hold the `PKG-INFO` file built by the `write_pkg_file` method of
the `Distribution` class in Distutils.
This change will not impact Python itself, because this file is not
used anywhere yet in the standard library. So there's no need of
deprecation.
This change will not impact Python itself, because `egg-info` files are not
used anywhere yet in the standard library besides Distutils.
Although it will impact the `setuptools` and `pip` projects, but given
the fact that they already work with a directory that contains a
`PKG-INFO` file, the change will be small.
the fact that they already work with a directory that contains a `PKG-INFO`
file, the change will have no deep consequences.
For example, if the `zlib` package is installed, two elements
will be installed in `site-packages`::
For example, if the `zlib` package is installed, the elements that
will be installed in `site-packages` will become::
- zlib
- zlib-2.5.2.egg-info/
PKG-INFO
The Python version will also be removed from the .egg-info directory
name. To be able to implement this change, the impacted code in Distutils
is the `install_egg_info` command, and the various third-party projects.
The Python version will also be removed from the `.egg-info` directory
name.
Adding a RECORD in the .egg-info directory
==========================================
A `RECORD` file will be added inside the `.egg-info` directory at installation
time.
time. The `RECORD` file will hold the list of installed files. These correspond
to the files listed by the `record` option of the `install` command, and will
always be generated. This will allow uninstallation, as explained later in this
PEP. This RECORD file is inspired from PEP 262 FILES [#pep262]_.
- the `RECORD` file will hold the list of installed files. These
correspond to the files listed by the `record` option of the `install`
command, and will always be generated. This will allow uninstallation, as
explained later in this PEP.
The RECORD format
-----------------
The `install` command will record by default installed files in the
RECORD file, using these rules:
The `RECORD` file is composed of records, one line per installed file.
Each record is composed of three elements separated by a `;` character:
- the file's full **path**
- if the installed file is located in a directory in `site-packages`,
it will be a '/'-separated relative path, no matter what is the target
@ -144,8 +164,13 @@ RECORD file, using these rules:
- if the installed file is located elsewhere in the system, a
'/'-separated absolute path is used.
This will require changing the way the `install` command writes the record
file, so the old `record` behavior will be deprecated.
- the **MD5** hash of the file, encoded in hex. Notice that `pyc` and `pyo`
generated files will not have a hash.
- the file's size in bytes
Example
-------
Back to our `zlib` example, we will have::
@ -154,6 +179,21 @@ Back to our `zlib` example, we will have::
PKG-INFO
RECORD
And the RECORD file will contain::
zlib/include/zconf.h;b690274f621402dda63bf11ba5373bf2;9544
zlib/include/zlib.h;9c4b84aff68aa55f2e9bf70481b94333;66188
zlib/lib/libz.a;e6d43fb94292411909404b07d0692d46;91128
zlib/share/man/man3/zlib.3;785dc03452f0508ff0678fba2457e0ba;4486
zlib-2.5.2.egg-info/PKG-INFO;6fe57de576d749536082d8e205b77748;195
zlib-2.5.2.egg-info/RECORD
Notice that:
- the `RECORD` file can't contain a hash of itself and is just mentioned here
- `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the file
paths are relative to it.
New functions in pkgutil
========================
@ -178,16 +218,31 @@ The new functions added in the package are :
Uses `get_egg_info` to get the `PKG-INFO` file, and returns a
`DistributionMetadata` instance that contains the metadata.
This will require a small change in `DistributionMetadata` (see #4908).
- get_egg_info_file(project_name, filename) -> file object or None
- get_files(project_name) -> iterator of (path, hash, size, other_projects)
Uses `get_egg_info` and gets any file inside the directory,
pointed by filename.
Uses `get_egg_info` to get the `RECORD` file, and returns an iterator.
Each returned element is a tuple `(path, hash, size, other_projects)` where
``path`, ``hash``, ``size`` are the values found in the RECORD file.
`other_projects` is a tuple containing the name of the projects that are
also referring to this file in their own RECORD file (same path).
If `other_projects` is empty, it means that the file is only referred by the
current project. In other words, it can be removed if the project is removed.
- get_egg_info_file(project_name, path) -> file object or None
Uses `get_egg_info` and gets any element inside the directory,
pointed by its relative path. `get_egg_info_file` will perform
an `os.path.join` on `get_egg_info(project_name)` and `path` to build the
whole path.
Let's use it with our `zlib` example::
>>> from pkgutil import get_egg_info, get_metadata, get_egg_info_file
>>> from pkgutil import (get_egg_info, get_metadata, get_egg_info_file,
... get_files)
>>> get_egg_info('zlib')
'/opt/local/lib/python2.6/site-packages/zlib-2.5.2.egg-info'
>>> metadata = get_metadata('zlib')
@ -197,6 +252,16 @@ Let's use it with our `zlib` example::
some
...
files
>>> for path, hash, size, other_projects in get_files('zlib'):
... print '%s %s %d %s' % (path, hash, size, ','.join(other_projects))
...
zlib/include/zconf.h b690274f621402dda63bf11ba5373bf2 9544
zlib/include/zlib.h 9c4b84aff68aa55f2e9bf70481b94333 66188
zlib/lib/libz.a e6d43fb94292411909404b07d0692d46 91128
zlib/share/man/man3/zlib.3 785dc03452f0508ff0678fba2457e0ba 4486
zlib-2.5.2.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195
zlib-2.5.2.egg-info/RECORD None None
Adding an Uninstall function
============================
@ -204,14 +269,13 @@ Adding an Uninstall function
Distutils provides a very basic way to install a project, which is running
the `install` command over the `setup.py` script of the distribution.
Distutils will provide a very basic `uninstall` command that will remove
all files listed in the `RECORD` file of a project, as long as they are not
mentioned in another `RECORD` file and as long as the package is installed
using the standard described earlier.
Distutils will provide a very basic ``uninstall`` function, that will be added
in ``distutils.util`` and will take the name of the project to uninstall as
its argument. ``uninstall`` will use ``pkgutil.get_files`` and remove all
unique files, as long as their hash didn't change. Then it will remove
directories where it removed the last elements.
This command will be added in ``distutils.util`` and will take the name
of the project to uninstall as its argument. A call to uninstall will return a
list of uninstalled files::
``uninstall`` will return a list of uninstalled files::
>>> from distutils.util import uninstall
>>> uninstall('zlib')
@ -220,23 +284,28 @@ list of uninstalled files::
If the project is not found, a ``DistutilsUninstallError`` will be raised.
To make it a reference API for third-party projects that wish to provide
an `uninstall feature`. The ``uninstall`` function can also be invoked with a
second callable argument, that will be invoked for each file to be removed.
If this callable returns `True`, the file will be removed.
To make it a reference API for third-party projects that wish to control
how `uninstall` works, a second callable argument can be used. It will be
called for each file that is removed. If the callable returns `True`, the
file will be removed. If it returns False, it will be left alone.
Examples::
>>> def _remove_and_log(path):
... logging.info('Removing %s' % path)
... return True
...
>>> uninstall('zlib', _remove_and_log)
>>> def _dry_run(path):
... logging.info('Removing %s (dry run)' % path)
... return False
...
>>> uninstall('zlib', _dry_run)
Of course, a third-party tool can use ``pkgutil.get_files`` for a maximum
control, to implement their own uninstall feature.
Backward compatibility and roadmap
==================================
@ -244,10 +313,20 @@ These changes will not introduce any compatibility problems with the previous
version of Distutils, and will also work with existing third-party tools.
Although, a backport of the new Distutils for 2.5, 2.6, 3.0 and 3.1 will be
provided so people can benefit from the new features.
provided so people can benefit from these new features.
The plan is to integrate them for Python 2.7 and Python 3.2
References
==========
.. [#pep262]
http://www.python.org/dev/peps/pep-0262
.. [#pep314]
http://www.python.org/dev/peps/pep-0314
Aknowledgments
==============