Terminology change (package -> distribution); specify the digest field for .pyc files

This commit is contained in:
Andrew M. Kuchling 2004-03-08 13:51:59 +00:00
parent 102b3b00af
commit 8644a9a8e2
1 changed files with 82 additions and 71 deletions

View File

@ -9,28 +9,34 @@ Post-History: 27-Mar-2002
Introduction Introduction
This PEP describes a format for a database of Python packages This PEP describes a format for a database of the Python software
installed on a system. installed on a system.
(In this document, the term "distribution" is used to mean a set
of code that's developed and distributed together. A "distribution"
is the same as a Red Hat or Debian package, but the term "package"
already has a meaning in Python terminology, meaning "a directory
with an __init__.py file in it.")
Requirements Requirements
We need a way to figure out what packages, and what versions of We need a way to figure out what distributions, and what versions of
those packages, are installed on a system. We want to provide those distributions, are installed on a system. We want to provide
features similar to CPAN, APT, or RPM. Required use cases that features similar to CPAN, APT, or RPM. Required use cases that
should be supported are: should be supported are:
* Is package X on a system? * Is distribution X on a system?
* What version of package X is installed? * What version of distribution X is installed?
* Where can the new version of package X be found? (This can * Where can the new version of distribution X be found? (This can
be defined as either "a home page where the user can go and be defined as either "a home page where the user can go and
find a download link", or "a place where a program can find find a download link", or "a place where a program can find
the newest version?" Both should probably be supported.) the newest version?" Both should probably be supported.)
* What files did package X put on my system? * What files did distribution X put on my system?
* What package did the file x/y/z.py come from? * What distribution did the file x/y/z.py come from?
* Has anyone modified x/y/z.py locally? * Has anyone modified x/y/z.py locally?
* What other packages does this package need? * What other distributions does this software need?
* What Python modules does this package provide? * What Python modules does this distribution provide?
Database Location Database Location
@ -41,12 +47,12 @@ Database Location
The structure of the database is deliberately kept simple; each The structure of the database is deliberately kept simple; each
file in this directory or its subdirectories (if any) describes a file in this directory or its subdirectories (if any) describes a
single package. Binary packages of Python software such as RPMs single distribution. Binary packagings of Python software such as
can then update Python's database by just installing the RPMs can then update Python's database by just installing the
corresponding file into the INSTALLDB directory. corresponding file into the INSTALLDB directory.
The rationale for scanning subdirectories is that we can move to a The rationale for scanning subdirectories is that we can move to a
directory-based indexing scheme if the package directory contains directory-based indexing scheme if the database directory contains
too many entries. For example, this would let us transparently too many entries. For example, this would let us transparently
switch from INSTALLDB/Numeric to INSTALLDB/N/Nu/Numeric or some switch from INSTALLDB/Numeric to INSTALLDB/N/Nu/Numeric or some
similar hashing scheme. similar hashing scheme.
@ -55,7 +61,7 @@ Database Location
Database Contents Database Contents
Each file in INSTALLDB or its subdirectories describes a single Each file in INSTALLDB or its subdirectories describes a single
package, and has the following contents: distribution, and has the following contents:
An initial line listing the sections in this file, separated An initial line listing the sections in this file, separated
by whitespace. Currently this will always be 'PKG-INFO FILES by whitespace. Currently this will always be 'PKG-INFO FILES
@ -64,29 +70,28 @@ Database Contents
we'd add a DOCS section and list it in the contents. Sections we'd add a DOCS section and list it in the contents. Sections
are always separated by blank lines. are always separated by blank lines.
A package that uses the Distutils for installation should A distribution that uses the Distutils for installation should
automatically update the database. Packages that roll their own automatically update the database. Distributions that roll their
installation will have to use the database's API to to manually own installation will have to use the database's API to to
add or update their own entry. System package managers such as manually add or update their own entry. System package managers
RPM or pkgadd can just create the new 'package name' file in the such as RPM or pkgadd can just create the new file in the
INSTALLDB directory. INSTALLDB directory.
Each section of the file is used for a different purpose. Each section of the file is used for a different purpose.
PKG-INFO section PKG-INFO section
An initial set of RFC-822 headers containing the package An initial set of RFC-822 headers containing the distribution
information for a file, as described in PEP 241, "Metadata for information for a file, as described in PEP 241, "Metadata for
Python Software Packages". Python Software Packages".
A blank line indicating the end of the PKG-INFO section.
FILES section FILES section
An entry for each file installed by the package. Generated files An entry for each file installed by the
such as .pyc and .pyo files are on this list as well as the original distribution. Generated files such as .pyc and .pyo files are
.py files installed by a package; their checksums won't be stored or on this list as well as the original .py files installed by a
checked, though. distribution; their checksums won't be stored or checked,
though.
Each file's entry is a single tab-delimited line that contains Each file's entry is a single tab-delimited line that contains
the following fields: the following fields:
@ -101,13 +106,16 @@ Database Contents
* The owner and group of the file, separated by a tab. * The owner and group of the file, separated by a tab.
On Windows, these fields will both be 'unknown'. On Windows, these fields will both be 'unknown'.
* A SHA1 digest of the file, encoded in hex. * A SHA1 digest of the file, encoded in hex. For generated files
such as *.pyc files, this field must contain the string "-",
which indicates that the file's checksum should not be verified.
REQUIRES section REQUIRES section
This section is a list of strings giving the services required for This section is a list of strings giving the services required for
this module distribution to run properly. This list includes the this module distribution to run properly. This list includes the
package name ("python-stdlib") and module names ("rfc822", distribution name ("python-stdlib") and module names ("rfc822",
"htmllib", "email", "email.Charset"). It will be specified "htmllib", "email", "email.Charset"). It will be specified
by an extra 'requires' argument to the distutils.core.setup() by an extra 'requires' argument to the distutils.core.setup()
function. For example: function. For example:
@ -123,7 +131,7 @@ Database Contents
PROVIDES section PROVIDES section
This section is a list of strings giving the services provided by This section is a list of strings giving the services provided by
an installed package. This list includes the package name an installed distribution. This list includes the distribution name
("python-stdlib") and module names ("rfc822", "htmllib", "email", ("python-stdlib") and module names ("rfc822", "htmllib", "email",
"email.Charset"). "email.Charset").
@ -147,12 +155,12 @@ API Description
suggestions for alternate locations in the standard library, or an suggestions for alternate locations in the standard library, or an
alternate module name?) alternate module name?)
The InstallationDatabase returns instances of Package that contain The InstallationDatabase returns instances of Distribution that contain
all the information about an installed package. all the information about an installed distribution.
XXX Several of the fields in Package are duplicates of ones in XXX Several of the fields in Distribution are duplicates of ones in
distutils.dist.Distribution. Probably they should be factored out distutils.dist.Distribution. Probably they should be factored out
into the Package class proposed here, but can this be done in a into the Distribution class proposed here, but can this be done in a
backward-compatible way? backward-compatible way?
InstallationDatabase has the following interface: InstallationDatabase has the following interface:
@ -164,38 +172,38 @@ class InstallationDatabase:
If path is None, INSTALLDB is used as the default. If path is None, INSTALLDB is used as the default.
""" """
def get_package (self, package_name): def get_distribution (self, distribution_name):
"""get_package(package_name:string) : Package """get_distribution(distribution_name:string) : Distribution
Get the object corresponding to a single package. Get the object corresponding to a single distribution.
""" """
def list_packages (self): def list_distributions (self):
"""list_packages() : [Package] """list_distributions() : [Distribution]
Return a list of all packages installed on the system, Return a list of all distributions installed on the system,
enumerated in no particular order. enumerated in no particular order.
""" """
def find_package (self, path): def find_distribution (self, path):
"""find_file(path:string) : Package """find_file(path:string) : Distribution
Search and return the package containing the file 'path'. Search and return the distribution containing the file 'path'.
Returns None if the file doesn't belong to any package Returns None if the file doesn't belong to any distribution
that the InstallationDatabase knows about. that the InstallationDatabase knows about.
XXX should this work for directories? XXX should this work for directories?
""" """
class Package: class Distribution:
"""Instance attributes: """Instance attributes:
name : string name : string
Package name Distribution name
files : {string : (size:int, perms:int, owner:string, group:string, files : {string : (size:int, perms:int, owner:string, group:string,
digest:string)} digest:string)}
Dictionary mapping the path of a file installed by this package Dictionary mapping the path of a file installed by this distribution
to information about the file. to information about the file.
The following fields all come from PEP 241. The following fields all come from PEP 241.
version : distutils.version.Version version : distutils.version.Version
Version of this package Version of this distribution
platform : [string] platform : [string]
summary : string summary : string
description : string description : string
@ -216,7 +224,7 @@ class Package:
def has_file (self, path): def has_file (self, path):
"""has_file(path:string) : Boolean """has_file(path:string) : Boolean
Returns true if the specified path belongs to a file in this Returns true if the specified path belongs to a file in this
package. distribution.
""" """
def check_file (self, path): def check_file (self, path):
@ -232,7 +240,7 @@ Deliverables
A description of the database API, to be added to this PEP. A description of the database API, to be added to this PEP.
Patches to the Distutils that 1) implement an InstallationDatabase Patches to the Distutils that 1) implement an InstallationDatabase
class, 2) Update the database when a new package is installed. 3) class, 2) Update the database when a new distribution is installed. 3)
add a simple package management tool, features to be added to this add a simple package management tool, features to be added to this
PEP. (Or should that be a separate PEP?) See [2] for the current PEP. (Or should that be a separate PEP?) See [2] for the current
patch. patch.
@ -240,18 +248,18 @@ Deliverables
Rejected Suggestions Rejected Suggestions
Instead of using one text file per package, one large text file or Instead of using one text file per distribution, one large text
an anydbm file could be used. This has been rejected for a few file or an anydbm file could be used. This has been rejected for
reasons. First, performance is probably not an extremely pressing a few reasons. First, performance is probably not an extremely
concern as the package database is only used when installing or pressing concern as the database is only used when installing or
removing packages, a relatively infrequent task. Scalability also removing software, a relatively infrequent task. Scalability also
likely isn't a problem, as people may have hundreds of Python likely isn't a problem, as people may have hundreds of Python
packages installed, but thousands seems unlikely. Finally, packages installed, but thousands or tens of thousands seems
individual text files are compatible with installers such as RPM unlikely. Finally, individual text files are compatible with
or DPKG because a package can just drop the new database file into installers such as RPM or DPKG because a binary packager can just
the database directory. If one large text file or a binary file drop the new database file into the database directory. If one
were used, the Python database would then have to be updated by large text file or a binary file were used, the Python database
running a postinstall script. would then have to be updated by running a postinstall script.
On Windows, the permissions and owner/group of a file aren't On Windows, the permissions and owner/group of a file aren't
stored. Windows does in fact support ownership and access stored. Windows does in fact support ownership and access
@ -267,13 +275,16 @@ References
[2] A patch to implement this PEP will be tracked as [2] A patch to implement this PEP will be tracked as
patch #562100 on SourceForge. patch #562100 on SourceForge.
http://www.python.org/sf/562100 http://www.python.org/sf/562100 .
Code implementing the installation database is currently in
Python CVS in the nondist/sandbox/pep262 directory.
Acknowledgements Acknowledgements
Ideas for this PEP originally came from postings by Greg Ward, Ideas for this PEP originally came from postings by Greg Ward,
Fred L. Drake Jr., Thomas Heller, Mats Wichmann, and others. Fred L. Drake Jr., Thomas Heller, Mats Wichmann, Phillip J. Eby,
and others.
Many changes and rewrites to this document were suggested by the Many changes and rewrites to this document were suggested by the
readers of the Distutils SIG. readers of the Distutils SIG.