2001-07-09 10:26:26 -04:00
|
|
|
|
PEP: 262
|
|
|
|
|
Title: A Database of Installed Python Packages
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Author: A.M. Kuchling <akuchlin@mems-exchange.org>
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Created: 08-Jul-2001
|
2002-10-13 16:21:17 -04:00
|
|
|
|
Status: Deferred
|
2002-03-27 22:03:28 -05:00
|
|
|
|
Post-History: 27-Mar-2002
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
|
|
|
|
Introduction
|
|
|
|
|
|
|
|
|
|
This PEP describes a format for a database of Python packages
|
|
|
|
|
installed on a system.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Requirements
|
|
|
|
|
|
|
|
|
|
We need a way to figure out what packages, and what versions of
|
|
|
|
|
those packages, are installed on a system. We want to provide
|
|
|
|
|
features similar to CPAN, APT, or RPM. Required use cases that
|
|
|
|
|
should be supported are:
|
|
|
|
|
|
|
|
|
|
* Is package X on a system?
|
|
|
|
|
* What version of package X is installed?
|
2002-03-25 08:36:05 -05:00
|
|
|
|
* Where can the new version of package X be found? (This can
|
|
|
|
|
be defined as either "a home page where the user can go and
|
2001-07-09 10:26:26 -04:00
|
|
|
|
find a download link", or "a place where a program can find
|
2002-03-25 08:36:05 -05:00
|
|
|
|
the newest version?" Both should probably be supported.)
|
2001-07-09 10:26:26 -04:00
|
|
|
|
* What files did package X put on my system?
|
|
|
|
|
* What package did the file x/y/z.py come from?
|
|
|
|
|
* Has anyone modified x/y/z.py locally?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Database Location
|
|
|
|
|
|
|
|
|
|
The database lives in a bunch of files under
|
|
|
|
|
<prefix>/lib/python<version>/install/. This location will be
|
|
|
|
|
called INSTALLDB through the remainder of this PEP.
|
|
|
|
|
|
|
|
|
|
The structure of the database is deliberately kept simple; each
|
|
|
|
|
file in this directory or its subdirectories (if any) describes a
|
|
|
|
|
single package.
|
|
|
|
|
|
|
|
|
|
The rationale for scanning subdirectories is that we can move to a
|
|
|
|
|
directory-based indexing scheme if the package directory contains
|
2002-03-25 08:36:05 -05:00
|
|
|
|
too many entries. For example, this would let us transparently
|
|
|
|
|
switch from INSTALLDB/Numeric to INSTALLDB/N/Nu/Numeric or some
|
|
|
|
|
similar hashing scheme.
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Database Contents
|
|
|
|
|
|
|
|
|
|
Each file in INSTALLDB or its subdirectories describes a single
|
|
|
|
|
package, and has the following contents:
|
|
|
|
|
|
|
|
|
|
An initial line listing the sections in this file, separated
|
|
|
|
|
by whitespace. Currently this will always be 'PKG-INFO
|
|
|
|
|
FILES'. This is for future-proofing; if we add a new section,
|
|
|
|
|
for example to list documentation files, then we'd add a DOCS
|
|
|
|
|
section and list it in the contents. Sections are always
|
2002-03-25 08:36:05 -05:00
|
|
|
|
separated by blank lines.
|
|
|
|
|
|
|
|
|
|
PKG-INFO section
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
2002-03-25 08:36:05 -05:00
|
|
|
|
An initial set of RFC-822 headers containing the package
|
|
|
|
|
information for a file, as described in PEP 241, "Metadata for
|
|
|
|
|
Python Software Packages".
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
|
|
|
|
A blank line indicating the end of the PKG-INFO section.
|
|
|
|
|
|
2002-03-25 08:36:05 -05:00
|
|
|
|
FILES section
|
|
|
|
|
|
2002-03-25 08:57:45 -05:00
|
|
|
|
An entry for each file installed by the package. Generated files
|
|
|
|
|
such as .pyc and .pyo files are on this list as well as the original
|
|
|
|
|
.py files installed by a package; their checksums won't be stored or
|
|
|
|
|
checked, though.
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
2002-03-25 08:36:05 -05:00
|
|
|
|
Each file's entry is a single tab-delimited line that contains
|
|
|
|
|
the following fields:
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
2002-03-25 08:57:45 -05:00
|
|
|
|
* The file's full path, as installed on the system.
|
|
|
|
|
|
2001-07-09 10:26:26 -04:00
|
|
|
|
* The file's size
|
|
|
|
|
|
2002-03-27 21:18:02 -05:00
|
|
|
|
* The file's permissions. On Windows, this field will always be
|
|
|
|
|
'unknown'
|
|
|
|
|
|
|
|
|
|
* The owner and group of the file, separated by a tab.
|
|
|
|
|
On Windows, these fields will both be 'unknown'.
|
2002-03-25 08:36:05 -05:00
|
|
|
|
|
|
|
|
|
* An MD5 digest of the file, encoded in hex.
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
2002-03-25 08:36:05 -05:00
|
|
|
|
A package that uses the Distutils for installation should
|
2001-07-09 10:26:26 -04:00
|
|
|
|
automatically update the database. Packages that roll their own
|
2002-03-25 08:36:05 -05:00
|
|
|
|
installation will have to use the database's API to to manually
|
|
|
|
|
add or update their own entry. System package managers such as
|
|
|
|
|
RPM or pkgadd can just create the new 'package name' file in the
|
|
|
|
|
INSTALLDB directory.
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
|
|
|
|
|
2002-03-28 16:39:16 -05:00
|
|
|
|
API Description
|
|
|
|
|
|
|
|
|
|
There's a single fundamental class, InstallationDatabase. The
|
|
|
|
|
code for it lives in distutils/install_db.py. (XXX any
|
|
|
|
|
suggestions for alternate locations in the standard library, or an
|
|
|
|
|
alternate module name?)
|
|
|
|
|
|
|
|
|
|
The InstallationDatabase returns instances of Package that contain
|
|
|
|
|
all the information about an installed package.
|
|
|
|
|
|
|
|
|
|
XXX Several of the fields in Package are duplicates of ones in
|
|
|
|
|
distutils.dist.Distribution. Probably they should be factored out
|
|
|
|
|
into the Package class proposed here, but can this be done in a
|
|
|
|
|
backward-compatible way?
|
|
|
|
|
|
|
|
|
|
InstallationDatabase has the following interface:
|
|
|
|
|
|
|
|
|
|
class InstallationDatabase:
|
|
|
|
|
def __init__ (self, path=None):
|
|
|
|
|
"""InstallationDatabase(path:string)
|
|
|
|
|
Read the installation database rooted at the specified path.
|
|
|
|
|
If path is None, INSTALLDB is used as the default.
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
def get_package (self, package_name):
|
|
|
|
|
"""get_package(package_name:string) : Package
|
|
|
|
|
Get the object corresponding to a single package.
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
def list_packages (self):
|
|
|
|
|
"""list_packages() : [Package]
|
|
|
|
|
Return a list of all packages installed on the system,
|
|
|
|
|
enumerated in no particular order.
|
|
|
|
|
"""
|
|
|
|
|
|
2002-10-13 16:20:23 -04:00
|
|
|
|
def find_package (self, path):
|
|
|
|
|
"""find_file(path:string) : Package
|
|
|
|
|
Search and return the package containing the file 'path'.
|
|
|
|
|
Returns None if the file doesn't belong to any package
|
|
|
|
|
that the InstallationDatabase knows about.
|
|
|
|
|
XXX should this work for directories?
|
|
|
|
|
"""
|
|
|
|
|
|
2002-03-28 16:39:16 -05:00
|
|
|
|
class Package:
|
|
|
|
|
"""Instance attributes:
|
|
|
|
|
name : string
|
|
|
|
|
Package name
|
|
|
|
|
files : {string : (size:int, perms:int, owner:string, group:string,
|
|
|
|
|
digest:string)}
|
|
|
|
|
Dictionary mapping the path of a file installed by this package
|
|
|
|
|
to information about the file.
|
|
|
|
|
|
|
|
|
|
The following fields all come from PEP 241.
|
|
|
|
|
|
|
|
|
|
version : distutils.version.Version
|
|
|
|
|
Version of this package
|
|
|
|
|
platform : [string]
|
|
|
|
|
summary : string
|
|
|
|
|
description : string
|
|
|
|
|
keywords : string
|
|
|
|
|
home_page : string
|
|
|
|
|
author : string
|
|
|
|
|
author_email : string
|
|
|
|
|
license : string
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
def add_file (self, path):
|
|
|
|
|
"""add_file(path:string):None
|
|
|
|
|
Record the size, ownership, &c., information for an installed file.
|
|
|
|
|
XXX as written, this would stat() the file. Should the size/perms/
|
|
|
|
|
checksum all be provided as parameters to this method instead?
|
|
|
|
|
"""
|
|
|
|
|
|
2002-10-13 16:20:23 -04:00
|
|
|
|
def has_file (self, path):
|
|
|
|
|
"""has_file(path:string) : Boolean
|
|
|
|
|
Returns true if the specified path belongs to a file in this
|
|
|
|
|
package.
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
def check_file (self, path):
|
|
|
|
|
"""check_file(path:string) : Boolean
|
|
|
|
|
Checks whether the file's size, checksum, and ownership match,
|
|
|
|
|
returning true if they do.
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
|
2002-03-28 16:39:16 -05:00
|
|
|
|
|
2001-07-09 10:26:26 -04:00
|
|
|
|
Deliverables
|
2002-03-25 08:36:05 -05:00
|
|
|
|
|
|
|
|
|
A description of the database API, to be added to this PEP.
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
2002-03-25 08:36:05 -05:00
|
|
|
|
Patches to the Distutils that 1) implement an InstallationDatabase
|
2001-07-09 10:26:26 -04:00
|
|
|
|
class, 2) Update the database when a new package is installed. 3)
|
2002-05-29 17:24:23 -04:00
|
|
|
|
add a simple package management tool, features to be added to this
|
|
|
|
|
PEP. (Or should that be a separate PEP?) See [2] for the current
|
|
|
|
|
patch.
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
|
|
|
|
|
2002-03-25 08:36:05 -05:00
|
|
|
|
Rejected Suggestions
|
|
|
|
|
|
|
|
|
|
Instead of using one text file per package, one large text file or
|
|
|
|
|
an anydbm file could be used. This has been rejected for a few
|
|
|
|
|
reasons. First, performance is probably not an extremely pressing
|
|
|
|
|
concern as the package database is only used when installing or
|
|
|
|
|
removing packages, a relatively infrequent task. Scalability also
|
|
|
|
|
likely isn't a problem, as people may have hundreds of Python
|
|
|
|
|
packages installed, but thousands seems unlikely. Finally,
|
|
|
|
|
individual text files are compatible with installers such as RPM
|
|
|
|
|
or DPKG because a package can just drop the new database file into
|
|
|
|
|
the database directory. If one large text file or a binary file
|
|
|
|
|
were used, the Python database would then have to be updated by
|
|
|
|
|
running a postinstall script.
|
|
|
|
|
|
2002-03-27 21:18:02 -05:00
|
|
|
|
On Windows, the permissions and owner/group of a file aren't
|
|
|
|
|
stored. Windows does in fact support ownership and access
|
|
|
|
|
permissions, but reading and setting them requires the win32all
|
|
|
|
|
extensions, and they aren't present in the basic Python installer
|
|
|
|
|
for Windows.
|
|
|
|
|
|
2002-03-25 08:36:05 -05:00
|
|
|
|
|
2001-07-09 10:26:26 -04:00
|
|
|
|
References
|
|
|
|
|
|
|
|
|
|
[1] Michael Muller's patch (posted to the Distutils-SIG around 28
|
|
|
|
|
Dec 1999) generates a list of installed files.
|
|
|
|
|
|
2002-05-29 17:24:23 -04:00
|
|
|
|
[2] A patch to implement this PEP will be tracked as
|
|
|
|
|
patch #562100 on SourceForge.
|
|
|
|
|
http://www.python.org/sf/562100
|
|
|
|
|
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
|
|
|
|
Acknowledgements
|
|
|
|
|
|
|
|
|
|
Ideas for this PEP originally came from postings by Greg Ward,
|
2002-03-25 08:36:05 -05:00
|
|
|
|
Fred L. Drake Jr., Thomas Heller, Mats Wichmann, and others.
|
2001-07-09 10:26:26 -04:00
|
|
|
|
|
|
|
|
|
Many changes and rewrites to this document were suggested by the
|
|
|
|
|
readers of the Distutils SIG.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
End:
|