diff --git a/pep-0262.txt b/pep-0262.txt index eae1187d2..fd517c8fd 100644 --- a/pep-0262.txt +++ b/pep-0262.txt @@ -9,28 +9,34 @@ Post-History: 27-Mar-2002 Introduction - This PEP describes a format for a database of Python packages + This PEP describes a format for a database of the Python software installed on a system. + (In this document, the term "distribution" is used to mean a set + of code that's developed and distributed together. A "distribution" + is the same as a Red Hat or Debian package, but the term "package" + already has a meaning in Python terminology, meaning "a directory + with an __init__.py file in it.") + Requirements - We need a way to figure out what packages, and what versions of - those packages, are installed on a system. We want to provide + We need a way to figure out what distributions, and what versions of + those distributions, are installed on a system. We want to provide features similar to CPAN, APT, or RPM. Required use cases that should be supported are: - * Is package X on a system? - * What version of package X is installed? - * Where can the new version of package X be found? (This can + * Is distribution X on a system? + * What version of distribution X is installed? + * Where can the new version of distribution X be found? (This can be defined as either "a home page where the user can go and find a download link", or "a place where a program can find the newest version?" Both should probably be supported.) - * What files did package X put on my system? - * What package did the file x/y/z.py come from? + * What files did distribution X put on my system? + * What distribution did the file x/y/z.py come from? * Has anyone modified x/y/z.py locally? - * What other packages does this package need? - * What Python modules does this package provide? + * What other distributions does this software need? + * What Python modules does this distribution provide? Database Location @@ -41,12 +47,12 @@ Database Location The structure of the database is deliberately kept simple; each file in this directory or its subdirectories (if any) describes a - single package. Binary packages of Python software such as RPMs - can then update Python's database by just installing the + single distribution. Binary packagings of Python software such as + RPMs can then update Python's database by just installing the corresponding file into the INSTALLDB directory. The rationale for scanning subdirectories is that we can move to a - directory-based indexing scheme if the package directory contains + directory-based indexing scheme if the database directory contains too many entries. For example, this would let us transparently switch from INSTALLDB/Numeric to INSTALLDB/N/Nu/Numeric or some similar hashing scheme. @@ -55,7 +61,7 @@ Database Location Database Contents Each file in INSTALLDB or its subdirectories describes a single - package, and has the following contents: + distribution, and has the following contents: An initial line listing the sections in this file, separated by whitespace. Currently this will always be 'PKG-INFO FILES @@ -64,50 +70,52 @@ Database Contents we'd add a DOCS section and list it in the contents. Sections are always separated by blank lines. - A package that uses the Distutils for installation should - automatically update the database. Packages that roll their own - installation will have to use the database's API to to manually - add or update their own entry. System package managers such as - RPM or pkgadd can just create the new 'package name' file in the + A distribution that uses the Distutils for installation should + automatically update the database. Distributions that roll their + own installation will have to use the database's API to to + manually add or update their own entry. System package managers + such as RPM or pkgadd can just create the new file in the INSTALLDB directory. Each section of the file is used for a different purpose. PKG-INFO section - An initial set of RFC-822 headers containing the package + An initial set of RFC-822 headers containing the distribution information for a file, as described in PEP 241, "Metadata for Python Software Packages". - A blank line indicating the end of the PKG-INFO section. - FILES section - An entry for each file installed by the package. Generated files - such as .pyc and .pyo files are on this list as well as the original - .py files installed by a package; their checksums won't be stored or - checked, though. + An entry for each file installed by the + distribution. Generated files such as .pyc and .pyo files are + on this list as well as the original .py files installed by a + distribution; their checksums won't be stored or checked, + though. - Each file's entry is a single tab-delimited line that contains - the following fields: + Each file's entry is a single tab-delimited line that contains + the following fields: - * The file's full path, as installed on the system. + * The file's full path, as installed on the system. - * The file's size + * The file's size - * The file's permissions. On Windows, this field will always be - 'unknown' + * The file's permissions. On Windows, this field will always be + 'unknown' + + * The owner and group of the file, separated by a tab. + On Windows, these fields will both be 'unknown'. + + * A SHA1 digest of the file, encoded in hex. For generated files + such as *.pyc files, this field must contain the string "-", + which indicates that the file's checksum should not be verified. - * The owner and group of the file, separated by a tab. - On Windows, these fields will both be 'unknown'. - - * A SHA1 digest of the file, encoded in hex. REQUIRES section This section is a list of strings giving the services required for this module distribution to run properly. This list includes the - package name ("python-stdlib") and module names ("rfc822", + distribution name ("python-stdlib") and module names ("rfc822", "htmllib", "email", "email.Charset"). It will be specified by an extra 'requires' argument to the distutils.core.setup() function. For example: @@ -123,7 +131,7 @@ Database Contents PROVIDES section This section is a list of strings giving the services provided by - an installed package. This list includes the package name + an installed distribution. This list includes the distribution name ("python-stdlib") and module names ("rfc822", "htmllib", "email", "email.Charset"). @@ -147,12 +155,12 @@ API Description suggestions for alternate locations in the standard library, or an alternate module name?) - The InstallationDatabase returns instances of Package that contain - all the information about an installed package. + The InstallationDatabase returns instances of Distribution that contain + all the information about an installed distribution. - XXX Several of the fields in Package are duplicates of ones in + XXX Several of the fields in Distribution are duplicates of ones in distutils.dist.Distribution. Probably they should be factored out - into the Package class proposed here, but can this be done in a + into the Distribution class proposed here, but can this be done in a backward-compatible way? InstallationDatabase has the following interface: @@ -164,38 +172,38 @@ class InstallationDatabase: If path is None, INSTALLDB is used as the default. """ - def get_package (self, package_name): - """get_package(package_name:string) : Package - Get the object corresponding to a single package. + def get_distribution (self, distribution_name): + """get_distribution(distribution_name:string) : Distribution + Get the object corresponding to a single distribution. """ - def list_packages (self): - """list_packages() : [Package] - Return a list of all packages installed on the system, + def list_distributions (self): + """list_distributions() : [Distribution] + Return a list of all distributions installed on the system, enumerated in no particular order. """ - def find_package (self, path): - """find_file(path:string) : Package - Search and return the package containing the file 'path'. - Returns None if the file doesn't belong to any package + def find_distribution (self, path): + """find_file(path:string) : Distribution + Search and return the distribution containing the file 'path'. + Returns None if the file doesn't belong to any distribution that the InstallationDatabase knows about. XXX should this work for directories? """ -class Package: +class Distribution: """Instance attributes: name : string - Package name + Distribution name files : {string : (size:int, perms:int, owner:string, group:string, digest:string)} - Dictionary mapping the path of a file installed by this package + Dictionary mapping the path of a file installed by this distribution to information about the file. The following fields all come from PEP 241. version : distutils.version.Version - Version of this package + Version of this distribution platform : [string] summary : string description : string @@ -216,7 +224,7 @@ class Package: def has_file (self, path): """has_file(path:string) : Boolean Returns true if the specified path belongs to a file in this - package. + distribution. """ def check_file (self, path): @@ -232,7 +240,7 @@ Deliverables A description of the database API, to be added to this PEP. Patches to the Distutils that 1) implement an InstallationDatabase - class, 2) Update the database when a new package is installed. 3) + class, 2) Update the database when a new distribution is installed. 3) add a simple package management tool, features to be added to this PEP. (Or should that be a separate PEP?) See [2] for the current patch. @@ -240,18 +248,18 @@ Deliverables Rejected Suggestions - Instead of using one text file per package, one large text file or - an anydbm file could be used. This has been rejected for a few - reasons. First, performance is probably not an extremely pressing - concern as the package database is only used when installing or - removing packages, a relatively infrequent task. Scalability also + Instead of using one text file per distribution, one large text + file or an anydbm file could be used. This has been rejected for + a few reasons. First, performance is probably not an extremely + pressing concern as the database is only used when installing or + removing software, a relatively infrequent task. Scalability also likely isn't a problem, as people may have hundreds of Python - packages installed, but thousands seems unlikely. Finally, - individual text files are compatible with installers such as RPM - or DPKG because a package can just drop the new database file into - the database directory. If one large text file or a binary file - were used, the Python database would then have to be updated by - running a postinstall script. + packages installed, but thousands or tens of thousands seems + unlikely. Finally, individual text files are compatible with + installers such as RPM or DPKG because a binary packager can just + drop the new database file into the database directory. If one + large text file or a binary file were used, the Python database + would then have to be updated by running a postinstall script. On Windows, the permissions and owner/group of a file aren't stored. Windows does in fact support ownership and access @@ -267,13 +275,16 @@ References [2] A patch to implement this PEP will be tracked as patch #562100 on SourceForge. - http://www.python.org/sf/562100 + http://www.python.org/sf/562100 . + Code implementing the installation database is currently in + Python CVS in the nondist/sandbox/pep262 directory. Acknowledgements Ideas for this PEP originally came from postings by Greg Ward, - Fred L. Drake Jr., Thomas Heller, Mats Wichmann, and others. + Fred L. Drake Jr., Thomas Heller, Mats Wichmann, Phillip J. Eby, + and others. Many changes and rewrites to this document were suggested by the readers of the Distutils SIG.