diff --git a/pep-0008.txt b/pep-0008.txt index b385fbeef..7d14be902 100644 --- a/pep-0008.txt +++ b/pep-0008.txt @@ -158,9 +158,21 @@ The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash -for line continuation. Make sure to indent the continued line -appropriately. The preferred place to break around a binary operator -is *after* the operator, not before it. Some examples:: +for line continuation. + +Backslashes may still be appropriate at times. For example, long, +multiple ``with``-statements cannot use implicit continuation, so +backslashes are acceptable:: + + with open('/path/to/some/file/you/want/to/read') as file_1, \ + open('/path/to/some/file/being/written', 'w') as file_2: + file_2.write(file_1.read()) + +Another such case is with ``assert`` statements. + +Make sure to indent the continued line appropriately. The preferred +place to break around a binary operator is *after* the operator, not +before it. Some examples:: class Rectangle(Blob): diff --git a/pep-0315.txt b/pep-0315.txt index 6bf5571e5..cc638621c 100644 --- a/pep-0315.txt +++ b/pep-0315.txt @@ -4,7 +4,7 @@ Version: $Revision$ Last-Modified: $Date$ Author: Raymond Hettinger W Isaac Carroll -Status: Deferred +Status: Rejected Type: Standards Track Content-Type: text/plain Created: 25-Apr-2003 @@ -21,19 +21,32 @@ Abstract Notice - Deferred; see + Rejected; see + http://mail.python.org/pipermail/python-ideas/2013-June/021610.html + + This PEP has been deferred since 2006; see http://mail.python.org/pipermail/python-dev/2006-February/060718.html Subsequent efforts to revive the PEP in April 2009 did not meet with success because no syntax emerged that could - compete with a while-True and an inner if-break. + compete with the following form: - A syntax was found for a basic do-while loop but it found - had little support because the condition was at the top: + while True: + + if not : + break + + + A syntax alternative to the one proposed in the PEP was found for + a basic do-while loop but it gained little support because the + condition was at the top: do ... while : + Users of the language are advised to use the while-True form with + an inner if-break when a do-while loop would have been appropriate. + Motivation diff --git a/pep-0426.txt b/pep-0426.txt index fdf6253f2..f790cd6fc 100644 --- a/pep-0426.txt +++ b/pep-0426.txt @@ -12,7 +12,8 @@ Type: Standards Track Content-Type: text/x-rst Requires: 440 Created: 30 Aug 2012 -Post-History: 14 Nov 2012, 5 Feb 2013, 7 Feb 2013, 9 Feb 2013, 27-May-2013 +Post-History: 14 Nov 2012, 5 Feb 2013, 7 Feb 2013, 9 Feb 2013, + 27 May 2013, 20 Jun 2013, 23 Jun 2013 Replaces: 345 @@ -21,8 +22,7 @@ Abstract This PEP describes a mechanism for publishing and exchanging metadata related to Python distributions. It includes specifics of the field names, -and their semantics and -usage. +and their semantics and usage. This document specifies version 2.0 of the metadata format. Version 1.0 is specified in PEP 241. @@ -42,7 +42,9 @@ identification scheme. "I" in this doc refers to Nick Coghlan. Daniel and Donald either wrote or contributed to earlier versions, and have been providing feedback as this - initial draft of the JSON-based rewrite has taken shape. + JSON-based rewrite has taken shape. Daniel and Donald have also been + vetting the proposal as we go to ensure it is practical to implement for + both clients and index servers. Metadata 2.0 represents a major upgrade to the Python packaging ecosystem, and attempts to incorporate experience gained over the 15 years(!) since @@ -63,8 +65,7 @@ identification scheme. * a new PEP to define v2.0 of the sdist format * an updated wheel PEP (v1.1) to add pymeta.json * an updated installation database PEP both for pymeta.json and to add - a linking scheme to better support runtime selection of dependencies, - as well as recording which extras are currently available + a linking scheme to better support runtime selection of dependencies * a new static config PEP to standardise metadata generation and creation of sdists * PEP 439, covering a bootstrapping mechanism for ``pip`` @@ -83,138 +84,241 @@ identification scheme. "rationale" section at the end of the document, as it would otherwise be an irrelevant distraction for future readers. +Purpose +======= -Definitions -=========== +The purpose of this PEP is to define a common metadata interchange format +for communication between software publication tools and software integration +tools in the Python ecosystem. One key aim is to support full dependency +analysis in that ecosystem without requiring the execution of arbitrary +Python code by those doing the analysis. Another aim is to encourage good +software distribution practices by default, while continuing to support the +current practices of almost all existing users of the Python Package Index +(both publishers and integrators). + +The design draws on the Python community's 15 years of experience with +distutils based software distribution, and incorporates ideas and concepts +from other distribution systems, including Python's setuptools, pip and +other projects, Ruby's gems, Perl's CPAN, Node.js's npm, PHP's composer +and Linux packaging systems such as RPM and APT. + + +Development, Distribution and Deployment of Python Software +=========================================================== + +The metadata design in this PEP is based on a particular conceptual model +of the software development and distribution process. This model consists of +the following phases: + +* Software development: this phase involves working with a source checkout + for a particular application to add features and fix bugs. It is + expected that developers in this phase will need to be able to build the + software, run the software's automated test suite, run project specific + utility scripts and publish the software. + +* Software publication: this phase involves taking the developed software + and making it available for use by software integrators. This includes + creating the descriptive metadata defined in this PEP, as well making the + software available (typically by uploading it to an index server). + +* Software integration: this phase involves taking published software + components and combining them into a coherent, integrated system. This + may be done directly using Python specific cross-platform tools, or it may + be handled through conversion to development language neutral platform + specific packaging systems. + +* Software deployment: this phase involves taking integrated software + components and deploying them on to the target system where the software + will actually execute. + +The publication and integration phases are collectively referred to as +the distribution phase, and the individual software components distributed +in that phase are referred to as "distributions". + +The exact details of these phases will vary greatly for particular use cases. +Deploying a web application to a public Platform-as-a-Service provider, +publishing a new release of a web framework or scientific library, +creating an integrated Linux distribution or upgrading a custom application +running in a secure enclave are all situations this metadata design should +be able to handle. + +The complexity of the metadata described in this PEP thus arises directly +from the actual complexities associated with software development, +distribution and deployment in a wide range of scenarios. + + +Supporting definitions +---------------------- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. -"Distributions" are deployable software components published through an index -server or otherwise made available for installation. +"Projects" are software components that are made available for integration. +Projects include Python libraries, frameworks, scripts, plugins, +applications, collections of data or other resources, and various +combinations thereof. Public Python projects are typically registered on +the `Python Package Index`_. -"Versions" are uniquely identified snapshots of a distribution. +"Releases" are uniquely identified snapshots of a project. -"Distribution archives" are the packaged files which are used to publish -and distribute the software. +"Distributions" are the packaged files which are used to publish +and distribute a release. -"Source archives" require build tools to be available on the target -system. +"Source archive" and "VCS checkout" both refer to the raw source code for +a release, prior to creation of an sdist or binary archive. + +An "sdist" is a publication format providing the distribution metadata and +and any source files that are essential to creating a binary archive for +the distribution. Creating a binary archive from an sdist requires that +the appropriate build tools be available on the system. "Binary archives" only require that prebuilt files be moved to the correct location on the target system. As Python is a dynamically bound -cross-platform language, many "binary" archives will contain only pure -Python source code. +cross-platform language, many so-called "binary" archives will contain only +pure Python source code. + +"Contributors" are individuals and organizations that work together to +develop a software component. + +"Publishers" are individuals and organizations that make software components +available for integration (typically by uploading distributions to an +index server) + +"Integrators" are individuals and organizations that incorporate published +distributions as components of an application or larger system. "Build tools" are automated tools intended to run on development systems, producing source and binary distribution archives. Build tools may also be -invoked by installation tools in order to install software distributed as -source archives rather than prebuilt binary archives. +invoked by integration tools in order to build software distributed as +sdists rather than prebuilt binary archives. "Index servers" are active distribution registries which publish version and dependency metadata and place constraints on the permitted metadata. +"Public index servers" are index servers which allow distribution uploads +from untrusted third parties. The `Python Package Index`_ is a public index +server. + "Publication tools" are automated tools intended to run on development systems and upload source and binary distribution archives to index servers. -"Installation tools" are automated tools intended to run on production -systems, consuming source and binary distribution archives from an index -server or other designated location and deploying them to the target system. +"Integration tools" are automated tools that consume the metadata and +distribution archives published by an index server or other designated +source, and make use of them in some fashion, such as installing them or +converting them to a platform specific packaging format. + +"Installation tools" are integration tools specifically intended to run on +deployment targets, consuming source and binary distribution archives from +an index server or other designated location and deploying them to the target +system. "Automated tools" is a collective term covering build tools, index servers, -publication tools, installation tools and any other software that produces +publication tools, integration tools and any other software that produces or consumes distribution version and dependency metadata. -"Projects" refers to the developers that manage the creation of a particular -distribution. - "Legacy metadata" refers to earlier versions of this metadata specification, along with the supporting metadata file formats defined by the ``setuptools`` project. +"Entry points" are a scheme for identifying Python callables or other +objects as strings consisting of a Python module name and a module +attribute name, separated by a colon. For example: ``"test.regrtest:main"``. -Development and distribution activities -======================================= +"Distros" is used as the preferred term for Linux distributions, to help +avoid confusion with the Python-specific meaning of the term. -Making effective use of a common metadata format requires a common -understanding of the most complex development and distribution model -the format is intended to support. The metadata format described in this -PEP is based on the following activities: -* Development: during development, a user is operating from a - source checkout (or equivalent) for the current project. Dependencies must - be available in order to build, test and create a source archive of the - distribution. +Integration and deployment of distributions +------------------------------------------- - .. note:: - As a generated file, the full distribution metadata often won't be - available in a raw source checkout or tarball. In such cases, the - relevant distribution metadata is generally obtained from another - location, such as the last published release, or by generating it - based on a command given in a standard input file. This spec - deliberately avoids handling that scenario, instead falling back on - the existing ``setup.py`` functionality. +The primary purpose of the distribution metadata is to support integration +and deployment of distributions as part of larger applications and systems. -* Build: the build step is the process of turning a source archive into a - binary archive. Dependencies must be available in order to build and - create a binary archive of the distribution (including any documentation - that is installed on target systems). +Integration and deployment can in turn be broken down into further substeps. -* Deployment: the deployment phase consists of two subphases: +* Build: the build step is the process of turning a VCS checkout, source + archive or sdist into a binary archive. Dependencies must be available + in order to build and create a binary archive of the distribution + (including any documentation that is installed on target systems). - * Installation: the installation phase involves getting the distribution - and all of its runtime dependencies onto the target system. In this - phase, the distribution may already be on the system (when upgrading or - reinstalling) or else it may be a completely new installation. +* Installation: the installation step involves getting the distribution + and all of its runtime dependencies onto the target system. In this + step, the distribution may already be on the system (when upgrading or + reinstalling) or else it may be a completely new installation. - * Usage: the usage phase, also referred to as "runtime", is normal usage - of the distribution after it has been installed on the target system. +* Runtime: this is normal usage of a distribution after it has been + installed on the target system. -The metadata format described in this PEP is designed to enable the -following: +These three steps may all occur directly on the target system. Alternatively +the build step may be separated out by using binary archives provided by the +publisher of the distribution, or by creating the binary archives on a +separate system prior to deployment. -* It should be practical to have separate development systems, build systems - and deployment systems. -* It should be practical to install dependencies needed specifically to - build source archives only on development systems. -* It should be practical to install dependencies needed specifically to - build the software only on development and build systems, as well as - optionally on deployment systems if installation from source archives - is needed. -* It should be practical to install dependencies needed to run the - distribution only on development and deployment systems. -* It should be practical to install the dependencies needed to run a - distribution's test suite only on development systems, as well as - optionally on deployment systems. -* It should be practical for repackagers to separate out the build - dependencies needed to build the application itself from those required - to build its documentation (as the documentation often doesn't need to - be rebuilt when porting an application to a different platform). +The published metadata for distributions SHOULD allow integrators, with the +aid of build and integration tools, to: -.. note:: +* obtain the original source code that was used to create a distribution +* identify and retrieve the dependencies (if any) required to use a + distribution +* identify and retrieve the dependencies (if any) required to build a + distribution from source +* identify and retrieve the dependencies (if any) required to run a + distribution's test suite +* find resources on using and contributing to the project +* access sufficiently rich metadata to support contacting distribution + publishers through appropriate channels, as well as finding distributions + that are relevant to particular problems - This "most complex supported scenario" is almost *exactly* what has to - happen to get an upstream Python package into a Linux distribution, and - is why the current crop of automatic Python metadata -> Linux distro - metadata converters have some serious issues, at least from the point of - view of complying with distro packaging policies: the information - they need to comply with those policies isn't available from the - upstream projects, and all current formats for publishing it are - distro specific. This means either upstreams have to maintain metadata - for multiple distributions (which rarely happens) or else repackagers - have to do a lot of work manually in order to separate out these - dependencies in a way that complies with those policies. - One thing this PEP aims to do is define a metadata format that at least - has the *potential* to provide the info repackagers need, thus allowing - upstream Python projects and Linux distro repackagers to collaborate more - effectively (and, ideally, make it possible to reliably automate - the process of converting upstream Python distributions into policy - compliant distro packages). +Development and publication of distributions +-------------------------------------------- - Some items in this section (and the contents of this note) will likely - end up moving down to the "Rationale for changes from PEP 345" section. +The secondary purpose of the distribution metadata is to support effective +collaboration amongst software contributors and publishers during the +development phase. + +The published metadata for distributions SHOULD allow contributors +and publishers, with the aid of build and publication tools, to: + +* perform all the same activities needed to effectively integrate and + deploy the distribution +* identify and retrieve the additional dependencies needed to develop and + publish the distribution +* specify the dependencies (if any) required to use the distribution +* specify the dependencies (if any) required to build the distribution + from source +* specify the dependencies (if any) required to run the distribution's + test suite +* specify the additional dependencies (if any) required to develop and + publish the distribution + + +Standard build system +--------------------- + +Both development and integration of distributions relies on the ability to +build extension modules and perform other operations in a distribution +independent manner. + +The current iteration of the metadata relies on the +``distutils``/``setuptools`` commands system to support these necessary +development and integration activities: + +* ``python setup.py dist_info``: generate distribution metadata in place + given a source archive or VCS checkout +* ``python setup.py sdist``: create an sdist from a source archive + or VCS checkout +* ``python setup.py build_ext --inplace``: build extension modules in place + given an sdist, source archive or VCS checkout +* ``python setup.py test``: run the distribution's test suite in place + given an sdist, source archive or VCS checkout +* ``python setup.py bdist_wheel``: create a binary archive from an sdist, + source archive or VCS checkout + +Future iterations of the metadata and associated PEPs may aim to replace +these ``distutils``/``setuptools`` dependent commands with build system +independent entry points. Metadata format @@ -247,14 +351,22 @@ is not provided for an optional field, that field MUST be omitted entirely. Automated tools MAY automatically derive valid values from other information sources (such as a version control system). +Automated tools, especially public index servers, MAY impose additional +length restrictions on metadata beyond those enumerated in this PEP. Such +limits SHOULD be imposed where necessary to protect the integrity of a +service, based on the available resources and the service provider's +judgment of reasonable metadata capacity requirements. + Metadata files -------------- The information defined in this PEP is serialised to ``pymeta.json`` -files for some use cases. As indicated by the extension, these -are JSON-encoded files. Each file consists of a single serialised mapping, -with fields as described in this PEP. +files for some use cases. These are files containing UTF-8 encoded JSON +metadata. + +Each metadata file consists of a single serialised mapping, with fields as +described in this PEP. There are three standard locations for these metadata files: @@ -270,21 +382,16 @@ There are three standard locations for these metadata files: These locations are to be confirmed, since they depend on the definition of sdist 2.0 and the revised installation database standard. There will also be a wheel 1.1 format update after this PEP is approved that - mandates 2.0+ metadata. + mandates provision of 2.0+ metadata. Other tools involved in Python distribution may also use this format. -It is expected that these metadata files will be generated by build tools -based on other input formats (such as ``setup.py``) rather than being -edited by hand. - -.. note:: - - It may be appropriate to add a "./setup.py dist_info" command to - setuptools to allow just the sdist metadata files to be generated - without having to build the full sdist archive. This would be - similar to the existing "./setup.py egg_info" command in setuptools, - which would continue to emit the legacy metadata format. +As JSON files are generally awkward to edit by hand, it is RECOMMENDED +that these metadata files be generated by build tools based on other +input formats (such as ``setup.py``) rather than being used directly as +a data input format. Generating the metadata as part of the publication +process also helps to deal with version specific fields (including the +source URL and the version field itself). For backwards compatibility with older installation tools, metadata 2.0 files MAY be distributed alongside legacy metadata. @@ -292,6 +399,10 @@ files MAY be distributed alongside legacy metadata. Index servers MAY allow distributions to be uploaded and installation tools MAY allow distributions to be installed with only legacy metadata. +Automated tools MAY attempt to automatically translate legacy metadata to +the format described in this PEP. Advice for doing so effectively is given +in Appendix A. + Essential dependency resolution metadata ---------------------------------------- @@ -304,13 +415,18 @@ The essential dependency resolution metadata consists of the following fields: * ``metadata_version`` +* ``generator`` * ``name`` * ``version`` -* ``build_label`` -* ``version_url`` +* ``source_label`` +* ``source_url`` * ``extras`` -* ``requires`` -* ``may_require`` +* ``meta_requires`` +* ``meta_may_require`` +* ``run_requires`` +* ``run_may_require`` +* ``test_requires`` +* ``test_may_require`` * ``build_requires`` * ``build_may_require`` * ``dev_requires`` @@ -320,23 +436,34 @@ fields: * ``supports_environments`` When serialised to a file, the name used for this metadata set SHOULD -be ``pymeta-minimal.json``. +be ``pymeta-dependencies.json``. -Abbreviated metadata --------------------- -Some metadata fields have the potential to contain a lot of information -that will rarely be referenced, greatly increasing storage requirements -without providing significant benefits. +Included documents +------------------ -The abbreviated metadata for a distribution consists of all fields -*except* the following: +Rather than being incorporated directly into the structured metadata, some +supporting documents are included alongside the metadata file in the +``dist-info`` metadata directory. -* ``description`` -* ``contributors`` +To accommodate the variety of existing naming conventions for these files, +they are explicitly identified in the ``document_names`` field, rather +than expecting index servers and other automated tools to identify them +automatically. -When serialised to a file, the name used for this metadata set SHOULD -be ``pymeta-short.json``. + +Metadata validation +------------------- + +A `jsonschema `__ description of +the distribution metadata is `available +`__. + +This schema does NOT currently handle validation of some of the more complex +string fields (instead treating them as opaque strings). + +Except where otherwise noted, all URL fields in the metadata MUST comply +with RFC 3986. Core metadata @@ -376,6 +503,17 @@ Example:: "metadata_version": "2.0" +Generator +--------- + +Name (and optional version) of the program that generated the file, +if any. A manually produced file would omit this field. + +Example:: + + "generator": "setuptools (0.8)" + + Name ---- @@ -391,7 +529,7 @@ permitted characters are constrained to: * hyphens (``-``) * periods (``.``) -Distributions named MUST start and end with an ASCII letter or digit. +Distribution names MUST start and end with an ASCII letter or digit. Automated tools MUST reject non-compliant names. @@ -399,14 +537,14 @@ All comparisons of distribution names MUST be case insensitive, and MUST consider hyphens and underscores to be equivalent. Index servers MAY consider "confusable" characters (as defined by the -Unicode Consortium in `TR39: Unicode Security Mechanisms `__) to be +Unicode Consortium in `TR39: Unicode Security Mechanisms `_) to be equivalent. Index servers that permit arbitrary distribution name registrations from untrusted sources SHOULD consider confusable characters to be equivalent when registering new distributions (and hence reject them as duplicates). -Installation tools MUST NOT silently accept a confusable alternate +Integration tools MUST NOT silently accept a confusable alternate spelling as matching a requested distribution name. At time of writing, the characters in the ASCII subset designated as @@ -421,45 +559,6 @@ Example:: "name": "ComfyChair" -.. note:: - - Debian doesn't actually permit underscores in names, but that seems - unduly restrictive for this spec given the common practice of using - valid Python identifiers as Python distribution names. A Debian side - policy of converting underscores to hyphens seems easy enough to - implement (and the requirement to consider hyphens and underscores as - equivalent ensures that doing so won't introduce any conflicts). - - We're deliberately *not* following Python 3 down the path of arbitrary - unicode identifiers at this time. The security implications of doing so - are substantially worse in the software distribution use case (it opens - up far more interesting attack vectors than mere code obfuscation), the - existing tooling really only works properly if you abide by the stated - restrictions and changing it would require a *lot* of work for all - the automated tools in the chain. - - PyPI has recently been updated to reject non-compliant names for newly - registered projects, but existing non-compliant names are still - tolerated when using legacy metadata formats. Affected distributions - will need to change their names (typically be replacing spaces with - hyphens) before they can migrate to the new metadata formats. - - Donald Stufft ran an analysis, and the new restrictions impact less - than 230 projects out of the ~31k already on PyPI. This isn't that - surprising given the fact that many existing tools could already - exhibit odd behaviour when attempting to deal with non-compliant - names, implicitly discouraging the use of more exotic names. - - Of those projects, ~200 have the only non-compliant character as an - internal space (e.g. "Twisted Web"). These will be automatically - migrated by replacing the spaces with hyphens (e.g. "Twisted-Web"), - which is what you have to actually type to install these distributions - with ``setuptools`` (which powers both ``easy_install`` and ``pip``). - - The remaining ~30 will be investigated manually and decided upon on a - case by case basis how to migrate them to the new naming rules (in - consultation with the maintainers of those projects where possible). - Version ------- @@ -469,11 +568,33 @@ versions are designed for consumption by automated tools and support a variety of flexible version specification mechanisms (see PEP 440 for details). +Version identifiers MUST comply with the format defined in PEP 440. + +Version identifiers MUST be unique within each project. + Example:: "version": "1.0a2" +Summary +------- + +A short summary of what the distribution does. + +This field SHOULD contain fewer than 512 characters and MUST contain fewer +than 2048. + +This field SHOULD NOT contain any line breaks. + +A more complete description SHOULD be included as a separate file in the +sdist for the distribution. See `Document names`_ for details. + +Example:: + + "summary": "A module that is more fiendish than soft cushions." + + Source code metadata ==================== @@ -489,9 +610,13 @@ Source label ------------ A constrained identifying text string, as defined in PEP 440. Source labels -cannot be used in ordered version comparisons, but may be used to select -an exact version (see PEP 440 for details). +cannot be used in version specifiers - they are included for information +purposes only. +Source labels MUST meet the character restrictions defined in PEP 440. + +Source labels MUST be unique within each project and MUST NOT match any +defined version for the project. Examples:: @@ -508,19 +633,23 @@ Source URL ---------- A string containing a full URL where the source for this specific version of -the distribution can be downloaded. (This means that the URL can't be -something like ``"https://github.com/pypa/pip/archive/master.zip"``, but -instead must be ``"https://github.com/pypa/pip/archive/1.3.1.zip"``.) +the distribution can be downloaded. -Some appropriate targets for a source URL are a source tarball, an sdist -archive or a direct reference to a tag or specific commit in an online -version control system. +Source URLs MUST be unique within each project. This means that the URL +can't be something like ``"https://github.com/pypa/pip/archive/master.zip"``, +but instead must be ``"https://github.com/pypa/pip/archive/1.3.1.zip"``. -All source URL references SHOULD either specify a secure transport -mechanism (such as ``https``) or else include an expected hash value in the -URL for verification purposes. If an insecure transport is specified without -any hash information (or with hash information that the tool doesn't -understand), automated tools SHOULD at least emit a warning and MAY +The source URL MUST reference either a source archive or a tag or specific +commit in an online version control system that permits creation of a +suitable VCS checkout. It is intended primarily for integrators that +wish to recreate the distribution from the original source form. + +All source URL references SHOULD specify a secure transport +mechanism (such as ``https``), include an expected hash value in the +URL for verification purposes, or both. If an insecure transport is specified +without any hash information, with hash information that the tool doesn't +understand, or with a selected hash algorithm that the tool considers too +weak to trust, automated tools SHOULD at least emit a warning and MAY refuse to rely on the URL. It is RECOMMENDED that only hashes which are unconditionally provided by @@ -530,7 +659,7 @@ for source archive hashes. At time of writing, that list consists of ``'sha512'``. For source archive references, an expected hash value may be specified by -including a ``=`` as part of the URL +including a ``=`` entry as part of the URL fragment. For version control references, the ``VCS+protocol`` scheme SHOULD be @@ -546,29 +675,6 @@ Example:: "source_url": "http://github.com/pypa/pip/archive/1.3.1.zip#sha1=da9234ee9982d4bbb3c72346a6de940a148ea686" "source_url": "git+https://github.com/pypa/pip.git@1.3.1" -.. note:: - - This was called "Download-URL" in previous versions of the metadata. It - has been renamed, since there are plenty of other download locations and - this URL is meant to be a way to get the original source for development - purposes (or to generate an SRPM or other platform specific equivalent). - - For extra fun and games, it appears that unlike "svn+ssh://", - neither "git+ssh://" nor "hg+ssh://" natively support direct linking to a - particular tag (hg does support direct links to bookmarks through the URL - fragment, but that doesn't help for git and doesn't appear to be what I - want anyway). - - However pip does have a `defined convention - `__ for - this kind of link, which effectively splits a "URL" into "@". - - The PEP simply adopts pip's existing solution to this problem. - - This field is separate from the project URLs, as it's expected to - change for each version, while the project URLs are expected to - be fairly stable. - Additional descriptive metadata =============================== @@ -580,74 +686,29 @@ All of these fields are optional. Automated tools MUST operate correctly if a distribution does not provide them, including failing cleanly when an operation depending on one of these fields is requested. -Summary + +License ------- -A one-line summary of what the distribution does. +A short string summarising the license used for this distribution. -Publication tools SHOULD emit a warning if this field is not provided. Index -servers MAY require that this field be present before allowing a -distribution to be uploaded. +Note that distributions that provide this field should still specify any +applicable license Trove classifiers in the `Classifiers`_ field. Even +when an appropriate Trove classifier is available, the license summary can +be a good way to specify a particular version of that license, or to +indicate any variations or exception to the license. + +This field SHOULD contain fewer than 512 characters and MUST contain fewer +than 2048. + +This field SHOULD NOT contain any line breaks. + +The full license text SHOULD be included as a separate file in the source +archive for the distribution. See `Document names`_ for details. Example:: - "summary": "A module that is more fiendish than soft cushions." - -.. note:: - - This used to be mandatory, and it's still highly recommended, but really, - nothing should break even when it's missing. - - -Description ------------ - -The distribution metadata should include a longer description of the -distribution that may run to several paragraphs. Software that deals -with metadata should not assume any maximum size for the description. - -The distribution description can be written using reStructuredText -markup [1]_. For programs that work with the metadata, supporting -markup is optional; programs may also display the contents of the -field as plain text without any special formatting. This means that -authors should be conservative in the markup they use. - -Example:: - - "description": "The ComfyChair module replaces SoftCushions.\\n\\nUse until lunchtime, but pause for a cup of coffee at eleven." - -.. note:: - - The difficulty of editing this field in a raw JSON file is one of the - main reasons this metadata interchange format is NOT recommended for - use as an input format for build tools. - - -Description Format ------------------- - -A field indicating the intended format of the text in the description field. -This allows index servers to render the description field correctly and -provide feedback on rendering errors, rather than having to guess the -intended format. - -If this field is omitted, or contains an unrecognised value, the default -rendering format MUST be plain text. - -The following format names SHOULD be used for the specified markup formats: - -* ``txt``: Plain text (default handling if field is omitted) -* ``rst``: reStructured Text -* ``md``: Markdown (exact syntax variant will be implementation dependent) -* ``adoc``: AsciiDoc -* ``html``: HTML - -Automated tools MAY render one or more of the listed formats as plain -text and MAY accept other markup formats beyond those listed. - -Example:: - - "description_format": "rst" + "license": "GPL version 3, excluding DRM provisions" Keywords @@ -661,40 +722,6 @@ Example:: "keywords": ["comfy", "chair", "cushions", "too silly", "monty python"] -License -------- - -A string indicating the license covering the distribution where the license -is not a simple selection from the "License" Trove classifiers. See -Classifiers" below. This field may also be used to specify a -particular version of a license which is named via the ``Classifier`` -field, or to indicate a variation or exception to such a license. - -Example:: - - "license": "GPL version 3, excluding DRM provisions" - - -License URL ------------ - -A specific URL referencing the full licence text for this version of the -distribution. - -Example:: - - "license_url": "https://github.com/pypa/pip/blob/1.3.1/LICENSE.txt" - -.. note:: - - Like Version URL, this is handled separately from the project URLs - as it is important that it remain accurate for this *specific* - version of the distribution, even if the project later switches to a - different license. - - The project URLs field is intended for more stable references. - - Classifiers ----------- @@ -704,11 +731,60 @@ for the distribution. Classifiers are described in PEP 301 [2]. Example:: "classifiers": [ - "Development Status :: 4 - Beta", - "Environment :: Console (Text Based)" + "Development Status :: 4 - Beta", + "Environment :: Console (Text Based)", + "License :: OSI Approved :: GNU General Public License v3 (GPLv3)" ] +Document names +-------------- + +Filenames for supporting documents included in the distribution's +``dist-info`` metadata directory. + +The following supporting documents can be named: + +* ``description``: a file containing a long description of the distribution +* ``license``: a file with the full text of the distribution's license +* ``changelog``: a file describing changes made to the distribution + +Supporting documents MUST be included directly in the ``dist-info`` +directory. Directory separators are NOT permitted in document names. + +The markup format (if any) for the file is indicated by the file extension. +This allows index servers and other automated tools to render included +text documents correctly and provide feedback on rendering errors, rather +than having to guess the intended format. + +If the filename has no extension, or the extension is not recognised, the +default rendering format MUST be plain text. + +The following markup renderers SHOULD be used for the specified file +extensions: + +* Plain text: ``.txt``, no extension, unknown extension +* reStructured Text: ``.rst`` +* Markdown: ``.md`` +* AsciiDoc: ``.adoc``, ``.asc``, ``.asciidoc`` +* HTML: ``.html``, ``.htm`` + +Automated tools MAY render one or more of the specified formats as plain +text and MAY render other markup formats beyond those listed. + +Automated tools SHOULD NOT make any assumptions regarding the maximum length +of supporting document content, except as necessary to protect the +integrity of a service. + +Example:: + + "document_names": { + "description": "README.rst", + "license": "LICENSE.rst", + "changelog": "NEWS" + } + + Contributor metadata ==================== @@ -726,6 +802,12 @@ The ``name`` subfield is required, the other subfields are optional. If no specific role is stated, the default is ``contributor``. +Email addresses must be in the form ``local-part@domain`` where the +local-part may be up to 64 characters long and the entire email address +contains no more than 254 characters. The formal specification of the +format is in RFC 5322 (sections 3.2.3 and 3.4.1) and RFC 5321, with a more +readable form given in the informational RFC 3696 and the associated errata. + The defined contributor roles are as follows: * ``author``: the original creator of a distribution @@ -734,15 +816,6 @@ The defined contributor roles are as follows: * ``contributor``: any other individuals or organizations involved in the creation of the distribution -.. note:: - - The contributor role field is included primarily to replace the - Author, Author-Email, Maintainer, Maintainer-Email fields from metadata - 1.2 in a way that allows those distinctions to be fully represented for - lossless translation, while allowing future distributions to pretty - much ignore everything other than the contact/contributor distinction - if they so choose. - Contact and contributor metadata is optional. Automated tools MUST operate correctly if a distribution does not provide it, including failing cleanly when an operation depending on one of these fields is requested. @@ -789,12 +862,12 @@ are the same as those for the main contact field. Example:: "contributors": [ - {"name": "John C."}, - {"name": "Erik I."}, - {"name": "Terry G."}, - {"name": "Mike P."}, - {"name": "Graeme C."}, - {"name": "Terry J."} + {"name": "John C."}, + {"name": "Erik I."}, + {"name": "Terry G."}, + {"name": "Mike P."}, + {"name": "Graeme C."}, + {"name": "Terry J."} ] @@ -815,34 +888,45 @@ permitted as a URL label. Example:: "project_urls": { - "Documentation": "https://distlib.readthedocs.org" - "Home": "https://bitbucket.org/pypa/distlib" - "Repository": "https://bitbucket.org/pypa/distlib/src" - "Tracker": "https://bitbucket.org/pypa/distlib/issues" + "Documentation": "https://distlib.readthedocs.org" + "Home": "https://bitbucket.org/pypa/distlib" + "Repository": "https://bitbucket.org/pypa/distlib/src" + "Tracker": "https://bitbucket.org/pypa/distlib/issues" } -Dependency metadata -=================== +Semantic dependencies +===================== Dependency metadata allows distributions to make use of functionality provided by other distributions, without needing to bundle copies of those distributions. +Semantic dependencies allow publishers to indicate not only which other +distributions are needed, but also *why* they're needed. This additional +information allows integrators to install just the dependencies they need +for specific activities, making it easier to minimise installation +footprints in constrained environments (regardless of the reasons for +those constraints). + +Distributions may declare five differents kinds of dependency: + +* "Meta" dependencies: subdistributions that are grouped together into a + single larger metadistribution for ease of reference and installation. +* Runtime dependencies: other distributions that are needed to actually use + this distribution (but are not considered subdistributions). +* Test dependencies: other distributions that are needed to run the + automated test suite for this distribution (but are not needed just to + use it). +* Build dependencies: other distributions that are needed to build this + distribution. +* Development dependencies: other distributions that are needed when + working on this distribution (but do not fit into one of the other + dependency categories). + Dependency management is heavily dependent on the version identification and specification scheme defined in PEP 440. -.. note:: - - This substantially changes the old two-phase setup vs runtime dependency - model in metadata 1.2 (which was in turn derived from the setuptools - dependency parameters). The translation is that ``dev_requires`` and - ``build_requires`` both map to ``Setup-Requires-Dist`` - in 1.2, while ``requires`` and ``distributes`` map to ``Requires-Dist``. - To go the other way, ``Setup-Requires-Dist`` maps to ``build_requires`` - and ``Requires-Dist`` maps to ``distributes`` (for exact comparisons) - and ``requires`` (for all other version specifiers). - All of these fields are optional. Automated tools MUST operate correctly if a distribution does not provide them, by assuming that a missing field indicates "Not applicable for this distribution". @@ -854,10 +938,11 @@ Dependency specifications Individual dependencies are typically defined as strings containing a distribution name (as found in the ``name`` field). The dependency name may be followed by an extras specifier (enclosed in square -brackets) and by a version specification (within parentheses). +brackets) and by a version specifier or direct reference (within +parentheses). See `Extras (optional dependencies)`_ for details on extras and PEP 440 -for details on version specifiers. +for details on version specifiers and direct references. The distribution names should correspond to names as found on the `Python Package Index`_; while these names are often the same as the module names @@ -903,13 +988,6 @@ Note that the same extras and environment markers MAY appear in multiple conditional dependencies. This may happen, for example, if an extra itself only needs some of its dependencies in specific environments. -.. note:: - - Technically, you could store the conditional and unconditional - dependencies in a single list and switch based on the entry type - (string or mapping), but the ``*requires`` vs ``*may-require`` two - list design seems easier to understand and work with. - Mapping dependencies to development and distribution activities --------------------------------------------------------------- @@ -918,26 +996,34 @@ The different categories of dependency are based on the various distribution and development activities identified above, and govern which dependencies should be installed for the specified activities: -* Deployment dependencies: +* Implied runtime dependencies: - * ``distributes`` - * ``requires`` - * ``may_require`` - * Request the ``test`` extra to also install + * ``meta_requires`` + * ``meta_may_require`` + * ``run_requires`` + * ``run_may_require`` - * ``test_requires`` - * ``test_may_require`` - -* Build dependencies: +* Implied build dependencies: * ``build_requires`` * ``build_may_require`` + * If running the distribution's test suite as part of the build process, + request the ``:meta:``, ``:run:`` and ``:test:`` extras to also + install: -* Development dependencies: + * ``meta_requires`` + * ``meta_may_require`` + * ``run_requires`` + * ``run_may_require`` + * ``test_requires`` + * ``test_may_require`` - * ``distributes`` - * ``requires`` - * ``may_require`` +* Implied development and publication dependencies: + + * ``meta_requires`` + * ``meta_may_require`` + * ``run_requires`` + * ``run_may_require`` * ``build_requires`` * ``build_may_require`` * ``test_requires`` @@ -945,139 +1031,146 @@ should be installed for the specified activities: * ``dev_requires`` * ``dev_may_require`` - -To ease compatibility with existing two phase setup/deployment toolchains, -installation tools MAY treat ``dev_requires`` and ``dev_may_require`` as -additions to ``build_requires`` and ``build_may_require`` rather than -as separate fields. - -Installation tools SHOULD allow users to request at least the following -operations for a named distribution: - -* Install the distribution and any deployment dependencies. -* Install just the build dependencies without installing the distribution -* Install just the development dependencies without installing - the distribution -* Install just the development dependencies without installing - the distribution or any dependencies listed in ``distributes`` - -The notation described in `Extras (optional dependencies)`_ SHOULD be used to -request additional optional dependencies when installing deployment -or build dependencies. +The notation described in `Extras (optional dependencies)`_ SHOULD be used +to determine exactly what gets installed for various operations. Installation tools SHOULD report an error if dependencies cannot be found, MUST at least emit a warning, and MAY allow the user to force the installation to proceed regardless. -.. note:: - - As an example of mapping this to Linux distro packages, assume an - example project without any extras defined is split into 2 RPMs - in a SPEC file: example and example-devel - - The ``distributes``, ``requires`` and applicable ``may_require`` - dependencies would be mapped to the Requires dependencies for the - "example" RPM (a mapping from environment markers to SPEC file - conditions would also allow those to be handled correctly) - - The ``build_requires`` and ``build_may_require`` dependencies would be - mapped to the BuildRequires dependencies for the "example" RPM. - - All defined dependencies relevant to Linux, including those in - ``dev_requires`` and ``test_requires``, would become Requires - dependencies for the "example-devel" RPM. - - If a project defines any extras, those would be mapped to additional - virtual RPMs with appropriate BuildRequires and Requires entries based - on the details of the dependency specifications. - - A documentation toolchain dependency like Sphinx would either go in - ``build_requires`` (for example, if man pages were included in the - built distribution) or in ``dev_requires`` (for example, if the - documentation is published solely through ReadTheDocs or the - project website). This would be enough to allow an automated converter - to map it to an appropriate dependency in the spec file. - - -Distributes ------------ - -A list of subdistributions that can easily be installed and used together -by depending on this metadistribution. - -Automated tools MUST allow strict version matching and source reference -clauses in this field and MUST NOT allow more permissive version specifiers. - -Example:: - - "distributes": ["ComfyUpholstery (== 1.0a2)", - "ComfySeatCushion (== 1.0a2)"] - - -Requires --------- - -A list of other distributions needed when this distribution is deployed. - -Automated tools MAY disallow strict version matching clauses and source -references in this field and SHOULD at least emit a warning for such clauses. - -Example:: - - "requires": ["SciPy", "PasteDeploy", "zope.interface (>3.5.0)"] +See Appendix B for an overview of mapping these dependencies to an RPM +spec file. Extras ------ A list of optional sets of dependencies that may be used to define -conditional dependencies in ``"may_require"`` and similar fields. See -`Extras (optional dependencies)`_ for details. +conditional dependencies in ``"may_distribute"``, ``"run_may_require"`` and +similar fields. See `Extras (optional dependencies)`_ for details. -The extra name``"test"`` is reserved for requesting the dependencies -specified in ``test_requires`` and ``test_may_require`` and is NOT -permitted in this field. +The names of extras MUST abide by the same restrictions as those for +distribution names. Example:: "extras": ["warmup"] -May require ------------ +Meta requires +------------- -A list of other distributions that may be needed when this distribution -is deployed, based on the extras requested and the target deployment +An abbreviation of "metadistribution requires". This is a list of +subdistributions that can easily be installed and used together by +depending on this metadistribution. + +In this field, automated tools: + +* MUST allow strict version matching +* MUST NOT allow more permissive version specifiers. +* MAY allow direct references + +Public index servers SHOULD NOT allow the use of direct references in +uploaded distributions. Direct references are intended primarily as a +tool for software integrators rather than publishers. + +Distributions that rely on direct references to platform specific binary +archives SHOULD define appropriate constraints in their +``supports_environments`` field. + +Example:: + + "meta_requires": ["ComfyUpholstery (== 1.0a2)", + "ComfySeatCushion (== 1.0a2)"] + + +Meta may require +---------------- + +An abbreviation of "metadistribution may require". This is a list of +subdistributions that can easily be installed and used together by +depending on this metadistribution, but are not required in all +circumstances. + +Any extras referenced from this field MUST be named in the `Extras`_ field. + +In this field, automated tools: + +* MUST allow strict version matching +* MUST NOT allow more permissive version specifiers. +* MAY allow direct references + +Public index servers SHOULD NOT allow the use of direct references in +uploaded distributions. Direct references are intended primarily as a +tool for software integrators rather than publishers. + +Distributions that rely on direct references to platform specific binary +archives SHOULD defined appropriate constraints in their +``supports_environments`` field. + +Example:: + + "meta_may_require": [ + { + "dependencies": ["CupOfTeaAtEleven (== 1.0a2)"], + "environment": "'linux' in sys.platform" + } + ] + + +Run requires +------------ + +A list of other distributions needed to actually run this distribution. + +Automated tools MUST NOT allow strict version matching clauses or direct +references in this field - if permitted at all, such clauses should appear +in ``meta_requires`` instead. + +Example:: + + "run_requires": ["SciPy", "PasteDeploy", "zope.interface (>3.5.0)"] + + +Run may require +--------------- + +A list of other distributions that may be needed to actually run this +distribution, based on the extras requested and the target deployment environment. Any extras referenced from this field MUST be named in the `Extras`_ field. -Automated tools MAY disallow strict version matching clauses and source -references in this field and SHOULD at least emit a warning for such clauses. +Automated tools MUST NOT allow strict version matching clauses or direct +references in this field - if permitted at all, such clauses should appear +in ``meta_may_require`` instead. Example:: - "may_require": [ - { - "dependencies": ["pywin32 (>1.0)"], - "environment": "sys.platform == 'win32'" - }, - { - "dependencies": ["SoftCushions"], - "extra": "warmup" - } - ] + "run_may_require": [ + { + "dependencies": ["pywin32 (>1.0)"], + "environment": "sys.platform == 'win32'" + }, + { + "dependencies": ["SoftCushions"], + "extra": "warmup" + } + ] + Test requires ------------- A list of other distributions needed in order to run the automated tests -for this distribution, either during development or when running the -``test_installed_dist`` metabuild when deployed. +for this distribution.. -Automated tools MAY disallow strict version matching clauses and source +Automated tools MAY disallow strict version matching clauses and direct references in this field and SHOULD at least emit a warning for such clauses. +Public index servers SHOULD NOT allow strict version matching clauses or +direct references in this field. + Example:: "test_requires": ["unittest2"] @@ -1087,41 +1180,45 @@ Test may require ---------------- A list of other distributions that may be needed in order to run the -automated tests for this distribution, either during development or when -running the ``test_installed_dist`` metabuild when deployed, based on the -extras requested and the target deployment environment. +automated tests for this distribution. Any extras referenced from this field MUST be named in the `Extras`_ field. -Automated tools MAY disallow strict version matching clauses and source +Automated tools MAY disallow strict version matching clauses and direct references in this field and SHOULD at least emit a warning for such clauses. +Public index servers SHOULD NOT allow strict version matching clauses or +direct references in this field. + Example:: - "test_may_require": [ - { - "dependencies": ["pywin32 (>1.0)"], - "environment": "sys.platform == 'win32'" - }, - { - "dependencies": ["CompressPadding"], - "extra": "warmup" - } - ] + "test_may_require": [ + { + "dependencies": ["pywin32 (>1.0)"], + "environment": "sys.platform == 'win32'" + }, + { + "dependencies": ["CompressPadding"], + "extra": "warmup" + } + ] Build requires -------------- A list of other distributions needed when this distribution is being built -(creating a binary archive from a source archive). +(creating a binary archive from an sdist, source archive or VCS checkout). Note that while these are build dependencies for the distribution being built, the installation is a *deployment* scenario for the dependencies. -Automated tools MAY disallow strict version matching clauses and source +Automated tools MAY disallow strict version matching clauses and direct references in this field and SHOULD at least emit a warning for such clauses. +Public index servers SHOULD NOT allow strict version matching clauses or +direct references in this field. + Example:: "build_requires": ["setuptools (>= 0.7)"] @@ -1131,8 +1228,8 @@ Build may require ----------------- A list of other distributions that may be needed when this distribution -is built (creating a binary archive from a source archive), based on the -features requested and the build environment. +is built (creating a binary archive from an sdist, source archive or +VCS checkout), based on the features requested and the build environment. Note that while these are build dependencies for the distribution being built, the installation is a *deployment* scenario for the dependencies. @@ -1142,21 +1239,24 @@ Any extras referenced from this field MUST be named in the `Extras`_ field. Automated tools MAY assume that all extras are implicitly requested when installing build dependencies. -Automated tools MAY disallow strict version matching clauses and source +Automated tools MAY disallow strict version matching clauses and direct references in this field and SHOULD at least emit a warning for such clauses. +Public index servers SHOULD NOT allow strict version matching clauses or +direct references in this field. + Example:: - "build_may_require": [ - { - "dependencies": ["pywin32 (>1.0)"], - "environment": "sys.platform == 'win32'" - }, - { - "dependencies": ["cython"], - "extra": "c-accelerators" - } - ] + "build_may_require": [ + { + "dependencies": ["pywin32 (>1.0)"], + "environment": "sys.platform == 'win32'" + }, + { + "dependencies": ["cython"], + "extra": "c-accelerators" + } + ] Dev requires @@ -1168,17 +1268,16 @@ dependencies. Additional dependencies that may be listed in this field include: -* tools needed to create a source archive +* tools needed to create an sdist from a source archive or VCS checkout * tools needed to generate project documentation that is published online rather than distributed along with the rest of the software -* additional test dependencies for tests which are not executed when the - test is invoked through the ``test_installed_dist`` metabuild hook (for - example, tests that require a local database server and web server and - may not work when fully installed on a production system) -Automated tools MAY disallow strict version matching clauses and source +Automated tools MAY disallow strict version matching clauses and direct references in this field and SHOULD at least emit a warning for such clauses. +Public index servers SHOULD NOT allow strict version matching clauses or +direct references in this field. + Example:: "dev_requires": ["hgtools", "sphinx (>= 1.0)"] @@ -1199,17 +1298,20 @@ Any extras referenced from this field MUST be named in the `Extras`_ field. Automated tools MAY assume that all extras are implicitly requested when installing development dependencies. -Automated tools MAY disallow strict version matching clauses and source +Automated tools MAY disallow strict version matching clauses and direct references in this field and SHOULD at least emit a warning for such clauses. +Public index servers SHOULD NOT allow strict version matching clauses or +direct references in this field. + Example:: - "dev_may_require": [ - { - "dependencies": ["pywin32 (>1.0)"], - "environment": "sys.platform == 'win32'" - } - ] + "dev_may_require": [ + { + "dependencies": ["pywin32 (>1.0)"], + "environment": "sys.platform == 'win32'" + } + ] Provides @@ -1231,6 +1333,19 @@ For instance, with distribute merged back into setuptools, the merged project is able to include a ``"provides": ["distribute"]`` entry to satisfy any projects that require the now obsolete distribution's name. +To avoid malicious hijacking of names, when interpreting metadata retrieved +from a public index server, automated tools MUST prefer the distribution +named in a version specifier over other distributions using that +distribution's name in a ``"provides"`` entry. Index servers MAY drop such +entries from the metadata they republish, but SHOULD NOT refuse to publish +such distributions. + +However, to appropriately handle project forks and mergers, automated tools +MUST accept ``"provides"`` entries that name other distributions when the +entry is retrieved from a local installation database or when there is a +corresponding ``"obsoleted_by"`` entry in the metadata for the named +distribution. + A distribution may also provide a "virtual" project name, which does not correspond to any separately distributed project: such a name might be used to indicate an abstract capability which could be supplied @@ -1291,63 +1406,66 @@ distribution supports any platform supported by Python. Individual entries are environment markers, as described in `Environment markers`_. -Installation tools SHOULD report an error if supported platforms are +Installation tools SHOULD report an error if supported environments are specified by the distribution and the current platform fails to match any of them, MUST at least emit a warning, and MAY allow the user to force the installation to proceed regardless. -Examples:: +The two main uses of this field are to declare which versions of Python +and which underlying operating systems are supported. +Examples indicating supported Python versions:: + + # Supports Python 2.6+ + "supports_environments": ["python_version >= '2.6'"] + + # Supports Python 2.6+ (for 2.x) or 3.3+ (for 3.x) + "supports_environments": ["python_version >= '3.3'", + "'3.0' > python_version >= '2.6'"] + +Examples indicating supported operating systems:: + + # Windows only "supports_environments": ["sys_platform == 'win32'"] + + # Anything except Windows "supports_environments": ["sys_platform != 'win32'"] + + # Linux or BSD only "supports_environments": ["'linux' in sys_platform", "'bsd' in sys_platform"] +Example where the supported Python version varies by platform:: -.. note:: - - This field replaces the old Platform, Requires-Platform and - Requires-Python fields and has been redesigned with environment - marker based semantics that should make it possible to reliably flag, - for example, Unix specific or Windows specific distributions, as well - as Python 2 only and Python 3 only distributions. + # The standard library's os module has long supported atomic renaming + # on POSIX systems, but only gained atomic renaming on Windows in Python + # 3.3. A distribution that needs atomic renaming support for reliable + # operation might declare the following supported environments. + "supports_environments": ["python_version >= '2.6' and sys_platform != 'win32'", + "python_version >= '3.3' and sys_platform == 'win32'"] -Metabuild system -================ +Install hooks +============= -The ``metabuild_hooks`` field is used to define various operations that -may be invoked on a distribution in a platform independent manner. - -The metabuild system currently defines three operations as part of the -deployment of a distribution: +The ``install_hooks`` field is used to define operations to be +invoked on the distribution in the following situations: * Installing to a deployment system * Uninstalling from a deployment system -* Running the distribution's test suite on a deployment system (hence the - ``test`` runtime extra) -Distributions may define handles for each of these operations as an -"entry point", a reference to a Python callable, with the module name -separated from the reference within the module by a colon (``:``). +Distributions may define handlers for each of these operations as an +"entry point", which is a reference to a Python callable, with the module +name separated from the reference within the module by a colon (``:``). -Example metabuild hooks:: +Example install hooks:: - "metabuild_hooks": { - "postinstall": "myproject.build_hooks:postinstall", - "preuininstall": "myproject.build_hooks:preuninstall", - "test_installed_dist": "some_test_harness:metabuild_hook" + "install_hooks": { + "postinstall": "ComfyChair.install_hooks:postinstall", + "preuininstall": "ComfyChair.install_hooks:preuninstall" } -Build and installation tools MAY offer additional operations beyond the -core metabuild operations. These operations SHOULD be composed from the -defined metabuild operations where appropriate. - -Build and installation tools SHOULD support the legacy ``setup.py`` based -commands for metabuild operations not yet defined as metabuild hooks. - -The metabuild hooks are gathered together into a single top level -``metabuild_hooks`` field. The individual hooks are: +The currently defined install hooks are: * ``postinstall``: run after the distribution has been installed to a target deployment system (or after it has been upgraded). If the hook is @@ -1357,18 +1475,15 @@ The metabuild hooks are gathered together into a single top level deployment system (or before it is upgraded). If the hook is not defined, it indicates no distribution specific actions are needed prior to uninstallation. -* ``test_installed_dist``: test an installed distribution is working. If the - hook is not defined, it indicates the distribution does not support - execution of the test suite after deployment. -The expected signatures of these hooks are as follows:: +The required signatures of these hooks are as follows:: def postinstall(current_meta, previous_meta=None): """Run following installation or upgrade of the distribution *current_meta* is the distribution metadata for the version now installed on the current system - *previous_meta* is either missing or ``None`` (indicating a fresh + *previous_meta* is either omitted or ``None`` (indicating a fresh install) or else the distribution metadata for the version that was previously installed (indicating an upgrade or downgrade). """ @@ -1378,30 +1493,65 @@ The expected signatures of these hooks are as follows:: *current_meta* is the distribution metadata for the version now installed on the current system - *next_meta* is either missing or ``None`` (indicating complete + *next_meta* is either omitted or ``None`` (indicating complete uninstallation) or else the distribution metadata for the version that is about to be installed (indicating an upgrade or downgrade). """ - def test_installed_dist(current_meta): - """Check an installed distribution is working correctly +When install hooks are defined, it is assumed that they MUST be executed +to obtain a properly working installation of the distribution, and to +properly remove the distribution from a system. - Note that this check should always be non-destructive as it may be - invoked automatically by some tools. +Install hooks SHOULD NOT be used to provide functionality that is +expected to be provided by installation tools (such as rewriting of +shebang lines and generation of executable wrappers for Windows). - Requires that the distribution's test dependencies be installed - (indicated by the ``test`` runtime extra). +Installation tools MUST ensure the distribution is fully installed, and +available through the import system and installation database when invoking +install hooks. - Returns ``True`` if the check passes, ``False`` otherwise. - """ +Installation tools MUST call install hooks with full metadata, rather than +only the essential dependency resolution metadata. -Metabuild hooks MUST be called with at least abbreviated metadata, and MAY -be called with full metadata. +The given parameter names are considered part of the hook signature. +Installation tools MUST call install hooks solely with keyword arguments. +Install hook implementations MUST use the given parameter names. -Where necessary, metabuild hooks check for the presence or absence of -optional dependencies defined as extras using the same techniques used -during normal operation of the distribution (for example, checking for -import failures for optional dependencies). +Installation tools SHOULD invoke install hooks automatically after +installing a distribution from a binary archive. + +When installing from an sdist, source archive or VCS checkout, installation +tools SHOULD create a binary archive using ``setup.py bdist_wheel`` and +then install binary archive normally (including invocation of any install +hooks). Installation tools SHOULD NOT invoke ``setup.py install`` directly. + +Installation tools SHOULD treat an exception thrown by a postinstall hook +as a failure of the installation and revert any other changes made to the +system. + +Installation tools SHOULD treat an exception thrown by a preuninstall hook +as an indication the removal of the distribution should be aborted. + +Installation tools MUST NOT silently ignore install hooks, as failing +to call these hooks may result in a misconfigured installation that fails +unexpectedly at runtime. Installation tools MAY refuse to install +distributions that define install hooks, or require that users +explicitly opt in to permitting the execution of such hooks. + +Install hook implementations MUST NOT make any assumptions regarding the +current working directory when they are invoked, and MUST NOT make +persistent alterations to the working directory or any other process global +state (other than potentially importing additional modules, or other +expected side effects of running the distribution). + +Install hooks have access to the full metadata for the release being +installed, that of the previous/next release (as appropriate), as well as +to all the normal runtime information (such as available imports). Hook +implementations can use this information to perform additional platform +specific installation steps. To check for the presence or absence of +"extras", hook implementations should use the same runtime checks that +would be used during normal operation (such as checking for the availability +of the relevant dependencies). Metadata Extensions @@ -1413,14 +1563,20 @@ distribution names, while the values may be any type natively supported in JSON:: "extensions" : { - "chili" : { "type" : "Poblano", "heat" : "Mild" }, - "languages" : [ "French", "Italian", "Hebrew" ] + "chili" : { "type" : "Poblano", "heat" : "Mild" }, + "languages" : [ "French", "Italian", "Hebrew" ] } -To avoid name conflicts, it is recommended that distribution names be used +To avoid name conflicts, it is RECOMMENDED that distribution names be used to identify metadata extensions. This practice will also make it easier to find authoritative documentation for metadata extensions. +Metadata extensions allow development tools to record information in the +metadata that may be useful during later phases of distribution. For +example, a build tool could include default build options in a metadata +extension when creating an sdist, and use those when creating the wheel +files later. + Extras (optional dependencies) ============================== @@ -1440,7 +1596,7 @@ Example of a distribution with optional dependencies:: "name": "ComfyChair", "extras": ["warmup", "c-accelerators"] - "may_require": [ + "run_may_require": [ { "dependencies": ["SoftCushions"], "extra": "warmup" @@ -1457,15 +1613,32 @@ Other distributions require the additional dependencies by placing the relevant extra names inside square brackets after the distribution name when specifying the dependency. -Extra specifications MUST support the following additional syntax: +Extra specifications MUST allow the following additional syntax: -* Multiple features can be requested by separating them with a comma within +* Multiple extras can be requested by separating them with a comma within the brackets. -* All explicitly defined extras may be requested with the ``*`` wildcard - character. Note that this does NOT request the implicitly defined - ``test`` extra - that must always be requested explicitly when it is - desired. -* Extras may be explicitly excluded by prefixing their name with a hyphen. +* The following special extras request processing of the corresponding + lists of dependencies: + + * ``:meta:``: ``meta_requires`` and ``meta_may_require`` + * ``:run:``: ``run_requires`` and ``run_may_require`` + * ``:test:``: ``test_requires`` and ``test_may_require`` + * ``:build:``: ``build_requires`` and ``build_may_require`` + * ``:dev:``: ``dev_requires`` and ``dev_may_require`` + * ``:*:``: process *all* dependency lists + +* The ``*`` character as an extra is a wild card that enables all of the + entries defined in the distribution's ``extras`` field. +* Extras may be explicitly excluded by prefixing their name with a ``-`` + character (this is useful in conjunction with ``*`` to exclude only + particular extras that are definitely not wanted, while enabling all + others). + +* The ``-`` character as an extra specification indicates that the + distribution itself should NOT be installed, and also disables the + normally implied processing of ``:meta:`` and ``:run:`` dependencies + (those may still be requested explicitly using the appropriate extra + specifications). Command line based installation tools SHOULD support this same syntax to allow extras to be requested explicitly. @@ -1473,15 +1646,31 @@ allow extras to be requested explicitly. The full set of dependency requirements is then based on the top level dependencies, along with those of any requested extras. -Example:: +Dependency examples:: - "requires": ["ComfyChair[warmup]"] + "run_requires": ["ComfyChair[warmup]"] -> requires ``ComfyChair`` and ``SoftCushions`` at run time - "requires": ["ComfyChair[*]"] + "run_requires": ["ComfyChair[*]"] -> requires ``ComfyChair`` and ``SoftCushions`` at run time, but - will also pick up any new optional dependencies other than those - needed solely to run the tests + will also pick up any new extras defined in later versions + +Command line examples:: + + pip install ComfyChair + -> installs ComfyChair with applicable :meta: and :run: dependencies + + pip install ComfyChair[*] + -> as above, but also installs all extra dependencies + + pip install ComfyChair[-,:build:,*] + -> installs just the build dependencies with all extras + + pip install ComfyChair[-,:build:,:run:,:meta:,:test:,*] + -> as above, but also installs dependencies needed to run the tests + + pip install ComfyChair[-,:*:,*] + -> installs the full set of development dependencies Environment markers @@ -1504,7 +1693,7 @@ And here's an example of some conditional metadata for a distribution that requires PyWin32 both at runtime and buildtime when using Windows:: "name": "ComfyChair", - "may_require": [ + "run_may_require": [ { "dependencies": ["pywin32 (>1.0)"], "environment": "sys.platform == 'win32'" @@ -1539,6 +1728,12 @@ where ``SUBEXPR`` is either a Python string (such as ``'2.4'``, or * ``platform_version``: ``platform.version()`` * ``platform_machine``: ``platform.machine()`` * ``platform_python_implementation``: ``platform.python_implementation()`` +* ``implementation_name````: ``sys.implementation.name`` +* ``implementation_version````: see definition below + +If a particular value is not available (such as the ``sys.implementation`` +subattributes in versions of Python prior to 3.3), the corresponding marker +variable MUST be considered equivalent to the empty string. Note that all subexpressions are restricted to strings or one of the marker variable names (which refer to string values), meaning that it is @@ -1548,17 +1743,24 @@ side of the ``in`` and ``not in`` operators. Chaining of comparison operations is permitted using the normal Python semantics of an implied ``and``. -The ``python_full_version`` marker variable is derived from -``sys.version_info()`` in accordance with the following algorithm:: +The ``python_full_version`` and ``implementation_version`` marker variables +are derived from ``sys.version_info()`` and ``sys.implementation.version`` +respectively, in accordance with the following algorithm:: - def format_full_version(): - info = sys.version_info + def format_full_version(info): version = '{0.major}.{0.minor}.{0.micro}'.format(info) kind = info.releaselevel if kind != 'final': version += kind[0] + str(info.serial) return version + python_full_version = format_full_version(sys.version_info) + implementation_version = format_full_version(sys.implementation.version) + +``python_full_version`` will typically correspond to the leading segment +of ``sys.version()``. + + Updating the metadata specification =================================== @@ -1570,8 +1772,68 @@ changing the meaning of existing fields, requires a new metadata version defined in a new PEP. -Summary of differences from \PEP 345 -==================================== +Appendix A: Conversion notes for legacy metadata +================================================ + +The reference implementations for converting from legacy metadata to +metadata 2.0 are: + +* the `wheel project `__, which + adds the ``bdist_wheel`` command to ``setuptools`` +* the `Warehouse project `__, which + will eventually be migrated to the Python Packaging Authority as the next + generation Python Package Index implementation +* the `distlib project `__ which is + derived from the core packaging infrastructure created for the + ``distutils2`` project and + +While it is expected that there may be some edge cases where manual +intervention is needed for clean conversion, the specification has been +designed to allow fully automated conversion of almost all projects on +PyPI. + +Metadata conversion (especially on the part of the index server) is a +necessary step to allow installation and analysis tools to start +benefiting from the new metadata format, without having to wait for +developers to upgrade to newer build systems. + + +Appendix B: Mapping dependency declarations to an RPM SPEC file +=============================================================== + + +As an example of mapping this PEP to Linux distro packages, assume an +example project without any extras defined is split into 2 RPMs +in a SPEC file: ``example`` and ``example-devel``. + +The ``meta_requires``, ``run_requires`` and applicable +``meta_may_require`` ``run_may_require`` dependencies would be mapped +to the Requires dependencies for the "example" RPM (a mapping from +environment markers relevant to Linux to SPEC file conditions would +also allow those to be handled correctly) + +The ``build_requires`` and ``build_may_require`` dependencies would be +mapped to the BuildRequires dependencies for the "example" RPM. + +All defined dependencies relevant to Linux, including those in +``dev_requires``, ``test_requires``, ``dev_may_require``, and +``test_may_require`` would become Requires dependencies for the +"example-devel" RPM. + +If the project did define any extras, those would likely be mapped to +additional virtual RPMs with appropriate BuildRequires and Requires +entries based on the details of the dependency specifications. + +A documentation toolchain dependency like Sphinx would either go in +``build_requires`` (for example, if man pages were included in the +built distribution) or in ``dev_requires`` (for example, if the +documentation is published solely through ReadTheDocs or the +project website). This would be enough to allow an automated converter +to map it to an appropriate dependency in the spec file. + + +Appendix C: Summary of differences from \PEP 345 +================================================= * Metadata-Version is now 2.0, with semantics specified for handling version changes @@ -1592,21 +1854,21 @@ Summary of differences from \PEP 345 * Changed the version scheme to be based on PEP 440 rather than PEP 386 -* Added the build label mechanism as described in PEP 440 +* Added the source label mechanism as described in PEP 440 -* Support for different development, build, test and deployment dependencies +* Support for different kinds of dependencies * The "Extras" optional dependency mechanism * A well-defined metadata extension mechanism -* Metabuild hook system +* Install hook system * Clarify and simplify various aspects of environment markers: * allow use of parentheses for grouping in the pseudo-grammar * consistently use underscores instead of periods in the variable names - * clarify that chained comparisons are not permitted + * allow ordered string comparisons and chained comparisons * More flexible system for defining contact points and contributors @@ -1616,9 +1878,11 @@ Summary of differences from \PEP 345 * Updated obsolescence mechanism -* Added "License URL" field +* Identification of supporting documents in the ``dist-info`` directory: -* Explicit declaration of description markup format + * Allows markup formats to be indicated through file extensions + * Standardises the common practice of taking the description from README + * Also supports inclusion of license files and changelogs * With all due respect to Charles Schulz and Peanuts, many of the examples have been updated to be more `thematically appropriate`_ for Python ;) @@ -1667,7 +1931,7 @@ eventually even the option to embed arbitrary JSON inside particular subfields. The old serialisation format also wasn't amenable to easy conversion to -standard Python data structures for use in the new metabuild hook APIs, or +standard Python data structures for use in the new install hook APIs, or in future extensions to the importer APIs to allow them to provide information for inclusion in the installation database. @@ -1691,33 +1955,47 @@ Build labels See PEP 440 for the rationale behind the addition of this field. -Development, build and deployment dependencies ----------------------------------------------- +Support for different kinds of dependencies +------------------------------------------- -The separation of the ``requires``, ``build_requires`` and ``dev_requires`` -fields allows a distribution to indicate whether a dependency is needed -specifically to develop, build or deploy the distribution. +The separation of the five different kinds of dependency allows a +distribution to indicate whether a dependency is needed specifically to +develop, build, test or use the distribution. -As distribution metadata improves, this should allow much greater control -over where particular dependencies end up being installed . +To allow for metadistributions like PyObjC, while still actively +discouraging overly strict dependency specifications, the separate +``meta`` dependency fields are used to separate out those dependencies +where exact version specifications are appropriate. + +The advantage of having these distinctions supported in the upstream Python +specific metadata is that even if a project doesn't care about these +distinction themselves, they may be more amenable to patches from +downstream redistributors that separate the fields appropriately. Over time, +this should allow much greater control over where and when particular +dependencies end up being installed. + +The names for the dependency fields have been deliberately chosen to avoid +conflicting with the existing terminology in setuptools and previous +versions of the metadata standard. Specifically, the names ``requires``, +``install_requires`` and ``setup_requires`` are not used, which will +hopefully reduce confustion when converting legacy metadata to the new +standard. Support for optional dependencies for distributions --------------------------------------------------- The new extras system allows distributions to declare optional -features, and to use the ``may_require`` and ``build_may_require`` fields -to indicate when particular dependencies are needed only to support those -features. It is derived from the equivalent system that is already in -widespread use as part of ``setuptools`` and allows that aspect of the -legacy ``setuptools`` metadata to be accurately represented in the new -metadata format. +behaviour, and to use the ``*may_require`` fields to indicate when +particular dependencies are needed only to support that behaviour. It is +derived from the equivalent system that is already in widespread use as +part of ``setuptools`` and allows that aspect of the legacy ``setuptools`` +metadata to be accurately represented in the new metadata format. -The ``test`` extra is implicitly defined for all distributions, as it -ties in with the new metabuild hook offering a standard way to request -execution of a distribution's test suite. Identifying test suite -dependencies is already one of the most popular uses of the extras system -in ``setuptools``. +The additions to the extras syntax relative to setuptools are defined to +make it easier to express the various possible combinations of dependencies, +in particular those associated with build systems (with optional support +for running the test suite) and development systems. Support for metadata extensions @@ -1734,21 +2012,39 @@ the chosen extension, and the new extras mechanism, allowing support for particular extensions to be provided as optional features. -Support for metabuild hooks +Support for install hooks --------------------------- -The new metabuild system is designed to allow the wheel format to fully -replace direct installation on deployment targets, by allowing projects like -Twisted to still execute code following installation from a wheel file. +The new install hook system is designed to allow the wheel format to fully +replace direct installation on deployment targets, by allowing projects to +explicitly define code that should be executed following installation from +a wheel file. -Falling back to invoking ``setup.py`` directly rather than using a -metabuild hook will remain an option when relying on version 1.x metadata, -and is also used as the interim solution for installation from source -archives. +This may be something relatively simple, like the `two line +refresh `__ +of the Twisted plugin caches that the Twisted developers recommend for +any project that provides Twisted plugins, to more complex platform +dependent behaviour, potentially in conjunction with appropriate +metadata extensions and ``supports_environments`` entries. -The ``test_installed_dist`` metabuild hook is included in order to integrate -with build systems that can automatically invoke test suites, and as -a complement to the ability to explicitly specify test dependencies. +For example, upstream declaration of external dependencies for various +Linux distributions in a distribution neutral format may be supported by +defining an appropriate metadata extension that is read by a postinstall +hook and converted into an appropriate invocation of the system package +manager. Other operations (such as registering COM DLLs on Windows, +registering services for automatic startup on any platform, or altering +firewall settings) may need to be undertaken with elevated privileges, +meaning they cannot be deferred to implicit execution on first use of the +distribution. + +The install hook and metadata extension systems allow support for such +activities to be pursued independently by the individual platform +communities, while still interoperating with the cross-platform Python +tools. + +Legacy packages that expect to able to run code on target systems using +``setup.py install`` will no longer work correctly. Such packages will +already break when pip 1.4+ is configured to use a wheel cache directory. Changes to environment markers @@ -1805,8 +2101,9 @@ platforms in a way that is actually amenable to automated processing. This has been used to replace several older fields with poorly defined semantics. For the moment, the old ``Requires-External`` field has been removed -entirely. Possible replacements may be explored through the metadata -extension mechanism. +entirely. The combination of explicit support for post install hooks and the +metadata extension mechanism will hopefully prove to be a more useful +replacement. Updated obsolescence mechanism @@ -1824,22 +2121,55 @@ The ``Obsoletes-Dist`` header is removed rather than deprecated as it is not widely supported, and so removing it does not present any significant barrier to tools and projects adopting the new metadata format. -Explicit markup for description -------------------------------- -Currently, PyPI attempts to detect the markup format by rendering it as -reStructuredText, and if that fails, treating it as plain text. Allowing -the intended format to be stated explicitly will allow this guessing to be -removed, and more informative error reports to be provided to users when -a rendering error occurs. +Included text documents +----------------------- -This is especially necessary since PyPI applies additional restrictions to +Currently, PyPI attempts to determine the description's markup format by +rendering it as reStructuredText, and if that fails, treating it as plain +text. + +Furthermore, many projects simply read their long description in from an +existing README file in ``setup.py``. The popularity of this practice is +only expected to increase, as many online version control systems +(including both GitHub and BitBucket) automatically display such files +on the landing page for the project. + +Standardising on the inclusion of the long description as a separate +file in the ``dist-info`` directory allows this to be simplified: + +* An existing file can just be copied into the ``dist-info`` directory as + part of creating the sdist +* The expected markup format can be determined by inspecting the file + extension of the specified path + +Allowing the intended format to be stated explicitly in the path allows +the format guessing to be removed and more informative error reports to be +provided to users when a rendering error occurs. + +This is especially helpful since PyPI applies additional restrictions to the rendering process for security reasons, thus a description that renders correctly on a developer's system may still fail to render on the server. +The document naming system used to achieve this then makes it relatively +straightforward to allow declaration of alternative markup formats like +HTML, Markdown and AsciiDoc through the use of appropriate file +extensions, as well as to define similar included documents for the +project's license and changelog. -Deferred features -================= +Grouping the included document names into a single top level field gives +automated tools the option of treating them as arbitrary documents without +worrying about their contents. + +Requiring that the included documents be added to the ``dist-info`` metadata +directory means that the complete metadata for the distribution can be +extracted from an sdist or binary archive simply by extracting that +directory, without needing to check for references to other files in the +sdist. + + +Appendix D: Deferred features +============================= Several potentially useful features have been deliberately deferred in order to better prioritise our efforts in migrating to the new metadata @@ -1847,15 +2177,26 @@ standard. These all reflect information that may be nice to have in the new metadata, but which can be readily added in metadata 2.1 without breaking any use cases already supported by metadata 2.0. -Once the ``pypi``, ``setuptools``, ``pip`` and ``distlib`` projects -support creation and consumption of metadata 2.0, then we may revisit -the creation of metadata 2.1 with these additional features. +Once the ``pypi``, ``setuptools``, ``pip``, ``wheel`` and ``distlib`` +projects support creation and consumption of metadata 2.0, then we may +revisit the creation of metadata 2.1 with some or all of these additional +features. -.. note:: - Given the nature of this PEP as an interoperability specification, - this section will probably be removed before the PEP is accepted. - However, it's useful to have it here while discussion is ongoing. +MIME type registration +---------------------- + +At some point after acceptance of the PEP, I will likely submit the +following MIME type registration requests to IANA: + +* Full metadata: ``application/vnd.python.pymeta+json`` +* Abbreviated metadata: ``application/vnd.python.pymeta-short+json`` +* Essential dependency resolution metadata: + ``application/vnd.python.pymeta-dependencies+json`` + +It's even possible we may be able to just register the ``vnd.python`` +namespace under the banner of the PSF rather than having to register +the individual subformats. String methods in environment markers @@ -1870,61 +2211,82 @@ and the fact that 64-bit Windows still shows up as ``win32`` is more than a little strange. -Module listing --------------- +Module and file listings +------------------------ -A top level ``"module"`` key, referencing a list of strings, with each -giving the fully qualified name of a public package or module provided -by the distribution. - -A flat list would be used in order to correctly accommodate namespace -packages (where a distribution may provide subpackages or submodules without -explicitly providing the parent namespace package). - -Example:: - - "modules": [ - "comfy.chair" - ] +Derived metadata giving the modules and files included in built +distributions may be useful at some point in the future. (At least RPM +provides this, and I believe the APT equivalent does as well) Explicitly providing a list of public module names will likely help with enabling features in RPM like "Requires: python(requests)", as well as providing richer static metadata for analysis from PyPI. -However, this is just extra info that doesn't impact installing from wheels, -so it is a good candidate for postponing to metadata 2.1. +However, this is just extra info that doesn't impact reliably installing +from wheels, so it is a good candidate for postponing to metadata 2.1 +(at the earliest). -Additional metabuild hooks --------------------------- +Additional install hooks +------------------------ -The following draft metabuild operations have been deferred for now: +In addition to the postinstall and preuninstall hooks described in the PEP, +other distribution systems (like RPM) include the notion of preinstall +and postuninstall hooks. These hooks would run with the runtime dependencies +installed, but without the distribution itself. These have been deliberately +omitted, as they're well suited to being explored further as metadata +extensions. + +Similarly, the idea of "optional" postinstall and preuninstall hooks can +be pursued as a metadata extension. + +By contrast, the mandatory postinstall and preuninstall hooks have been +included directly in the PEP, specifically to ensure installation tools +don't silently ignore them. This ensures users will either be able to +install such distributions, or else receive an explicit error at installation +time. + + +Metabuild system +---------------- + +This version of the metadata specification continues to use ``setup.py`` +and the distutils command syntax to invoke build and test related +operations on a source archive or VCS checkout. + +It may be desirable to replace these in the future with tool independent +entry points that support: * Generating the metadata file on a development system -* Generating a source archive on a development system +* Generating an sdist on a development system * Generating a binary archive on a build system +* Running the test suite on a built (but not installed) distribution Metadata 2.0 deliberately focuses on wheel based installation, leaving -tarball and sdist based installation to use the existing ``setup.py`` -based ``distutils`` command interface. +sdist, source archive, and VCS checkout based installation to use the +existing ``setup.py`` based ``distutils`` command interface. -In the meantime, the above three operations will continue to be handled -through the ``distutils``/``setuptools`` command system: +In the meantime, the above operations will be handled through the +``distutils``/``setuptools`` command system: * ``python setup.py dist_info`` * ``python setup.py sdist`` +* ``python setup.py build_ext --inplace`` +* ``python setup.py test`` * ``python setup.py bdist_wheel`` -The following additional metabuild hooks may be added in metadata 2.1 to +The following metabuild hooks may be defined in metadata 2.1 to cover these operations without relying on ``setup.py``: -* ``make_dist_info``: generate the source archive's dist_info directory -* ``make_sdist``: construct a source archive -* ``build_wheel``: construct a binary wheel archive from an sdist source - archive +* ``make_dist_info``: generate the sdist's dist_info directory +* ``make_sdist``: create the contents of an sdist +* ``build_dist``: create the contents of a binary wheel archive from an + unpacked sdist +* ``test_built_dist``: run the test suite for a built distribution -Tentative signatures have been designed for those hooks, but they will -not be pursued further until 2.1:: +Tentative signatures have been designed for those hooks, but in order to +better focus initial development efforts on the integration and installation +use cases, they will not be pursued further until metadata 2.1:: def make_dist_info(source_dir, info_dir): """Generate the contents of dist_info for an sdist archive @@ -1949,11 +2311,11 @@ not be pursued further until 2.1:: Returns the distribution metadata as a dictionary. """ - def build_wheel(sdist_dir, contents_dir, info_dir, compatibility=None): - """Generate the contents of a wheel archive + def build_dist(sdist_dir, built_dir, info_dir, compatibility=None): + """Generate the contents of a binary wheel archive - *source_dir* points to an unpacked source archive - *contents_dir* is the destination where the wheel contents should be + *sdist_dir* points to an unpacked sdist + *built_dir* is the destination where the wheel contents should be written (note that archiving the contents is the responsibility of the metabuild tool rather than the hook function) *info_dir* is the destination where the wheel metadata files should @@ -1965,36 +2327,93 @@ not be pursued further until 2.1:: Returns the actual compatibility tag for the build """ -As with the existing metabuild hooks, checking for extras would be done + def test_built_dist(sdist_dir, built_dir, info_dir): + """Check a built (but not installed) distribution works as expected + + *sdist_dir* points to an unpacked sdist + *built_dir* points to a platform appropriate unpacked wheel archive + (which may be missing the wheel metadata directory) + *info_dir* points to the appropriate wheel metadata directory + + Requires that the distribution's test dependencies be installed + (indicated by the ``:test:`` extra). + + Returns ``True`` if the check passes, ``False`` otherwise. + """ + +As with the existing install hooks, checking for extras would be done using the same import based checks as are used for runtime extras. That way it doesn't matter if the additional dependencies were requested explicitly or just happen to be available on the system. +There are still a number of open questions with this design, such as whether +a single build hook is sufficient to cover both "build for testing" and +"prep for deployment", as well as various complexities like support for +cross-compilation of binaries, specification of target platforms and +Python versions when creating wheel files, etc. -Rejected Features -================= +Opting to retain the status quo for now allows us to make progress on +improved metadata publication and binary installation support, rather than +having to delay that awaiting the creation of a viable metabuild framework. + + +Appendix E: Rejected features +============================= The following features have been explicitly considered and rejected as introducing too much additional complexity for too small a gain in expressiveness. -.. note:: - Given the nature of this PEP as an interoperability specification, - this section will probably be removed before the PEP is accepted. - However, it's useful to have it here while discussion is ongoing. +Disallowing underscores in distribution names +--------------------------------------------- + +Debian doesn't actually permit underscores in names, but that seems +unduly restrictive for this spec given the common practice of using +valid Python identifiers as Python distribution names. A Debian side +policy of converting underscores to hyphens seems easy enough to +implement (and the requirement to consider hyphens and underscores as +equivalent ensures that doing so won't introduce any conflicts). -Detached metadata ------------------ +Allowing the use of Unicode in distribution names +------------------------------------------------- -Rather than allowing some large items (such as the description field) to -be distributed separately, this PEP instead defines two metadata subsets -that should support more reasonable caching and API designs (for example, -only the essential dependency resolution metadata would be distributed -through TUF, and it is entirely possible the updated sdist, wheel and -installation database specs will use the abbreviated metadata, leaving -the full metadata as the province of index servers). +This PEP deliberately avoids following Python 3 down the path of arbitrary +Unicode identifiers, as the security implications of doing so are +substantially worse in the software distribution use case (it opens +up far more interesting attack vectors than mere code obfuscation). + +In addition, the existing tools really only work properly if you restrict +names to ASCII and changing that would require a *lot* of work for all +the automated tools in the chain. + +It may be reasonable to revisit this question at some point in the (distant) +future, but setting up a more reliable software distribution system is +challenging enough without adding more general Unicode identifier support +into the mix. + + +Single list for conditional and unconditional dependencies +---------------------------------------------------------- + +It's technically possible to store the conditional and unconditional +dependencies of each kind in a single list and switch the handling based on +the entry type (string or mapping). + +However, the current ``*requires`` vs ``*may-require`` two list design seems +easier to understand and work with, since it's only the conditional +dependencies that need to be checked against the requested extras list and +the target installation environment. + + +Depending on source labels +-------------------------- + +There is no mechanism to express a dependency on a source label - they +are included in the metadata for internal project reference only. Instead, +dependencies must be expressed in terms of either public versions or else +direct URL references. Alternative dependencies @@ -2019,7 +2438,7 @@ case would arguably be better served by an SQL Alchemy defined "supported database driver" metadata extension where a project depends on SQL Alchemy, and then declares in the extension which database drivers are checked for compatibility by the upstream project (similar to the advisory -``supports-platform`` field in the main metadata). +``supports_environments`` field in the main metadata). We're also getting better support for "virtual provides" in this version of the metadata standard, so this may end up being an installer and index @@ -2047,9 +2466,67 @@ Conditional provides Under the revised metadata design, conditional "provides" based on runtime features or the environment would go in a separate "may_provide" field. -However, I'm not convinced there's a great use case for that, so the idea +However, it isn't clear there's any use case for doing that, so the idea is rejected unless someone can present a compelling use case (and even then -the idea wouldn't be reconsidered until metadata 2.1 at the earliest). +the idea won't be reconsidered until metadata 2.1 at the earliest). + + +A hook to run tests against installed distributions +--------------------------------------------------- + +Earlier drafts of this PEP defined a hook for running automated +tests against an *installed* distribution. This isn't actually what you +generally want - you want the ability to test a *built* distribution, +potentially relying on files which won't be included in the binary archives. + +RPM's "check" step also runs between the build step and the install step, +rather than after the install step. + +Accordingly, the ``test_installed_dist`` hook has been removed, and the +``test_built_dist`` metabuild hook has been tentatively defined. However, +along with the rest of the metabuild hooks, further consideration has been +deferred until metadata 2.1 at the earliest. + + +Extensible signatures for the install hooks +------------------------------------------- + +The install hooks have been deliberately designed to NOT accept arbitary +keyword arguments that the hook implementation is then expected to ignore. + +The argument in favour of that API design technique is to allow the addition +of new optional arguments in the future, without requiring the definition +of a new install hook, or migration to version 3.0 of the metadata +specification. It is a technique very commonly seen in function wrappers +which merely pass arguments along to the inner function rather than +processing them directly. + +However, the install hooks are already designed to have access to the full +metadata for the distribution (including all metadata extensions and +the previous/next version when appropriate), as well as to the full target +deployment environment. + +This means there are two candidates for additional information that +could be passed as arbitrary keyword arguments: + +* installer dependent settings +* user provided installation options + +The first of those runs explicitly counter to one of the core goals of the +metadata 2.0 specification: decoupling the software developer's choice of +development and publication tools from the software integrator's choice of +integration and deployment tools. + +The second is a complex problem that has a readily available workaround in +the form of operating system level environment variables (this is also +one way to interoperate with platform specific installation tools). + +Alternatively, installer developers may either implicitly inject an +additional metadata extension when invoking the install hook, or else +define an alternate hook signature as a distinct metadata extension to be +provided by the distribution. Either of these approaches makes the +reliance on installer-dependent behaviour suitably explicit in either +the install hook implementation or the distribution metadata. References diff --git a/pep-0426/pymeta-schema.json b/pep-0426/pymeta-schema.json new file mode 100644 index 000000000..2b67dafa8 --- /dev/null +++ b/pep-0426/pymeta-schema.json @@ -0,0 +1,249 @@ +{ + "id": "http://www.python.org/dev/peps/pep-0426/", + "$schema": "http://json-schema.org/draft-04/schema#", + "title": "Metadata for Python Software Packages 2.0", + "type": "object", + "properties": { + "metadata_version": { + "description": "Version of the file format", + "type": "string", + "pattern": "^(\\d+(\\.\\d+)*)$" + }, + "generator": { + "description": "Name and version of the program that produced this file.", + "type": "string", + "pattern": "^[0-9A-Za-z]([0-9A-Za-z_.-]*[0-9A-Za-z])( \\((\\d+(\\.\\d+)*)((a|b|c|rc)(\\d+))?(\\.(post)(\\d+))?(\\.(dev)(\\d+))\\))?$" + }, + "name": { + "description": "The name of the distribution.", + "type": "string", + "pattern": "^[0-9A-Za-z]([0-9A-Za-z_.-]*[0-9A-Za-z])?$" + }, + "version": { + "description": "The distribution's public version identifier", + "type": "string", + "pattern": "^(\\d+(\\.\\d+)*)((a|b|c|rc)(\\d+))?(\\.(post)(\\d+))?(\\.(dev)(\\d+))?$" + }, + "source_label": { + "description": "A constrained identifying text string", + "type": "string", + "pattern": "^[0-9a-z_.-+]+$" + }, + "source_url": { + "description": "A string containing a full URL where the source for this specific version of the distribution can be downloaded.", + "type": "string", + "format": "uri" + }, + "summary": { + "description": "A one-line summary of what the distribution does.", + "type": "string" + }, + "document_names": { + "description": "Names of supporting metadata documents", + "type": "object", + "properties": { + "description": { + "type": "string", + "$ref": "#/definitions/document_name" + }, + "changelog": { + "type": "string", + "$ref": "#/definitions/document_name" + }, + "license": { + "type": "string", + "$ref": "#/definitions/document_name" + } + }, + "additionalProperties": false + }, + "keywords": { + "description": "A list of additional keywords to be used to assist searching for the distribution in a larger catalog.", + "type": "array", + "items": { + "type": "string" + } + }, + "license": { + "description": "A string indicating the license covering the distribution.", + "type": "string" + }, + "classifiers": { + "description": "A list of strings, with each giving a single classification value for the distribution.", + "type": "array", + "items": { + "type": "string" + } + }, + "contacts": { + "description": "A list of contributor entries giving the recommended contact points for getting more information about the project.", + "type": "array", + "items": { + "type": "object", + "$ref": "#/definitions/contact" + } + }, + "contributors": { + "description": "A list of contributor entries for other contributors not already listed as current project points of contact.", + "type": "array", + "items": { + "type": "object", + "$ref": "#/definitions/contact" + } + }, + "project_urls": { + "description": "A mapping of arbitrary text labels to additional URLs relevant to the project.", + "type": "object" + }, + "extras": { + "description": "A list of optional sets of dependencies that may be used to define conditional dependencies in \"may_require\" and similar fields.", + "type": "array", + "items": { + "type": "string", + "$ref": "#/definitions/extra_name" + } + }, + "distributes": { + "description": "A list of subdistributions made available through this metadistribution.", + "type": "array", + "$ref": "#/definitions/dependencies" + }, + "may_distribute": { + "description": "A list of subdistributions that may be made available through this metadistribution, based on the extras requested and the target deployment environment.", + "$ref": "#/definitions/conditional_dependencies" + }, + "run_requires": { + "description": "A list of other distributions needed when to run this distribution.", + "type": "array", + "$ref": "#/definitions/dependencies" + }, + "run_may_require": { + "description": "A list of other distributions that may be needed when this distribution is deployed, based on the extras requested and the target deployment environment.", + "$ref": "#/definitions/conditional_dependencies" + }, + "test_requires": { + "description": "A list of other distributions needed when this distribution is tested.", + "type": "array", + "$ref": "#/definitions/dependencies" + }, + "test_may_require": { + "description": "A list of other distributions that may be needed when this distribution is tested, based on the extras requested and the target deployment environment.", + "type": "array", + "$ref": "#/definitions/conditional_dependencies" + }, + "build_requires": { + "description": "A list of other distributions needed when this distribution is built.", + "type": "array", + "$ref": "#/definitions/dependencies" + }, + "build_may_require": { + "description": "A list of other distributions that may be needed when this distribution is built, based on the extras requested and the target deployment environment.", + "type": "array", + "$ref": "#/definitions/conditional_dependencies" + }, + "dev_requires": { + "description": "A list of other distributions needed when this distribution is developed.", + "type": "array", + "$ref": "#/definitions/dependencies" + }, + "dev_may_require": { + "description": "A list of other distributions that may be needed when this distribution is developed, based on the extras requested and the target deployment environment.", + "type": "array", + "$ref": "#/definitions/conditional_dependencies" + }, + "provides": { + "description": "A list of strings naming additional dependency requirements that are satisfied by installing this distribution. These strings must be of the form Name or Name (Version), as for the requires field.", + "type": "array", + "items": { + "type": "string" + } + }, + "obsoleted_by": { + "description": "A string that indicates that this project is no longer being developed. The named project provides a substitute or replacement.", + "type": "string", + "$ref": "#/definitions/version_specifier" + }, + "supports_environments": { + "description": "A list of strings specifying the environments that the distribution explicitly supports.", + "type": "array", + "items": { + "type": "string", + "$ref": "#/definitions/environment_marker" + } + }, + "metabuild_hooks": { + "description": "The metabuild_hooks field is used to define various operations that may be invoked on a distribution in a platform independent manner.", + "type": "object" + }, + "extensions": { + "description": "Extensions to the metadata may be present in a mapping under the 'extensions' key.", + "type": "object" + } + }, + + "required": ["metadata_version", "name", "version"], + "additionalProperties": false, + + "definitions": { + "contact": { + "type": "object", + "properties": { + "name": { + "type": "string" + }, + "email": { + "type": "string" + }, + "url": { + "type": "string" + }, + "role": { + "type": "string" + } + }, + "required": ["name"], + "additionalProperties": false + }, + "dependencies": { + "type": "array", + "items": { + "type": "string", + "$ref": "#/definitions/version_specifier" + } + }, + "conditional_dependencies": { + "type": "array", + "items": { + "type": "object", + "properties": { + "extra": { + "type": "string", + "$ref": "#/definitions/extra_name" + }, + "environment": { + "type": "string", + "$ref": "#/definitions/environment_marker" + }, + "dependencies": { + "type": "array", + "$ref": "#/definitions/dependencies" + } + }, + "required": ["dependencies"], + "additionalProperties": false + } + }, + "version_specifier": { + "type": "string" + }, + "extra_name": { + "type": "string" + }, + "environment_marker": { + "type": "string" + }, + "document_name": { + "type": "string" + } + } +} diff --git a/pep-0435.txt b/pep-0435.txt index 8ded74927..41739a51d 100644 --- a/pep-0435.txt +++ b/pep-0435.txt @@ -5,7 +5,7 @@ Last-Modified: $Date$ Author: Barry Warsaw , Eli Bendersky , Ethan Furman -Status: Accepted +Status: Final Type: Standards Track Content-Type: text/x-rst Created: 2013-02-23 @@ -467,6 +467,10 @@ assignment to ``Animal`` is equivalent to:: ... cat = 3 ... dog = 4 +The reason for defaulting to ``1`` as the starting number and not ``0`` is +that ``0`` is ``False`` in a boolean sense, but enum members all evaluate +to ``True``. + Proposed variations =================== diff --git a/pep-0440.txt b/pep-0440.txt index 391c07705..79e8bc1eb 100644 --- a/pep-0440.txt +++ b/pep-0440.txt @@ -9,7 +9,7 @@ Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 18 Mar 2013 -Post-History: 30 Mar 2013, 27-May-2013 +Post-History: 30 Mar 2013, 27 May 2013, 20 Jun 2013 Replaces: 386 @@ -27,7 +27,7 @@ standardised approach to versioning, as described in PEP 345 and PEP 386. This PEP was broken out of the metadata 2.0 specification in PEP 426. Unlike PEP 426, the notes that remain in this document are intended as - part of the final specification. + part of the final specification (except for this one). Definitions @@ -40,7 +40,7 @@ document are to be interpreted as described in RFC 2119. The following terms are to be interpreted as described in PEP 426: * "Distributions" -* "Versions" +* "Releases" * "Build tools" * "Index servers" * "Publication tools" @@ -52,9 +52,13 @@ The following terms are to be interpreted as described in PEP 426: Version scheme ============== -Distribution versions are identified by both a public version identifier, -which supports all defined version comparison operations, and a build -label, which supports only strict equality comparisons. +Distributions are identified by a public version identifier which +supports all defined version comparison operations + +Distributions may also define a source label, which is not used by +automated tools. Source labels are useful when a project internal +versioning scheme requires translation to create a compliant public +version identifier. The version scheme is used both to describe the distribution version provided by a particular distribution archive, as well as to place @@ -84,7 +88,7 @@ Public version identifiers are separated into up to four segments: * Post-release segment: ``.postN`` * Development release segment: ``.devN`` -Any given version will be a "release", "pre-release", "post-release" or +Any given release will be a "final release", "pre-release", "post-release" or "developmental release" as defined in the following sections. .. note:: @@ -105,28 +109,37 @@ Source labels Source labels are text strings with minimal defined semantics. To ensure source labels can be readily incorporated as part of file names -and URLs, they MUST be comprised of only ASCII alphanumerics, plus signs, -periods and hyphens. +and URLs, and to avoid formatting inconsistences in hexadecimal hash +representations they MUST be limited to the following set of permitted +characters: -In addition, source labels MUST be unique within a given distribution. +* Lowercase ASCII letters (``[a-z]``) +* ASCII digits (``[0-9]``) +* underscores (``_``) +* hyphens (``-``) +* periods (``.``) +* plus signs (``+``) -As with distribution names, all comparisons of source labels MUST be case -insensitive. +Source labels MUST start and end with an ASCII letter or digit. + +Source labels MUST be unique within each project and MUST NOT match any +defined version for the project. -Releases --------- +Final releases +-------------- -A version identifier that consists solely of a release segment is termed -a "release". +A version identifier that consists solely of a release segment is +termed a "final release". -The release segment consists of one or more non-negative integer values, -separated by dots:: +The release segment consists of one or more non-negative integer +values, separated by dots:: N[.N]+ -Releases within a project will typically be numbered in a consistently -increasing fashion. +Final releases within a project MUST be numbered in a consistently +increasing fashion, otherwise automated tools will not be able to upgrade +them correctly. Comparison and ordering of release segments considers the numeric value of each component of the release segment in turn. When comparing release @@ -157,8 +170,8 @@ For example:: 2.0 2.0.1 -A release series is any set of release numbers that start with a common -prefix. For example, ``3.3.1``, ``3.3.5`` and ``3.3.9.45`` are all +A release series is any set of final release numbers that start with a +common prefix. For example, ``3.3.1``, ``3.3.5`` and ``3.3.9.45`` are all part of the ``3.3`` release series. .. note:: @@ -206,8 +219,8 @@ of both ``c`` and ``rc`` releases for a common release segment. Post-releases ------------- -Some projects use post-releases to address minor errors in a release that -do not affect the distributed software (for example, correcting an error +Some projects use post-releases to address minor errors in a final release +that do not affect the distributed software (for example, correcting an error in the release notes). If used as part of a project's development cycle, these post-releases are @@ -371,7 +384,7 @@ are permitted and MUST be ordered as shown:: .devN, aN, bN, cN, rcN, , .postN Note that `rc` will always sort after `c` (regardless of the numeric -component) although they are semantically equivalent. Tools are free to +component) although they are semantically equivalent. Tools MAY reject this case as ambiguous and remain in compliance with the PEP. Within an alpha (``1.0a1``), beta (``1.0b1``), or release candidate @@ -506,6 +519,22 @@ numbering based on API compatibility, as well as triggering more appropriate version comparison semantics. +Olson database versioning +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``pytz`` project inherits its versioning scheme from the corresponding +Olson timezone database versioning scheme: the year followed by a lowercase +character indicating the version of the database within that year. + +This can be translated to a compliant 3-part version identifier as +``0..``, where the serial starts at zero (for the 'a' +release) and is incremented with each subsequent database update within the +year. + +As with other translated version identifiers, the corresponding Olson +database version would be recorded in the source label field. + + Version specifiers ================== @@ -521,7 +550,6 @@ clause: * ``~=``: `Compatible release`_ clause * ``==``: `Version matching`_ clause * ``!=``: `Version exclusion`_ clause -* ``is``: `Source reference`_ clause * ``<=``, ``>=``: `Inclusive ordered comparison`_ clause * ``<``, ``>``: `Exclusive ordered comparison`_ clause @@ -605,6 +633,11 @@ version. The *only* substitution performed is the zero padding of the release segment to ensure the release segments are compared with the same length. +Whether or not strict version matching is appropriate depends on the specific +use case for the version specifier. Automated tools SHOULD at least issue +warnings and MAY reject them entirely when strict version matches are used +inappropriately. + Prefix matching may be requested instead of strict comparison, by appending a trailing ``.*`` to the version identifier in the version matching clause. This means that additional trailing segments will be ignored when @@ -645,75 +678,6 @@ match or not as shown:: != 1.1.* # Same prefix, so 1.1.post1 does not match clause -Source reference ----------------- - -A source reference includes the source reference operator ``is`` and -a source label or a source URL. - -Installation tools MAY also permit direct references to a platform -appropriate binary archive in a source reference clause. - -Publication tools and public index servers SHOULD NOT permit direct -references to a platform appropriate binary archive in a source -reference clause. - -Source label matching works solely on strict equality comparisons: the -candidate source label must be exactly the same as the source label in the -version clause for the clause to match the candidate distribution. - -For example, a source reference could be used to depend directly on a -version control hash based identifier rather than the translated public -version:: - - exact-dependency (is 1.3.7+build.11.e0f985a) - -A source URL is distinguished from a source label by the presence of -``:`` and ``/`` characters in the source reference. As these characters -are not permitted in source labels, they indicate that the reference uses -a source URL. - -Some appropriate targets for a source URL are a source tarball, an sdist -archive or a direct reference to a tag or specific commit in an online -version control system. The exact URLs and -targets supported will be installation tool specific. - -For example, a local source archive may be referenced directly:: - - pip (is file:///localbuilds/pip-1.3.1.zip) - -All source URL references SHOULD either specify a local file URL, a secure -transport mechanism (such as ``https``) or else include an expected hash -value in the URL for verification purposes. If an insecure network -transport is specified without any hash information (or with hash -information that the tool doesn't understand), automated tools SHOULD -at least emit a warning and MAY refuse to rely on the URL. - -It is RECOMMENDED that only hashes which are unconditionally provided by -the latest version of the standard library's ``hashlib`` module be used -for source archive hashes. At time of writing, that list consists of -``'md5'``, ``'sha1'``, ``'sha224'``, ``'sha256'``, ``'sha384'``, and -``'sha512'``. - -For source archive references, an expected hash value may be -specified by including a ``=`` as part of -the URL fragment. - -For version control references, the ``VCS+protocol`` scheme SHOULD be -used to identify both the version control system and the secure transport. - -To support version control systems that do not support including commit or -tag references directly in the URL, that information may be appended to the -end of the URL using the ``@`` notation. - -The use of ``is`` when defining dependencies for published distributions -is strongly discouraged as it greatly complicates the deployment of -security fixes. The source label matching operator is intended primarily -for use when defining dependencies for repeatable *deployments of -applications* while using a shared distribution index, as well as to -reference dependencies which are not published through an index server. - - Inclusive ordered comparison ---------------------------- @@ -752,62 +716,108 @@ Handling of pre-releases ------------------------ Pre-releases of any kind, including developmental releases, are implicitly -excluded from all version specifiers, *unless* a pre-release or developmental -release is explicitly mentioned in one of the clauses. For example, these -specifiers implicitly exclude all pre-releases and development -releases of later versions:: - - 2.2 - >= 1.0 - -While these specifiers would include at least some of them:: - - 2.2.dev0 - 2.2, != 2.3b2 - >= 1.0a1 - >= 1.0c1 - >= 1.0, != 1.0b2 - >= 1.0, < 2.0.dev123 +excluded from all version specifiers, *unless* they are already present +on the system, explicitly requested by the user, or if the only available +version that satisfies the version specifier is a pre-release. By default, dependency resolution tools SHOULD: * accept already installed pre-releases for all version specifiers -* accept remotely available pre-releases for version specifiers which - include at least one version clauses that references a pre-release +* accept remotely available pre-releases for version specifiers where + there is no final or post release that satisfies the version specifier * exclude all other pre-releases from consideration +Dependency resolution tools MAY issue a warning if a pre-release is needed +to satisfy a version specifier. + Dependency resolution tools SHOULD also allow users to request the following alternative behaviours: * accepting pre-releases for all version specifiers * excluding pre-releases for all version specifiers (reporting an error or - warning if a pre-release is already installed locally) + warning if a pre-release is already installed locally, or if a + pre-release is the only way to satisfy a particular specifier) Dependency resolution tools MAY also allow the above behaviour to be controlled on a per-distribution basis. -Post-releases and purely numeric releases receive no special treatment in -version specifiers - they are always included unless explicitly excluded. +Post-releases and final releases receive no special treatment in version +specifiers - they are always included unless explicitly excluded. Examples -------- -* ``3.1``: version 3.1 or later, but not - version 4.0 or later. Excludes pre-releases and developmental releases. -* ``3.1.2``: version 3.1.2 or later, but not - version 3.2.0 or later. Excludes pre-releases and developmental releases. -* ``3.1a1``: version 3.1a1 or later, but not - version 4.0 or later. Allows pre-releases like 3.2a4 and developmental - releases like 3.2.dev1. +* ``3.1``: version 3.1 or later, but not version 4.0 or later. +* ``3.1.2``: version 3.1.2 or later, but not version 3.2.0 or later. +* ``3.1a1``: version 3.1a1 or later, but not version 4.0 or later. * ``== 3.1``: specifically version 3.1 (or 3.1.0), excludes all pre-releases, post releases, developmental releases and any 3.1.x maintenance releases. -* ``== 3.1.*``: any version that starts with 3.1, excluding pre-releases and - developmental releases. Equivalent to the ``3.1.0`` compatible release - clause. +* ``== 3.1.*``: any version that starts with 3.1. Equivalent to the + ``3.1.0`` compatible release clause. * ``3.1.0, != 3.1.3``: version 3.1.0 or later, but not version 3.1.3 and - not version 3.2.0 or later. Excludes pre-releases and developmental - releases. + not version 3.2.0 or later. + + +Direct references +================= + +Some automated tools may permit the use of a direct reference as an +alternative to a normal version specifier. A direct reference consists of +the word ``from`` and an explicit URL. + +Whether or not direct references are appropriate depends on the specific +use case for the version specifier. Automated tools SHOULD at least issue +warnings and MAY reject them entirely when direct references are used +inappropriately. + +Public index servers SHOULD NOT allow the use of direct references in +uploaded distributions. Direct references are intended as a tool for +software integrators rather than publishers. + +Depending on the use case, some appropriate targets for a direct URL +reference may be a valid ``source_url`` entry (see PEP 426), an sdist, or +a wheel binary archive. The exact URLs and targets supported will be tool +dependent. + +For example, a local source archive may be referenced directly:: + + pip (from file:///localbuilds/pip-1.3.1.zip) + +Alternatively, a prebuilt archive may also be referenced:: + + pip (from file:///localbuilds/pip-1.3.1-py33-none-any.whl) + +All direct references that do not refer to a local file URL SHOULD +specify a secure transport mechanism (such as ``https``), include an +expected hash value in the URL for verification purposes, or both. If an +insecure transport is specified without any hash information, with hash +information that the tool doesn't understand, or with a selected hash +algorithm that the tool considers too weak to trust, automated tools +SHOULD at least emit a warning and MAY refuse to rely on the URL. + +It is RECOMMENDED that only hashes which are unconditionally provided by +the latest version of the standard library's ``hashlib`` module be used +for source archive hashes. At time of writing, that list consists of +``'md5'``, ``'sha1'``, ``'sha224'``, ``'sha256'``, ``'sha384'``, and +``'sha512'``. + +For source archive and wheel references, an expected hash value may be +specified by including a ``=`` entry as +part of the URL fragment. + +Version control references, the ``VCS+protocol`` scheme SHOULD be +used to identify both the version control system and the secure transport. + +To support version control systems that do not support including commit or +tag references directly in the URL, that information may be appended to the +end of the URL using the ``@`` notation. + +Remote URL examples:: + + pip (from https://github.com/pypa/pip/archive/1.3.1.zip) + pip (from http://github.com/pypa/pip/archive/1.3.1.zip#sha1=da9234ee9982d4bbb3c72346a6de940a148ea686) + pip (from git+https://github.com/pypa/pip.git@1.3.1) Updating the versioning specification @@ -825,28 +835,30 @@ Summary of differences from \PEP 386 * Moved the description of version specifiers into the versioning PEP -* added the "source label" concept to better handle projects that wish to +* Added the "source label" concept to better handle projects that wish to use a non-compliant versioning scheme internally, especially those based on DVCS hashes - -* added the "compatible release" clause -* added the "source reference" clause +* Added the "direct reference" concept as a standard notation for direct + references to resources (rather than each tool needing to invents its own) -* added the trailing wildcard syntax for prefix based version matching +* Added the "compatible release" clause + +* Added the trailing wildcard syntax for prefix based version matching and exclusion -* changed the top level sort position of the ``.devN`` suffix +* Changed the top level sort position of the ``.devN`` suffix -* allowed single value version numbers +* Allowed single value version numbers -* explicit exclusion of leading or trailing whitespace +* Explicit exclusion of leading or trailing whitespace -* explicit criterion for the exclusion of date based versions +* Explicit criterion for the exclusion of date based versions -* implicitly exclude pre-releases unless explicitly requested +* Implicitly exclude pre-releases unless they're already present or + needed to satisfy a dependency -* treat post releases the same way as unqualified releases +* Treat post releases the same way as unqualified releases * Discuss ordering and dependencies across metadata versions @@ -995,11 +1007,12 @@ The previous interpretation also excluded post-releases from some version specifiers for no adequately justified reason. The updated interpretation is intended to make it difficult to accidentally -accept a pre-release version as satisfying a dependency, while allowing -pre-release versions to be explicitly requested when needed. +accept a pre-release version as satisfying a dependency, while still +allowing pre-release versions to be retrieved automatically when that's the +only way to satisfy a dependency. The "some forward compatibility assumed" default version constraint is -taken directly from the Ruby community's "pessimistic version constraint" +derived from the Ruby community's "pessimistic version constraint" operator [2]_ to allow projects to take a cautious approach to forward compatibility promises, while still easily setting a minimum required version for their dependencies. It is made the default behaviour rather @@ -1022,16 +1035,26 @@ improved tools for dynamic path manipulation. The trailing wildcard syntax to request prefix based version matching was added to make it possible to sensibly define both compatible release clauses -and the desired pre-release handling semantics for ``<`` and ``>`` ordered -comparison clauses. +and the desired pre- and post-release handling semantics for ``<`` and ``>`` +ordered comparison clauses. -Source references are added for two purposes. In conjunction with source -labels, they allow hash based references to exact versions that aren't -compliant with the fully ordered public version scheme, such as those -generated from version control. In combination with source URLs, they -also allow the new metadata standard to natively support an existing -feature of ``pip``, which allows arbitrary URLs like -``file:///localbuilds/exampledist-1.0-py33-none-any.whl``. + +Adding direct references +------------------------ + +Direct references are added as an "escape clause" to handle messy real +world situations that don't map neatly to the standard distribution model. +This includes dependencies on unpublished software for internal use, as well +as handling the more complex compatibility issues that may arise when +wrapping third party libraries as C extensions (this is of especial concern +to the scientific community). + +Index servers are deliberately given a lot of freedom to disallow direct +references, since they're intended primarily as a tool for integrators +rather than publishers. PyPI in particular is currently going through the +process of *eliminating* dependencies on external references, as unreliable +external services have the effect of slowing down installation operations, +as well as reducing PyPI's own apparent reliability. References diff --git a/pep-0442.txt b/pep-0442.txt index c6d7f66ff..e746cb8ca 100644 --- a/pep-0442.txt +++ b/pep-0442.txt @@ -4,13 +4,13 @@ Version: $Revision$ Last-Modified: $Date$ Author: Antoine Pitrou BDFL-Delegate: Benjamin Peterson -Status: Draft +Status: Accepted Type: Standards Track Content-Type: text/x-rst Created: 2013-05-18 Python-Version: 3.4 Post-History: 2013-05-18 -Resolution: TBD +Resolution: http://mail.python.org/pipermail/python-dev/2013-June/126746.html Abstract @@ -201,8 +201,7 @@ Predictability -------------- Following this scheme, an object's finalizer is always called exactly -once. The only exception is if an object is resurrected: the finalizer -will be called again when the object becomes unreachable again. +once, even if it was resurrected afterwards. For CI objects, the order in which finalizers are called (step 2 above) is undefined. diff --git a/pep-0443.txt b/pep-0443.txt index 909fe0e58..88362cb35 100644 --- a/pep-0443.txt +++ b/pep-0443.txt @@ -4,7 +4,7 @@ Version: $Revision$ Last-Modified: $Date$ Author: Ɓukasz Langa Discussions-To: Python-Dev -Status: Accepted +Status: Final Type: Standards Track Content-Type: text/x-rst Created: 22-May-2013 @@ -193,48 +193,37 @@ handling of old-style classes and Zope's ExtensionClasses. More importantly, it introduces support for Abstract Base Classes (ABC). When a generic function implementation is registered for an ABC, the -dispatch algorithm switches to a mode of MRO calculation for the -provided argument which includes the relevant ABCs. The algorithm is as -follows:: +dispatch algorithm switches to an extended form of C3 linearization, +which includes the relevant ABCs in the MRO of the provided argument. +The algorithm inserts ABCs where their functionality is introduced, i.e. +``issubclass(cls, abc)`` returns ``True`` for the class itself but +returns ``False`` for all its direct base classes. Implicit ABCs for +a given class (either registered or inferred from the presence of +a special method like ``__len__()``) are inserted directly after the +last ABC explicitly listed in the MRO of said class. - def _compose_mro(cls, haystack): - """Calculates the MRO for a given class `cls`, including relevant - abstract base classes from `haystack`.""" - bases = set(cls.__mro__) - mro = list(cls.__mro__) - for regcls in haystack: - if regcls in bases or not issubclass(cls, regcls): - continue # either present in the __mro__ or unrelated - for index, base in enumerate(mro): - if not issubclass(base, regcls): - break - if base in bases and not issubclass(regcls, base): - # Conflict resolution: put classes present in __mro__ - # and their subclasses first. - index += 1 - mro.insert(index, regcls) - return mro - -In its most basic form, it returns the MRO for the given type:: +In its most basic form, this linearization returns the MRO for the given +type:: >>> _compose_mro(dict, []) [, ] -When the haystack consists of ABCs that the specified type is a subclass -of, they are inserted in a predictable order:: +When the second argument contains ABCs that the specified type is +a subclass of, they are inserted in a predictable order:: >>> _compose_mro(dict, [Sized, MutableMapping, str, ... Sequence, Iterable]) [, , - , , + , , + , , ] While this mode of operation is significantly slower, all dispatch decisions are cached. The cache is invalidated on registering new implementations on the generic function or when user code calls -``register()`` on an ABC to register a new virtual subclass. In the -latter case, it is possible to create a situation with ambiguous -dispatch, for instance:: +``register()`` on an ABC to implicitly subclass it. In the latter case, +it is possible to create a situation with ambiguous dispatch, for +instance:: >>> from collections import Iterable, Container >>> class P: @@ -261,20 +250,38 @@ guess:: RuntimeError: Ambiguous dispatch: or -Note that this exception would not be raised if ``Iterable`` and -``Container`` had been provided as base classes during class definition. -In this case dispatch happens in the MRO order:: +Note that this exception would not be raised if one or more ABCs had +been provided explicitly as base classes during class definition. In +this case dispatch happens in the MRO order:: >>> class Ten(Iterable, Container): ... def __iter__(self): ... for i in range(10): ... yield i ... def __contains__(self, value): - ... return value in range(10) + ... return value in range(10) ... >>> g(Ten()) 'iterable' +A similar conflict arises when subclassing an ABC is inferred from the +presence of a special method like ``__len__()`` or ``__contains__()``:: + + >>> class Q: + ... def __contains__(self, value): + ... return False + ... + >>> issubclass(Q, Container) + True + >>> Iterable.register(Q) + >>> g(Q()) + Traceback (most recent call last): + ... + RuntimeError: Ambiguous dispatch: + or + +An early version of the PEP contained a custom approach that was simpler +but created a number of edge cases with surprising results [#why-c3]_. Usage Patterns ============== @@ -378,6 +385,8 @@ References a particular annotation style". (http://www.python.org/dev/peps/pep-0008) +.. [#why-c3] http://bugs.python.org/issue18244 + .. [#pep-3124] http://www.python.org/dev/peps/pep-3124/ .. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules diff --git a/pep-0445.txt b/pep-0445.txt new file mode 100644 index 000000000..0b2821500 --- /dev/null +++ b/pep-0445.txt @@ -0,0 +1,773 @@ +PEP: 445 +Title: Add new APIs to customize Python memory allocators +Version: $Revision$ +Last-Modified: $Date$ +Author: Victor Stinner +BDFL-Delegate: Antoine Pitrou +Status: Accepted +Type: Standards Track +Content-Type: text/x-rst +Created: 15-june-2013 +Python-Version: 3.4 +Resolution: http://mail.python.org/pipermail/python-dev/2013-July/127222.html + +Abstract +======== + +This PEP proposes new Application Programming Interfaces (API) to customize +Python memory allocators. The only implementation required to conform to +this PEP is CPython, but other implementations may choose to be compatible, +or to re-use a similar scheme. + + +Rationale +========= + +Use cases: + +* Applications embedding Python which want to isolate Python memory from + the memory of the application, or want to use a different memory + allocator optimized for its Python usage +* Python running on embedded devices with low memory and slow CPU. + A custom memory allocator can be used for efficiency and/or to get + access all the memory of the device. +* Debug tools for memory allocators: + + - track the memory usage (find memory leaks) + - get the location of a memory allocation: Python filename and line + number, and the size of a memory block + - detect buffer underflow, buffer overflow and misuse of Python + allocator APIs (see `Redesign Debug Checks on Memory Block + Allocators as Hooks`_) + - force memory allocations to fail to test handling of the + ``MemoryError`` exception + + +Proposal +======== + +New Functions and Structures +---------------------------- + +* Add a new GIL-free (no need to hold the GIL) memory allocator: + + - ``void* PyMem_RawMalloc(size_t size)`` + - ``void* PyMem_RawRealloc(void *ptr, size_t new_size)`` + - ``void PyMem_RawFree(void *ptr)`` + - The newly allocated memory will not have been initialized in any + way. + - Requesting zero bytes returns a distinct non-*NULL* pointer if + possible, as if ``PyMem_Malloc(1)`` had been called instead. + +* Add a new ``PyMemAllocator`` structure:: + + typedef struct { + /* user context passed as the first argument to the 3 functions */ + void *ctx; + + /* allocate a memory block */ + void* (*malloc) (void *ctx, size_t size); + + /* allocate or resize a memory block */ + void* (*realloc) (void *ctx, void *ptr, size_t new_size); + + /* release a memory block */ + void (*free) (void *ctx, void *ptr); + } PyMemAllocator; + +* Add a new ``PyMemAllocatorDomain`` enum to choose the Python + allocator domain. Domains: + + - ``PYMEM_DOMAIN_RAW``: ``PyMem_RawMalloc()``, ``PyMem_RawRealloc()`` + and ``PyMem_RawFree()`` + + - ``PYMEM_DOMAIN_MEM``: ``PyMem_Malloc()``, ``PyMem_Realloc()`` and + ``PyMem_Free()`` + + - ``PYMEM_DOMAIN_OBJ``: ``PyObject_Malloc()``, ``PyObject_Realloc()`` + and ``PyObject_Free()`` + +* Add new functions to get and set memory block allocators: + + - ``void PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)`` + - ``void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)`` + - The new allocator must return a distinct non-*NULL* pointer when + requesting zero bytes + - For the ``PYMEM_DOMAIN_RAW`` domain, the allocator must be + thread-safe: the GIL is not held when the allocator is called. + +* Add a new ``PyObjectArenaAllocator`` structure:: + + typedef struct { + /* user context passed as the first argument to the 2 functions */ + void *ctx; + + /* allocate an arena */ + void* (*alloc) (void *ctx, size_t size); + + /* release an arena */ + void (*free) (void *ctx, void *ptr, size_t size); + } PyObjectArenaAllocator; + +* Add new functions to get and set the arena allocator used by + *pymalloc*: + + - ``void PyObject_GetArenaAllocator(PyObjectArenaAllocator *allocator)`` + - ``void PyObject_SetArenaAllocator(PyObjectArenaAllocator *allocator)`` + +* Add a new function to reinstall the debug checks on memory allocators when + a memory allocator is replaced with ``PyMem_SetAllocator()``: + + - ``void PyMem_SetupDebugHooks(void)`` + - Install the debug hooks on all memory block allocators. The function can be + called more than once, hooks are only installed once. + - The function does nothing is Python is not compiled in debug mode. + +* Memory block allocators always return *NULL* if *size* is greater than + ``PY_SSIZE_T_MAX``. The check is done before calling the inner + function. + +.. note:: + The *pymalloc* allocator is optimized for objects smaller than 512 bytes + with a short lifetime. It uses memory mappings with a fixed size of 256 + KB called "arenas". + +Here is how the allocators are set up by default: + +* ``PYMEM_DOMAIN_RAW``, ``PYMEM_DOMAIN_MEM``: ``malloc()``, + ``realloc()`` and ``free()``; call ``malloc(1)`` when requesting zero + bytes +* ``PYMEM_DOMAIN_OBJ``: *pymalloc* allocator which falls back on + ``PyMem_Malloc()`` for allocations larger than 512 bytes +* *pymalloc* arena allocator: ``VirtualAlloc()`` and ``VirtualFree()`` on + Windows, ``mmap()`` and ``munmap()`` when available, or ``malloc()`` + and ``free()`` + + +Redesign Debug Checks on Memory Block Allocators as Hooks +--------------------------------------------------------- + +Since Python 2.3, Python implements different checks on memory +allocators in debug mode: + +* Newly allocated memory is filled with the byte ``0xCB``, freed memory + is filled with the byte ``0xDB``. +* Detect API violations, ex: ``PyObject_Free()`` called on a memory + block allocated by ``PyMem_Malloc()`` +* Detect write before the start of the buffer (buffer underflow) +* Detect write after the end of the buffer (buffer overflow) + +In Python 3.3, the checks are installed by replacing ``PyMem_Malloc()``, +``PyMem_Realloc()``, ``PyMem_Free()``, ``PyObject_Malloc()``, +``PyObject_Realloc()`` and ``PyObject_Free()`` using macros. The new +allocator allocates a larger buffer and writes a pattern to detect buffer +underflow, buffer overflow and use after free (by filling the buffer with +the byte ``0xDB``). It uses the original ``PyObject_Malloc()`` +function to allocate memory. So ``PyMem_Malloc()`` and +``PyMem_Realloc()`` indirectly call``PyObject_Malloc()`` and +``PyObject_Realloc()``. + +This PEP redesigns the debug checks as hooks on the existing allocators +in debug mode. Examples of call traces without the hooks: + +* ``PyMem_RawMalloc()`` => ``_PyMem_RawMalloc()`` => ``malloc()`` +* ``PyMem_Realloc()`` => ``_PyMem_RawRealloc()`` => ``realloc()`` +* ``PyObject_Free()`` => ``_PyObject_Free()`` + +Call traces when the hooks are installed (debug mode): + +* ``PyMem_RawMalloc()`` => ``_PyMem_DebugMalloc()`` + => ``_PyMem_RawMalloc()`` => ``malloc()`` +* ``PyMem_Realloc()`` => ``_PyMem_DebugRealloc()`` + => ``_PyMem_RawRealloc()`` => ``realloc()`` +* ``PyObject_Free()`` => ``_PyMem_DebugFree()`` + => ``_PyObject_Free()`` + +As a result, ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now call +``malloc()`` and ``realloc()`` in both release mode and debug mode, +instead of calling ``PyObject_Malloc()`` and ``PyObject_Realloc()`` in +debug mode. + +When at least one memory allocator is replaced with +``PyMem_SetAllocator()``, the ``PyMem_SetupDebugHooks()`` function must +be called to reinstall the debug hooks on top on the new allocator. + + +Don't call malloc() directly anymore +------------------------------------ + +``PyObject_Malloc()`` falls back on ``PyMem_Malloc()`` instead of +``malloc()`` if size is greater or equal than 512 bytes, and +``PyObject_Realloc()`` falls back on ``PyMem_Realloc()`` instead of +``realloc()`` + +Direct calls to ``malloc()`` are replaced with ``PyMem_Malloc()``, or +``PyMem_RawMalloc()`` if the GIL is not held. + +External libraries like zlib or OpenSSL can be configured to allocate memory +using ``PyMem_Malloc()`` or ``PyMem_RawMalloc()``. If the allocator of a +library can only be replaced globally (rather than on an object-by-object +basis), it shouldn't be replaced when Python is embedded in an application. + +For the "track memory usage" use case, it is important to track memory +allocated in external libraries to have accurate reports, because these +allocations can be large (e.g. they can raise a ``MemoryError`` exception) +and would otherwise be missed in memory usage reports. + + +Examples +======== + +Use case 1: Replace Memory Allocators, keep pymalloc +---------------------------------------------------- + +Dummy example wasting 2 bytes per memory block, +and 10 bytes per *pymalloc* arena:: + + #include + + size_t alloc_padding = 2; + size_t arena_padding = 10; + + void* my_malloc(void *ctx, size_t size) + { + int padding = *(int *)ctx; + return malloc(size + padding); + } + + void* my_realloc(void *ctx, void *ptr, size_t new_size) + { + int padding = *(int *)ctx; + return realloc(ptr, new_size + padding); + } + + void my_free(void *ctx, void *ptr) + { + free(ptr); + } + + void* my_alloc_arena(void *ctx, size_t size) + { + int padding = *(int *)ctx; + return malloc(size + padding); + } + + void my_free_arena(void *ctx, void *ptr, size_t size) + { + free(ptr); + } + + void setup_custom_allocator(void) + { + PyMemAllocator alloc; + PyObjectArenaAllocator arena; + + alloc.ctx = &alloc_padding; + alloc.malloc = my_malloc; + alloc.realloc = my_realloc; + alloc.free = my_free; + + PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc); + PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc); + /* leave PYMEM_DOMAIN_OBJ unchanged, use pymalloc */ + + arena.ctx = &arena_padding; + arena.alloc = my_alloc_arena; + arena.free = my_free_arena; + PyObject_SetArenaAllocator(&arena); + + PyMem_SetupDebugHooks(); + } + + +Use case 2: Replace Memory Allocators, override pymalloc +-------------------------------------------------------- + +If you have a dedicated allocator optimized for allocations of objects +smaller than 512 bytes with a short lifetime, pymalloc can be overriden +(replace ``PyObject_Malloc()``). + +Dummy example wasting 2 bytes per memory block:: + + #include + + size_t padding = 2; + + void* my_malloc(void *ctx, size_t size) + { + int padding = *(int *)ctx; + return malloc(size + padding); + } + + void* my_realloc(void *ctx, void *ptr, size_t new_size) + { + int padding = *(int *)ctx; + return realloc(ptr, new_size + padding); + } + + void my_free(void *ctx, void *ptr) + { + free(ptr); + } + + void setup_custom_allocator(void) + { + PyMemAllocator alloc; + alloc.ctx = &padding; + alloc.malloc = my_malloc; + alloc.realloc = my_realloc; + alloc.free = my_free; + + PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc); + PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc); + PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc); + + PyMem_SetupDebugHooks(); + } + +The *pymalloc* arena does not need to be replaced, because it is no more +used by the new allocator. + + +Use case 3: Setup Hooks On Memory Block Allocators +-------------------------------------------------- + +Example to setup hooks on all memory block allocators:: + + struct { + PyMemAllocator raw; + PyMemAllocator mem; + PyMemAllocator obj; + /* ... */ + } hook; + + static void* hook_malloc(void *ctx, size_t size) + { + PyMemAllocator *alloc = (PyMemAllocator *)ctx; + void *ptr; + /* ... */ + ptr = alloc->malloc(alloc->ctx, size); + /* ... */ + return ptr; + } + + static void* hook_realloc(void *ctx, void *ptr, size_t new_size) + { + PyMemAllocator *alloc = (PyMemAllocator *)ctx; + void *ptr2; + /* ... */ + ptr2 = alloc->realloc(alloc->ctx, ptr, new_size); + /* ... */ + return ptr2; + } + + static void hook_free(void *ctx, void *ptr) + { + PyMemAllocator *alloc = (PyMemAllocator *)ctx; + /* ... */ + alloc->free(alloc->ctx, ptr); + /* ... */ + } + + void setup_hooks(void) + { + PyMemAllocator alloc; + static int installed = 0; + + if (installed) + return; + installed = 1; + + alloc.malloc = hook_malloc; + alloc.realloc = hook_realloc; + alloc.free = hook_free; + PyMem_GetAllocator(PYMEM_DOMAIN_RAW, &hook.raw); + PyMem_GetAllocator(PYMEM_DOMAIN_MEM, &hook.mem); + PyMem_GetAllocator(PYMEM_DOMAIN_OBJ, &hook.obj); + + alloc.ctx = &hook.raw; + PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc); + + alloc.ctx = &hook.mem; + PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc); + + alloc.ctx = &hook.obj; + PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc); + } + +.. note:: + ``PyMem_SetupDebugHooks()`` does not need to be called because + memory allocator are not replaced: the debug checks on memory + block allocators are installed automatically at startup. + + +Performances +============ + +The implementation of this PEP (issue #3329) has no visible overhead on +the Python benchmark suite. + +Results of the `Python benchmarks suite +`_ (-b 2n3): some tests are 1.04x +faster, some tests are 1.04 slower. Results of pybench microbenchmark: +"+0.1%" slower globally (diff between -4.9% and +5.6%). + +The full output of benchmarks is attached to the issue #3329. + + +Rejected Alternatives +===================== + +More specific functions to get/set memory allocators +---------------------------------------------------- + +It was originally proposed a larger set of C API functions, with one pair +of functions for each allocator domain: + +* ``void PyMem_GetRawAllocator(PyMemAllocator *allocator)`` +* ``void PyMem_GetAllocator(PyMemAllocator *allocator)`` +* ``void PyObject_GetAllocator(PyMemAllocator *allocator)`` +* ``void PyMem_SetRawAllocator(PyMemAllocator *allocator)`` +* ``void PyMem_SetAllocator(PyMemAllocator *allocator)`` +* ``void PyObject_SetAllocator(PyMemAllocator *allocator)`` + +This alternative was rejected because it is not possible to write +generic code with more specific functions: code must be duplicated for +each memory allocator domain. + + +Make PyMem_Malloc() reuse PyMem_RawMalloc() by default +------------------------------------------------------ + +If ``PyMem_Malloc()`` called ``PyMem_RawMalloc()`` by default, +calling ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, alloc)`` would also +patch ``PyMem_Malloc()`` indirectly. + +This alternative was rejected because ``PyMem_SetAllocator()`` would +have a different behaviour depending on the domain. Always having the +same behaviour is less error-prone. + + +Add a new PYDEBUGMALLOC environment variable +-------------------------------------------- + +It was proposed to add a new ``PYDEBUGMALLOC`` environment variable to +enable debug checks on memory block allocators. It would have had the same +effect as calling the ``PyMem_SetupDebugHooks()``, without the need +to write any C code. Another advantage is to allow to enable debug checks +even in release mode: debug checks would always be compiled in, but only +enabled when the environment variable is present and non-empty. + +This alternative was rejected because a new environment variable would +make Python initialization even more complex. `PEP 432 +`_ tries to simplify the +CPython startup sequence. + + +Use macros to get customizable allocators +----------------------------------------- + +To have no overhead in the default configuration, customizable +allocators would be an optional feature enabled by a configuration +option or by macros. + +This alternative was rejected because the use of macros implies having +to recompile extensions modules to use the new allocator and allocator +hooks. Not having to recompile Python nor extension modules makes debug +hooks easier to use in practice. + + +Pass the C filename and line number +----------------------------------- + +Define allocator functions as macros using ``__FILE__`` and ``__LINE__`` +to get the C filename and line number of a memory allocation. + +Example of ``PyMem_Malloc`` macro with the modified +``PyMemAllocator`` structure:: + + typedef struct { + /* user context passed as the first argument + to the 3 functions */ + void *ctx; + + /* allocate a memory block */ + void* (*malloc) (void *ctx, const char *filename, int lineno, + size_t size); + + /* allocate or resize a memory block */ + void* (*realloc) (void *ctx, const char *filename, int lineno, + void *ptr, size_t new_size); + + /* release a memory block */ + void (*free) (void *ctx, const char *filename, int lineno, + void *ptr); + } PyMemAllocator; + + void* _PyMem_MallocTrace(const char *filename, int lineno, + size_t size); + + /* the function is still needed for the Python stable ABI */ + void* PyMem_Malloc(size_t size); + + #define PyMem_Malloc(size) \ + _PyMem_MallocTrace(__FILE__, __LINE__, size) + +The GC allocator functions would also have to be patched. For example, +``_PyObject_GC_Malloc()`` is used in many C functions and so objects of +different types would have the same allocation location. + +This alternative was rejected because passing a filename and a line +number to each allocator makes the API more complex: pass 3 new +arguments (ctx, filename, lineno) to each allocator function, instead of +just a context argument (ctx). Having to also modify GC allocator +functions adds too much complexity for a little gain. + + +GIL-free PyMem_Malloc() +----------------------- + +In Python 3.3, when Python is compiled in debug mode, ``PyMem_Malloc()`` +indirectly calls ``PyObject_Malloc()`` which requires the GIL to be +held (it isn't thread-safe). That's why ``PyMem_Malloc()`` must be called +with the GIL held. + +This PEP changes ``PyMem_Malloc()``: it now always calls ``malloc()`` +rather than ``PyObject_Malloc()``. The "GIL must be held" restriction +could therefore be removed from ``PyMem_Malloc()``. + +This alternative was rejected because allowing to call +``PyMem_Malloc()`` without holding the GIL can break applications +which setup their own allocators or allocator hooks. Holding the GIL is +convenient to develop a custom allocator: no need to care about other +threads. It is also convenient for a debug allocator hook: Python +objects can be safely inspected, and the C API may be used for reporting. + +Moreover, calling ``PyGILState_Ensure()`` in a memory allocator has +unexpected behaviour, especially at Python startup and when creating of a +new Python thread state. It is better to free custom allocators of +the responsibility of acquiring the GIL. + + +Don't add PyMem_RawMalloc() +--------------------------- + +Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is +held. Otherwise, keep ``malloc()`` unchanged. + +The ``PyMem_Malloc()`` is used without the GIL held in some Python +functions. For example, the ``main()`` and ``Py_Main()`` functions of +Python call ``PyMem_Malloc()`` whereas the GIL do not exist yet. In this +case, ``PyMem_Malloc()`` would be replaced with ``malloc()`` (or +``PyMem_RawMalloc()``). + +This alternative was rejected because ``PyMem_RawMalloc()`` is required +for accurate reports of the memory usage. When a debug hook is used to +track the memory usage, the memory allocated by direct calls to +``malloc()`` cannot be tracked. ``PyMem_RawMalloc()`` can be hooked and +so all the memory allocated by Python can be tracked, including +memory allocated without holding the GIL. + + +Use existing debug tools to analyze memory use +---------------------------------------------- + +There are many existing debug tools to analyze memory use. Some +examples: `Valgrind `_, `Purify +`_, `Clang AddressSanitizer +`_, `failmalloc +`_, etc. + +The problem is to retrieve the Python object related to a memory pointer +to read its type and/or its content. Another issue is to retrieve the +source of the memory allocation: the C backtrace is usually useless +(same reasoning than macros using ``__FILE__`` and ``__LINE__``, see +`Pass the C filename and line number`_), the Python filename and line +number (or even the Python traceback) is more useful. + +This alternative was rejected because classic tools are unable to +introspect Python internals to collect such information. Being able to +setup a hook on allocators called with the GIL held allows to collect a +lot of useful data from Python internals. + + +Add a msize() function +---------------------- + +Add another function to ``PyMemAllocator`` and +``PyObjectArenaAllocator`` structures:: + + size_t msize(void *ptr); + +This function returns the size of a memory block or a memory mapping. +Return (size_t)-1 if the function is not implemented or if the pointer +is unknown (ex: NULL pointer). + +On Windows, this function can be implemented using ``_msize()`` and +``VirtualQuery()``. + +The function can be used to implement a hook tracking the memory usage. +The ``free()`` method of an allocator only gets the address of a memory +block, whereas the size of the memory block is required to update the +memory usage. + +The additional ``msize()`` function was rejected because only few +platforms implement it. For example, Linux with the GNU libc does not +provide a function to get the size of a memory block. ``msize()`` is not +currently used in the Python source code. The function would only be +used to track memory use, and make the API more complex. A debug hook +can implement the function internally, there is no need to add it to +``PyMemAllocator`` and ``PyObjectArenaAllocator`` structures. + + +No context argument +------------------- + +Simplify the signature of allocator functions, remove the context +argument: + +* ``void* malloc(size_t size)`` +* ``void* realloc(void *ptr, size_t new_size)`` +* ``void free(void *ptr)`` + +It is likely for an allocator hook to be reused for +``PyMem_SetAllocator()`` and ``PyObject_SetAllocator()``, or even +``PyMem_SetRawAllocator()``, but the hook must call a different function +depending on the allocator. The context is a convenient way to reuse the +same custom allocator or hook for different Python allocators. + +In C++, the context can be used to pass *this*. + + +External Libraries +================== + +Examples of API used to customize memory allocators. + +Libraries used by Python: + +* OpenSSL: `CRYPTO_set_mem_functions() + `_ + to set memory management functions globally +* expat: `parserCreate() + `_ + has a per-instance memory handler +* zlib: `zlib 1.2.8 Manual `_, + pass an opaque pointer +* bz2: `bzip2 and libbzip2, version 1.0.5 + `_, + pass an opaque pointer +* lzma: `LZMA SDK - How to Use + `_, + pass an opaque pointer +* lipmpdec: no opaque pointer (classic malloc API) + +Other libraries: + +* glib: `g_mem_set_vtable() + `_ +* libxml2: + `xmlGcMemSetup() `_, + global +* Oracle's OCI: `Oracle Call Interface Programmer's Guide, + Release 2 (9.2) + `_, + pass an opaque pointer + +The new *ctx* parameter of this PEP was inspired by the API of zlib and +Oracle's OCI libraries. + +See also the `GNU libc: Memory Allocation Hooks +`_ +which uses a different approach to hook memory allocators. + + +Memory Allocators +================= + +The C standard library provides the well known ``malloc()`` function. +Its implementation depends on the platform and of the C library. The GNU +C library uses a modified ptmalloc2, based on "Doug Lea's Malloc" +(dlmalloc). FreeBSD uses `jemalloc +`_. Google provides *tcmalloc* which +is part of `gperftools `_. + +``malloc()`` uses two kinds of memory: heap and memory mappings. Memory +mappings are usually used for large allocations (ex: larger than 256 +KB), whereas the heap is used for small allocations. + +On UNIX, the heap is handled by ``brk()`` and ``sbrk()`` system calls, +and it is contiguous. On Windows, the heap is handled by +``HeapAlloc()`` and can be discontiguous. Memory mappings are handled by +``mmap()`` on UNIX and ``VirtualAlloc()`` on Windows, they can be +discontiguous. + +Releasing a memory mapping gives back immediatly the memory to the +system. On UNIX, the heap memory is only given back to the system if the +released block is located at the end of the heap. Otherwise, the memory +will only be given back to the system when all the memory located after +the released memory is also released. + +To allocate memory on the heap, an allocator tries to reuse free space. +If there is no contiguous space big enough, the heap must be enlarged, +even if there is more free space than required size. This issue is +called the "memory fragmentation": the memory usage seen by the system +is higher than real usage. On Windows, ``HeapAlloc()`` creates +a new memory mapping with ``VirtualAlloc()`` if there is not enough free +contiguous memory. + +CPython has a *pymalloc* allocator for allocations smaller than 512 +bytes. This allocator is optimized for small objects with a short +lifetime. It uses memory mappings called "arenas" with a fixed size of +256 KB. + +Other allocators: + +* Windows provides a `Low-fragmentation Heap + `_. + +* The Linux kernel uses `slab allocation + `_. + +* The glib library has a `Memory Slice API + `_: + efficient way to allocate groups of equal-sized chunks of memory + +This PEP allows to choose exactly which memory allocator is used for your +application depending on its usage of the memory (number of allocations, +size of allocations, lifetime of objects, etc.). + + +Links +===== + +CPython issues related to memory allocation: + +* `Issue #3329: Add new APIs to customize memory allocators + `_ +* `Issue #13483: Use VirtualAlloc to allocate memory arenas + `_ +* `Issue #16742: PyOS_Readline drops GIL and calls PyOS_StdioReadline, + which isn't thread safe `_ +* `Issue #18203: Replace calls to malloc() with PyMem_Malloc() or + PyMem_RawMalloc() `_ +* `Issue #18227: Use Python memory allocators in external libraries like + zlib or OpenSSL `_ + +Projects analyzing the memory usage of Python applications: + +* `pytracemalloc + `_ +* `Meliae: Python Memory Usage Analyzer + `_ +* `Guppy-PE: umbrella package combining Heapy and GSL + `_ +* `PySizer (developed for Python 2.4) + `_ + + +Copyright +========= + +This document has been placed into the public domain. + diff --git a/pep-0446.txt b/pep-0446.txt new file mode 100644 index 000000000..c83c7eae4 --- /dev/null +++ b/pep-0446.txt @@ -0,0 +1,242 @@ +PEP: 446 +Title: Add new parameters to configure the inheritance of files and for non-blocking sockets +Version: $Revision$ +Last-Modified: $Date$ +Author: Victor Stinner +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 3-July-2013 +Python-Version: 3.4 + + +Abstract +======== + +This PEP proposes new portable parameters and functions to configure the +inheritance of file descriptors and the non-blocking flag of sockets. + + +Rationale +========= + +Inheritance of file descriptors +------------------------------- + +The inheritance of file descriptors in child processes can be configured +on each file descriptor using a *close-on-exec* flag. By default, the +close-on-exec flag is not set. + +On Windows, the close-on-exec flag is ``HANDLE_FLAG_INHERIT``. File +descriptors are not inherited if the ``bInheritHandles`` parameter of +the ``CreateProcess()`` function is ``FALSE``, even if the +``HANDLE_FLAG_INHERIT`` flag is set. If ``bInheritHandles`` is ``TRUE``, +only file descriptors with ``HANDLE_FLAG_INHERIT`` flag set are +inherited, others are not. + +On UNIX, the close-on-exec flag is ``O_CLOEXEC``. File descriptors with +the ``O_CLOEXEC`` flag set are closed at the execution of a new program +(ex: when calling ``execv()``). + +The ``O_CLOEXEC`` flag has no effect on ``fork()``, all file descriptors +are inherited by the child process. Futhermore, most properties file +descriptors are shared between the parent and the child processes, +except file attributes which are duplicated (``O_CLOEXEC`` is the only +file attribute). Setting ``O_CLOEXEC`` flag of a file descriptor in the +child process does not change the ``O_CLOEXEC`` flag of the file +descriptor in the parent process. + + +Issues of the inheritance of file descriptors +--------------------------------------------- + +Inheritance of file descriptors causes issues. For example, closing a +file descriptor in the parent process does not release the resource +(file, socket, ...), because the file descriptor is still open in the +child process. + +Leaking file descriptors is also a major security vulnerability. An +untrusted child process can read sensitive data like passwords and take +control of the parent process though leaked file descriptors. It is for +example a known vulnerability to escape from a chroot. + + +Non-blocking sockets +-------------------- + +To handle multiple network clients in a single thread, a multiplexing +function like ``select()`` can be used. For best performances, sockets +must be configured as non-blocking. Operations like ``send()`` and +``recv()`` return an ``EAGAIN`` or ``EWOULDBLOCK`` error if the +operation would block. + +By default, newly created sockets are blocking. Setting the non-blocking +mode requires additional system calls. + +On UNIX, the blocking flag is ``O_NONBLOCK``: a pipe and a socket are +non-blocking if the ``O_NONBLOCK`` flag is set. + + +Setting flags at the creation of the file descriptor +---------------------------------------------------- + +Windows and recent versions of other operating systems like Linux +support setting the close-on-exec flag directly at the creation of file +descriptors, and close-on-exec and blocking flags at the creation of +sockets. + +Setting these flags at the creation is atomic and avoids additional +system calls. + + +Proposal +======== + +New cloexec And blocking Parameters +----------------------------------- + +Add a new optional *cloexec* on functions creating file descriptors: + +* ``io.FileIO`` +* ``io.open()`` +* ``open()`` +* ``os.dup()`` +* ``os.dup2()`` +* ``os.fdopen()`` +* ``os.open()`` +* ``os.openpty()`` +* ``os.pipe()`` +* ``select.devpoll()`` +* ``select.epoll()`` +* ``select.kqueue()`` + +Add new optional *cloexec* and *blocking* parameters to functions +creating sockets: + +* ``asyncore.dispatcher.create_socket()`` +* ``socket.socket()`` +* ``socket.socket.accept()`` +* ``socket.socket.dup()`` +* ``socket.socket.fromfd`` +* ``socket.socketpair()`` + +The default value of *cloexec* is ``False`` and the default value of +*blocking* is ``True``. + +The atomicity is not guaranteed. If the platform does not support +setting close-on-exec and blocking flags at the creation of the file +descriptor or socket, the flags are set using additional system calls. + + +New Functions +------------- + +Add new functions the get and set the close-on-exec flag of a file +descriptor, available on all platforms: + +* ``os.get_cloexec(fd:int) -> bool`` +* ``os.set_cloexec(fd:int, cloexec: bool)`` + +Add new functions the get and set the blocking flag of a file +descriptor, only available on UNIX: + +* ``os.get_blocking(fd:int) -> bool`` +* ``os.set_blocking(fd:int, blocking: bool)`` + + +Other Changes +------------- + +The ``subprocess.Popen`` class must clear the close-on-exec flag of file +descriptors of the ``pass_fds`` parameter. The flag is cleared in the +child process before executing the program, the change does not change +the flag in the parent process. + +The close-on-exec flag must also be set on private file descriptors and +sockets in the Python standard library. For example, on UNIX, +os.urandom() opens ``/dev/urandom`` to read some random bytes and the +file descriptor is closed at function exit. The file descriptor is not +expected to be inherited by child processes. + + +Rejected Alternatives +===================== + +PEP 433 +------- + +The PEP 433 entitled "Easier suppression of file descriptor inheritance" +is a previous attempt proposing various other alternatives, but no +consensus could be reached. + +This PEP has a well defined behaviour (the default value of the new +*cloexec* parameter is not configurable), is more conservative (no +backward compatibility issue), and is much simpler. + + +Add blocking parameter for file descriptors and use Windows overlapped I/O +-------------------------------------------------------------------------- + +Windows supports non-blocking operations on files using an extension of +the Windows API called "Overlapped I/O". Using this extension requires +to modify the Python standard library and applications to pass a +``OVERLAPPED`` structure and an event loop to wait for the completion of +operations. + +This PEP only tries to expose portable flags on file descriptors and +sockets. Supporting overlapped I/O requires an abstraction providing a +high-level and portable API for asynchronous operations on files and +sockets. Overlapped I/O are out of the scope of this PEP. + +UNIX supports non-blocking files, moreover recent versions of operating +systems support setting the non-blocking flag at the creation of a file +descriptor. It would be possible to add a new optional *blocking* +parameter to Python functions creating file descriptors. On Windows, +creating a file descriptor with ``blocking=False`` would raise a +``NotImplementedError``. This behaviour is not acceptable for the ``os`` +module which is designed as a thin wrapper on the C functions of the +operating system. If a platform does not support a function, the +function should not be available on the platform. For example, +the ``os.fork()`` function is not available on Windows. + +For all these reasons, this alternative was rejected. The PEP 3156 +proposes an abstraction for asynchronous I/O supporting non-blocking +files on Windows. + + +Links +===== + +Python issues: + +* `#10115: Support accept4() for atomic setting of flags at socket + creation `_ +* `#12105: open() does not able to set flags, such as O_CLOEXEC + `_ +* `#12107: TCP listening sockets created without FD_CLOEXEC flag + `_ +* `#16850: Add "e" mode to open(): close-and-exec + (O_CLOEXEC) / O_NOINHERIT `_ +* `#16860: Use O_CLOEXEC in the tempfile module + `_ +* `#16946: subprocess: _close_open_fd_range_safe() does not set + close-on-exec flag on Linux < 2.6.23 if O_CLOEXEC is defined + `_ +* `#17070: Use the new cloexec to improve security and avoid bugs + `_ + +Other links: + +* `Secure File Descriptor Handling + `_ (Ulrich Drepper, + 2008) +* `Ghosts of Unix past, part 2: Conflated designs + `_ (Neil Brown, 2010) explains the + history of ``O_CLOEXEC`` and ``O_NONBLOCK`` flags + + +Copyright +========= + +This document has been placed into the public domain. +