Add PEP 423 (Naming conventions and recipes related to packaging).

This commit is contained in:
Guido van Rossum 2012-06-26 10:25:12 -07:00
parent a4ef40c921
commit f1db51dc79
1 changed files with 845 additions and 0 deletions

845
pep-0423.txt Normal file
View File

@ -0,0 +1,845 @@
PEP: 423
Title: Naming conventions and recipes related to packaging
Version: $Revision$
Last-Modified: $Date$
Author: Benoit Bryon <benoit@marmelune.net>
Discussion-To: <distutils-sig@python.org>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 24-May-2012
Post-History:
Abstract
========
This document deals with:
* names of Python projects,
* names of Python packages or modules being distributed,
* namespace packages.
It provides guidelines and recipes for distribution authors:
* new projects should follow the `guidelines <#overview>`_ below.
* existing projects should be aware of these guidelines and can follow
`specific recipes for existing projects
<#how-to-apply-naming-guidelines-on-existing-projects>`_.
Terminology
===========
Reference is `packaging terminology in Python documentation`_.
Relationship with other PEPs
============================
* `PEP 8`_ deals with code style guide, including names of Python
packages and modules. It covers syntax of package/modules names.
* `PEP 345`_ deals with packaging metadata, and defines name argument
of the ``packaging.core.setup()`` function.
* `PEP 420`_ deals with namespace packages. It brings support of
namespace packages to Python core. Before, namespaces packages were
implemented by external libraries.
* `PEP 3108`_ deals with transition between Python 2.x and Python 3.x
applied to standard library: some modules to be deleted, some to be
renamed. It points out that naming conventions matter and is an
example of transition plan.
Overview
========
Here is a summarized list of guidelines you should follow to choose
names:
* `understand and respect namespace ownership <#respect-ownership>`_.
* if your project is related to another project or community:
* search for conventions in main project's documentation, because
`projects should organize community contributions
<#organize-community-contributions>`_.
* `follow specific project or related community conventions
<#follow-community-or-related-project-conventions-if-any>`_, if any.
* if there is no convention, `follow a standard naming pattern
<#use-standard-pattern-for-community-contributions>`_.
* make sure your project name is unique, i.e. avoid duplicates:
* `use top-level namespace for ownership
<#top-level-namespace-relates-to-code-ownership>`_,
* `check for name availability
<#how-to-check-for-name-availability>`_,
* `register names with PyPI`_.
* make sure distributed packages and modules names are unique, unless
you explicitely want to distribute alternatives to existing packages
or modules. `Using the same value for package/module name and
project name <#use-a-single-name>`_ is the recommended way to
achieve this.
* `distribute only one package or module at a time
<#multiple-packages-modules-should-be-rare>`_, unless you know what
you are doing. It makes it possible to apply the "`use a single
name`_" rule, and thus make names consistent.
* make it easy to discover and remember your project:
* `use as much memorable names as possible
<#pick-memorable-names>`_,
* `use as much meaningful names as possible
<#pick-meaningful-names>`_,
* `use other packaging metadata <#use-packaging-metadata>`_.
* `avoid deep nesting`_. Flat things are easier to use and remember
than nested ones:
* one or two namespace levels are recommended, because they are
almost always enough.
* even if not recommended, three levels are, de facto, a common
case.
* in most cases, you should not need more than three levels.
* `follow PEP 8
<#follow-pep-8-for-syntax-of-package-and-module-names>`_ for syntax
of package and module names.
* if you followed specific conventions, or if your project is intended
to receive contributions from the community, `organize community
contributions`_.
* `if still in doubt, ask <#if-in-doubt-ask>`_.
If in doubt, ask
================
If you feel unsure after reading this document, ask `Python
community`_ on IRC or on a mailing list.
Top-level namespace relates to code ownership
=============================================
This helps avoid clashes between project names.
Ownership could be:
* an individual.
Example: `gp.fileupload`_ is owned and maintained by Gael
Pasgrimaud.
* an organization.
Examples:
* `zest.releaser`_ is owned and maintained by Zest Software.
* `Django`_ is owned and maintained by the Django Software
Fundation.
* a group or community.
Example: `sphinx`_ is maintained by developers of the Sphinx
project, not only by its author, Georg Brandl.
* a group or community related to another package.
Example: `collective.recaptcha`_ is owned by its author: David
Glick, Groundwire. But the "collective" namespace is owned by Plone
community.
Respect ownership
-----------------
Understand the purpose of namespace before you use it.
Don't plug into a namespace you don't own, unless explicitely
authorized.
`If in doubt, ask`_.
As an example, don't plug in "django.contrib" namespace because it is
managed by Django's core contributors.
Exceptions can be defined by project authors. See `Organize community
contributions`_ below.
Also, this rule applies to non-Python projects.
As an example, don't use "apache" as top-level namespace: "Apache" is
the name of an existing project (in the case of "Apache", it is also a
trademark).
Private (including closed-source) projects use a namespace
----------------------------------------------------------
... because private projects are owned by somebody. So apply the
`ownership rule <#top-level-namespace-relates-to-code-ownership>`_.
For internal/customer projects, use your company name as the
namespace.
This rule applies to closed-source projects.
As an example, if you are creating a "climbing" project for the
"Python Sport" company: use "pythonsport.climbing" name, even if it is
closed source.
Individual projects use a namespace
-----------------------------------
... because they are owned by individuals. So apply the
`ownership rule <#top-level-namespace-relates-to-code-ownership>`_.
There is no shame in releasing a project as open source even if it has
an "internal" or "individual" name.
If the project comes to a point where the author wants to change
ownership (i.e. the project no longer belongs to an individual), keep
in mind `it is easy to rename the project
<#how-to-rename-a-project>`_.
Community-owned projects can avoid namespace packages
-----------------------------------------------------
If your project is generic enough (i.e. it is not a contrib to another
product or framework), you can avoid namespace packages. The base
condition is generally that your project is owned by a group (i.e. the
development team) which is dedicated to this project.
Only use a "shared" namespace if you really intend the code to be
community owned.
As an example, `sphinx`_ project belongs to the Sphinx development
team. There is no need to have some "sphinx" namespace package with
only one "sphinx.sphinx" project inside.
In doubt, use an individual/organization namespace
--------------------------------------------------
If your project is really experimental, best choice is to use an
individual or organization namespace:
* it allows projects to be released early.
* it won't block a name if the project is abandoned.
* it doesn't block future changes. When a project becomes mature and
there is no reason to keep individual ownership, `it remains
possible to rename the project <#how-to-rename-a-project>`_.
Use a single name
=================
Distribute only one package (or only one module) per project, and use
package (or module) name as project name.
* It avoids possible confusion between project name and distributed
package or module name.
* It makes the name consistent.
* It is explicit: when one sees project name, he guesses
package/module name, and vice versa.
* It also limits implicit clashes between package/module names.
By using a single name, when you register a project name to `PyPI`_,
you also perform a basic package/module name availability
verification.
As an example, `pipeline`_, `python-pipeline`_ and
`django-pipeline`_ all distribute a package or module called
"pipeline". So installing two of them leads to errors. This issue
wouldn't have occurred if these distributions used a single name.
Yes:
* Package name: "kheops.pyramid",
i.e. ``import kheops.pyramid``
* Project name: "kheops.pyramid",
i.e. ``pip install kheops.pyramid``
No:
* Package name: "kheops"
* Project name: "KheopsPyramid"
.. note::
For historical reasons, `PyPI`_ contains many distributions where
project and distributed package/module names differ.
Multiple packages/modules should be rare
----------------------------------------
Technically, Python distributions can provide multiple packages and/or
modules. See `setup script reference`_ for details.
Some distributions actually do.
As an example, `setuptools`_ and `distribute`_ are both declaring
"pkg_resources", "easy_install" and "site" modules in addition to
respective "setuptools" and "distribute" packages.
Consider this use case as exceptional. In most cases, you don't need
this feature. So a distribution should provide only one package or
module at a time.
Distinct names should be rare
-----------------------------
A notable exception to the `Use a single name`_ rule is when you
explicitely need distinct names.
As an example, the `Pillow`_ project provides an alternative to the
original `PIL`_ distribution. Both projects distribute a "PIL"
package.
Consider this use case as exceptional. In most cases, you don't need
this feature. So a distributed package name should be equal to project
name.
Follow PEP 8 for syntax of package and module names
===================================================
`PEP 8`_ applies to names of Python packages and modules.
If you `Use a single name`_, `PEP 8`_ also applies to project names.
The exceptions are namespace packages, where dots are required in
project name.
Pick memorable names
====================
One important thing about a project name is that it be memorable.
As an example, `celery`_ is not a meaningful name. At first, it is not
obvious that it deals with message queuing. But it is memorable,
partly because it can be used to feed a `RabbitMQ`_ server.
Pick meaningful names
=====================
Ask yourself "how would I describe in one sentence what this name is
for?", and then "could anyone have guessed that by looking at the
name?".
As an example, `DateUtils`_ is a meaningful name. It is obvious that
it deals with utilities for dates.
When you are using namespaces, try to make each part meaningful.
Use packaging metadata
======================
Consider project names as unique identifiers on PyPI:
* it is important that these identifiers remain human-readable.
* it is even better when these identifiers are meaningful.
* but the primary purpose of identifiers is not to classify or
describe projects.
**Classifiers and keywords metadata are made for categorization of
distributions.** Summary and description metadata are meant to
describe the project.
As an example, there is a "`Framework :: Twisted`_" classifier. Even
if names are quite heterogeneous (they don't follow a particular
pattern), we get the list.
In order to `Organize community contributions`_, conventions about
names and namespaces matter, but conventions about metadata should be
even more important.
As an example, we can find Plone portlets in many places:
* plone.portlet.*
* collective.portlet.*
* collective.portlets.*
* collective.*.portlets
* some vendor-related projects such as "quintagroup.portlet.cumulus"
* and even projects where "portlet" pattern doesn't appear in the
name.
Even if Plone community has conventions, using the name to categorize
distributions is inapropriate. It's impossible to get the full list of
distributions that provide portlets for Plone by filtering on names.
But it would be possible if all these distributions used
"Framework :: Plone" classifier and "portlet" keyword.
Avoid deep nesting
==================
`The Zen of Python`_ says "Flat is better than nested".
Two levels is almost always enough
----------------------------------
Don't define everything in deeply nested hierarchies: you will end up
with projects and packages like "pythonsport.common.maps.forest". This
type of name is both verbose and cumbersome (e.g. if you have many
imports from the package).
Furthermore, big hierarchies tend to break down over time as the
boundaries between different packages blur.
The consensus is that two levels of nesting are preferred.
For example, we have ``plone.principalsource`` instead of
``plone.source.principal`` or something like that. The name is
shorter, the package structure is simpler, and there would be very
little to gain from having three levels of nesting here. It would be
impractical to try to put all "core Plone" sources (a source is kind
of vocabulary) into the ``plone.source.*`` namespace, in part because
some sources are part of other packages, and in part because sources
already exist in other places. Had we made a new namespace, it would
be inconsistently used from the start.
Yes: "pyranha"
Yes: "pythonsport.climbing"
Yes: "pythonsport.forestmap"
No: "pythonsport.maps.forest"
Use only one level for ownership
--------------------------------
Don't use 3 levels to set individual/organization ownership in
a community namespace.
As an example, let's consider:
* you are pluging into a community namespace, such as "collective".
* and you want to add a more restrictive "ownership" level, to avoid
clashes inside the community.
In such a case, **you'd better use the most restrictive ownership
level as first level.**
As an example, where "collective" is a major community namespace that
"gergovie" belongs to, and "vercingetorix" it the name of "gergovie"
author:
No: "collective.vercingetorix.gergovie"
Yes: "vercingetorix.gergovie"
Don't use namespace levels for categorization
---------------------------------------------
`Use packaging metadata`_ instead.
Don't use more than 3 levels
----------------------------
Technically, you can create deeply nested hierarchies. However, in
most cases, you shouldn't need it.
.. note::
Even communities where namespaces are standard don't use more than
3 levels.
Conventions for communities or related projects
===============================================
Follow community or related project conventions, if any
-------------------------------------------------------
Projects or related communities can have specific conventions, which
may differ from those explained in this document.
In such a case, `they should declare specific conventions in
documentation <#organize-community-contributions>`_.
So, if your project belongs to another project or to a community,
first look for specific conventions in main project's documentation.
If there is no specific conventions, follow the ones declared in this
document.
As an example, `Plone community`_ releases community contributions in
the "collective" namespace package. It differs from the `standard
namespace for contributions
<#use-standard-pattern-for-community-contributions>`_ proposed here.
But since it is documented, there is no ambiguity and you should
follow this specific convention.
Use standard pattern for community contributions
------------------------------------------------
When no specific rule is defined, use the
``${MAINPROJECT}contrib.${PROJECT}`` pattern to store community
contributions for any product or framework, where:
* ``${MAINPROJECT}`` is the name of the related project. "pyranha" in
the example below.
* ``${PROJECT}`` is the name of your project. "giantteeth" in the
example below.
As an example:
* you are the author of "pyranha" project. You own the "pyranha"
namespace.
* you didn't defined specific naming conventions for community
contributions.
* a third-party developer wants to publish a "giantteeth" project
related to your "pyranha" project in a community namespace. So he
should publish it as "pyranhacontrib.giantteeth".
It is the simplest way to `Organize community contributions`_.
.. note::
Why ``${MAINPROJECT}contrib.*`` pattern?
* ``${MAINPROJECT}c.*`` is not explicit enough. As examples, "zc"
belongs to "Zope Corporation" whereas "z3c" belongs to "Zope 3
community".
* ``${MAINPROJECT}community`` is too long.
* ``${MAINPROJECT}community`` conflicts with existing namespaces
such as "iccommunity" or "PyCommunity".
* ``${MAINPROJECT}.contrib.*`` is inside ${MAINPROJECT} namespace,
i.e. it is owned by ${MAINPROJECT} authors. It breaks the
`Top-level namespace relates to code ownership`_ rule.
* ``${MAINPROJECT}.contrib.*`` breaks the `Avoid deep nesting`_
rule.
* names where ``${MAINPROJECT}`` doesn't appear are not explicit
enough, i.e. nobody can guess they are related to
``${MAINPROJECT}``. As an example, it is not obvious that
"collective.*" belongs to Plone community.
* ``{$DIST}contrib.*`` looks like existing ``sphinxcontrib-*``
packages. But ``sphinxcontrib-*`` is actually about Sphinx
contrib, so this is not a real conflict... In fact, the "contrib"
suffix was inspired by "sphinxcontrib".
Organize community contributions
--------------------------------
This is the counterpart of the `follow community conventions
<#follow-community-or-related-project-conventions-if-any>`_ and
`standard pattern for contributions
<#use-standard-pattern-for-community-contributions>`_ rules.
Actions:
* Choose a naming convention for community contributions.
* If it is not `the default
<#use-standard-pattern-for-community-contributions>`_, then
document it.
* if you use the `default convention
<#use-standard-pattern-for-community-contributions>`_, then this
document should be enough. Don't reapeat it. You may reference
it.
* else, tell users about custom conventions in project's
"contribute" or "create modules" documentation.
* Also recommend the use of additional metadata, such as
`classifiers and keywords <#use-packaging-metadata>`_.
About convention choices:
* New projects should choose the `default contrib pattern
<#use-standard-pattern-for-community-contributions>`_.
* Existing projects with community contributions should start with
custom conventions. Then they can `Promote migrations`_.
It means that existing community conventions don't have to be
changed. But, at least, they should be explicitely documented.
Example: "pyranha" is your project name and package name.
Tell contributors that:
* pyranha-related distributions should use the "pyranha" keyword
* pyranha-related distributions providing templates should also use
"templates" keyword.
* community contributions should be released under "pyranhacontrib"
namespace (i.e. use "pyranhacontrib.*" pattern).
Register names with PyPI
========================
`PyPI`_ is the central place for distributions in Python community.
So, it is also the place where to register project and package names.
See `Registering with the Package Index`_ for details.
Recipes
=======
The following recipes will help you follow the guidelines and
conventions above.
How to check for name availability?
-----------------------------------
Before you choose a project name, make sure it hasn't already been
registered in the following locations:
* `PyPI`_
* that's all. PyPI is the only official place.
As an example, you could also check in various locations such as
popular code hosting services, but keep in mind that PyPI is the only
place you can **register** for names in Python community.
That's why it is important you `register names with PyPI`_.
Also make sure the names of distributed packages or modules haven't
already been registered:
* in the `Python Standard Library`_.
* inside projects at `PyPI`. There is currently no helper for that.
Notice that the more projects follow the `use a single name`_ rule,
the easier is the verification.
* you may `ask the community <#if-in-doubt-ask>`_.
The `use a single name`_ rule also helps you avoid clashes with
package names: if a project name is available, then the package name
has good chances to be available too.
How to rename a project?
------------------------
Renaming a project is possible, but keep in mind that it will cause
some confusions. So, pay particular attention to README and
documentation, so that users understand what happened.
#. First of all, **do not remove legacy distributions from PyPI**.
Because some users may be using them.
#. Copy the legacy project, then change names (project and
package/module). Pay attention to, at least:
* packaging files,
* folder name that contains source files,
* documentation, including README,
* import statements in code.
#. Assign ``Obsoletes-Dist`` metadata to new distribution in setup.cfg
file. See `PEP 345 about Obsolete-Dist`_ and `setup.cfg
specification`_.
#. Release a new version of the renamed project, then publish it.
#. Edit legacy project:
* add dependency to new project,
* drop everything except packaging stuff,
* add the ``Development Status :: 7 - Inactive`` classifier in
setup script,
* publish a new release.
So, users of the legacy package:
* can continue using the legacy distributions at a deprecated version,
* can upgrade to last version of legacy distribution, which is
empty...
* ... and automatically download new distribution as a dependency of
the legacy one.
Users who discover the legacy project see it is inactive.
Improved handling of renamed projects on PyPI
'''''''''''''''''''''''''''''''''''''''''''''
If many projects follow `Renaming howto <#how-to-rename-a-project>`_
recipe, then many legacy distributions will have the following
characteristics:
* ``Development Status :: 7 - Inactive`` classifier.
* latest version is empty, except packaging stuff.
* lastest version "redirects" to another distribution. E.g. it has a
single dependency on the renamed project.
* referenced as ``Obsoletes-Dist`` in a newer distribution.
So it will be possible to detect renamed projects and improve
readability on PyPI. So that users can focus on active distributions.
But this feature is not required now. There is no urge. It won't be
covered in this document.
How to apply naming guidelines on existing projects?
----------------------------------------------------
**There is no obligation for existing projects to be renamed**. The
choice is left to project authors and mainteners for obvious reasons.
However, project authors are invited to:
* at least, `state about current naming`_.
* then `plan and promote migration <#promote-migrations>`_.
* optionally actually `rename existing project or distributed
packages/modules <#how-to-rename-a-project>`_.
State about current naming
''''''''''''''''''''''''''
The important thing, at first, is that you state about current
choices:
* Ask yourself "why did I choose the current name?", then document it.
* If there are differences with the guidelines provided in this
document, you should tell your users.
* If possible, create issues in the project's bugtracker, at least for
record. Then you are free to resolve them later, or maybe mark them
as "wontfix".
Projects that are meant to receive contributions from community should
also `organize community contributions`_.
Promote migrations
''''''''''''''''''
Every Python developer should migrate whenever possible, or promote
the migrations in their respective communities.
Apply these guidelines on your projects, then the community will see
it is safe.
In particular, "leaders" such as authors of popular projects are
influential, they have power and, thus, responsability over
communities.
Apply these guidelines on popular projects, then communities will
adopt the conventions too.
**Projects should promote migrations when they release a new (major)
version**, particularly `if this version introduces support for
Python 3.x, new standard library's packaging or namespace packages
<#opportunity>`_.
Opportunity
'''''''''''
As of Python 3.3 being developed:
* many projects are not Python 3.x compatible. It includes "big"
products or frameworks. It means that many projects will have to do
a migration to support Python 3.x.
* packaging (aka distutils2) is on the starting blocks. When it is
released, projects will be invited to migrate and use new packaging.
* `PEP 420`_ brings official support of namespace packages to Python.
It means that most active projects should be about to migrate in the
next year(s) to support Python 3.x, new packaging or new namespace
packages.
Such an opportunity is unique and won't come again soon!
So let's introduce and promote naming conventions as soon as possible
(i.e. **now**).
References
==========
Additional background:
* `Martin Aspeli's article about names`_. Some parts of this document
are quotes from this article.
* `in development official packaging documentation`_.
* `The Hitchhiker's Guide to Packaging`_, which has an empty
placeholder for "naming specification".
References and footnotes:
.. _`packaging terminology in Python documentation`:
http://docs.python.org/dev/packaging/introduction.html#general-python-terminology
.. _`PEP 8`:
http://www.python.org/dev/peps/pep-0008/#package-and-module-names
.. _`PEP 345`: http://www.python.org/dev/peps/pep-0345/
.. _`PEP 420`: http://www.python.org/dev/peps/pep-0420/
.. _`PEP 3108`: http://www.python.org/dev/peps/pep-3108/
.. _`Python community`: http://www.python.org/community/
.. _`gp.fileupload`: http://pypi.python.org/pypi/gp.fileupload/
.. _`zest.releaser`: http://pypi.python.org/pypi/zest.releaser/
.. _`django`: http://djangoproject.com/
.. _`sphinx`: http://sphinx.pocoo.org
.. _`pypi`: http://pypi.python.org
.. _`collective.recaptcha`:
http://pypi.python.org/pypi/collective.recaptcha/
.. _`pipeline`: http://pypi.python.org/pypi/pipeline/
.. _`python-pipeline`: http://pypi.python.org/pypi/python-pipeline/
.. _`django-pipeline`: http://pypi.python.org/pypi/django-pipeline/
.. _`setup script reference`:
http://docs.python.org/dev/packaging/setupscript.html
.. _`setuptools`: http://pypi.python.org/pypi/setuptools
.. _`distribute`: http://packages.python.org/distribute/
.. _`Pillow`: http://pypi.python.org/pypi/Pillow/
.. _`PIL`: http://pypi.python.org/pypi/PIL/
.. _`celery`: http://pypi.python.org/pypi/celery/
.. _`RabbitMQ`: http://www.rabbitmq.com
.. _`DateUtils`: http://pypi.python.org/pypi/DateUtils/
.. _`Framework :: Twisted`:
http://pypi.python.org/pypi?:action=browse&show=all&c=525
.. _`The Zen of Python`: http://www.python.org/dev/peps/pep-0020/
.. _`Plone community`: http://plone.org/community/develop
.. _`Registering with the Package Index`:
http://docs.python.org/dev/packaging/packageindex.html
.. _`Python Standard Library`:
http://docs.python.org/library/index.html
.. _`PEP 345 about Obsolete-Dist`:
http://www.python.org/dev/peps/pep-0345/#obsoletes-dist-multiple-use
.. _`setup.cfg specification`:
http://docs.python.org/dev/packaging/setupcfg.html
.. _`Martin Aspeli's article about names`:
http://www.martinaspeli.net/articles/the-naming-of-things-package-names-and-namespaces
.. _`in development official packaging documentation`:
http://docs.python.org/dev/packaging/
.. _`The Hitchhiker's Guide to Packaging`:
http://guide.python-distribute.org/specification.html#naming-specification
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: