Fix number and edit PEP a bit for clarity.

This commit is contained in:
Georg Brandl 2009-03-22 09:01:46 +00:00
parent b5bf5d858d
commit 7cd0217332
1 changed files with 105 additions and 98 deletions

View File

@ -1,4 +1,4 @@
PEP: 376 PEP: 381
Title: Mirroring infrastructure for PyPI Title: Mirroring infrastructure for PyPI
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
@ -15,38 +15,38 @@ Abstract
This PEP describes a mirroring infrastructure for PyPI. This PEP describes a mirroring infrastructure for PyPI.
Rationale Rationale
========= =========
PyPI is hosting over 4000 projects and is used on a daily basis PyPI is hosting over 4000 projects and is used on a daily basis by
by people to build applications. Especially systems like `easy_install` people to build applications. Especially systems like `easy_install`
and `zc.buildout` make intensive usage of PyPI. and `zc.buildout` make intensive usage of PyPI.
For people making intensive use of PyPI, it can act as a single point For people making intensive use of PyPI, it can act as a single point
of failure. People have started to set up some mirrors, both private and of failure. People have started to set up some mirrors, both private
public. Those mirrors are active mirrors, which means that they are and public. Those mirrors are active mirrors, which means that they
browsing PyPI to get synced. are browsing PyPI to get synced.
In order to make the system more reliable, this PEP describes: In order to make the system more reliable, this PEP describes:
- the mirror listing and registering at PyPI - the mirror listing and registering at PyPI
- the pages a public mirror should maintain. - the pages a public mirror should maintain. These pages will be used
these pages will be used by PyPI, in order to get by PyPI, in order to get hit counts and the last modified date.
hit counts and the last modified date.
- how a mirror should synchronize with PyPI - how a mirror should synchronize with PyPI
- how a client can implement a fail-over mechanism - how a client can implement a fail-over mechanism
- a contact form for Package maintainers - a contact form for Package maintainers
Mirror listing and registering Mirror listing and registering
============================== ==============================
A new text page will be added at `http://pypi.python.org/mirrors` A new text page will be added at http://pypi.python.org/mirrors that
that can be browsed like the simple index. This page gives a list of can be browsed like the simple index. This page gives a list of the
the mirrors through a list of links. mirrors through a list of links.
These links are the URL of the simple index of each mirror.
The page will look like this::
These links are the URL of the simple index of each mirror. The page
will look like this::
# PyPI mirrors # PyPI mirrors
# #
@ -73,15 +73,16 @@ The page will look like this::
http://example.com/pypi,index,last-modified,local-stats,stats,mirrors http://example.com/pypi,index,last-modified,local-stats,stats,mirrors
http://example2.com/pypi,index,last-modified,local-stats,stats,mirrors http://example2.com/pypi,index,last-modified,local-stats,stats,mirrors
When a mirror is proposed on the mailing list, it is manually When a mirror is proposed on the mailing list, it is manually added in
added in the mirror list in the PyPI application after it the mirror list in the PyPI application after it has been checked to
has been checked to be compliant with the mirroring rules. be compliant with the mirroring rules.
The mirror list page is a simple text page that can be browsed by any
tool that wants to get a list of registered mirrors. Other package
indexes that are not mirrors of PyPI are not added in the mirror list
in PyPI, although they can provide themselve the same mirroring list
mechanism for their own mirrors.
The mirror list page is a simple text page that can be browsed
by any tool that wants to get a list of registered mirrors.
Other package indexes that are not mirrors of PyPI are not added in the
mirror list in PyPI. Although they can provide themselve the
same mirroring list mechanism for their own mirrors.
Special pages a mirror needs to provide Special pages a mirror needs to provide
======================================= =======================================
@ -96,35 +97,36 @@ A mirror needs to provide four pages, beside the index one:
Last modified date Last modified date
:::::::::::::::::: ::::::::::::::::::
CPAN uses a freshness date system where the mirror last synchronisation CPAN uses a freshness date system where the mirror's last
date is made available. synchronisation date is made available.
For PyPI, each mirror needs to maintain an url with a simple text content For PyPI, each mirror needs to maintain a URL with simple text content
that represents the last synchronisation date the mirror maintains. that represents the last synchronisation date the mirror maintains.
The date is provided in GMT time, using the ISO 8601 format The date is provided in GMT time, using the ISO 8601 format (see
(see http://en.wikipedia.org/wiki/ISO_8601) http://en.wikipedia.org/wiki/ISO_8601).
Each mirror will be responsible to maintain its last modified date. Each mirror will be responsible to maintain its last modified date.
Conventionaly, this page should be reachable at: `/last-modified`. Conventionally, this page should be reachable at: `/last-modified`.
Local statistics Local statistics
:::::::::::::::: ::::::::::::::::
Each mirror is responsible to count all the downloads Each mirror is responsible to count all the downloads that where done
that where done on it. This is used by PyPI to sum up all via it. This is used by PyPI to sum up all downloads, to be able to
downloads, to be able to display the grand total. display the grand total.
These statistics are in csv-like form, with a header at the first These statistics are in CSV-like form, with a header in the first
line. It needs to obey `PEP 305 <http://www.python.org/dev/peps/pep-0305/#id19>`_ line. It needs to obey PEP 305 [#pep305]_. Basically, it should be
Basically, it should be readable by Python `csv` module. readable by Python's `csv` module.
The fields in this file are: The fields in this file are:
- package: the distutils id of the package. - package: the distutils id of the package.
- filename: the filename that has been downloaded. - filename: the filename that has been downloaded.
- useragent: the User-Agent of the client that has downloaded the package. - useragent: the User-Agent of the client that has downloaded the
package.
- count: the number of downloads. - count: the number of downloads.
The content will look like this:: The content will look like this::
@ -133,9 +135,10 @@ The content will look like this::
zc.buildout,zc.buildout-1.6.0.tgz,MyAgent,142 zc.buildout,zc.buildout-1.6.0.tgz,MyAgent,142
... ...
The counting starts the day the mirror is launched, and there is one file per The counting starts the day the mirror is launched, and there is one
day, compressed using the `bzip2` format. Each file is named after the file per day, compressed using the `bzip2` format. Each file is named
day. For example `2008-11-06.bz2` is the file for the 6th of November 2008. like the day. For example `2008-11-06.bz2` is the file for the 6th of
November 2008.
They are then provided in a folder called `days`. For example: They are then provided in a folder called `days`. For example:
@ -143,21 +146,21 @@ They are then provided in a folder called `days`. For example:
- /local-stats/days/2008-11-07.bz2 - /local-stats/days/2008-11-07.bz2
- /local-stats/days/2008-11-08.bz2 - /local-stats/days/2008-11-08.bz2
Conventionally the name should be `local-stats` but it can be any name Conventionally the name should be `local-stats`, but it can be any
provided when the mirror is registered. name provided when the mirror is registered.
Statistics page Statistics page
::::::::::::::: :::::::::::::::
PyPI and each mirror are responsible to provide the grand total PyPI and each mirror are responsible to provide the grand total page
page at `/stats`. This page is calculated daily by PyPI, at `/stats`. This page is calculated daily by PyPI, by reading all
by reading all mirrors local stats and suming them. mirrors' local stats and summing them.
Therefore the mirrors should not try to rebuild this stat page but simply Therefore the mirrors should not try to rebuild this stat page but
get PyPI's one during each synchronization. simply get the one on PyPI during each synchronization.
It has the same structure than `local-stats` but also provides It has the same structure as `local-stats` but also provides counts
counts for months. for months.
Examples: Examples:
@ -167,42 +170,42 @@ Examples:
- /stats/months/2008-11.bz2 - /stats/months/2008-11.bz2
- /stats/months/2008-10.bz2 - /stats/months/2008-10.bz2
Conventionally the name should be `stats` but it can be any name Conventionally the name should be `stats`, but it can be any name
provided when the mirror is registered. provided when the mirror is registered.
Mirrors listing page Mirrors listing page
:::::::::::::::::::: ::::::::::::::::::::
Like `/stats`, each mirror should get and provide a copy of the `/mirrors` Like `/stats`, each mirror should get and provide a copy of the
page. `/mirrors` page.
Conventionally the name should be `mirrors` but it can be any name Conventionally the name should be `mirrors`, but it can be any name
provided when the mirror is registered. provided when the mirror is registered.
How a mirror should synchronize with PyPI How a mirror should synchronize with PyPI
========================================= =========================================
A mirroring protocol calls `Simple Index` was described A mirroring protocol called `Simple Index` was described and
and implemented by Martin v. Loewis and Jim Fulton, based on implemented by Martin v. Loewis and Jim Fulton, based on how
how `easy_install` works. This section synthesizes it `easy_install` works. This section synthesizes it and gives a few
and give a few relevant links, plus a small part about relevant links, plus a small part about `User-Agent`.
`User-Agent`.
The mirroring protocol The mirroring protocol
:::::::::::::::::::::: ::::::::::::::::::::::
XXX Need to describe the protocol here. XXX Need to describe the protocol here.
The `zc.pypimirror <http://pypi.python.org/pypi/z3c.pypimirror>`_ package The zc.pypimirror package [#zcpkg]_ provides an application that
provides an application that respects this protocol to browse PyPI. respects this protocol to browse PyPI.
User-agent request header User-agent request header
::::::::::::::::::::::::: :::::::::::::::::::::::::
In order to be able to differentiate actions taken by clients In order to be able to differentiate actions taken by clients over
over PyPI, a specific user agent name should be provided by all PyPI, a specific user agent name should be provided by all mirroring
mirroring softwares. softwares.
This is also true for all clients like: This is also true for all clients like:
@ -216,9 +219,8 @@ XXX user agent registering mechanism at PyPI ?
How a client can use PyPI and its mirrors How a client can use PyPI and its mirrors
::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::
Clients that are browsing PyPI should be able to use Clients that are browsing PyPI should be able to use alternative
alternative mirrors, by reading the `/mirrors` page mirrors, by reading the `/mirrors` page at PyPI.
at PyPI.
The clients so far that could use this mechanism: The clients so far that could use this mechanism:
@ -229,20 +231,18 @@ The clients so far that could use this mechanism:
Fail-over mechanism Fail-over mechanism
::::::::::::::::::: :::::::::::::::::::
Clients that are browsing PyPI should be able to use Clients that are browsing PyPI should be able to use a fail-over
a fail-over mechanism when PyPI or the used mirror mechanism when PyPI or the used mirror is not responding.
is not responding.
This can be done by parsing the `/mirrors` page of PyPI This can be done by parsing the `/mirrors` page of PyPI or the one
or the one located on any PyPI mirror. located on any PyPI mirror.
It is up to the client to decide wich mirror should It is up to the client to decide wich mirror should be used, maybe by
be used. Maybe by looking at its geographical location and looking at its geographical location and its responsivness.
its responsivness.
This PEP does not describe how this fail-over This PEP does not describe how this fail-over mechanism should work,
mechanism should work, but it is strongly encouraged but it is strongly encouraged that the clients try to use the nearest
that the clients try to use the nearest mirror. mirror.
The clients so far that could use this mechanism: The clients so far that could use this mechanism:
@ -253,44 +253,51 @@ The clients so far that could use this mechanism:
Extra package indexes Extra package indexes
::::::::::::::::::::: :::::::::::::::::::::
It is obvious that some package will not be uploaded It is obvious that some packages will not be uploaded to PyPI, whether
to PyPI. Wether because they are private or wether because because they are private or whether because the project maintainer
the project maintainer runs his own server where people runs his own server where people might get the project package.
might get the project package. Although, it is strongly However, it is strongly encouraged that a public package index follows
encouraged that a public package index follows PyPI PyPI and Distutils protocols.
and Distutils protocols.
In other words, the `register` and `upload` command In other words, the `register` and `upload` command should be
should be compatible with any package index server out compatible with any package index server out there.
there.
Softwares that are compatible with PyPI and Distutils so Softwares that are compatible with PyPI and Distutils so far:
far:
- `PloneSoftwareCenter <http://plone.org/products/plonesoftwarecenter>`_ - `PloneSoftwareCenter <http://plone.org/products/plonesoftwarecenter>`_
wich is used to run plone.org products section. wich is used to run plone.org products section.
- `EggBasket <http://www.chrisarndt.de/projects/eggbasket>`_ - `EggBasket <http://www.chrisarndt.de/projects/eggbasket>`_
**An extra package index is not a mirror or PyPI but can have itself **An extra package index is not a mirror of PyPI, but can have some
some mirrors** mirrors itself.**
Merging several indexes Merging several indexes
::::::::::::::::::::::: :::::::::::::::::::::::
When a client needs to get some packages from several When a client needs to get some packages from several distinct
distinct indexes, it should be able to use each one of them indexes, it should be able to use each one of them as a potential
as a potential source of packages. Different indexes source of packages. Different indexes should be defined as a sorted
should be defined as a sorted list for the client to list for the client to look for a package.
look for a package.
Each independant index can of course provide a list of Each independant index can of course provide a list of its mirrors, if
its mirrors, if the `/mirrors` page is available. the `/mirrors` page is available.
That permits all combinations at client level, for a reliable That permits all combinations at client level, for a reliable
packaging system with all levels of privacy. packaging system with all levels of privacy.
It is up the client to deal with the merging. It is up the client to deal with the merging.
References
==========
.. [#pep305]
http://www.python.org/dev/peps/pep-0305/#id19
.. [#zcpkg]
http://pypi.python.org/pypi/z3c.pypimirror
Copyright Copyright
========= =========