2009-03-22 05:01:46 -04:00
|
|
|
|
PEP: 381
|
2009-03-21 10:08:19 -04:00
|
|
|
|
Title: Mirroring infrastructure for PyPI
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Tarek Ziadé <tarek@ziade.org>
|
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 21-March-2009
|
|
|
|
|
Python-Version: N.A.
|
|
|
|
|
Post-History:
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
This PEP describes a mirroring infrastructure for PyPI.
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
|
2009-03-21 10:08:19 -04:00
|
|
|
|
Rationale
|
|
|
|
|
=========
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
PyPI is hosting over 4000 projects and is used on a daily basis by
|
|
|
|
|
people to build applications. Especially systems like `easy_install`
|
2009-03-21 10:08:19 -04:00
|
|
|
|
and `zc.buildout` make intensive usage of PyPI.
|
|
|
|
|
|
|
|
|
|
For people making intensive use of PyPI, it can act as a single point
|
2009-03-22 05:01:46 -04:00
|
|
|
|
of failure. People have started to set up some mirrors, both private
|
|
|
|
|
and public. Those mirrors are active mirrors, which means that they
|
|
|
|
|
are browsing PyPI to get synced.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
In order to make the system more reliable, this PEP describes:
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
- the mirror listing and registering at PyPI
|
|
|
|
|
- the pages a public mirror should maintain. These pages will be used
|
|
|
|
|
by PyPI, in order to get hit counts and the last modified date.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
- how a mirror should synchronize with PyPI
|
|
|
|
|
- how a client can implement a fail-over mechanism
|
|
|
|
|
- a contact form for Package maintainers
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
|
2009-03-21 10:08:19 -04:00
|
|
|
|
Mirror listing and registering
|
|
|
|
|
==============================
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
A new text page will be added at http://pypi.python.org/mirrors that
|
|
|
|
|
can be browsed like the simple index. This page gives a list of the
|
|
|
|
|
mirrors through a list of links.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
These links are the URL of the simple index of each mirror. The page
|
|
|
|
|
will look like this::
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
# PyPI mirrors
|
|
|
|
|
#
|
|
|
|
|
# If you want to register a new mirror, send an email
|
|
|
|
|
# to the catalog-SIG@python.org with:
|
|
|
|
|
#
|
|
|
|
|
# - The urls of your mirror:
|
|
|
|
|
# - the root of the server
|
|
|
|
|
# - the index page
|
|
|
|
|
# - the last modified page
|
|
|
|
|
# - the local stats page
|
|
|
|
|
# - the global stats page
|
|
|
|
|
# - the mirrors page
|
|
|
|
|
#
|
|
|
|
|
# - The name and email of the maintainer.
|
|
|
|
|
#
|
|
|
|
|
# The registering is done manually and to become a
|
|
|
|
|
# mirror, you need to strictly follow the mirror protocol
|
|
|
|
|
# described here:
|
|
|
|
|
#
|
|
|
|
|
# http://wiki.python.org/PEP_374
|
|
|
|
|
#
|
|
|
|
|
# root,index,last-modified,local-stats,stats,mirrors
|
|
|
|
|
http://example.com/pypi,index,last-modified,local-stats,stats,mirrors
|
|
|
|
|
http://example2.com/pypi,index,last-modified,local-stats,stats,mirrors
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
When a mirror is proposed on the mailing list, it is manually added in
|
|
|
|
|
the mirror list in the PyPI application after it has been checked to
|
|
|
|
|
be compliant with the mirroring rules.
|
|
|
|
|
|
|
|
|
|
The mirror list page is a simple text page that can be browsed by any
|
|
|
|
|
tool that wants to get a list of registered mirrors. Other package
|
|
|
|
|
indexes that are not mirrors of PyPI are not added in the mirror list
|
|
|
|
|
in PyPI, although they can provide themselve the same mirroring list
|
|
|
|
|
mechanism for their own mirrors.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Special pages a mirror needs to provide
|
|
|
|
|
=======================================
|
|
|
|
|
|
|
|
|
|
A mirror needs to provide four pages, beside the index one:
|
|
|
|
|
|
|
|
|
|
- last-modified
|
|
|
|
|
- local-stats
|
|
|
|
|
- stats
|
|
|
|
|
- mirrors
|
|
|
|
|
|
|
|
|
|
Last modified date
|
|
|
|
|
::::::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
CPAN uses a freshness date system where the mirror's last
|
|
|
|
|
synchronisation date is made available.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
For PyPI, each mirror needs to maintain a URL with simple text content
|
2009-03-21 10:08:19 -04:00
|
|
|
|
that represents the last synchronisation date the mirror maintains.
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
The date is provided in GMT time, using the ISO 8601 format (see
|
|
|
|
|
http://en.wikipedia.org/wiki/ISO_8601).
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Each mirror will be responsible to maintain its last modified date.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Conventionally, this page should be reachable at: `/last-modified`.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
Local statistics
|
|
|
|
|
::::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Each mirror is responsible to count all the downloads that where done
|
|
|
|
|
via it. This is used by PyPI to sum up all downloads, to be able to
|
|
|
|
|
display the grand total.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
These statistics are in CSV-like form, with a header in the first
|
|
|
|
|
line. It needs to obey PEP 305 [#pep305]_. Basically, it should be
|
|
|
|
|
readable by Python's `csv` module.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
The fields in this file are:
|
|
|
|
|
|
|
|
|
|
- package: the distutils id of the package.
|
|
|
|
|
- filename: the filename that has been downloaded.
|
2009-03-22 05:01:46 -04:00
|
|
|
|
- useragent: the User-Agent of the client that has downloaded the
|
|
|
|
|
package.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
- count: the number of downloads.
|
|
|
|
|
|
|
|
|
|
The content will look like this::
|
|
|
|
|
|
|
|
|
|
# package,filename,useragent,count
|
|
|
|
|
zc.buildout,zc.buildout-1.6.0.tgz,MyAgent,142
|
|
|
|
|
...
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
The counting starts the day the mirror is launched, and there is one
|
|
|
|
|
file per day, compressed using the `bzip2` format. Each file is named
|
|
|
|
|
like the day. For example `2008-11-06.bz2` is the file for the 6th of
|
|
|
|
|
November 2008.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
They are then provided in a folder called `days`. For example:
|
|
|
|
|
|
|
|
|
|
- /local-stats/days/2008-11-06.bz2
|
|
|
|
|
- /local-stats/days/2008-11-07.bz2
|
|
|
|
|
- /local-stats/days/2008-11-08.bz2
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Conventionally the name should be `local-stats`, but it can be any
|
|
|
|
|
name provided when the mirror is registered.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
Statistics page
|
|
|
|
|
:::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
PyPI and each mirror are responsible to provide the grand total page
|
|
|
|
|
at `/stats`. This page is calculated daily by PyPI, by reading all
|
|
|
|
|
mirrors' local stats and summing them.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Therefore the mirrors should not try to rebuild this stat page but
|
|
|
|
|
simply get the one on PyPI during each synchronization.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
It has the same structure as `local-stats` but also provides counts
|
|
|
|
|
for months.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
|
|
|
|
|
- /stats/days/2008-11-06.bz2
|
|
|
|
|
- /stats/days/2008-11-07.bz2
|
|
|
|
|
- /stats/days/2008-11-08.bz2
|
|
|
|
|
- /stats/months/2008-11.bz2
|
|
|
|
|
- /stats/months/2008-10.bz2
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Conventionally the name should be `stats`, but it can be any name
|
2009-03-21 10:08:19 -04:00
|
|
|
|
provided when the mirror is registered.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mirrors listing page
|
|
|
|
|
::::::::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Like `/stats`, each mirror should get and provide a copy of the
|
|
|
|
|
`/mirrors` page.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Conventionally the name should be `mirrors`, but it can be any name
|
2009-03-21 10:08:19 -04:00
|
|
|
|
provided when the mirror is registered.
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
|
2009-03-21 10:08:19 -04:00
|
|
|
|
How a mirror should synchronize with PyPI
|
|
|
|
|
=========================================
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
A mirroring protocol called `Simple Index` was described and
|
|
|
|
|
implemented by Martin v. Loewis and Jim Fulton, based on how
|
|
|
|
|
`easy_install` works. This section synthesizes it and gives a few
|
|
|
|
|
relevant links, plus a small part about `User-Agent`.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
The mirroring protocol
|
|
|
|
|
::::::::::::::::::::::
|
|
|
|
|
|
|
|
|
|
XXX Need to describe the protocol here.
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
The zc.pypimirror package [#zcpkg]_ provides an application that
|
|
|
|
|
respects this protocol to browse PyPI.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
User-agent request header
|
|
|
|
|
:::::::::::::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
In order to be able to differentiate actions taken by clients over
|
|
|
|
|
PyPI, a specific user agent name should be provided by all mirroring
|
|
|
|
|
softwares.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
This is also true for all clients like:
|
|
|
|
|
|
|
|
|
|
- `zc.buildout <http://pypi.python.org/pypi/zc.buildout>`_
|
|
|
|
|
- `setuptools <http://pypi.python.org/pypi/zc.buildout>`_
|
|
|
|
|
- `pip <http://pypi.python.org/pypi/zc.buildout>`_
|
|
|
|
|
- etc.
|
|
|
|
|
|
|
|
|
|
XXX user agent registering mechanism at PyPI ?
|
|
|
|
|
|
|
|
|
|
How a client can use PyPI and its mirrors
|
|
|
|
|
:::::::::::::::::::::::::::::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Clients that are browsing PyPI should be able to use alternative
|
|
|
|
|
mirrors, by reading the `/mirrors` page at PyPI.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
The clients so far that could use this mechanism:
|
|
|
|
|
|
|
|
|
|
- setuptools
|
|
|
|
|
- zc.buildout (through setuptools)
|
|
|
|
|
- pip
|
|
|
|
|
|
|
|
|
|
Fail-over mechanism
|
|
|
|
|
:::::::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Clients that are browsing PyPI should be able to use a fail-over
|
|
|
|
|
mechanism when PyPI or the used mirror is not responding.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
This can be done by parsing the `/mirrors` page of PyPI or the one
|
|
|
|
|
located on any PyPI mirror.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
It is up to the client to decide wich mirror should be used, maybe by
|
|
|
|
|
looking at its geographical location and its responsivness.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
This PEP does not describe how this fail-over mechanism should work,
|
|
|
|
|
but it is strongly encouraged that the clients try to use the nearest
|
|
|
|
|
mirror.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
The clients so far that could use this mechanism:
|
|
|
|
|
|
|
|
|
|
- setuptools
|
|
|
|
|
- zc.buildout (through setuptools)
|
|
|
|
|
- pip
|
|
|
|
|
|
|
|
|
|
Extra package indexes
|
|
|
|
|
:::::::::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
It is obvious that some packages will not be uploaded to PyPI, whether
|
|
|
|
|
because they are private or whether because the project maintainer
|
|
|
|
|
runs his own server where people might get the project package.
|
|
|
|
|
However, it is strongly encouraged that a public package index follows
|
|
|
|
|
PyPI and Distutils protocols.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
In other words, the `register` and `upload` command should be
|
|
|
|
|
compatible with any package index server out there.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Softwares that are compatible with PyPI and Distutils so far:
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
- `PloneSoftwareCenter <http://plone.org/products/plonesoftwarecenter>`_
|
|
|
|
|
wich is used to run plone.org products section.
|
|
|
|
|
- `EggBasket <http://www.chrisarndt.de/projects/eggbasket>`_
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
**An extra package index is not a mirror of PyPI, but can have some
|
|
|
|
|
mirrors itself.**
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
Merging several indexes
|
|
|
|
|
:::::::::::::::::::::::
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
When a client needs to get some packages from several distinct
|
|
|
|
|
indexes, it should be able to use each one of them as a potential
|
|
|
|
|
source of packages. Different indexes should be defined as a sorted
|
|
|
|
|
list for the client to look for a package.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
Each independant index can of course provide a list of its mirrors, if
|
|
|
|
|
the `/mirrors` page is available.
|
2009-03-21 10:08:19 -04:00
|
|
|
|
|
|
|
|
|
That permits all combinations at client level, for a reliable
|
|
|
|
|
packaging system with all levels of privacy.
|
|
|
|
|
|
|
|
|
|
It is up the client to deal with the merging.
|
|
|
|
|
|
2009-03-22 05:01:46 -04:00
|
|
|
|
|
|
|
|
|
References
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
.. [#pep305]
|
|
|
|
|
http://www.python.org/dev/peps/pep-0305/#id19
|
|
|
|
|
|
|
|
|
|
.. [#zcpkg]
|
|
|
|
|
http://pypi.python.org/pypi/z3c.pypimirror
|
|
|
|
|
|
|
|
|
|
|
2009-03-21 10:08:19 -04:00
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|