diff --git a/pep-0381.txt b/pep-0381.txt index 22b320b77..59272488a 100644 --- a/pep-0381.txt +++ b/pep-0381.txt @@ -1,4 +1,4 @@ -PEP: 376 +PEP: 381 Title: Mirroring infrastructure for PyPI Version: $Revision$ Last-Modified: $Date$ @@ -15,38 +15,38 @@ Abstract This PEP describes a mirroring infrastructure for PyPI. + Rationale ========= -PyPI is hosting over 4000 projects and is used on a daily basis -by people to build applications. Especially systems like `easy_install` +PyPI is hosting over 4000 projects and is used on a daily basis by +people to build applications. Especially systems like `easy_install` and `zc.buildout` make intensive usage of PyPI. For people making intensive use of PyPI, it can act as a single point -of failure. People have started to set up some mirrors, both private and -public. Those mirrors are active mirrors, which means that they are -browsing PyPI to get synced. +of failure. People have started to set up some mirrors, both private +and public. Those mirrors are active mirrors, which means that they +are browsing PyPI to get synced. In order to make the system more reliable, this PEP describes: -- the mirror listing and registering at PyPI -- the pages a public mirror should maintain. - these pages will be used by PyPI, in order to get - hit counts and the last modified date. +- the mirror listing and registering at PyPI +- the pages a public mirror should maintain. These pages will be used + by PyPI, in order to get hit counts and the last modified date. - how a mirror should synchronize with PyPI - how a client can implement a fail-over mechanism - a contact form for Package maintainers + Mirror listing and registering ============================== -A new text page will be added at `http://pypi.python.org/mirrors` -that can be browsed like the simple index. This page gives a list of -the mirrors through a list of links. - -These links are the URL of the simple index of each mirror. -The page will look like this:: +A new text page will be added at http://pypi.python.org/mirrors that +can be browsed like the simple index. This page gives a list of the +mirrors through a list of links. +These links are the URL of the simple index of each mirror. The page +will look like this:: # PyPI mirrors # @@ -73,15 +73,16 @@ The page will look like this:: http://example.com/pypi,index,last-modified,local-stats,stats,mirrors http://example2.com/pypi,index,last-modified,local-stats,stats,mirrors -When a mirror is proposed on the mailing list, it is manually -added in the mirror list in the PyPI application after it -has been checked to be compliant with the mirroring rules. +When a mirror is proposed on the mailing list, it is manually added in +the mirror list in the PyPI application after it has been checked to +be compliant with the mirroring rules. + +The mirror list page is a simple text page that can be browsed by any +tool that wants to get a list of registered mirrors. Other package +indexes that are not mirrors of PyPI are not added in the mirror list +in PyPI, although they can provide themselve the same mirroring list +mechanism for their own mirrors. -The mirror list page is a simple text page that can be browsed -by any tool that wants to get a list of registered mirrors. -Other package indexes that are not mirrors of PyPI are not added in the -mirror list in PyPI. Although they can provide themselve the -same mirroring list mechanism for their own mirrors. Special pages a mirror needs to provide ======================================= @@ -96,35 +97,36 @@ A mirror needs to provide four pages, beside the index one: Last modified date :::::::::::::::::: -CPAN uses a freshness date system where the mirror last synchronisation -date is made available. +CPAN uses a freshness date system where the mirror's last +synchronisation date is made available. -For PyPI, each mirror needs to maintain an url with a simple text content +For PyPI, each mirror needs to maintain a URL with simple text content that represents the last synchronisation date the mirror maintains. -The date is provided in GMT time, using the ISO 8601 format -(see http://en.wikipedia.org/wiki/ISO_8601) +The date is provided in GMT time, using the ISO 8601 format (see +http://en.wikipedia.org/wiki/ISO_8601). -Each mirror will be responsible to maintain its last modified date. +Each mirror will be responsible to maintain its last modified date. -Conventionaly, this page should be reachable at: `/last-modified`. +Conventionally, this page should be reachable at: `/last-modified`. Local statistics :::::::::::::::: -Each mirror is responsible to count all the downloads -that where done on it. This is used by PyPI to sum up all -downloads, to be able to display the grand total. +Each mirror is responsible to count all the downloads that where done +via it. This is used by PyPI to sum up all downloads, to be able to +display the grand total. -These statistics are in csv-like form, with a header at the first -line. It needs to obey `PEP 305 `_ -Basically, it should be readable by Python `csv` module. +These statistics are in CSV-like form, with a header in the first +line. It needs to obey PEP 305 [#pep305]_. Basically, it should be +readable by Python's `csv` module. The fields in this file are: - package: the distutils id of the package. - filename: the filename that has been downloaded. -- useragent: the User-Agent of the client that has downloaded the package. +- useragent: the User-Agent of the client that has downloaded the + package. - count: the number of downloads. The content will look like this:: @@ -133,9 +135,10 @@ The content will look like this:: zc.buildout,zc.buildout-1.6.0.tgz,MyAgent,142 ... -The counting starts the day the mirror is launched, and there is one file per -day, compressed using the `bzip2` format. Each file is named after the -day. For example `2008-11-06.bz2` is the file for the 6th of November 2008. +The counting starts the day the mirror is launched, and there is one +file per day, compressed using the `bzip2` format. Each file is named +like the day. For example `2008-11-06.bz2` is the file for the 6th of +November 2008. They are then provided in a folder called `days`. For example: @@ -143,21 +146,21 @@ They are then provided in a folder called `days`. For example: - /local-stats/days/2008-11-07.bz2 - /local-stats/days/2008-11-08.bz2 -Conventionally the name should be `local-stats` but it can be any name -provided when the mirror is registered. +Conventionally the name should be `local-stats`, but it can be any +name provided when the mirror is registered. Statistics page ::::::::::::::: -PyPI and each mirror are responsible to provide the grand total -page at `/stats`. This page is calculated daily by PyPI, -by reading all mirrors local stats and suming them. +PyPI and each mirror are responsible to provide the grand total page +at `/stats`. This page is calculated daily by PyPI, by reading all +mirrors' local stats and summing them. -Therefore the mirrors should not try to rebuild this stat page but simply -get PyPI's one during each synchronization. +Therefore the mirrors should not try to rebuild this stat page but +simply get the one on PyPI during each synchronization. -It has the same structure than `local-stats` but also provides -counts for months. +It has the same structure as `local-stats` but also provides counts +for months. Examples: @@ -167,42 +170,42 @@ Examples: - /stats/months/2008-11.bz2 - /stats/months/2008-10.bz2 -Conventionally the name should be `stats` but it can be any name +Conventionally the name should be `stats`, but it can be any name provided when the mirror is registered. Mirrors listing page :::::::::::::::::::: -Like `/stats`, each mirror should get and provide a copy of the `/mirrors` -page. +Like `/stats`, each mirror should get and provide a copy of the +`/mirrors` page. -Conventionally the name should be `mirrors` but it can be any name +Conventionally the name should be `mirrors`, but it can be any name provided when the mirror is registered. + How a mirror should synchronize with PyPI ========================================= -A mirroring protocol calls `Simple Index` was described -and implemented by Martin v. Loewis and Jim Fulton, based on -how `easy_install` works. This section synthesizes it -and give a few relevant links, plus a small part about -`User-Agent`. +A mirroring protocol called `Simple Index` was described and +implemented by Martin v. Loewis and Jim Fulton, based on how +`easy_install` works. This section synthesizes it and gives a few +relevant links, plus a small part about `User-Agent`. The mirroring protocol :::::::::::::::::::::: XXX Need to describe the protocol here. -The `zc.pypimirror `_ package -provides an application that respects this protocol to browse PyPI. +The zc.pypimirror package [#zcpkg]_ provides an application that +respects this protocol to browse PyPI. User-agent request header ::::::::::::::::::::::::: -In order to be able to differentiate actions taken by clients -over PyPI, a specific user agent name should be provided by all -mirroring softwares. +In order to be able to differentiate actions taken by clients over +PyPI, a specific user agent name should be provided by all mirroring +softwares. This is also true for all clients like: @@ -216,9 +219,8 @@ XXX user agent registering mechanism at PyPI ? How a client can use PyPI and its mirrors ::::::::::::::::::::::::::::::::::::::::: -Clients that are browsing PyPI should be able to use -alternative mirrors, by reading the `/mirrors` page -at PyPI. +Clients that are browsing PyPI should be able to use alternative +mirrors, by reading the `/mirrors` page at PyPI. The clients so far that could use this mechanism: @@ -229,20 +231,18 @@ The clients so far that could use this mechanism: Fail-over mechanism ::::::::::::::::::: -Clients that are browsing PyPI should be able to use -a fail-over mechanism when PyPI or the used mirror -is not responding. +Clients that are browsing PyPI should be able to use a fail-over +mechanism when PyPI or the used mirror is not responding. -This can be done by parsing the `/mirrors` page of PyPI -or the one located on any PyPI mirror. +This can be done by parsing the `/mirrors` page of PyPI or the one +located on any PyPI mirror. -It is up to the client to decide wich mirror should -be used. Maybe by looking at its geographical location and -its responsivness. +It is up to the client to decide wich mirror should be used, maybe by +looking at its geographical location and its responsivness. -This PEP does not describe how this fail-over -mechanism should work, but it is strongly encouraged -that the clients try to use the nearest mirror. +This PEP does not describe how this fail-over mechanism should work, +but it is strongly encouraged that the clients try to use the nearest +mirror. The clients so far that could use this mechanism: @@ -253,44 +253,51 @@ The clients so far that could use this mechanism: Extra package indexes ::::::::::::::::::::: -It is obvious that some package will not be uploaded -to PyPI. Wether because they are private or wether because -the project maintainer runs his own server where people -might get the project package. Although, it is strongly -encouraged that a public package index follows PyPI -and Distutils protocols. +It is obvious that some packages will not be uploaded to PyPI, whether +because they are private or whether because the project maintainer +runs his own server where people might get the project package. +However, it is strongly encouraged that a public package index follows +PyPI and Distutils protocols. -In other words, the `register` and `upload` command -should be compatible with any package index server out -there. +In other words, the `register` and `upload` command should be +compatible with any package index server out there. -Softwares that are compatible with PyPI and Distutils so -far: +Softwares that are compatible with PyPI and Distutils so far: - `PloneSoftwareCenter `_ wich is used to run plone.org products section. - `EggBasket `_ -**An extra package index is not a mirror or PyPI but can have itself -some mirrors** +**An extra package index is not a mirror of PyPI, but can have some +mirrors itself.** Merging several indexes ::::::::::::::::::::::: -When a client needs to get some packages from several -distinct indexes, it should be able to use each one of them -as a potential source of packages. Different indexes -should be defined as a sorted list for the client to -look for a package. +When a client needs to get some packages from several distinct +indexes, it should be able to use each one of them as a potential +source of packages. Different indexes should be defined as a sorted +list for the client to look for a package. -Each independant index can of course provide a list of -its mirrors, if the `/mirrors` page is available. +Each independant index can of course provide a list of its mirrors, if +the `/mirrors` page is available. That permits all combinations at client level, for a reliable packaging system with all levels of privacy. It is up the client to deal with the merging. + +References +========== + +.. [#pep305] + http://www.python.org/dev/peps/pep-0305/#id19 + +.. [#zcpkg] + http://pypi.python.org/pypi/z3c.pypimirror + + Copyright =========