320 lines
11 KiB
Plaintext
320 lines
11 KiB
Plaintext
PEP: 347
|
||
Title: Migrating the Python CVS to Subversion
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Martin v. Löwis <martin@v.loewis.de>
|
||
Discussions-To: <python-dev@python.org>
|
||
Status: Draft
|
||
Type: Process
|
||
Content-Type: text/x-rst
|
||
Created: 14-Jul-2004
|
||
Post-History: 14-Jul-2004
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
The Python source code is currently managed in a CVS repository on
|
||
sourceforge.net. This PEP proposes to move it to a Subversion
|
||
repository on svn.python.org.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
This change has two aspects: moving from CVS to Subversion, and moving
|
||
from SourceForge to python.org. For each, a rationale will be given.
|
||
|
||
|
||
Moving to Subversion
|
||
--------------------
|
||
|
||
CVS has a number of limitations that have been eliminated by
|
||
Subversion. For the development of Python, the most notable
|
||
improvements are:
|
||
|
||
- the ability to rename files and directories, and to remove
|
||
directories, while keeping the history of these files.
|
||
|
||
- support for change sets (sets of correlated changes to multiple
|
||
files) through global revision numbers. Change sets are
|
||
transactional.
|
||
|
||
- atomic, fast tagging: a cvs tag might take many minutes; a
|
||
Subversion tag (svn cp) will complete quickly, and atomically.
|
||
Likewise, branches are very efficient.
|
||
|
||
- support for offline diffs, which is useful when creating patches.
|
||
|
||
|
||
Moving to python.org
|
||
--------------------
|
||
|
||
SourceForge has kindly provided an important infrastructure for the
|
||
past years. Unfortunately, the attention that SF received has also
|
||
caused repeated overload situations in the past, to which the SF
|
||
operators could not always respond in a timely manner. In particular,
|
||
for CVS, they had to reduce the load on the primary CVS server by
|
||
introducing a second, read-only CVS server for anonymous access. This
|
||
server is regularly synchronized, but lags behind the the read-write
|
||
CVS repository between synchronizations. As a result, users without
|
||
commit access can see recent changes to the repository only after a
|
||
delay.
|
||
|
||
On python.org, it would be possible to make the repository accessible
|
||
for anonymous access.
|
||
|
||
|
||
Migration Procedure
|
||
===================
|
||
|
||
To move the Python CVS repository, the following steps need to be
|
||
executed. The steps are elaborated upon in the following sections.
|
||
|
||
1. Collect SSH keys for all current committers, along with usernames
|
||
to appear in commit messages.
|
||
|
||
2. At the beginning of the migration, announce that the repository on
|
||
SourceForge closed.
|
||
|
||
3. 24 hours after the last commit, download the CVS repository.
|
||
|
||
4. Convert the CVS repository into a Subversion repository.
|
||
|
||
5. Publish the repository with write access for committers, and
|
||
read-only anonymous access.
|
||
|
||
6. Disable CVS access on SF.
|
||
|
||
|
||
Collect SSH keys
|
||
----------------
|
||
|
||
After some discussion, svn+ssh was selected as the best method
|
||
for write access to the repository. Developers can continue to
|
||
use their SSH keys, but they must be installed on python.org.
|
||
|
||
In order to avoid having to create a new Unix user for each
|
||
developer, a single account should be used, with command=
|
||
attributes in the authorized_keys files.
|
||
|
||
The lines in the authorized_keys file should read like this
|
||
(wrapped for better readability)::
|
||
|
||
command="/usr/bin/svnserve --root=/svnroot -t
|
||
--tunnel-user='<username>'",no-port-forwarding,
|
||
no-X11-forwarding,no-agent-forwarding,no-pty
|
||
ssh-dss <key> <comment>
|
||
|
||
As the usernames, the real names should be used instead of
|
||
the SF account names, so that people can be better identified
|
||
in log messages.
|
||
|
||
Administrator Access
|
||
--------------------
|
||
|
||
Administrator access to the pythondev account should be granted
|
||
to all current admins of the Python SF project. To distinguish
|
||
between shell login and svnserve login, admins need to maintain
|
||
two keys. Using OpenSSH, the following procedure can be
|
||
used to create a second key::
|
||
|
||
cd .ssh
|
||
ssh-keygen -t DSA -f pythondev -C <user>@pythondev
|
||
vi config
|
||
|
||
In the config file, the following lines need to be added::
|
||
|
||
Host pythondev
|
||
Hostname dinsdale.python.org
|
||
User pythondev
|
||
IdentityFile ~/.ssh/pythondev
|
||
|
||
Then, shell login becomes possible through "ssh pythondev".
|
||
|
||
Downloading the CVS Repository
|
||
------------------------------
|
||
|
||
The CVS repository can be downloaded from
|
||
|
||
http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.bz2
|
||
|
||
Since this tarball is generated only once a day, some time must pass
|
||
after the repository freeze before the tarball can be picked up. It
|
||
should be verified that the last commit, as recorded on the
|
||
python-commits mailing list, is indeed included in the tarball.
|
||
|
||
After the conversion, the converted CVS tarball should be kept
|
||
forever on www.python.org/archive/python-cvsroot-<date>.tar.bz2
|
||
|
||
|
||
Converting the CVS Repository
|
||
-----------------------------
|
||
|
||
The Python CVS repository contains two modules: distutils and python.
|
||
The python module is further structured into dist and nondist,
|
||
where dist only contains src (the python code proper). nondist
|
||
contains various subdirectories.
|
||
|
||
These should be reorganized in the Subversion repository to get
|
||
shorter URLs, following the <project>/{trunk,tags,branches}
|
||
structure. A project will be created for each nondist directory,
|
||
plus for src (called python), plus distutils. Reorganizing the
|
||
repository is best done in the CVS tree, as shown below.
|
||
|
||
The fsfs backend should be used as the repository format (which
|
||
requires Subversion 1.1). The fsfs backend has the advantage of being
|
||
more backup-friendly, as it allows incremental repository backups,
|
||
without requiring any dump commands to be run.
|
||
|
||
The conversion should be done using the cvs2svn utility, available
|
||
e.g. in the cvs2svn Debian package. As cvs2svn does not currently
|
||
support the project/trunk structure, each project needs to be
|
||
converted separately. To get each conversion result into a separate
|
||
directory in the target repository, svnadmin load must be used.
|
||
|
||
Subversion has a different view on binary-vs-text files than CVS.
|
||
To correctly carry the CVS semantics forward, svn:eol-style should
|
||
be set to native on all files that are not marked binary in the
|
||
CVS.
|
||
|
||
In summary, the conversion script is::
|
||
|
||
#!/bin/sh
|
||
rm cvs2svn-*
|
||
rm -rf python py.new
|
||
tar xjf python-cvsroot.tar.bz2
|
||
rm -rf python/CVSROOT
|
||
svnadmin create --fs-type fsfs py.new
|
||
mv python/python python/orig
|
||
mv python/orig/dist/src python/python
|
||
mv python/orig/nondist/* python
|
||
# nondist/nondist is empty
|
||
rmdir python/nondist
|
||
rm -rf python/orig
|
||
for a in python/*
|
||
do
|
||
b=`basename $a`
|
||
cvs2svn -q --dump-only --encoding=latin1 --force-branch=cnri-16-start \
|
||
--force-branch=descr-branch --force-branch=release152p1-patches \
|
||
--force-tag=r16b1 $a
|
||
svn mkdir -m"Conversion to SVN" file:///`pwd`/py.new/$b
|
||
svnadmin load -q --parent-dir $b py.new < cvs2svn-dump
|
||
rm cvs2svn-dump
|
||
done
|
||
|
||
Sample results of this conversion are available at
|
||
|
||
http://www.dcl.hpi.uni-potsdam.de/pysvn/
|
||
|
||
|
||
Publish the Repository
|
||
------------------------
|
||
|
||
The repository should be published at http://svn.python.org/projects.
|
||
Read-write access should be granted to all current SF committers
|
||
through svn+ssh://pythondev@svn.python.org/projects;
|
||
read-only anonymous access through WebDAV should also be
|
||
granted.
|
||
|
||
As an option, websvn (available e.g. from the Debian websvn package)
|
||
could be provided. Unfortunately, in the test installation, websvn
|
||
breaks because it runs out of memory.
|
||
|
||
The current SF project admins should get write access to the
|
||
authorized_keys2 file of the pythondev account.
|
||
|
||
|
||
Disable CVS
|
||
-----------
|
||
|
||
It appears that CVS cannot be disabled entirely. Only the user
|
||
interface can be removed from the project page; the repository itself
|
||
remains available. If desired, write access to the python and
|
||
distutils modules can be disabled through a CVS commitinfo entry.
|
||
|
||
|
||
Discussion
|
||
==========
|
||
|
||
Several alternatives had been suggested to the procedure above.
|
||
The rejected alternatives are shortly discussed here:
|
||
|
||
- create multiple repositories, one for python and one for
|
||
distutils. This would have allowed even shorter URLs, but
|
||
was rejected because a single repository supports moving code
|
||
across projects.
|
||
|
||
- Several people suggested to create the project/trunk structure
|
||
through standard cvs2svn, followed by renames. This would have
|
||
the disadvantage that old revisions use different path names
|
||
than recent revisions; the suggested approach through dump files
|
||
works without renames.
|
||
|
||
- Several people also expressed concern about the administrative
|
||
overhead that hosting the repository on python.org would cause
|
||
to pydotorg admins. As a specific alternative, BerliOS has been
|
||
suggested. The pydotorg admins themselves haven\'t objected
|
||
to the additional workload; migrating the repository again if
|
||
they get overworked is an option.
|
||
|
||
- Different authentication strategies were discussed. As
|
||
alternatives to svn+ssh were suggested
|
||
|
||
* Subversion over WebDAV, using SSL and basic authentication,
|
||
with pydotorg-generated passwords mailed to the user. People
|
||
did not like that approach, since they would need to store
|
||
the password on disk (because they can't remember it); this
|
||
is a security risk.
|
||
|
||
* Subversion over WebDAV, using SSL client certificates. This would
|
||
work, but would require us to administer a certificate authority.
|
||
|
||
- Instead of hosting this on python.org, people suggested hosting
|
||
it elsewhere. One issue is whether this alternative should be
|
||
free or commercial; several people suggested it should better
|
||
be commercial, to reduce the load on the volunteers. In
|
||
particular:
|
||
|
||
* Greg Stein suggested http://www.wush.net/subversion.php. They
|
||
offer 5 GB for $90/month, with 200 GB download/month.
|
||
The data is on a RAID drive and fully backed up. Anonymous
|
||
access and email commit notifications are supported. wush.net
|
||
elaborated the following details:
|
||
|
||
- The machine would be a Virtuozzo Virtual Private Server (VPS),
|
||
hosted at PowerVPS.
|
||
|
||
- The default repository URL would be http://python.wush.net/svn/projectname/,
|
||
but anything else could be arranged
|
||
|
||
- we would get SSH login to the machine, with sudo capabilities.
|
||
|
||
- They have a Web interface for management of the various SVN
|
||
repositories that we want to host, and to manage user accounts.
|
||
While svn+ssh would be supported, the user interface does not
|
||
yet support it.
|
||
|
||
- For offsite mirroring/backup, they suggest to use rsync
|
||
instead of download of repository tarballs.
|
||
|
||
Bob Ippolito reported that they had used wush.net for a
|
||
commercial project for about 6 months, after which time they
|
||
left wush.net, because the service was down for three days,
|
||
with nobody reachable, and no explanation when it came back.
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
End:
|