321 lines
11 KiB
Plaintext
321 lines
11 KiB
Plaintext
PEP: 347
|
|
Title: Migrating the Python CVS to Subversion
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Martin von Löwis <martin@v.loewis.de>
|
|
Discussions-To: python-dev@python.org
|
|
Status: Final
|
|
Type: Process
|
|
Content-Type: text/x-rst
|
|
Created: 14-Jul-2004
|
|
Post-History: 14-Jul-2004
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
The Python source code is currently managed in a CVS repository on
|
|
sourceforge.net. This PEP proposes to move it to a Subversion
|
|
repository on svn.python.org.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
This change has two aspects: moving from CVS to Subversion, and moving
|
|
from SourceForge to python.org. For each, a rationale will be given.
|
|
|
|
|
|
Moving to Subversion
|
|
--------------------
|
|
|
|
CVS has a number of limitations that have been eliminated by
|
|
Subversion. For the development of Python, the most notable
|
|
improvements are:
|
|
|
|
- the ability to rename files and directories, and to remove
|
|
directories, while keeping the history of these files.
|
|
|
|
- support for change sets (sets of correlated changes to multiple
|
|
files) through global revision numbers. Change sets are
|
|
transactional.
|
|
|
|
- atomic, fast tagging: a cvs tag might take many minutes; a
|
|
Subversion tag (svn cp) will complete quickly, and atomically.
|
|
Likewise, branches are very efficient.
|
|
|
|
- support for offline diffs, which is useful when creating patches.
|
|
|
|
|
|
Moving to python.org
|
|
--------------------
|
|
|
|
SourceForge has kindly provided an important infrastructure for the
|
|
past years. Unfortunately, the attention that SF received has also
|
|
caused repeated overload situations in the past, to which the SF
|
|
operators could not always respond in a timely manner. In particular,
|
|
for CVS, they had to reduce the load on the primary CVS server by
|
|
introducing a second, read-only CVS server for anonymous access. This
|
|
server is regularly synchronized, but lags behind the read-write CVS
|
|
repository between synchronizations. As a result, users without
|
|
commit access can see recent changes to the repository only after a
|
|
delay.
|
|
|
|
On python.org, it would be possible to make the repository accessible
|
|
for anonymous access.
|
|
|
|
|
|
Migration Procedure
|
|
===================
|
|
|
|
To move the Python CVS repository, the following steps need to be
|
|
executed. The steps are elaborated upon in the following sections.
|
|
|
|
1. Collect SSH keys for all current committers, along with usernames
|
|
to appear in commit messages.
|
|
|
|
2. At the beginning of the migration, announce that the repository on
|
|
SourceForge closed.
|
|
|
|
3. 24 hours after the last commit, download the CVS repository.
|
|
|
|
4. Convert the CVS repository into a Subversion repository.
|
|
|
|
5. Publish the repository with write access for committers, and
|
|
read-only anonymous access.
|
|
|
|
6. Disable CVS access on SF.
|
|
|
|
|
|
Collect SSH keys
|
|
----------------
|
|
|
|
After some discussion, svn+ssh was selected as the best method
|
|
for write access to the repository. Developers can continue to
|
|
use their SSH keys, but they must be installed on python.org.
|
|
|
|
In order to avoid having to create a new Unix user for each
|
|
developer, a single account should be used, with command=
|
|
attributes in the authorized_keys files.
|
|
|
|
The lines in the authorized_keys file should read like this
|
|
(wrapped for better readability)::
|
|
|
|
command="/usr/bin/svnserve --root=/svnroot -t
|
|
--tunnel-user='<username>'",no-port-forwarding,
|
|
no-X11-forwarding,no-agent-forwarding,no-pty
|
|
ssh-dss <key> <comment>
|
|
|
|
As the usernames, the real names should be used instead of
|
|
the SF account names, so that people can be better identified
|
|
in log messages.
|
|
|
|
Administrator Access
|
|
--------------------
|
|
|
|
Administrator access to the pythondev account should be granted
|
|
to all current admins of the Python SF project. To distinguish
|
|
between shell login and svnserve login, admins need to maintain
|
|
two keys. Using OpenSSH, the following procedure can be
|
|
used to create a second key::
|
|
|
|
cd .ssh
|
|
ssh-keygen -t DSA -f pythondev -C <user>@pythondev
|
|
vi config
|
|
|
|
In the config file, the following lines need to be added::
|
|
|
|
Host pythondev
|
|
Hostname dinsdale.python.org
|
|
User pythondev
|
|
IdentityFile ~/.ssh/pythondev
|
|
|
|
Then, shell login becomes possible through "ssh pythondev".
|
|
|
|
Downloading the CVS Repository
|
|
------------------------------
|
|
|
|
The CVS repository can be downloaded from
|
|
|
|
http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.bz2
|
|
|
|
Since this tarball is generated only once a day, some time must pass
|
|
after the repository freeze before the tarball can be picked up. It
|
|
should be verified that the last commit, as recorded on the
|
|
python-commits mailing list, is indeed included in the tarball.
|
|
|
|
After the conversion, the converted CVS tarball should be kept
|
|
forever on www.python.org/archive/python-cvsroot-<date>.tar.bz2
|
|
|
|
|
|
Converting the CVS Repository
|
|
-----------------------------
|
|
|
|
The Python CVS repository contains two modules: distutils and python.
|
|
The python module is further structured into dist and nondist,
|
|
where dist only contains src (the python code proper). nondist
|
|
contains various subdirectories.
|
|
|
|
These should be reorganized in the Subversion repository to get
|
|
shorter URLs, following the <project>/{trunk,tags,branches}
|
|
structure. A project will be created for each nondist directory,
|
|
plus for src (called python), plus distutils. Reorganizing the
|
|
repository is best done in the CVS tree, as shown below.
|
|
|
|
The fsfs backend should be used as the repository format (which
|
|
requires Subversion 1.1). The fsfs backend has the advantage of being
|
|
more backup-friendly, as it allows incremental repository backups,
|
|
without requiring any dump commands to be run.
|
|
|
|
The conversion should be done using the cvs2svn utility, available
|
|
e.g. in the cvs2svn Debian package. As cvs2svn does not currently
|
|
support the project/trunk structure, each project needs to be
|
|
converted separately. To get each conversion result into a separate
|
|
directory in the target repository, svnadmin load must be used.
|
|
|
|
Subversion has a different view on binary-vs-text files than CVS.
|
|
To correctly carry the CVS semantics forward, svn:eol-style should
|
|
be set to native on all files that are not marked binary in the
|
|
CVS.
|
|
|
|
In summary, the conversion script is::
|
|
|
|
#!/bin/sh
|
|
rm cvs2svn-*
|
|
rm -rf python py.new
|
|
tar xjf python-cvsroot.tar.bz2
|
|
rm -rf python/CVSROOT
|
|
svnadmin create --fs-type fsfs py.new
|
|
mv python/python python/orig
|
|
mv python/orig/dist/src python/python
|
|
mv python/orig/nondist/* python
|
|
# nondist/nondist is empty
|
|
rmdir python/nondist
|
|
rm -rf python/orig
|
|
for a in python/*
|
|
do
|
|
b=`basename $a`
|
|
cvs2svn -q --dump-only --encoding=latin1 --force-branch=cnri-16-start \
|
|
--force-branch=descr-branch --force-branch=release152p1-patches \
|
|
--force-tag=r16b1 $a
|
|
svn mkdir -m"Conversion to SVN" file:///`pwd`/py.new/$b
|
|
svnadmin load -q --parent-dir $b py.new < cvs2svn-dump
|
|
rm cvs2svn-dump
|
|
done
|
|
|
|
Sample results of this conversion are available at
|
|
|
|
http://www.dcl.hpi.uni-potsdam.de/pysvn/
|
|
|
|
|
|
Publish the Repository
|
|
------------------------
|
|
|
|
The repository should be published at http://svn.python.org/projects.
|
|
Read-write access should be granted to all current SF committers
|
|
through svn+ssh://pythondev@svn.python.org/;
|
|
read-only anonymous access through WebDAV should also be
|
|
granted.
|
|
|
|
As an option, websvn (available e.g. from the Debian websvn package)
|
|
could be provided. Unfortunately, in the test installation, websvn
|
|
breaks because it runs out of memory.
|
|
|
|
The current SF project admins should get write access to the
|
|
authorized_keys2 file of the pythondev account.
|
|
|
|
|
|
Disable CVS
|
|
-----------
|
|
|
|
It appears that CVS cannot be disabled entirely. Only the user
|
|
interface can be removed from the project page; the repository itself
|
|
remains available. If desired, write access to the python and
|
|
distutils modules can be disabled through a CVS commitinfo entry.
|
|
|
|
|
|
Discussion
|
|
==========
|
|
|
|
Several alternatives had been suggested to the procedure above.
|
|
The rejected alternatives are shortly discussed here:
|
|
|
|
- create multiple repositories, one for python and one for
|
|
distutils. This would have allowed even shorter URLs, but
|
|
was rejected because a single repository supports moving code
|
|
across projects.
|
|
|
|
- Several people suggested to create the project/trunk structure
|
|
through standard cvs2svn, followed by renames. This would have
|
|
the disadvantage that old revisions use different path names
|
|
than recent revisions; the suggested approach through dump files
|
|
works without renames.
|
|
|
|
- Several people also expressed concern about the administrative
|
|
overhead that hosting the repository on python.org would cause
|
|
to pydotorg admins. As a specific alternative, BerliOS has been
|
|
suggested. The pydotorg admins themselves haven\'t objected
|
|
to the additional workload; migrating the repository again if
|
|
they get overworked is an option.
|
|
|
|
- Different authentication strategies were discussed. As
|
|
alternatives to svn+ssh were suggested
|
|
|
|
* Subversion over WebDAV, using SSL and basic authentication,
|
|
with pydotorg-generated passwords mailed to the user. People
|
|
did not like that approach, since they would need to store
|
|
the password on disk (because they can't remember it); this
|
|
is a security risk.
|
|
|
|
* Subversion over WebDAV, using SSL client certificates. This would
|
|
work, but would require us to administer a certificate authority.
|
|
|
|
- Instead of hosting this on python.org, people suggested hosting
|
|
it elsewhere. One issue is whether this alternative should be
|
|
free or commercial; several people suggested it should better
|
|
be commercial, to reduce the load on the volunteers. In
|
|
particular:
|
|
|
|
* Greg Stein suggested http://www.wush.net/subversion.php. They
|
|
offer 5 GB for $90/month, with 200 GB download/month.
|
|
The data is on a RAID drive and fully backed up. Anonymous
|
|
access and email commit notifications are supported. wush.net
|
|
elaborated the following details:
|
|
|
|
- The machine would be a Virtuozzo Virtual Private Server (VPS),
|
|
hosted at PowerVPS.
|
|
|
|
- The default repository URL would be http://python.wush.net/svn/projectname/,
|
|
but anything else could be arranged
|
|
|
|
- we would get SSH login to the machine, with sudo capabilities.
|
|
|
|
- They have a Web interface for management of the various SVN
|
|
repositories that we want to host, and to manage user accounts.
|
|
While svn+ssh would be supported, the user interface does not
|
|
yet support it.
|
|
|
|
- For offsite mirroring/backup, they suggest to use rsync
|
|
instead of download of repository tarballs.
|
|
|
|
Bob Ippolito reported that they had used wush.net for a
|
|
commercial project for about 6 months, after which time they
|
|
left wush.net, because the service was down for three days,
|
|
with nobody reachable, and no explanation when it came back.
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|