1519 lines
53 KiB
Plaintext
1519 lines
53 KiB
Plaintext
PEP: 374
|
|
Title: Choosing a distributed VCS for the Python project
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Brett Cannon <brett@python.org>,
|
|
Stephen J. Turnbull <stephen@xemacs.org>,
|
|
Alexandre Vassalotti <alexandre@peadrop.com>,
|
|
Barry Warsaw <barry@python.org>,
|
|
Dirkjan Ochtman <dirkjan@ochtman.nl>
|
|
Status: Final
|
|
Type: Process
|
|
Content-Type: text/x-rst
|
|
Created: 07-Nov-2008
|
|
Post-History: 07-Nov-2008,
|
|
22-Jan-2009
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Python has been using a centralized version control system (VCS;
|
|
first CVS, now Subversion) for years to great effect. Having a master
|
|
copy of the official version of Python provides people with a single
|
|
place to always get the official Python source code. It has also
|
|
allowed for the storage of the history of the language, mostly for
|
|
help with development, but also for posterity. And of course the V in
|
|
VCS is very helpful when developing.
|
|
|
|
But a centralized version control system has its drawbacks. First and
|
|
foremost, in order to have the benefits of version control with
|
|
Python in a seamless fashion, one must be a "core developer" (i.e.
|
|
someone with commit privileges on the master copy of Python). People
|
|
who are not core developers but who wish to work with Python's
|
|
revision tree, e.g. anyone writing a patch for Python or creating a
|
|
custom version, do not have direct tool support for revisions. This
|
|
can be quite a limitation, since these non-core developers cannot
|
|
easily do basic tasks such as reverting changes to a previously
|
|
saved state, creating branches, publishing one's changes with full
|
|
revision history, etc. For non-core developers, the last safe tree
|
|
state is one the Python developers happen to set, and this prevents
|
|
safe development. This second-class citizenship is a hindrance to
|
|
people who wish to contribute to Python with a patch of any
|
|
complexity and want a way to incrementally save their progress to
|
|
make their development lives easier.
|
|
|
|
There is also the issue of having to be online to be able to commit
|
|
one's work. Because centralized VCSs keep a central copy that stores
|
|
all revisions, one must have Internet access in order for their
|
|
revisions to be stored; no Net, no commit. This can be annoying if
|
|
you happen to be traveling and lack any Internet. There is also the
|
|
situation of someone wishing to contribute to Python but having a
|
|
bad Internet connection where committing is time-consuming and
|
|
expensive and it might work out better to do it in a single step.
|
|
|
|
Another drawback to a centralized VCS is that a common use case is
|
|
for a developer to revise patches in response to review comments.
|
|
This is more difficult with a centralized model because there's no
|
|
place to contain intermediate work. It's either all checked in or
|
|
none of it is checked in. In the centralized VCS, it's also very
|
|
difficult to track changes to the trunk as they are committed, while
|
|
you're working on your feature or bug fix branch. This increases
|
|
the risk that such branches will grow stale, out-dated, or that
|
|
merging them into the trunk will generate too may conflicts to be
|
|
easily resolved.
|
|
|
|
Lastly, there is the issue of maintenance of Python. At any one time
|
|
there is at least one major version of Python under development (at
|
|
the time of this writing there are two). For each major version of
|
|
Python under development there is at least the maintenance version
|
|
of the last minor version and the in-development minor version (e.g.
|
|
with 2.6 just released, that means that both 2.6 and 2.7 are being
|
|
worked on). Once a release is done, a branch is created between the
|
|
code bases where changes in one version do not (but could) belong in
|
|
the other version. As of right now there is no natural support for
|
|
this branch in time in central VCSs; you must use tools that
|
|
simulate the branching. Tracking merges is similarly painful for
|
|
developers, as revisions often need to be merged between four active
|
|
branches (e.g. 2.6 maintenance, 3.0 maintenance, 2.7 development,
|
|
3.1 development). In this case, VCSs such as Subversion only handle
|
|
this through arcane third party tools.
|
|
|
|
Distributed VCSs (DVCSs) solve all of these problems. While one can
|
|
keep a master copy of a revision tree, anyone is free to copy that
|
|
tree for their own use. This gives everyone the power to commit
|
|
changes to their copy, online or offline. It also more naturally
|
|
ties into the idea of branching in the history of a revision tree
|
|
for maintenance and the development of new features bound for
|
|
Python. DVCSs also provide a great many additional features that
|
|
centralized VCSs don't or can't provide.
|
|
|
|
This PEP explores the possibility of changing Python's use of Subversion
|
|
to any of the currently popular DVCSs, in order to gain
|
|
the benefits outlined above. This PEP does not guarantee that a switch
|
|
to a DVCS will occur at the conclusion of this PEP. It is quite
|
|
possible that no clear winner will be found and that svn will continue
|
|
to be used. If this happens, this PEP will be revisited and revised in
|
|
the future as the state of DVCSs evolves.
|
|
|
|
|
|
Terminology
|
|
===========
|
|
|
|
Agreeing on a common terminology is surprisingly difficult,
|
|
primarily because each VCS uses these terms when describing subtly
|
|
different tasks, objects, and concepts. Where possible, we try to
|
|
provide a generic definition of the concepts, but you should consult
|
|
the individual system's glossaries for details. Here are some basic
|
|
references for terminology, from some of the standard web-based
|
|
references on each VCS. You can also refer to glossaries for each
|
|
DVCS:
|
|
|
|
* Subversion : http://svnbook.red-bean.com/en/1.5/svn.basic.html
|
|
* Bazaar : http://bazaar-vcs.org/BzrGlossary
|
|
* Mercurial : http://www.selenic.com/mercurial/wiki/index.cgi/UnderstandingMercurial
|
|
* git : http://book.git-scm.com/1_the_git_object_model.html
|
|
|
|
|
|
branch
|
|
A line of development; a collection of revisions, ordered by
|
|
time.
|
|
|
|
checkout/working copy/working tree
|
|
A tree of code the developer can edit, linked to a branch.
|
|
|
|
index
|
|
A "staging area" where a revision is built (unique to git).
|
|
|
|
repository
|
|
A collection of revisions, organized into branches.
|
|
|
|
clone
|
|
A complete copy of a branch or repository.
|
|
|
|
commit
|
|
To record a revision in a repository.
|
|
|
|
merge
|
|
Applying all the changes and history from one branch/repository
|
|
to another.
|
|
|
|
pull
|
|
To update a checkout/clone from the original branch/repository,
|
|
which can be remote or local
|
|
|
|
push/publish
|
|
To copy a revision, and all revisions it depends on, from a one
|
|
repository to another.
|
|
|
|
cherry-pick
|
|
To merge one or more specific revisions from one branch to
|
|
another, possibly in a different repository, possibly without its
|
|
dependent revisions.
|
|
|
|
rebase
|
|
To "detach" a branch, and move it to a new branch point; move
|
|
commits to the beginning of a branch instead of where they
|
|
happened in time.
|
|
|
|
|
|
Typical Workflow
|
|
================
|
|
|
|
At the moment, the typical workflow for a Python core developer is:
|
|
|
|
|
|
* Edit code in a checkout until it is stable enough to commit/push.
|
|
* Commit to the master repository.
|
|
|
|
It is a rather simple workflow, but it has drawbacks. For one,
|
|
because any work that involves the repository takes time thanks to
|
|
the network, commits/pushes tend to not necessarily be as atomic as
|
|
possible. There is also the drawback of there not being a
|
|
necessarily cheap way to create new checkouts beyond a recursive
|
|
copy of the checkout directory.
|
|
|
|
A DVCS would lead to a workflow more like this:
|
|
|
|
* Branch off of a local clone of the master repository.
|
|
* Edit code, committing in atomic pieces.
|
|
* Merge the branch into the mainline, and
|
|
* Push all commits to the master repository.
|
|
|
|
While there are more possible steps, the workflow is much more
|
|
independent of the master repository than is currently possible. By
|
|
being able to commit locally at the speed of your disk, a core
|
|
developer is able to do atomic commits much more frequently,
|
|
minimizing having commits that do multiple things to the code. Also
|
|
by using a branch, the changes are isolated (if desired) from other
|
|
changes being made by other developers. Because branches are cheap,
|
|
it is easy to create and maintain many smaller branches that address
|
|
one specific issue, e.g. one bug or one new feature. More
|
|
sophisticated features of DVCSs allow the developer to more easily
|
|
track long running development branches as the official mainline
|
|
progresses.
|
|
|
|
|
|
Contenders
|
|
==========
|
|
|
|
========== ========== ======= =================================== ==========================================
|
|
Name Short Name Version 2.x Trunk Mirror 3.x Trunk Mirror
|
|
========== ========== ======= =================================== ==========================================
|
|
Bazaar_ bzr 1.12 http://code.python.org/python/trunk http://code.python.org/python/3.0
|
|
Mercurial_ hg 1.2.0 http://code.python.org/hg/trunk/ http://code.python.org/hg/branches/py3k/
|
|
git_ N/A 1.6.1 git://code.python.org/python/trunk git://code.python.org/python/branches/py3k
|
|
========== ========== ======= =================================== ==========================================
|
|
|
|
.. _Bazaar: http://bazaar-vcs.org/
|
|
.. _Mercurial: http://www.selenic.com/mercurial/
|
|
.. _git: http://www.git-scm.com/
|
|
|
|
This PEP does not consider darcs, arch, or monotone. The main
|
|
problem with these DVCSs is that they are simply not popular enough
|
|
to bother supporting when they do not provide some very compelling
|
|
features that the other DVCSs provide. Arch and darcs also have
|
|
significant performance problems which seem unlikely to be addressed
|
|
in the near future.
|
|
|
|
|
|
Interoperability
|
|
================
|
|
|
|
For those who have already decided which DVCSs they want to use, and
|
|
are willing to maintain local mirrors themselves, all three DVCSs
|
|
support interchange via the git "fast-import" changeset format. git
|
|
does so natively, of course, and native support for Bazaar is under
|
|
active development, and getting good early reviews as of mid-February
|
|
2009. Mercurial has idiosyncratic support for importing via its *hg
|
|
convert* command, and `third-party fast-import support`_ is available
|
|
for exporting. Also, the Tailor_ tool supports automatic maintenance
|
|
of mirrors based on an official repository in any of the candidate
|
|
formats with a local mirror in any format.
|
|
|
|
.. _third-party fast-import support: http://repo.or.cz/r/fast-export.git/.git/description
|
|
.. _Tailor: http://progetti.arstecnica.it/tailor/
|
|
|
|
|
|
Usage Scenarios
|
|
===============
|
|
|
|
Probably the best way to help decide on whether/which DVCS should
|
|
replace Subversion is to see what it takes to perform some
|
|
real-world usage scenarios that developers (core and non-core) have
|
|
to work with. Each usage scenario outlines what it is, a bullet list
|
|
of what the basic steps are (which can vary slightly per VCS), and
|
|
how to perform the usage scenario in the various VCSs
|
|
(including Subversion).
|
|
|
|
Each VCS had a single author in charge of writing implementations
|
|
for each scenario (unless otherwise noted).
|
|
|
|
========= ===
|
|
Name VCS
|
|
========= ===
|
|
Brett svn
|
|
Barry bzr
|
|
Alexandre hg
|
|
Stephen git
|
|
========= ===
|
|
|
|
|
|
Initial Setup
|
|
-------------
|
|
|
|
Some DVCSs have some perks if you do some initial setup upfront.
|
|
This section covers what can be done before any of the usage
|
|
scenarios are run in order to take better advantage of the tools.
|
|
|
|
All of the DVCSs support configuring your project identification.
|
|
Unlike the centralized systems, they use your email address to
|
|
identify your commits. (Access control is generally done by
|
|
mechanisms external to the DVCS, such as ssh or console login).
|
|
This identity may be associated with a full name.
|
|
|
|
All of the DVCSs will query the system to get some approximation to
|
|
this information, but that may not be what you want. They also
|
|
support setting this information on a per-user basis, and on a
|
|
per-project basis. Convenience commands to set these attributes vary,
|
|
but all allow direct editing of configuration files.
|
|
|
|
Some VCSs support end-of-line (EOL) conversions on checkout/checkin.
|
|
|
|
|
|
svn
|
|
'''
|
|
|
|
None required, but it is recommended you follow the
|
|
`guidelines <http://www.python.org/dev/faq/#what-configuration-settings-should-i-use>`_
|
|
in the dev FAQ.
|
|
|
|
|
|
bzr
|
|
'''
|
|
|
|
No setup is required, but for much quicker and space-efficient local
|
|
branching, you should create a shared repository to hold all your
|
|
Python branches. A shared repository is really just a parent
|
|
directory containing a .bzr directory. When bzr commits a revision,
|
|
it searches from the local directory on up the file system for a .bzr
|
|
directory to hold the revision. By sharing revisions across multiple
|
|
branches, you cut down on the amount of disk space used. Do this::
|
|
|
|
cd ~/projects
|
|
bzr init-repo python
|
|
cd python
|
|
|
|
Now, all your Python branches should be created inside of
|
|
``~/projects/python``.
|
|
|
|
There are also some settings you can put in your
|
|
``~/.bzr/bazaar.conf``
|
|
and ``~/.bzr/locations.conf`` file to set up defaults for interacting
|
|
with Python code. None of them are required, although some are
|
|
recommended. E.g. I would suggest gpg signing all commits, but that
|
|
might be too high a barrier for developers. Also, you can set up
|
|
default push locations depending on where you want to push branches
|
|
by default. If you have write access to the master branches, that
|
|
push location could be code.python.org. Otherwise, it might be a
|
|
free Bazaar code hosting service such as Launchpad. If Bazaar is
|
|
chosen, we should decide what the policies and recommendations are.
|
|
|
|
At a minimum, I would set up your email address::
|
|
|
|
bzr whoami "Firstname Lastname <email.address@example.com>"
|
|
|
|
As with hg and git below, there are ways to set your email address (or really,
|
|
just about any parameter) on a
|
|
per-repository basis. You do this with settings in your
|
|
``$HOME/.bazaar/locations.conf`` file, which has an ini-style format as does
|
|
the other DVCSs. See the Bazaar documentation for details,
|
|
which mostly aren't relevant for this discussion.
|
|
|
|
|
|
hg
|
|
''
|
|
|
|
Minimally, you should set your user name. To do so, create the file
|
|
``.hgrc`` in your home directory and add the following::
|
|
|
|
[ui]
|
|
username = Firstname Lastname <email.address@example.com>
|
|
|
|
If you are using Windows and your tools do not support Unix-style newlines,
|
|
you can enable automatic newline translation by adding to your configuration::
|
|
|
|
[extensions]
|
|
win32text =
|
|
|
|
These options can also be set locally to a given repository by
|
|
customizing ``<repo>/.hg/hgrc``, instead of ``~/.hgrc``.
|
|
|
|
|
|
git
|
|
'''
|
|
|
|
None needed. However, git supports a number of features that can
|
|
smooth your work, with a little preparation. git supports setting
|
|
defaults at the workspace, user, and system levels. The system
|
|
level is out of scope of this PEP. The user configuration file is
|
|
``$HOME/.gitconfig`` on Unix-like systems, and the workspace
|
|
configuration file is ``$REPOSITORY/.git/config``.
|
|
|
|
You can use the ``git-config`` tool to set preferences for user.name and
|
|
user.email either globally (for your system login account) or
|
|
locally (to a given git working copy), or you can edit the
|
|
configuration files (which have the same format as shown in the
|
|
Mercurial section above).::
|
|
|
|
# my full name doesn't change
|
|
# note "--global" flag means per user
|
|
# (system-wide configuration is set with "--system")
|
|
git config --global user.name 'Firstname Lastname'
|
|
# but use my Pythonic email address
|
|
cd /path/to/python/repository
|
|
git config user.email email.address@python.example.com
|
|
|
|
If you are using Windows, you probably want to set the core.autocrlf
|
|
and core.safecrlf preferences to true using ``git-config``.::
|
|
|
|
# check out files with CRLF line endings rather than Unix-style LF only
|
|
git config --global core.autocrlf true
|
|
# scream if a transformation would be ambiguous
|
|
# (eg, a working file contains both naked LF and CRLF)
|
|
# and check them back in with the reverse transformation
|
|
git config --global core.safecrlf true
|
|
|
|
Although the repository will usually contain a .gitignore file
|
|
specifying file names that rarely if ever should be registered in the
|
|
VCS, you may have personal conventions (e.g., always editing log
|
|
messages in a temporary file named ".msg") that you may wish to
|
|
specify.::
|
|
|
|
# tell git where my personal ignores are
|
|
git config --global core.excludesfile ~/.gitignore
|
|
# I use .msg for my long commit logs, and Emacs makes backups in
|
|
# files ending with ~
|
|
# these are globs, not regular expressions
|
|
echo '*~' >> ~/.gitignore
|
|
echo '.msg' >> ~/.gitignore
|
|
|
|
If you use multiple branches, as with the other VCSes, you can save a
|
|
lot of space by putting all objects in a common object store. This
|
|
also can save download time, if the origins of the branches were in
|
|
different repositories, because objects are shared across branches in
|
|
your repository even if they were not present in the upstream
|
|
repositories. git is very space- and time-efficient and applies a
|
|
number of optimizations automatically, so this configuration is
|
|
optional. (Examples are omitted.)
|
|
|
|
|
|
One-Off Checkout
|
|
----------------
|
|
|
|
As a non-core developer, I want to create and publish a one-off patch
|
|
that fixes a bug, so that a core developer can review it for
|
|
inclusion in the mainline.
|
|
|
|
* Checkout/branch/clone trunk.
|
|
* Edit some code.
|
|
* Generate a patch (based on what is best supported by the VCS, e.g.
|
|
branch history).
|
|
* Receive reviewer comments and address the issues.
|
|
* Generate a second patch for the core developer to commit.
|
|
|
|
|
|
svn
|
|
'''
|
|
::
|
|
|
|
svn checkout http://svn.python.org/projects/python/trunk
|
|
cd trunk
|
|
# Edit some code.
|
|
echo "The cake is a lie!" > README
|
|
# Since svn lacks support for local commits, we fake it with patches.
|
|
svn diff >> commit-1.diff
|
|
svn diff >> patch-1.diff
|
|
# Upload the patch-1 to bugs.python.org.
|
|
# Receive reviewer comments.
|
|
# Edit some code.
|
|
echo "The cake is real!" > README
|
|
# Since svn lacks support for local commits, we fake it with patches.
|
|
svn diff >> commit-2.diff
|
|
svn diff >> patch-2.diff
|
|
# Upload patch-2 to bugs.python.org
|
|
|
|
|
|
bzr
|
|
'''
|
|
::
|
|
|
|
bzr branch http://code.python.org/python/trunk
|
|
cd trunk
|
|
# Edit some code.
|
|
bzr commit -m 'Stuff I did'
|
|
bzr send -o bundle
|
|
# Upload bundle to bugs.python.org
|
|
# Receive reviewer comments
|
|
# Edit some code
|
|
bzr commit -m 'Respond to reviewer comments'
|
|
bzr send -o bundle
|
|
# Upload updated bundle to bugs.python.org
|
|
|
|
The ``bundle`` file is like a super-patch. It can be read by ``patch(1)`` but
|
|
it contains additional metadata so that it can be fed to ``bzr merge`` to
|
|
produce a fully usable branch completely with history. See `Patch Review`_
|
|
section below.
|
|
|
|
|
|
hg
|
|
''
|
|
::
|
|
|
|
hg clone http://code.python.org/hg/trunk
|
|
cd trunk
|
|
# Edit some code.
|
|
hg commit -m "Stuff I did"
|
|
hg outgoing -p > fixes.patch
|
|
# Upload patch to bugs.python.org
|
|
# Receive reviewer comments
|
|
# Edit some code
|
|
hg commit -m "Address reviewer comments."
|
|
hg outgoing -p > additional-fixes.patch
|
|
# Upload patch to bugs.python.org
|
|
|
|
While ``hg outgoing`` does not have the flag for it, most Mercurial
|
|
commands support git's extended patch format through a ``--git``
|
|
command. This can be set in one's ``.hgrc`` file so that all commands
|
|
that generate a patch use the extended format.
|
|
|
|
|
|
git
|
|
'''
|
|
|
|
The patches could be created with
|
|
``git diff master > stuff-i-did.patch``, too, but
|
|
``git format-patch | git am`` knows some tricks
|
|
(empty files, renames, etc) that ordinary patch can't handle. git
|
|
grabs "Stuff I did" out of the commit message to create the file
|
|
name 0001-Stuff-I-did.patch. See Patch Review below for a
|
|
description of the git-format-patch format.
|
|
::
|
|
|
|
# Get the mainline code.
|
|
git clone git://code.python.org/python/trunk
|
|
cd trunk
|
|
# Edit some code.
|
|
git commit -a -m 'Stuff I did.'
|
|
# Create patch for my changes (i.e, relative to master).
|
|
git format-patch master
|
|
git tag stuff-v1
|
|
# Upload 0001-Stuff-I-did.patch to bugs.python.org.
|
|
# Time passes ... receive reviewer comments.
|
|
# Edit more code.
|
|
git commit -a -m 'Address reviewer comments.'
|
|
# Make an add-on patch to apply on top of the original.
|
|
git format-patch stuff-v1
|
|
# Upload 0001-Address-reviewer-comments.patch to bugs.python.org.
|
|
|
|
|
|
Backing Out Changes
|
|
-------------------
|
|
|
|
As a core developer, I want to undo a change that was not ready for
|
|
inclusion in the mainline.
|
|
|
|
* Back out the unwanted change.
|
|
* Push patch to server.
|
|
|
|
|
|
svn
|
|
'''
|
|
::
|
|
|
|
# Assume the change to revert is in revision 40
|
|
svn merge -c -40 .
|
|
# Resolve conflicts, if any.
|
|
svn commit -m "Reverted revision 40"
|
|
|
|
|
|
bzr
|
|
'''
|
|
::
|
|
|
|
# Assume the change to revert is in revision 40
|
|
bzr merge -r 40..39
|
|
# Resolve conflicts, if any.
|
|
bzr commit -m "Reverted revision 40"
|
|
|
|
Note that if the change you want revert is the last one that was
|
|
made, you can just use ``bzr uncommit``.
|
|
|
|
|
|
hg
|
|
''
|
|
::
|
|
|
|
# Assume the change to revert is in revision 9150dd9c6d30
|
|
hg backout --merge -r 9150dd9c6d30
|
|
# Resolve conflicts, if any.
|
|
hg commit -m "Reverted changeset 9150dd9c6d30"
|
|
hg push
|
|
|
|
Note, you can use "hg rollback" and "hg strip" to revert changes you committed
|
|
in your local repository, but did not yet push to other repositories.
|
|
|
|
git
|
|
'''
|
|
::
|
|
|
|
# Assume the change to revert is the grandfather of a revision tagged "newhotness".
|
|
git revert newhotness~2
|
|
# Resolve conflicts if any. If there are no conflicts, the commit
|
|
# will be done automatically by "git revert", which prompts for a log.
|
|
git commit -m "Reverted changeset 9150dd9c6d30."
|
|
git push
|
|
|
|
|
|
Patch Review
|
|
------------
|
|
|
|
As a core developer, I want to review patches submitted by other
|
|
people, so that I can make sure that only approved changes are added
|
|
to Python.
|
|
|
|
Core developers have to review patches as submitted by other people.
|
|
This requires applying the patch, testing it, and then tossing away
|
|
the changes. The assumption can be made that a core developer already
|
|
has a checkout/branch/clone of the trunk.
|
|
|
|
* Branch off of trunk.
|
|
* Apply patch w/o any comments as generated by the patch submitter.
|
|
* Push patch to server.
|
|
* Delete now-useless branch.
|
|
|
|
|
|
svn
|
|
'''
|
|
|
|
Subversion does not exactly fit into this development style very well
|
|
as there are no such thing as a "branch" as has been defined in this
|
|
PEP. Instead a developer either needs to create another checkout for
|
|
testing a patch or create a branch on the server. Up to this point,
|
|
core developers have not taken the "branch on the server" approach to
|
|
dealing with individual patches. For this scenario the assumption
|
|
will be the developer creates a local checkout of the trunk to work
|
|
with.::
|
|
|
|
cp -r trunk issue0000
|
|
cd issue0000
|
|
patch -p0 < __patch__
|
|
# Review patch.
|
|
svn commit -m "Some patch."
|
|
cd ..
|
|
rm -r issue0000
|
|
|
|
Another option is to only have a single checkout running at any one
|
|
time and use ``svn diff`` along with ``svn revert -R`` to store away
|
|
independent changes you may have made.
|
|
|
|
|
|
bzr
|
|
'''
|
|
::
|
|
|
|
bzr branch trunk issueNNNN
|
|
# Download `patch` bundle from Roundup
|
|
bzr merge patch
|
|
# Review patch
|
|
bzr commit -m'Patch NNN by So N. So' --fixes python:NNNN
|
|
bzr push bzr+ssh://me@code.python.org/trunk
|
|
rm -rf ../issueNNNN
|
|
|
|
Alternatively, since you're probably going to commit these changes to
|
|
the trunk, you could just do a checkout. That would give you a local
|
|
working tree while the branch (i.e. all revisions) would continue to
|
|
live on the server. This is similar to the svn model and might allow
|
|
you to more quickly review the patch. There's no need for the push
|
|
in this case.::
|
|
|
|
bzr checkout trunk issueNNNN
|
|
# Download `patch` bundle from Roundup
|
|
bzr merge patch
|
|
# Review patch
|
|
bzr commit -m'Patch NNNN by So N. So' --fixes python:NNNN
|
|
rm -rf ../issueNNNN
|
|
|
|
|
|
hg
|
|
''
|
|
::
|
|
|
|
hg clone trunk issue0000
|
|
cd issue0000
|
|
# If the patch was generated using hg export, the user name of the
|
|
# submitter is automatically recorded. Otherwise,
|
|
# use hg import --no-commit submitted.diff and commit with
|
|
# hg commit -u "Firstname Lastname <email.address@example.com>"
|
|
hg import submitted.diff
|
|
# Review patch.
|
|
hg push ssh://alexandre@code.python.org/hg/trunk/
|
|
|
|
|
|
git
|
|
'''
|
|
We assume a patch created by git-format-patch. This is a Unix mbox
|
|
file containing one or more patches, each formatted as an :rfc:`2822`
|
|
message. git-am interprets each message as a commit as follows. The
|
|
author of the patch is taken from the From: header, the date from the
|
|
Date header. The commit log is created by concatenating the content
|
|
of the subject line, a blank line, and the message body up to the
|
|
start of the patch.::
|
|
|
|
cd trunk
|
|
# Create a branch in case we don't like the patch.
|
|
# This checkout takes zero time, since the workspace is left in
|
|
# the same state as the master branch.
|
|
git checkout -b patch-review
|
|
# Download patch from bugs.python.org to submitted.patch.
|
|
git am < submitted.patch
|
|
# Review and approve patch.
|
|
# Merge into master and push.
|
|
git checkout master
|
|
git merge patch-review
|
|
git push
|
|
|
|
|
|
Backport
|
|
--------
|
|
|
|
As a core developer, I want to apply a patch to 2.6, 2.7, 3.0, and 3.1
|
|
so that I can fix a problem in all three versions.
|
|
|
|
Thanks to always having the cutting-edge and the latest release
|
|
version under development, Python currently has four branches being
|
|
worked on simultaneously. That makes it important for a change to
|
|
propagate easily through various branches.
|
|
|
|
svn
|
|
'''
|
|
|
|
Because of Python's use of svnmerge, changes start with the trunk
|
|
(2.7) and then get merged to the release version of 2.6. To get the
|
|
change into the 3.x series, the change is merged into 3.1, fixed up,
|
|
and then merged into 3.0 (2.7 -> 2.6; 2.7 -> 3.1 -> 3.0).
|
|
|
|
This is in contrast to a port-forward strategy where the patch would
|
|
have been added to 2.6 and then pulled forward into newer versions
|
|
(2.6 -> 2.7 -> 3.0 -> 3.1).
|
|
|
|
::
|
|
|
|
# Assume patch applied to 2.7 in revision 0000.
|
|
cd release26-maint
|
|
svnmerge merge -r 0000
|
|
# Resolve merge conflicts and make sure patch works.
|
|
svn commit -F svnmerge-commit-message.txt # revision 0001.
|
|
cd ../py3k
|
|
svnmerge merge -r 0000
|
|
# Same as for 2.6, except Misc/NEWS changes are reverted.
|
|
svn revert Misc/NEWS
|
|
svn commit -F svnmerge-commit-message.txt # revision 0002.
|
|
cd ../release30-maint
|
|
svnmerge merge -r 0002
|
|
svn commit -F svnmerge-commit-message.txt # revision 0003.
|
|
|
|
|
|
bzr
|
|
'''
|
|
|
|
Bazaar is pretty straightforward here, since it supports cherry
|
|
picking revisions manually. In the example below, we could have
|
|
given a revision id instead of a revision number, but that's usually
|
|
not necessary. Martin Pool suggests "We'd generally recommend doing
|
|
the fix first in the oldest supported branch, and then merging it
|
|
forward to the later releases."::
|
|
|
|
# Assume patch applied to 2.7 in revision 0000
|
|
cd release26-maint
|
|
bzr merge ../trunk -c 0000
|
|
# Resolve conflicts and make sure patch works
|
|
bzr commit -m 'Back port patch NNNN'
|
|
bzr push bzr+ssh://me@code.python.org/trunk
|
|
cd ../py3k
|
|
bzr merge ../trunk -r 0000
|
|
# Same as for 2.6 except Misc/NEWS changes are reverted
|
|
bzr revert Misc/NEWS
|
|
bzr commit -m 'Forward port patch NNNN'
|
|
bzr push bzr+ssh://me@code.python.org/py3k
|
|
|
|
|
|
hg
|
|
''
|
|
|
|
Mercurial, like other DVCS, does not well support the current
|
|
workflow used by Python core developers to backport patches. Right
|
|
now, bug fixes are first applied to the development mainline
|
|
(i.e., trunk), then back-ported to the maintenance branches and
|
|
forward-ported, as necessary, to the py3k branch. This workflow
|
|
requires the ability to cherry-pick individual changes. Mercurial's
|
|
transplant extension provides this ability. Here is an example of
|
|
the scenario using this workflow::
|
|
|
|
cd release26-maint
|
|
# Assume patch applied to 2.7 in revision 0000
|
|
hg transplant -s ../trunk 0000
|
|
# Resolve conflicts, if any.
|
|
cd ../py3k
|
|
hg pull ../trunk
|
|
hg merge
|
|
hg revert Misc/NEWS
|
|
hg commit -m "Merged trunk"
|
|
hg push
|
|
|
|
In the above example, transplant acts much like the current svnmerge
|
|
command. When transplant is invoked without the revision, the command
|
|
launches an interactive loop useful for transplanting multiple
|
|
changes. Another useful feature is the --filter option which can be
|
|
used to modify changesets programmatically (e.g., it could be used
|
|
for removing changes to Misc/NEWS automatically).
|
|
|
|
Alternatively to the traditional workflow, we could avoid
|
|
transplanting changesets by committing bug fixes to the oldest
|
|
supported release, then merge these fixes upward to the more recent
|
|
branches.
|
|
::
|
|
|
|
cd release25-maint
|
|
hg import fix_some_bug.diff
|
|
# Review patch and run test suite. Revert if failure.
|
|
hg push
|
|
cd ../release26-maint
|
|
hg pull ../release25-maint
|
|
hg merge
|
|
# Resolve conflicts, if any. Then, review patch and run test suite.
|
|
hg commit -m "Merged patches from release25-maint."
|
|
hg push
|
|
cd ../trunk
|
|
hg pull ../release26-maint
|
|
hg merge
|
|
# Resolve conflicts, if any, then review.
|
|
hg commit -m "Merged patches from release26-maint."
|
|
hg push
|
|
|
|
Although this approach makes the history non-linear and slightly
|
|
more difficult to follow, it encourages fixing bugs across all
|
|
supported releases. Furthermore, it scales better when there is many
|
|
changes to backport, because we do not need to seek the specific
|
|
revision IDs to merge.
|
|
|
|
|
|
git
|
|
'''
|
|
|
|
In git I would have a workspace which contains all of
|
|
the relevant master repository branches. git cherry-pick doesn't
|
|
work across repositories; you need to have the branches in the same
|
|
repository.
|
|
::
|
|
|
|
# Assume patch applied to 2.7 in revision release27~3 (4th patch back from tip).
|
|
cd integration
|
|
git checkout release26
|
|
git cherry-pick release27~3
|
|
# If there are conflicts, resolve them, and commit those changes.
|
|
# git commit -a -m "Resolve conflicts."
|
|
# Run test suite. If fixes are necessary, record as a separate commit.
|
|
# git commit -a -m "Fix code causing test failures."
|
|
git checkout master
|
|
git cherry-pick release27~3
|
|
# Do any conflict resolution and test failure fixups.
|
|
# Revert Misc/NEWS changes.
|
|
git checkout HEAD^ -- Misc/NEWS
|
|
git commit -m 'Revert cherry-picked Misc/NEWS changes.' Misc/NEWS
|
|
# Push both ports.
|
|
git push release26 master
|
|
|
|
If you are regularly merging (rather than cherry-picking) from a
|
|
given branch, then you can block a given commit from being
|
|
accidentally merged in the future by merging, then reverting it.
|
|
This does not prevent a cherry-pick from pulling in the unwanted
|
|
patch, and this technique requires blocking everything that you don't
|
|
want merged. I'm not sure if this differs from svn on this point.
|
|
::
|
|
|
|
cd trunk
|
|
# Merge in the alpha tested code.
|
|
git merge experimental-branch
|
|
# We don't want the 3rd-to-last commit from the experimental-branch,
|
|
# and we don't want it to ever be merged.
|
|
# The notation "^N" means Nth parent of the current commit. Thus HEAD^2^1^1
|
|
# means the first parent of the first parent of the second parent of HEAD.
|
|
git revert HEAD^2^1^1
|
|
# Propagate the merge and the prohibition to the public repository.
|
|
git push
|
|
|
|
|
|
Coordinated Development of a New Feature
|
|
----------------------------------------
|
|
|
|
Sometimes core developers end up working on a major feature with
|
|
several developers. As a core developer, I want to be able to
|
|
publish feature branches to a common public location so that I can
|
|
collaborate with other developers.
|
|
|
|
This requires creating a branch on a server that other developers
|
|
can access. All of the DVCSs support creating new repositories on
|
|
hosts where the developer is already able to commit, with
|
|
appropriate configuration of the repository host. This is
|
|
similar in concept to the existing sandbox in svn, although details
|
|
of repository initialization may differ.
|
|
|
|
For non-core developers, there are various more-or-less public-access
|
|
repository-hosting services.
|
|
Bazaar has
|
|
Launchpad_,
|
|
Mercurial has
|
|
`bitbucket.org`_,
|
|
and git has
|
|
GitHub_.
|
|
All also have easy-to-use
|
|
CGI interfaces for developers who maintain their own servers.
|
|
|
|
|
|
.. _Launchpad: http://www.launchpad.net/
|
|
.. _bitbucket.org: http://www.bitbucket.org/
|
|
.. _GitHub: http://www.github.com/
|
|
|
|
* Branch trunk.
|
|
* Pull from branch on the server.
|
|
* Pull from trunk.
|
|
* Push merge to trunk.
|
|
|
|
|
|
svn
|
|
'''
|
|
::
|
|
|
|
# Create branch.
|
|
svn copy svn+ssh://pythondev@svn.python.org/python/trunk svn+ssh://pythondev@svn.python.org/python/branches/NewHotness
|
|
svn checkout svn+ssh://pythondev@svn.python.org/python/branches/NewHotness
|
|
cd NewHotness
|
|
svnmerge init
|
|
svn commit -m "Initialize svnmerge."
|
|
# Pull in changes from other developers.
|
|
svn update
|
|
# Pull in trunk and merge to the branch.
|
|
svnmerge merge
|
|
svn commit -F svnmerge-commit-message.txt
|
|
|
|
|
|
This scenario is incomplete as the decision for what DVCS to go with
|
|
was made before the work was complete.
|
|
|
|
|
|
Separation of Issue Dependencies
|
|
--------------------------------
|
|
|
|
Sometimes, while working on an issue, it becomes apparent that the
|
|
problem being worked on is actually a compound issue of various
|
|
smaller issues. Being able to take the current work and then begin
|
|
working on a separate issue is very helpful to separate out issues
|
|
into individual units of work instead of compounding them into a
|
|
single, large unit.
|
|
|
|
* Create a branch A (e.g. urllib has a bug).
|
|
* Edit some code.
|
|
* Create a new branch B that branch A depends on (e.g. the urllib
|
|
bug exposes a socket bug).
|
|
* Edit some code in branch B.
|
|
* Commit branch B.
|
|
* Edit some code in branch A.
|
|
* Commit branch A.
|
|
* Clean up.
|
|
|
|
|
|
svn
|
|
'''
|
|
|
|
To make up for svn's lack of cheap branching, it has a changelist
|
|
option to associate a file with a single changelist. This is not as
|
|
powerful as being able to associate at the commit level. There is
|
|
also no way to express dependencies between changelists.
|
|
::
|
|
|
|
cp -r trunk issue0000
|
|
cd issue0000
|
|
# Edit some code.
|
|
echo "The cake is a lie!" > README
|
|
svn changelist A README
|
|
# Edit some other code.
|
|
echo "I own Python!" > LICENSE
|
|
svn changelist B LICENSE
|
|
svn ci -m "Tell it how it is." --changelist B
|
|
# Edit changelist A some more.
|
|
svn ci -m "Speak the truth." --changelist A
|
|
cd ..
|
|
rm -rf issue0000
|
|
|
|
|
|
bzr
|
|
'''
|
|
Here's an approach that uses bzr shelf (now a standard part of bzr)
|
|
to squirrel away some changes temporarily while you take a detour to
|
|
fix the socket bugs.
|
|
::
|
|
|
|
bzr branch trunk bug-0000
|
|
cd bug-0000
|
|
# Edit some code. Dang, we need to fix the socket module.
|
|
bzr shelve --all
|
|
# Edit some code.
|
|
bzr commit -m "Socket module fixes"
|
|
# Detour over, now resume fixing urllib
|
|
bzr unshelve
|
|
# Edit some code
|
|
|
|
Another approach uses the loom plugin. Looms can
|
|
greatly simplify working on dependent branches because they
|
|
automatically take care of the stacking dependencies for you.
|
|
Imagine looms as a stack of dependent branches (called "threads" in
|
|
loom parlance), with easy ways to move up and down the stack of
|
|
threads, merge changes up the stack to descendant threads, create
|
|
diffs between threads, etc. Occasionally, you may need or want to
|
|
export your loom threads into separate branches, either for review
|
|
or commit. Higher threads incorporate all the changes in the lower
|
|
threads, automatically.
|
|
::
|
|
|
|
bzr branch trunk bug-0000
|
|
cd bug-0000
|
|
bzr loomify --base trunk
|
|
bzr create-thread fix-urllib
|
|
# Edit some code. Dang, we need to fix the socket module first.
|
|
bzr commit -m "Checkpointing my work so far"
|
|
bzr down-thread
|
|
bzr create-thread fix-socket
|
|
# Edit some code
|
|
bzr commit -m "Socket module fixes"
|
|
bzr up-thread
|
|
# Manually resolve conflicts if necessary
|
|
bzr commit -m 'Merge in socket fixes'
|
|
# Edit me some more code
|
|
bzr commit -m "Now that socket is fixed, complete the urllib fixes"
|
|
bzr record done
|
|
|
|
For bonus points, let's say someone else fixes the socket module in
|
|
exactly the same way you just did. Perhaps this person even grabbed your
|
|
fix-socket thread and applied just that to the trunk. You'd like to
|
|
be able to merge their changes into your loom and delete your
|
|
now-redundant fix-socket thread.
|
|
::
|
|
|
|
bzr down-thread trunk
|
|
# Get all new revisions to the trunk. If you've done things
|
|
# correctly, this will succeed without conflict.
|
|
bzr pull
|
|
bzr up-thread
|
|
# See? The fix-socket thread is now identical to the trunk
|
|
bzr commit -m 'Merge in trunk changes'
|
|
bzr diff -r thread: | wc -l # returns 0
|
|
bzr combine-thread
|
|
bzr up-thread
|
|
# Resolve any conflicts
|
|
bzr commit -m 'Merge trunk'
|
|
# Now our top-thread has an up-to-date trunk and just the urllib fix.
|
|
|
|
|
|
hg
|
|
''
|
|
|
|
One approach is to use the shelve extension; this extension is not included
|
|
with Mercurial, but it is easy to install. With shelve, you can select changes
|
|
to put temporarily aside.
|
|
::
|
|
|
|
hg clone trunk issue0000
|
|
cd issue0000
|
|
# Edit some code (e.g. urllib).
|
|
hg shelve
|
|
# Select changes to put aside
|
|
# Edit some other code (e.g. socket).
|
|
hg commit
|
|
hg unshelve
|
|
# Complete initial fix.
|
|
hg commit
|
|
cd ../trunk
|
|
hg pull ../issue0000
|
|
hg merge
|
|
hg commit
|
|
rm -rf ../issue0000
|
|
|
|
Several other way to approach this scenario with Mercurial. Alexander Solovyov
|
|
presented a few `alternative approaches`_ on Mercurial's mailing list.
|
|
|
|
.. _alternative approaches: http://selenic.com/pipermail/mercurial/2009-January/023710.html
|
|
|
|
git
|
|
'''
|
|
::
|
|
|
|
cd trunk
|
|
# Edit some code in urllib.
|
|
# Discover a bug in socket, want to fix that first.
|
|
# So save away our current work.
|
|
git stash
|
|
# Edit some code, commit some changes.
|
|
git commit -a -m "Completed fix of socket."
|
|
# Restore the in-progress work on urllib.
|
|
git stash apply
|
|
# Edit me some more code, commit some more fixes.
|
|
git commit -a -m "Complete urllib fixes."
|
|
# And push both patches to the public repository.
|
|
git push
|
|
|
|
Bonus points: suppose you took your time, and someone else fixes
|
|
socket in the same way you just did, and landed that in the trunk. In
|
|
that case, your push will fail because your branch is not up-to-date.
|
|
If the fix was a one-liner, there's a very good chance that it's
|
|
*exactly* the same, character for character. git would notice that,
|
|
and you are done; git will silently merge them.
|
|
|
|
Suppose we're not so lucky::
|
|
|
|
# Update your branch.
|
|
git pull git://code.python.org/public/trunk master
|
|
|
|
# git has fetched all the necessary data, but reports that the
|
|
# merge failed. We discover the nearly-duplicated patch.
|
|
# Neither our version of the master branch nor the workspace has
|
|
# been touched. Revert our socket patch and pull again:
|
|
git revert HEAD^
|
|
git pull git://code.python.org/public/trunk master
|
|
|
|
Like Bazaar and Mercurial, git has extensions to manage stacks of
|
|
patches. You can use the original Quilt by Andrew Morton, or there is
|
|
StGit ("stacked git") which integrates patch-tracking for large sets
|
|
of patches into the VCS in a way similar to Mercurial Queues or Bazaar
|
|
looms.
|
|
|
|
|
|
Doing a Python Release
|
|
----------------------
|
|
|
|
How does :pep:`101` change when using a DVCS?
|
|
|
|
|
|
bzr
|
|
'''
|
|
|
|
It will change, but not substantially so. When doing the
|
|
maintenance branch, we'll just push to the new location instead of
|
|
doing an svn cp. Tags are totally different, since in svn they are
|
|
directory copies, but in bzr (and I'm guessing hg), they are just
|
|
symbolic names for revisions on a particular branch. The release.py
|
|
script will have to change to use bzr commands instead. It's
|
|
possible that because DVCS (in particular, bzr) does cherry picking
|
|
and merging well enough that we'll be able to create the maint
|
|
branches sooner. It would be a useful exercise to try to do a
|
|
release off the bzr/hg mirrors.
|
|
|
|
|
|
hg
|
|
''
|
|
|
|
Clearly, details specific to Subversion in :pep:`101` and in the
|
|
release script will need to be updated. In particular, release
|
|
tagging and maintenance branches creation process will have to be
|
|
modified to use Mercurial's features; this will simplify and
|
|
streamline certain aspects of the release process. For example,
|
|
tagging and re-tagging a release will become a trivial operation
|
|
since a tag, in Mercurial, is simply a symbolic name for a given
|
|
revision.
|
|
|
|
|
|
git
|
|
'''
|
|
|
|
It will change, but not substantially so. When doing the
|
|
maintenance branch, we'll just git push to the new location instead
|
|
of doing an svn cp. Tags are totally different, since in svn they
|
|
are directory copies, but in git they are just symbolic names for
|
|
revisions, as are branches. (The difference between a tag and a
|
|
branch is that tags refer to a particular commit, and will never
|
|
change unless you use git tag -f to force them to move. The
|
|
checked-out branch, on the other hand, is automatically updated by
|
|
git commit.) The release.py script will have to change to use git
|
|
commands instead. With git I would create a (local) maintenance
|
|
branch as soon as the release engineer is chosen. Then I'd "git
|
|
pull" until I didn't like a patch, when it would be "git pull; git
|
|
revert ugly-patch", until it started to look like the sensible thing
|
|
is to fork off, and start doing "git cherry-pick" on the good
|
|
patches.
|
|
|
|
|
|
Platform/Tool Support
|
|
=====================
|
|
|
|
Operating Systems
|
|
-----------------
|
|
==== ======================================= ============================================= =============================
|
|
DVCS Windows OS X UNIX
|
|
==== ======================================= ============================================= =============================
|
|
bzr yes (installer) w/ tortoise yes (installer, fink or MacPorts) yes (various package formats)
|
|
hg yes (third-party installer) w/ tortoise yes (third-party installer, fink or MacPorts) yes (various package formats)
|
|
git yes (third-party installer) yes (third-party installer, fink or MacPorts) yes (.deb or .rpm)
|
|
==== ======================================= ============================================= =============================
|
|
|
|
As the above table shows, all three DVCSs are available on all three
|
|
major OS platforms. But what it also shows is that Bazaar is the
|
|
only DVCS that directly supports Windows with a binary installer
|
|
while Mercurial and git require you to rely on a third-party for
|
|
binaries. Both bzr and hg have a tortoise version while git does not.
|
|
|
|
Bazaar and Mercurial also has the benefit of being available in pure
|
|
Python with optional extensions available for performance.
|
|
|
|
|
|
CRLF -> LF Support
|
|
------------------
|
|
|
|
bzr
|
|
My understanding is that support for this is being worked on as
|
|
I type, landing in a version RSN. I will try to dig up details.
|
|
|
|
hg
|
|
Supported via the win32text extension.
|
|
|
|
git
|
|
I can't say from personal experience, but it looks like there's
|
|
pretty good support via the core.autocrlf and core.safecrlf
|
|
configuration attributes.
|
|
|
|
|
|
Case-insensitive filesystem support
|
|
-----------------------------------
|
|
|
|
bzr
|
|
Should be OK. I share branches between Linux and OS X all the
|
|
time. I've done case changes (e.g. ``bzr mv Mailman mailman``) and
|
|
as long as I did it on Linux (obviously), when I pulled in the
|
|
changes on OS X everything was hunky dory.
|
|
|
|
hg
|
|
Mercurial uses a case safe repository mechanism and detects case
|
|
folding collisions.
|
|
|
|
git
|
|
Since OS X preserves case, you can do case changes there too.
|
|
git does not have a problem with renames in either direction.
|
|
However, case-insensitive filesystem support is usually taken
|
|
to mean complaining about collisions on case-sensitive files
|
|
systems. git does not do that.
|
|
|
|
|
|
Tools
|
|
-----
|
|
|
|
In terms of code review tools such as `Review Board`_ and Rietveld_,
|
|
the former supports all three while the latter supports hg and git but
|
|
not bzr. Bazaar does not yet have an online review board, but it
|
|
has several ways to manage email based reviews and trunk merging.
|
|
There's `Bundle Buggy`_, `Patch Queue Manager`_ (PQM), and
|
|
`Launchpad's code reviews <https://launchpad.net/+tour/code-review>`_.
|
|
|
|
.. _Review Board: http://www.review-board.org/
|
|
.. _Rietveld: http://code.google.com/p/rietveld/
|
|
|
|
.. _Bundle Buggy: http://code.aaronbentley.com/bundlebuggy/
|
|
.. _Patch Queue Manager: http://bazaar-vcs.org/PatchQueueManager
|
|
|
|
All three have some web site online that provides basic hosting
|
|
support for people who want to put a repository online. Bazaar has
|
|
Launchpad, Mercurial has bitbucket.org, and git has GitHub. Google
|
|
Code also has instructions on how to use git with the service, both
|
|
to hold a repository and how to act as a read-only mirror.
|
|
|
|
All three also `appear to be supported
|
|
<http://buildbot.net/repos/release/docs/buildbot.html#How-Different-VC-Systems-Specify-Sources>`_
|
|
by Buildbot_.
|
|
|
|
.. _Buildbot: http://buildbot.net
|
|
|
|
|
|
Usage On Top Of Subversion
|
|
==========================
|
|
|
|
==== ============
|
|
DVCS svn support
|
|
==== ============
|
|
bzr bzr-svn_ (third-party)
|
|
hg `multiple third-parties <http://www.selenic.com/mercurial/wiki/index.cgi/WorkingWithSubversion>`__
|
|
git git-svn_
|
|
==== ============
|
|
|
|
.. _bzr-svn: http://bazaar-vcs.org/BzrForeignBranches/Subversion
|
|
.. _git-svn: http://www.kernel.org/pub/software/scm/git/docs/git-svn.html
|
|
|
|
All three DVCSs have svn support, although git is the only one to
|
|
come with that support out-of-the-box.
|
|
|
|
|
|
Server Support
|
|
==============
|
|
|
|
==== ==================
|
|
DVCS Web page interface
|
|
==== ==================
|
|
bzr loggerhead_
|
|
hg hgweb_
|
|
git gitweb_
|
|
==== ==================
|
|
|
|
.. _loggerhead: https://launchpad.net/loggerhead
|
|
.. _hgweb: http://www.selenic.com/mercurial/wiki/index.cgi/HgWebDirStepByStep
|
|
.. _gitweb: http://git.or.cz/gitwiki/Gitweb
|
|
|
|
All three DVCSs support various hooks on the client and server side
|
|
for e.g. pre/post-commit verifications.
|
|
|
|
|
|
Development
|
|
===========
|
|
|
|
All three projects are under active development. Git seems to be on a
|
|
monthly release schedule. Bazaar is on a time-released monthly
|
|
schedule. Mercurial is on a 4-month, timed release schedule.
|
|
|
|
|
|
Special Features
|
|
================
|
|
|
|
bzr
|
|
---
|
|
|
|
Martin Pool adds: "bzr has a stable Python scripting interface, with
|
|
a distinction between public and private interfaces and a
|
|
deprecation window for APIs that are changing. Some plugins are
|
|
listed in https://edge.launchpad.net/bazaar and
|
|
http://bazaar-vcs.org/Documentation".
|
|
|
|
|
|
hg
|
|
--
|
|
|
|
Alexander Solovyov comments:
|
|
|
|
Mercurial has easy to use extensive API with hooks for main events
|
|
and ability to extend commands. Also there is the mq (mercurial
|
|
queues) extension, distributed with Mercurial, which simplifies
|
|
work with patches.
|
|
|
|
|
|
git
|
|
---
|
|
|
|
git has a cvsserver mode, ie, you can check out a tree from git
|
|
using CVS. You can even commit to the tree, but features like
|
|
merging are absent, and branches are handled as CVS modules, which
|
|
is likely to shock a veteran CVS user.
|
|
|
|
|
|
Tests/Impressions
|
|
=================
|
|
|
|
As I (Brett Cannon) am left with the task of making the final
|
|
decision of which/any DVCS to go with and not my co-authors, I felt
|
|
it only fair to write down what tests I ran and my impressions as I
|
|
evaluate the various tools so as to be as transparent as possible.
|
|
|
|
|
|
Barrier to Entry
|
|
----------------
|
|
|
|
The amount of time and effort it takes to get a checkout of Python's
|
|
repository is critical. If the difficulty or time is too great then a
|
|
person wishing to contribute to Python may very well give up. That
|
|
cannot be allowed to happen.
|
|
|
|
I measured the checking out of the 2.x trunk as if I was a non-core
|
|
developer. Timings were done using the ``time`` command in zsh and
|
|
space was calculated with ``du -c -h``.
|
|
|
|
======= ================ ========= =====
|
|
DVCS San Francisco Vancouver Space
|
|
======= ================ ========= =====
|
|
svn 1:04 2:59 139 M
|
|
bzr 10:45 16:04 276 M
|
|
hg 2:30 5:24 171 M
|
|
git 2:54 5:28 134 M
|
|
======= ================ ========= =====
|
|
|
|
When comparing these numbers to svn, it is important to realize that
|
|
it is not a 1:1 comparison. Svn does not pull down the entire revision
|
|
history like all of the DVCSs do. That means svn can perform an
|
|
initial checkout much faster than the DVCS purely based on the fact
|
|
that it has less information to download for the network.
|
|
|
|
|
|
Performance of basic information functionality
|
|
----------------------------------------------
|
|
|
|
To see how the tools did for performing a command that required
|
|
querying the history, the log for the ``README`` file was timed.
|
|
|
|
==== =====
|
|
DVCS Time
|
|
==== =====
|
|
bzr 4.5 s
|
|
hg 1.1 s
|
|
git 1.5 s
|
|
==== =====
|
|
|
|
One thing of note during this test was that git took longer than the
|
|
other three tools to figure out how to get the log without it using a
|
|
pager. While the pager use is a nice touch in general, not having it
|
|
automatically turn on took some time (turns out the main ``git``
|
|
command has a ``--no-pager`` flag to disable use of the pager).
|
|
|
|
|
|
Figuring out what command to use from built-in help
|
|
----------------------------------------------------
|
|
|
|
I ended up trying to find out what the command was to see what URL the
|
|
repository was cloned from. To do this I used nothing more than the
|
|
help provided by the tool itself or its man pages.
|
|
|
|
Bzr was the easiest: ``bzr info``. Running ``bzr help`` didn't show
|
|
what I wanted, but mentioned ``bzr help commands``. That list had the
|
|
command with a description that made sense.
|
|
|
|
Git was the second easiest. The command ``git help`` didn't show much
|
|
and did not have a way of listing all commands. That is when I viewed
|
|
the man page. Reading through the various commands I discovered ``git
|
|
remote``. The command itself spit out nothing more than ``origin``.
|
|
Trying ``git remote origin`` said it was an error and printed out the
|
|
command usage. That is when I noticed ``git remote show``. Running
|
|
``git remote show origin`` gave me the information I wanted.
|
|
|
|
For hg, I never found the information I wanted on my own. It turns out
|
|
I wanted ``hg paths``, but that was not obvious from the description
|
|
of "show definition of symbolic path names" as printed by ``hg help``
|
|
(it should be noted that reporting this in the PEP did lead to the
|
|
Mercurial developers to clarify the wording to make the use of the
|
|
``hg paths`` command clearer).
|
|
|
|
|
|
Updating a checkout
|
|
---------------------
|
|
|
|
To see how long it takes to update an outdated repository I timed both
|
|
updating a repository 700 commits behind and 50 commits behind (three
|
|
weeks stale and 1 week stale, respectively).
|
|
|
|
==== =========== ==========
|
|
DVCS 700 commits 50 commits
|
|
==== =========== ==========
|
|
bzr 39 s 7 s
|
|
hg 17 s 3 s
|
|
git N/A 4 s
|
|
==== =========== ==========
|
|
|
|
.. note::
|
|
Git lacks a value for the *700 commits* scenario as it does
|
|
not seem to allow checking out a repository at a specific
|
|
revision.
|
|
|
|
Git deserves special mention for its output from ``git pull``. It
|
|
not only lists the delta change information for each file but also
|
|
color-codes the information.
|
|
|
|
|
|
Decision
|
|
=========
|
|
|
|
At PyCon 2009 the decision was made to go with Mercurial.
|
|
|
|
|
|
Why Mercurial over Subversion
|
|
-----------------------------
|
|
|
|
While svn has served the development team well, it needs to be
|
|
admitted that svn does not serve the needs of non-committers as well
|
|
as a DVCS does. Because svn only provides its features such as version
|
|
control, branching, etc. to people with commit privileges on the
|
|
repository it can be a hindrance for people who lack commit
|
|
privileges. But DVCSs have no such limitation as anyone can create a
|
|
local branch of Python and perform their own local commits without the
|
|
burden that comes with cloning the entire svn repository. Allowing
|
|
anyone to have the same workflow as the core developers was the key
|
|
reason to switch from svn to hg.
|
|
|
|
Orthogonal to the benefits of allowing anyone to easily commit locally
|
|
to their own branches is offline, fast operations. Because hg stores
|
|
all data locally there is no need to send requests to a server
|
|
remotely and instead work off of the local disk. This improves
|
|
response times tremendously. It also allows for offline usage for when
|
|
one lacks an Internet connection. But this benefit is minor and
|
|
considered simply a side-effect benefit instead of a driving factor
|
|
for switching off of Subversion.
|
|
|
|
|
|
Why Mercurial over other DVCSs
|
|
------------------------------
|
|
|
|
Git was not chosen for three key reasons (see the `PyCon 2009
|
|
lightning talk <http://pycon.blip.tv/file/1947231/>`_ where Brett
|
|
Cannon lists these exact reasons; talk started at 3:45). First, git's
|
|
Windows support is the weakest out of the three DVCSs being considered
|
|
which is unacceptable as Python needs to support development on any
|
|
platform it runs on. Since Python runs on Windows and some people do
|
|
develop on the platform it needs solid support. And while git's
|
|
support is improving, as of this moment it is the weakest by a large
|
|
enough margin to warrant considering it a problem.
|
|
|
|
Second, and just as important as the first issue, is that the Python
|
|
core developers liked git the least out of the three DVCS options by a
|
|
wide margin. If you look at the following table you will see the
|
|
results of a survey taken of the core developers and how by a large
|
|
margin git is the least favorite version control system.
|
|
|
|
==== == ===== == ==========
|
|
DVCS ++ equal -- Uninformed
|
|
==== == ===== == ==========
|
|
git 5 1 8 13
|
|
bzr 10 3 2 12
|
|
hg 15 1 1 10
|
|
==== == ===== == ==========
|
|
|
|
Lastly, all things being equal (which they are not
|
|
as shown by the previous two issues), it is preferable to
|
|
use and support a tool written in Python and not one written in C and
|
|
shell. We are pragmatic enough to not choose a tool simply because it
|
|
is written in Python, but we do see the usefulness in promoting tools
|
|
that do use it when it is reasonable to do so as it is in this case.
|
|
|
|
As for why Mercurial was chosen over Bazaar, it came down to
|
|
popularity. As the core developer survey shows, hg was preferred over
|
|
bzr. But the community also appears to prefer hg as was shown at PyCon
|
|
after git's removal from consideration was announced. Many people came
|
|
up to Brett and said in various ways that they wanted hg to be chosen.
|
|
While no one said they did not want bzr chosen, no one said they did
|
|
either.
|
|
|
|
Based on all of this information, Guido and Brett decided Mercurial
|
|
was to be the next version control system for Python.
|
|
|
|
|
|
Transition Plan
|
|
===============
|
|
|
|
:pep:`385` outlines the transition from svn to hg.
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|