PEP 385: a bunch of updates after python-dev discussion.

This commit is contained in:
Dirkjan Ochtman 2009-08-03 11:38:44 +00:00
parent 7da366131d
commit ad94c1d76e
1 changed files with 84 additions and 55 deletions

View File

@ -59,27 +59,25 @@ client. The latter makes it a little easier to switch between branches, but
often has somewhat unintuitive results for people (though this has been often has somewhat unintuitive results for people (though this has been
getting better in recent versions of Mercurial). getting better in recent versions of Mercurial).
I'm still a bit on the fence about whether Python should adopt cloned The current proposal is to use named branches for release branches and adopt
branches and named branches. Since it usually makes more sense to tag releases cloned branches for feature branches, with one exception to this rule: the 3.x
on the maintenance branch, for example, mainline history would not contain branches will be kept in separate clones from the 2.x branches. I think this
release tags if we used cloned branches. Also, Mercurial 1.2 and 1.3 have the provides an optimal hybrid approach for Python's uses of branching.
necessary tools to make named branches less painful (because they can be
properly closed and closed heads are no longer considered in relevant cases).
A disadvantage might be that the used clones will be a good bit larger (since Differences between named branches and cloned branches:
they essentially contain all other branches as well). This can me mitigated by
keeping non-release (feature) branches in separate clones. Also note that it's
still possible to clone a single named branch from a combined clone, by
specifying the branch as in hg clone http://hg.python.org/main/#2.6-maint.
Keeping the py3k history in a separate clone problably also makes sense.
XXX To do: size comparison for selected separation scenarios. * Tags in a different (maintenance) clone aren't available in the local clone
* Clones with named branches will be larger, since they contain more data
(The Mercurial book discourages the use of named branches, but it is, in this
respect, somewhat outdated. Named branches have gotten much easier to use
since that comment was written, due to improvements in hg.)
Converting branches Converting branches
------------------- -------------------
There are quite a lot of branches in SVN's branches directory. I propose to There are quite a lot of branches in SVN's branches directory. I propose to
clean this up a bit, by employing the following the strategy: clean this up a bit, by following this basic strategy:
* Keep all release (maintenance) branches * Keep all release (maintenance) branches
* Discard branches that haven't been touched in 18 months, unless somone * Discard branches that haven't been touched in 18 months, unless somone
@ -87,6 +85,21 @@ clean this up a bit, by employing the following the strategy:
* Keep branches that have been touched in the last 18 months, unless someone * Keep branches that have been touched in the last 18 months, unless someone
indicates the branch can be deprecated indicates the branch can be deprecated
There's a `branch map`_ available that shows info about each branch:
* keep-clone means we'll keep that branch in a separate clone
* keep-named means we'll keep that branch as a named branch in one of the clones
* strip means we won't keep that branch
* streamed-merge means that it got merged by committing several new revisions
to the other branch
* merged-r* means the branch got merged in the named revision
* merges? means I haven't checked/found out yet whether that branch was ever
merged
* ? means that your input would be even more helpful than for the other items
* some items have no action yet, feel free to treat that as just '?'
.. _branch map: http://hg.python.org/pymigr/file/tip/all-branches.txt
Converting tags Converting tags
--------------- ---------------
@ -95,8 +108,8 @@ fact, full tags, but contain only a smaller subset of the repository. I think
we should keep all release tags, and consider other tags for inclusion based we should keep all release tags, and consider other tags for inclusion based
on requests from the developer community. I'd like to consider unifying the on requests from the developer community. I'd like to consider unifying the
release tag naming scheme to make some things more consistent, if people feel release tag naming scheme to make some things more consistent, if people feel
that won't create too many problems. For example, Mercurial itself just uses that won't create too many problems. The current proposal is to bring old
'1.2.1' as a tag, where CPython would currently use r121. release tags in line with the current practice of release tag naming.
Author map Author map
---------- ----------
@ -119,17 +132,19 @@ that are not eligible for version control. It does this by employing several
possible forms of pattern matching. The current Python repository already possible forms of pattern matching. The current Python repository already
includes a rudimentary .hgignore file to help with using the hg mirrors. includes a rudimentary .hgignore file to help with using the hg mirrors.
It might be useful to have the .hgignore be generated automatically from Since the current Python repository already includes a .hgignore file (for use
svn:ignore properties. This would make sure all historic revisions also have with hg mirrors), we'll just use that. Generating full history of the file
useful ignore information (though one could argue ignoring isn't really was debated but deemed impractical (because it's relatively hard with fairly
relevant to just checking out an old revision). little gain, since ignoring is less important for older revisions).
Revlog reordering Revlog reordering
----------------- -----------------
As an optional optimization technique, we should consider trying a reordering As an optional optimization technique, I have performed a reordering pass on
pass on the revlogs (internal Mercurial files) resulting from the conversion. the revlogs (internal Mercurial files) resulting from the conversion. In some
In some cases this results in dramatic decreases in on-disk repository size. cases this results in dramatic decreases in on-disk repository size. This
especially makes sense for the manifest (where it really helps out quite a lot)
and oft-edited files like NEWS.txt (with an admittedly smaller effect).
Other repositories Other repositories
------------------ ------------------
@ -138,6 +153,13 @@ Richard Tew has indicated that he'd like the Stackless repository to also be
converted. What other projects in the svn.python.org repository should be converted. What other projects in the svn.python.org repository should be
converted? Do we want to convert the peps repository? distutils? others? converted? Do we want to convert the peps repository? distutils? others?
There's now an initial stab at converting the Jython repository. The current
tip of hgsubversion unfortunately fails at some point. Pending investigation.
Other repositories that would like to converted to Mercurial can announce
themselves to me after the main Python migration is done, and I'll take care
of their needs.
Infrastructure Infrastructure
============== ==============
@ -165,17 +187,18 @@ developed and deployed. The following hooks are being used:
lines. Open issue: do we check only the tip after each push, or do we check lines. Open issue: do we check only the tip after each push, or do we check
every commit in a changegroup? every commit in a changegroup?
* commit mails: we can leverage the notify extension for this * commit mails: we can leverage the notify extension for this. Emails will
include diffs for each changeset committed against the repository.
* buildbots: both the regular and the community build masters must be notified. * buildbots: both the regular and the community build masters must be notified.
Fortunately buildbot includes support for hg. I've also implemented this for Fortunately buildbot includes support for hg. I've also implemented this for
Mercurial itself, so I don't expect problems here. Mercurial itself, so I don't expect problems here.
* check contributors: in the current setup, all changesets bear the username of * check contributors: in the current setup, all changesets bear the username of
committers, who must have signed the contributor agreement. In a DVCS, the committers, who must have signed the contributor agreement. We might want to
committers are not necessarily the same people who push, and so we can't use a hook to check if the committer is a contributor if we keep a list of
check if the committer is a contributor. We could use a hook to check if the registered contributors. Then, the hook might warn users that push a group
committer is a contributor if we keep a list of registered contributors. of revisions containing changesets from unknown contributors.
hgwebdir hgwebdir
-------- --------
@ -185,6 +208,15 @@ come up with a style to match the Python website. It may also be useful to
build a quick extension to augment the URL rev parser so that it can also take build a quick extension to augment the URL rev parser so that it can also take
r[0-9]+ args and come up with the matching hg revision. r[0-9]+ args and come up with the matching hg revision.
roundup
-------
We'll come up with an auto-linking plugin for roundup, which can match a
changeset identifier (possibly with a branch prefix), and link it to the
appropriate revision in the hgwebdir instance. Second, the script above (in
the hgwebdir section) will make sure that old links to revision should continue
to work (by pointing to the hg changeset that reflects the svn revision).
After migration After migration
=============== ===============
@ -222,36 +254,31 @@ on the outcome of debate about this PEP (for example, the branching strategy).
.. _wiki: http://www.selenic.com/mercurial/wiki/ .. _wiki: http://www.selenic.com/mercurial/wiki/
.. _parts of the developer FAQ: http://www.python.org/dev/faq/#version-control .. _parts of the developer FAQ: http://www.python.org/dev/faq/#version-control
Think first, commit later? Proposed workflow
-------------------------- -----------------
In recent history, old versions of Python have been maintained by a select I propose two workflows for the migration of patches between several branches.
group of people backporting patches from trunk to release branches. While
this may not scale so well as the development pace grows, it also runs into
some problems with the current crop of distributed versioning tools. These
tools (I believe similar problems would exist for either git, bzr, or hg,
though some may cope better than others) are based on the idea of a Directed
Acyclic Graph (or DAG), meaning they keep track of relations of changesets.
Mercurial itself has a stable branch which is a ''strict'' subset of the For migration within 2.x or 3.x branches, I propose a patch always gets
unstable branch. This means that generally all fixes for the stable branch committed to the oldest branch where it applies first. Then, the resulting
get committed against the tip of the stable branch, then they get merged into changeset can be merged using hg merge to all newer branches within that
the unstable branch (which already contains the parent of the new cset). This series (2.x or 3.x). If it does not apply as-is to the newer branch, hg revert
provides a largely frictionless environment for moving changes from stable to can be used to easily revert to the new-branch-native head, patch in some
unstable branches. Mistakes, where a change that should go on stable goes on alternative version of the patch (or none, if it's not applicable), then commit
unstable first, do happen, but they're usually easy to fix. That can be done by the merge. The premise here is that all changesets from an older branch within
copying the change over to the stable branch, then trivial-merging with the series are eventually merged to all newer branches within the series.
unstable -- meaning the merge in fact ignores the parent from the stable
branch).
This strategy means a little more work for regular committers, because they The upshot is that this provides for the most painless merging procedure. The
have to think about whether their change should go on stable or unstable; they downside is that in the general case, people have to think about the oldest
may even have to ask someone else (the RM) before committing. But it also branch to which the patch should be applied before actually applying it.
relieves a dedicated group of committers of regular backporting duty, in
addition to making it easier to work with the tool.
Now would be a good time to consider changing strategies in this regard, For migration between 2.x and 3.x branches (which should all be in the same
although it would be relatively easy to switch to such a model later on. direction, though I'm not sure what direction is most appropriate here),
changesets should be transplanted (not merged) in some other way. The
transplant extension, import/export and bundle/unbundle work equally well here.
Choosing this approach allows 3.x not to carry all of the 2.x history-since-it-
was-branched, meaning the clone is not as big and the merges not as complicated.
The future of Subversion The future of Subversion
------------------------ ------------------------
@ -281,7 +308,9 @@ to Python code as part of sys.version:
I propose that the revision identifier will be the short version of hg's I propose that the revision identifier will be the short version of hg's
revision hash, for example 'dd3ebf81af43', augmented with '+' (instead of 'M') revision hash, for example 'dd3ebf81af43', augmented with '+' (instead of 'M')
if the working directory from which it was built was modified. This mirrors if the working directory from which it was built was modified. This mirrors
the output of the hg id command, which is intended for this kind of usage. the output of the hg id command, which is intended for this kind of usage. The
sys.subversion value will also be renamed to sys.mercurial to reflect the
change in VCS.
For the tag/branch identifier, I propose that hg will check for tags on the For the tag/branch identifier, I propose that hg will check for tags on the
currently checked out revision, use the tag if there is one ('tip' doesn't currently checked out revision, use the tag if there is one ('tip' doesn't