From 3a5e2bdbfbd26c4d79c38c8889af6ecb42ac340d Mon Sep 17 00:00:00 2001 From: Donald Stufft Date: Sat, 29 Nov 2014 18:12:06 -0500 Subject: [PATCH] Add PEP 481 - Migrate Some Supporting Repositories to Git and Github --- pep-0481.txt | 284 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 284 insertions(+) create mode 100644 pep-0481.txt diff --git a/pep-0481.txt b/pep-0481.txt new file mode 100644 index 000000000..3517e5ddf --- /dev/null +++ b/pep-0481.txt @@ -0,0 +1,284 @@ +PEP: 481 +Title: Migrate Some Supporting Repositories to Git and Github +Version: $Revision$ +Last-Modified: $Date$ +Author: Donald Stufft +Status: Draft +Type: Process +Content-Type: text/x-rst +Created: 29-Nov-2014 +Post-History: 29-Nov-2014 + + +Abstract +======== + +This PEP proposes migrating to Git and Github for certain supporting +repositories (such as the repository for Python Enhancement Proposals) in a way +that is more accessible to new contributors, and easier to manage for core +developers. This is offered as an alternative to PEP 474 which aims to achieve +the same overall benefits but while continuing to use the Mercurial DVCS and +without relying on a commerical entity. + +In particular this PEP proposes changes to the following repositories: + +* https://hg.python.org/devguide/ +* https://hg.python.org/devinabox/ +* https://hg.python.org/peps/ + + +This PEP does not propose any changes to the core development workflow for +CPython itself. + + +Rationale +========= + +As PEP 474 mentions, there are currently a number of repositories hosted on +hg.python.org which are not directly used for the development of CPython but +instead are supporting or ancillary repositories. These supporting repositories +do not typically have complex workflows or often branches at all other than the +primary integration branch. This simplicity makes them very good targets for +the "Pull Request" workflow that is commonly found on sites like Github. + +However where PEP 474 wants to continue to use Mercurial and wishes to use an +OSS and self-hosted and therefore restricts itself to only those solutions this +PEP expands the scope of that to include migrating to Git and using Github. + +The existing method of contributing to these repositories generally includes +generating a patch and either uploading them to bugs.python.org or emailing +them to peps@python.org. This process is unfriendly towards non-comitter +contributors as well as making the process harder than it needs to be for +comitters to accept the patches sent by users. In addition to the benefits +in the pull request workflow itself, this style of workflow also enables +non techincal contributors, especially those who do not know their way around +the DVCS of choice, to contribute using the web based editor. On the committer +side the Pull Requests enable them to tell, before merging, whether or not +a particular Pull Request will break anything. It also enables them to do a +simple "push button" merge which does not require them to check out the +changes locally. Another such feature that is useful in particular for docs, +is the ability to view a "prose" diff. This Github specific feature enables +a committer to view a diff of the rendered output which will hide things like +reformatting a paragraph and show you what the actual "meat" of the change +actually is. + + +Why Git? +-------- + +Looking at the variety of DVCS which are available today it becomes fairly +clear that git has gotten the vast mindshare of people who are currently using +it. The Open Hub (Previously Ohloh) statistics [#openhub-stats]_ show that +currently 37% of the repositories Open Hub is indexing is using git which is +second only to SVN (which has 48%) while Mercurial has just 2% of the indexed +repositories (beating only bazaar which has 1%). In additon to the Open Hub +statistics a look at the top 100 projects on PyPI (ordered by total download +counts) shows us that within the Python space itself there is a majority of +projects using git: + +=== ========= ========== ====== === ==== +Git Mercurial Subversion Bazaar CVS None +=== ========= ========== ====== === ==== +62 22 7 4 1 1 +=== ========= ========== ====== === ==== + + +Chosing a DVCS which has the larger mindshare will make it more likely that any +particular person who has experience with DVCS at all will be able to +meaningfully use the DVCS that we have chosen without having to learn a new +tool. + +In addition to simply making it more likely that any individual will already +know how to use git, the number of projects and people using it means that the +resources for learning the tool are likely to be more fully fleshed out and +when you run into problems the liklihood that someone else had that problem +and posted a question and recieved an answer is also far likelier. + +Thirdly by using a more popular tool you also increase your options for tooling +*around* the DVCS itself. Looking at the various options for hosting +repositories it's extremely rare to find a hosting solution (whether OSS or +commerical) that supports Mercurial but does not support Git, on the flip side +there are a number of tools which support Git but do not support Mercurial. +Therefore the popularity of git increases the flexibility of our options going +into the future for what toolchain these projects use. + +Also by moving to the more popular DVCS we increase the likelhood that the +knowledge that the person has learned in contributing to these support +repositories will transfer to projects outside of the immediate CPython project +such as to the larger Python community which is primarily using Git hosted on +Github. + +In previous years there was concern about how well supported git was on Windows +in comparison to Mercurial. However git has grown to support Windows as a first +class citizen. In addition to that, for Windows users who are not well aquanted +with the Windows command line there are GUI options as well. + +On a techincal level git and Mercurial are fairly similar, however the git +branching model is signifcantly better than Mercurial "Named Branches" for +non-comitter contributors. Mercurial does have a "Bookmarks" extension however +this isn't quite as good as git's branching model. All bookmarks live in the +same namespace so it requires individual users to ensure that they namespace +the branchnames themselves lest the risk collision. It also is an extension +which requires new users to first discover they need an extension at all and +then figure out what they need to do in order to enable that extension. Since +it is an extension it also means that in general support for them outside of +Mercurial core is going to be less than 100% in comparison to git where the +feature is built in and core to using git at all. Finally users who are not +used to Mercurial are unlikely to discover bookmarks on their own, instead they +will likely attempt to use Mercurial's "Named Branches" which, given the fact +they live "forever", are not often what a project wants their contributors to +use. + + +Why Github? +----------- + +There are a number of software projects or web services which offer +functionality similar to that of Github. These range from commerical web +services such as a Bitbucket to self-hosted OSS solutions such as Kallithea or +Gitlab. This PEP proposes that we move these repositories to Github. + +There are two primary reasons for selecting Github: Popularity and +Quality/Polish. + +Github is currently the most popular hosted repository hosting according to +Alexa where it currently has a global rank of 121. Much like for Git itself by +choosing the most popular tool we gain benefits in increasing the likelhood +that a new contributor will have already experienced the toolchain, the quality +and availablity of the help, more and better tooling being built around it, and +the knowledge transfer to other projects. A look again at the top 100 projects +by download counts on PyPI shows the following hosting locations: + +====== ========= =========== ========= =========== ========== +GitHub BitBucket Google Code Launchpad SourceForge Other/Self +====== ========= =========== ========= =========== ========== +62 18 6 4 3 7 +====== ========= =========== ========= =========== ========== + +In addition to all of those reasons, Github also has the benefit that while +many of the options have similar features when you look at them in a feature +matrix the Github version of each of those features tend to work better and be +far more polished. This is hard to quantify objectively however it is a fairly +common sentiment if you go around and ask people who are using these services +often. + +Finally a reason to choose a web service at all over something that is +self-hosted is to be able to more efficiently use volunteer time and donated +resources. Every additional service hosted on the PSF infrastruture by the +PSF infrastructure team further spreads out the amount of time that the +volunteers on that team have to spend and uses some chunk of resources that +could potentionally be used for something where there is no free or affordable +hosted solution available. + +One concern that people do have with using a hosted service is that there is a +lack of control and that at some point in the future the service may no longer +be suitable. It is the opinion of this PEP that Github does not currently and +has not in the past engaged in any attempts to lock people into their platform +and that if at some point in the future Github is no longer suitable for one +reason or another than at that point we can look at migrating away from Github +onto a different solution. In other words, we'll cross that bridge if and when +we come to it. + + +Example: Scientific Python +-------------------------- + +One of the key ideas behind the move to both git and Github is that a feature +of a DVCS, the repository hosting, and the workflow used is the social network +and size of the community using said tools. We can see this is true by looking +at an example from a sub-community of the Python community: The Scientific +Python community. They have already migrated most of the key pieces of the +SciPy stack onto Github using the Pull Request based workflow starting with +IPython and as more projects moved over it became a natural default for new +projects. + +They claim to have seen a great benefit from this move, where it enables casual +contributors to easily move between different projects within their +sub-community without having to learn a special, bespoke workflow and a +different toolchain for each project. They've found that when people can use +their limited time on actually contributing instead of learning the different +tools and workflows that not only do they contribute more to one project, that +they also expand out and contribute to other projects. This move is also +attributed to making it commonplace for members of that community to go so far +as publishing their research and educational materials on Github as well. + +This showcases the real power behind moving to a highly popular toolchain and +workflow, as each variance introduces yet another hurdle for new and casual +contributors to get past and it makes the time spent learning that workflow +less reusable with other projects. + + +Migration +========= + +Through the use of hg-git [#hg-git]_ we can easily convert a Mercurial +repository to a Git repository by simply pushing the Mercurial repository to +the Git repository. People who wish to continue to use Mercurual locally can +then use hg-git going into the future using the new Github URL, however they +will need to re-clone their repositories as using Git as the server seems to +trigger a one time change of the changeset ids. + +As none of the selected repositories have any tags, branches, or bookmarks +other than the ``default`` branch the migration will simply map the ``default`` +branch in Mercurial to the ``master`` branch in git. + +In addition since none of the selected projects have any great need of a +complex bug tracker, they will also migrate their issue handling to using the +GitHub issues. + +In addition to the migration of the repository hosting itself there are a +number of locations for each particular repository which will require updating. +The bulk of these will simply be changing commands from the hg equivilant to +the git equivilant. + +In particular this will include: + +* Updating www.python.org to generate PEPs using a git clone and link to + Github. +* Updating docs.python.org to pull from Github instead of hg.python.org for the + devguide. +* Enabling the ability to send an email to python-checkins@python.org for each + push. +* Enabling the ability to send an IRC message to #python-dev on Freenode for + each push. +* Migrate any issues for these projects to their respective bug tracker on + Github. + +This will restore these repositories to similar functionality as they currently +have. In addition to this the migration will also include enabling testing for +each pull request using Travis CI [#travisci]_ where possible to ensure that +a new PR does not break the ability to render the documentation or PEPs. + + +User Access +=========== + +Moving to Github would involve adding an additional user account that will need +to be managed, however it also offers finer grained control, allowing the +ability to grant someone access to only one particular repository instead of +the coarser grained ACLs available on hg.python.org. + + +References +========== + +.. [#openhub-stats] `Open Hub Statistics ` +.. [#hg-git] `hg-git ` +.. [#travisci] `Travis CI ` + + +Copyright +========= + +This document has been placed in the public domain. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: