merge
This commit is contained in:
commit
3cdb11485b
15
pep-0101.txt
15
pep-0101.txt
|
@ -424,11 +424,12 @@ How to Make A Release
|
||||||
that directory. Note though that if you're releasing a maintenance
|
that directory. Note though that if you're releasing a maintenance
|
||||||
release for an older version, don't change the current link.
|
release for an older version, don't change the current link.
|
||||||
|
|
||||||
___ If this is a final release (even a maintenance release), also unpack
|
___ If this is a final release (even a maintenance release), also
|
||||||
the HTML docs to /srv/docs.python.org/release/X.Y.Z on
|
unpack the HTML docs to /srv/docs.python.org/release/X.Y.Z on
|
||||||
docs.iad1.psf.io. Make sure the files are in group "docs". If it is a
|
docs.iad1.psf.io. Make sure the files are in group "docs" and are
|
||||||
release of a security-fix-only version, tell the DE to build a version
|
group-writeable. If it is a release of a security-fix-only version,
|
||||||
with the "version switcher" and put it there.
|
tell the DE to build a version with the "version switcher"
|
||||||
|
and put it there.
|
||||||
|
|
||||||
___ Let the DE check if the docs are built and work all right.
|
___ Let the DE check if the docs are built and work all right.
|
||||||
|
|
||||||
|
@ -484,6 +485,10 @@ How to Make A Release
|
||||||
Note that the easiest thing is probably to copy fields from
|
Note that the easiest thing is probably to copy fields from
|
||||||
an existing Python release "page", editing as you go.
|
an existing Python release "page", editing as you go.
|
||||||
|
|
||||||
|
There should only be one "page" for a release (e.g. 3.5.0, 3.5.1).
|
||||||
|
Reuse the same page for all pre-releases, changing the version
|
||||||
|
number and the documentation as you go.
|
||||||
|
|
||||||
___ If this isn't the first release for a version, open the existing
|
___ If this isn't the first release for a version, open the existing
|
||||||
"page" for editing and update it to the new release. Don't save yet!
|
"page" for editing and update it to the new release. Don't save yet!
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,951 @@
|
||||||
|
PEP: 103
|
||||||
|
Title: Collecting information about git
|
||||||
|
Version: $Revision$
|
||||||
|
Last-Modified: $Date$
|
||||||
|
Author: Oleg Broytman <phd@phdru.name>
|
||||||
|
Status: Draft
|
||||||
|
Type: Informational
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 01-Jun-2015
|
||||||
|
Post-History: 12-Sep-2015
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
This Informational PEP collects information about git. There is, of
|
||||||
|
course, a lot of documentation for git, so the PEP concentrates on
|
||||||
|
more complex (and more related to Python development) issues,
|
||||||
|
scenarios and examples.
|
||||||
|
|
||||||
|
The plan is to extend the PEP in the future collecting information
|
||||||
|
about equivalence of Mercurial and git scenarios to help migrating
|
||||||
|
Python development from Mercurial to git.
|
||||||
|
|
||||||
|
The author of the PEP doesn't currently plan to write a Process PEP on
|
||||||
|
migration Python development from Mercurial to git.
|
||||||
|
|
||||||
|
|
||||||
|
Documentation
|
||||||
|
=============
|
||||||
|
|
||||||
|
Git is accompanied with a lot of documentation, both online and
|
||||||
|
offline.
|
||||||
|
|
||||||
|
|
||||||
|
Documentation for starters
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
Git Tutorial: `part 1
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/gittutorial.html>`_,
|
||||||
|
`part 2
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/gittutorial-2.html>`_.
|
||||||
|
|
||||||
|
`Git User's manual
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/user-manual.html>`_.
|
||||||
|
`Everyday GIT With 20 Commands Or So
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/giteveryday.html>`_.
|
||||||
|
`Git workflows
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/gitworkflows.html>`_.
|
||||||
|
|
||||||
|
|
||||||
|
Advanced documentation
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
`Git Magic
|
||||||
|
<http://www-cs-students.stanford.edu/~blynn/gitmagic/index.html>`_,
|
||||||
|
with a number of translations.
|
||||||
|
|
||||||
|
`Pro Git <https://git-scm.com/book>`_. The Book about git. Buy it at
|
||||||
|
Amazon or download in PDF, mobi, or ePub form. It has translations to
|
||||||
|
many different languages. Download Russian translation from `GArik
|
||||||
|
<https://github.com/GArik/progit/wiki>`_.
|
||||||
|
|
||||||
|
`Git Wiki <https://git.wiki.kernel.org/index.php/Main_Page>`_.
|
||||||
|
|
||||||
|
|
||||||
|
Offline documentation
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
Git has builtin help: run ``git help $TOPIC``. For example, run
|
||||||
|
``git help git`` or ``git help help``.
|
||||||
|
|
||||||
|
|
||||||
|
Quick start
|
||||||
|
===========
|
||||||
|
|
||||||
|
Download and installation
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
Unix users: `download and install using your package manager
|
||||||
|
<https://git-scm.com/download/linux>`_.
|
||||||
|
|
||||||
|
Microsoft Windows: download `git-for-windows
|
||||||
|
<https://github.com/git-for-windows/git/releases>`_ or `msysGit
|
||||||
|
<https://github.com/msysgit/msysgit/releases>`_.
|
||||||
|
|
||||||
|
MacOS X: use git installed with `XCode
|
||||||
|
<https://developer.apple.com/xcode/downloads/>`_ or download from
|
||||||
|
`MacPorts <https://www.macports.org/ports.php?by=name&substr=git>`_ or
|
||||||
|
`git-osx-installer
|
||||||
|
<http://sourceforge.net/projects/git-osx-installer/files/>`_ or
|
||||||
|
install git with `Homebrew <http://brew.sh/>`_: ``brew install git``.
|
||||||
|
|
||||||
|
`git-cola <https://git-cola.github.io/index.html>`_ is a Git GUI
|
||||||
|
written in Python and GPL licensed. Linux, Windows, MacOS X.
|
||||||
|
|
||||||
|
`TortoiseGit <https://tortoisegit.org/>`_ is a Windows Shell Interface
|
||||||
|
to Git based on TortoiseSVN; open source.
|
||||||
|
|
||||||
|
|
||||||
|
Initial configuration
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
This simple code is often appears in documentation, but it is
|
||||||
|
important so let repeat it here. Git stores author and committer
|
||||||
|
names/emails in every commit, so configure your real name and
|
||||||
|
preferred email::
|
||||||
|
|
||||||
|
$ git config --global user.name "User Name"
|
||||||
|
$ git config --global user.email user.name@example.org
|
||||||
|
|
||||||
|
|
||||||
|
Examples in this PEP
|
||||||
|
====================
|
||||||
|
|
||||||
|
Examples of git commands in this PEP use the following approach. It is
|
||||||
|
supposed that you, the user, works with a local repository named
|
||||||
|
``python`` that has an upstream remote repo named ``origin``. Your
|
||||||
|
local repo has two branches ``v1`` and ``master``. For most examples
|
||||||
|
the currently checked out branch is ``master``. That is, it's assumed
|
||||||
|
you have done something like that::
|
||||||
|
|
||||||
|
$ git clone https://git.python.org/python.git
|
||||||
|
$ cd python
|
||||||
|
$ git branch v1 origin/v1
|
||||||
|
|
||||||
|
The first command clones remote repository into local directory
|
||||||
|
`python``, creates a new local branch master, sets
|
||||||
|
remotes/origin/master as its upstream remote-tracking branch and
|
||||||
|
checks it out into the working directory.
|
||||||
|
|
||||||
|
The last command creates a new local branch v1 and sets
|
||||||
|
remotes/origin/v1 as its upstream remote-tracking branch.
|
||||||
|
|
||||||
|
The same result can be achieved with commands::
|
||||||
|
|
||||||
|
$ git clone -b v1 https://git.python.org/python.git
|
||||||
|
$ cd python
|
||||||
|
$ git checkout --track origin/master
|
||||||
|
|
||||||
|
The last command creates a new local branch master, sets
|
||||||
|
remotes/origin/master as its upstream remote-tracking branch and
|
||||||
|
checks it out into the working directory.
|
||||||
|
|
||||||
|
|
||||||
|
Branches and branches
|
||||||
|
=====================
|
||||||
|
|
||||||
|
Git terminology can be a bit misleading. Take, for example, the term
|
||||||
|
"branch". In git it has two meanings. A branch is a directed line of
|
||||||
|
commits (possibly with merges). And a branch is a label or a pointer
|
||||||
|
assigned to a line of commits. It is important to distinguish when you
|
||||||
|
talk about commits and when about their labels. Lines of commits are
|
||||||
|
by itself unnamed and are usually only lengthening and merging.
|
||||||
|
Labels, on the other hand, can be created, moved, renamed and deleted
|
||||||
|
freely.
|
||||||
|
|
||||||
|
|
||||||
|
Remote repositories and remote branches
|
||||||
|
=======================================
|
||||||
|
|
||||||
|
Remote-tracking branches are branches (pointers to commits) in your
|
||||||
|
local repository. They are there for git (and for you) to remember
|
||||||
|
what branches and commits have been pulled from and pushed to what
|
||||||
|
remote repos (you can pull from and push to many remotes).
|
||||||
|
Remote-tracking branches live under ``remotes/$REMOTE`` namespaces,
|
||||||
|
e.g. ``remotes/origin/master``.
|
||||||
|
|
||||||
|
To see the status of remote-tracking branches run::
|
||||||
|
|
||||||
|
$ git branch -rv
|
||||||
|
|
||||||
|
To see local and remote-tracking branches (and tags) pointing to
|
||||||
|
commits::
|
||||||
|
|
||||||
|
$ git log --decorate
|
||||||
|
|
||||||
|
You never do your own development on remote-tracking branches. You
|
||||||
|
create a local branch that has a remote branch as upstream and do
|
||||||
|
development on that local branch. On push git pushes commits to the
|
||||||
|
remote repo and updates remote-tracking branches, on pull git fetches
|
||||||
|
commits from the remote repo, updates remote-tracking branches and
|
||||||
|
fast-forwards, merges or rebases local branches.
|
||||||
|
|
||||||
|
When you do an initial clone like this::
|
||||||
|
|
||||||
|
$ git clone -b v1 https://git.python.org/python.git
|
||||||
|
|
||||||
|
git clones remote repository ``https://git.python.org/python.git`` to
|
||||||
|
directory ``python``, creates a remote named ``origin``, creates
|
||||||
|
remote-tracking branches, creates a local branch ``v1``, configure it
|
||||||
|
to track upstream remotes/origin/v1 branch and checks out ``v1`` into
|
||||||
|
the working directory.
|
||||||
|
|
||||||
|
|
||||||
|
Updating local and remote-tracking branches
|
||||||
|
-------------------------------------------
|
||||||
|
|
||||||
|
There is a major difference between
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git fetch $REMOTE $BRANCH
|
||||||
|
|
||||||
|
and
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git fetch $REMOTE $BRANCH:$BRANCH
|
||||||
|
|
||||||
|
The first command fetches commits from the named $BRANCH in the
|
||||||
|
$REMOTE repository that are not in your repository, updates
|
||||||
|
remote-tracking branch and leaves the id (the hash) of the head commit
|
||||||
|
in file .git/FETCH_HEAD.
|
||||||
|
|
||||||
|
The second command fetches commits from the named $BRANCH in the
|
||||||
|
$REMOTE repository that are not in your repository and updates both
|
||||||
|
the local branch $BRANCH and its upstream remote-tracking branch. But
|
||||||
|
it refuses to update branches in case of non-fast-forward. And it
|
||||||
|
refuses to update the current branch (currently checked out branch,
|
||||||
|
where HEAD is pointing to).
|
||||||
|
|
||||||
|
The first command is used internally by ``git pull``.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git pull $REMOTE $BRANCH
|
||||||
|
|
||||||
|
is equivalent to
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git fetch $REMOTE $BRANCH
|
||||||
|
$ git merge FETCH_HEAD
|
||||||
|
|
||||||
|
Certainly, $BRANCH in that case should be your current branch. If you
|
||||||
|
want to merge a different branch into your current branch first update
|
||||||
|
that non-current branch and then merge::
|
||||||
|
|
||||||
|
$ git fetch origin v1:v1 # Update v1
|
||||||
|
$ git pull --rebase origin master # Update the current branch master
|
||||||
|
# using rebase instead of merge
|
||||||
|
$ git merge v1
|
||||||
|
|
||||||
|
If you have not yet pushed commits on ``v1``, though, the scenario has
|
||||||
|
to become a bit more complex. Git refuses to update
|
||||||
|
non-fast-forwardable branch, and you don't want to do force-pull
|
||||||
|
because that would remove your non-pushed commits and you would need
|
||||||
|
to recover. So you want to rebase ``v1`` but you cannot rebase
|
||||||
|
non-current branch. Hence, checkout ``v1`` and rebase it before
|
||||||
|
merging::
|
||||||
|
|
||||||
|
$ git checkout v1
|
||||||
|
$ git pull --rebase origin v1
|
||||||
|
$ git checkout master
|
||||||
|
$ git pull --rebase origin master
|
||||||
|
$ git merge v1
|
||||||
|
|
||||||
|
It is possible to configure git to make it fetch/pull a few branches
|
||||||
|
or all branches at once, so you can simply run
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git pull origin
|
||||||
|
|
||||||
|
or even
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git pull
|
||||||
|
|
||||||
|
Default remote repository for fetching/pulling is ``origin``. Default
|
||||||
|
set of references to fetch is calculated using matching algorithm: git
|
||||||
|
fetches all branches having the same name on both ends.
|
||||||
|
|
||||||
|
|
||||||
|
Push
|
||||||
|
''''
|
||||||
|
|
||||||
|
Pushing is a bit simpler. There is only one command ``push``. When you
|
||||||
|
run
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git push origin v1 master
|
||||||
|
|
||||||
|
git pushes local v1 to remote v1 and local master to remote master.
|
||||||
|
The same as::
|
||||||
|
|
||||||
|
$ git push origin v1:v1 master:master
|
||||||
|
|
||||||
|
Git pushes commits to the remote repo and updates remote-tracking
|
||||||
|
branches. Git refuses to push commits that aren't fast-forwardable.
|
||||||
|
You can force-push anyway, but please remember - you can force-push to
|
||||||
|
your own repositories but don't force-push to public or shared repos.
|
||||||
|
If you find git refuses to push commits that aren't fast-forwardable,
|
||||||
|
better fetch and merge commits from the remote repo (or rebase your
|
||||||
|
commits on top of the fetched commits), then push. Only force-push if
|
||||||
|
you know what you do and why you do it. See the section `Commit
|
||||||
|
editing and caveats`_ below.
|
||||||
|
|
||||||
|
It is possible to configure git to make it push a few branches or all
|
||||||
|
branches at once, so you can simply run
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git push origin
|
||||||
|
|
||||||
|
or even
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
$ git push
|
||||||
|
|
||||||
|
Default remote repository for pushing is ``origin``. Default set of
|
||||||
|
references to push in git before 2.0 is calculated using matching
|
||||||
|
algorithm: git pushes all branches having the same name on both ends.
|
||||||
|
Default set of references to push in git 2.0+ is calculated using
|
||||||
|
simple algorithm: git pushes the current branch back to its
|
||||||
|
@{upstream}.
|
||||||
|
|
||||||
|
To configure git before 2.0 to the new behaviour run::
|
||||||
|
|
||||||
|
$ git config push.default simple
|
||||||
|
|
||||||
|
To configure git 2.0+ to the old behaviour run::
|
||||||
|
|
||||||
|
$ git config push.default matching
|
||||||
|
|
||||||
|
Git doesn't allow to push a branch if it's the current branch in the
|
||||||
|
remote non-bare repository: git refuses to update remote working
|
||||||
|
directory. You really should push only to bare repositories. For
|
||||||
|
non-bare repositories git prefers pull-based workflow.
|
||||||
|
|
||||||
|
When you want to deploy code on a remote host and can only use push
|
||||||
|
(because your workstation is behind a firewall and you cannot pull
|
||||||
|
from it) you do that in two steps using two repositories: you push
|
||||||
|
from the workstation to a bare repo on the remote host, ssh to the
|
||||||
|
remote host and pull from the bare repo to a non-bare deployment repo.
|
||||||
|
|
||||||
|
That changed in git 2.3, but see `the blog post
|
||||||
|
<https://github.com/blog/1957-git-2-3-has-been-released#push-to-deploy>`_
|
||||||
|
for caveats; in 2.4 the push-to-deploy feature was `further improved
|
||||||
|
<https://github.com/blog/1994-git-2-4-atomic-pushes-push-to-deploy-and-more#push-to-deploy-improvements>`_.
|
||||||
|
|
||||||
|
|
||||||
|
Tags
|
||||||
|
''''
|
||||||
|
|
||||||
|
Git automatically fetches tags that point to commits being fetched
|
||||||
|
during fetch/pull. To fetch all tags (and commits they point to) run
|
||||||
|
``git fetch --tags origin``. To fetch some specific tags fetch them
|
||||||
|
explicitly::
|
||||||
|
|
||||||
|
$ git fetch origin tag $TAG1 tag $TAG2...
|
||||||
|
|
||||||
|
For example::
|
||||||
|
|
||||||
|
$ git fetch origin tag 1.4.2
|
||||||
|
$ git fetch origin v1:v1 tag 2.1.7
|
||||||
|
|
||||||
|
Git doesn't automatically pushes tags. That allows you to have private
|
||||||
|
tags. To push tags list them explicitly::
|
||||||
|
|
||||||
|
$ git push origin tag 1.4.2
|
||||||
|
$ git push origin v1 master tag 2.1.7
|
||||||
|
|
||||||
|
Or push all tags at once::
|
||||||
|
|
||||||
|
$ git push --tags origin
|
||||||
|
|
||||||
|
Don't move tags with ``git tag -f`` or remove tags with ``git tag -d``
|
||||||
|
after they have been published.
|
||||||
|
|
||||||
|
|
||||||
|
Private information
|
||||||
|
'''''''''''''''''''
|
||||||
|
|
||||||
|
When cloning/fetching/pulling/pushing git copies only database objects
|
||||||
|
(commits, trees, files and tags) and symbolic references (branches and
|
||||||
|
lightweight tags). Everything else is private to the repository and
|
||||||
|
never cloned, updated or pushed. It's your config, your hooks, your
|
||||||
|
private exclude file.
|
||||||
|
|
||||||
|
If you want to distribute hooks, copy them to the working tree, add,
|
||||||
|
commit, push and instruct the team to update and install the hooks
|
||||||
|
manually.
|
||||||
|
|
||||||
|
|
||||||
|
Commit editing and caveats
|
||||||
|
==========================
|
||||||
|
|
||||||
|
A warning not to edit published (pushed) commits also appears in
|
||||||
|
documentation but it's repeated here anyway as it's very important.
|
||||||
|
|
||||||
|
It is possible to recover from a forced push but it's PITA for the
|
||||||
|
entire team. Please avoid it.
|
||||||
|
|
||||||
|
To see what commits have not been published yet compare the head of the
|
||||||
|
branch with its upstream remote-tracking branch::
|
||||||
|
|
||||||
|
$ git log origin/master.. # from origin/master to HEAD (of master)
|
||||||
|
$ git log origin/v1..v1 # from origin/v1 to the head of v1
|
||||||
|
|
||||||
|
For every branch that has an upstream remote-tracking branch git
|
||||||
|
maintains an alias @{upstream} (short version @{u}), so the commands
|
||||||
|
above can be given as::
|
||||||
|
|
||||||
|
$ git log @{u}..
|
||||||
|
$ git log v1@{u}..v1
|
||||||
|
|
||||||
|
To see the status of all branches::
|
||||||
|
|
||||||
|
$ git branch -avv
|
||||||
|
|
||||||
|
To compare the status of local branches with a remote repo::
|
||||||
|
|
||||||
|
$ git remote show origin
|
||||||
|
|
||||||
|
Read `how to recover from upstream rebase
|
||||||
|
<https://git-scm.com/docs/git-rebase#_recovering_from_upstream_rebase>`_.
|
||||||
|
It is in ``git help rebase``.
|
||||||
|
|
||||||
|
On the other hand don't be too afraid about commit editing. You can
|
||||||
|
safely edit, reorder, remove, combine and split commits that haven't
|
||||||
|
been pushed yet. You can even push commits to your own (backup) repo,
|
||||||
|
edit them later and force-push edited commits to replace what have
|
||||||
|
already been pushed. Not a problem until commits are in a public
|
||||||
|
or shared repository.
|
||||||
|
|
||||||
|
|
||||||
|
Undo
|
||||||
|
====
|
||||||
|
|
||||||
|
Whatever you do, don't panic. Almost anything in git can be undone.
|
||||||
|
|
||||||
|
|
||||||
|
git checkout: restore file's content
|
||||||
|
------------------------------------
|
||||||
|
|
||||||
|
``git checkout``, for example, can be used to restore the content of
|
||||||
|
file(s) to that one of a commit. Like this::
|
||||||
|
|
||||||
|
git checkout HEAD~ README
|
||||||
|
|
||||||
|
The commands restores the contents of README file to the last but one
|
||||||
|
commit in the current branch. By default the commit ID is simply HEAD;
|
||||||
|
i.e. ``git checkout README`` restores README to the latest commit.
|
||||||
|
|
||||||
|
(Do not use ``git checkout`` to view a content of a file in a commit,
|
||||||
|
use ``git cat-file -p``; e.g. ``git cat-file -p HEAD~:path/to/README``).
|
||||||
|
|
||||||
|
|
||||||
|
git reset: remove (non-pushed) commits
|
||||||
|
--------------------------------------
|
||||||
|
|
||||||
|
``git reset`` moves the head of the current branch. The head can be
|
||||||
|
moved to point to any commit but it's often used to remove a commit or
|
||||||
|
a few (preferably, non-pushed ones) from the top of the branch - that
|
||||||
|
is, to move the branch backward in order to undo a few (non-pushed)
|
||||||
|
commits.
|
||||||
|
|
||||||
|
``git reset`` has three modes of operation - soft, hard and mixed.
|
||||||
|
Default is mixed. ProGit `explains
|
||||||
|
<https://git-scm.com/book/en/Git-Tools-Reset-Demystified>`_ the
|
||||||
|
difference very clearly. Bare repositories don't have indices or
|
||||||
|
working trees so in a bare repo only soft reset is possible.
|
||||||
|
|
||||||
|
|
||||||
|
Unstaging
|
||||||
|
'''''''''
|
||||||
|
|
||||||
|
Mixed mode reset with a path or paths can be used to unstage changes -
|
||||||
|
that is, to remove from index changes added with ``git add`` for
|
||||||
|
committing. See `The Book
|
||||||
|
<https://git-scm.com/book/en/Git-Basics-Undoing-Things>`_ for details
|
||||||
|
about unstaging and other undo tricks.
|
||||||
|
|
||||||
|
|
||||||
|
git reflog: reference log
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
Removing commits with ``git reset`` or moving the head of a branch
|
||||||
|
sounds dangerous and it is. But there is a way to undo: another
|
||||||
|
reset back to the original commit. Git doesn't remove commits
|
||||||
|
immediately; unreferenced commits (in git terminology they are called
|
||||||
|
"dangling commits") stay in the database for some time (default is two
|
||||||
|
weeks) so you can reset back to it or create a new branch pointing to
|
||||||
|
the original commit.
|
||||||
|
|
||||||
|
For every move of a branch's head - with ``git commit``, ``git
|
||||||
|
checkout``, ``git fetch``, ``git pull``, ``git rebase``, ``git reset``
|
||||||
|
and so on - git stores a reference log (reflog for short). For every
|
||||||
|
move git stores where the head was. Command ``git reflog`` can be used
|
||||||
|
to view (and manipulate) the log.
|
||||||
|
|
||||||
|
In addition to the moves of the head of every branch git stores the
|
||||||
|
moves of the HEAD - a symbolic reference that (usually) names the
|
||||||
|
current branch. HEAD is changed with ``git checkout $BRANCH``.
|
||||||
|
|
||||||
|
By default ``git reflog`` shows the moves of the HEAD, i.e. the
|
||||||
|
command is equivalent to ``git reflog HEAD``. To show the moves of the
|
||||||
|
head of a branch use the command ``git reflog $BRANCH``.
|
||||||
|
|
||||||
|
So to undo a ``git reset`` lookup the original commit in ``git
|
||||||
|
reflog``, verify it with ``git show`` or ``git log`` and run ``git
|
||||||
|
reset $COMMIT_ID``. Git stores the move of the branch's head in
|
||||||
|
reflog, so you can undo that undo later again.
|
||||||
|
|
||||||
|
In a more complex situation you'd want to move some commits along with
|
||||||
|
resetting the head of the branch. Cherry-pick them to the new branch.
|
||||||
|
For example, if you want to reset the branch ``master`` back to the
|
||||||
|
original commit but preserve two commits created in the current branch
|
||||||
|
do something like::
|
||||||
|
|
||||||
|
$ git branch save-master # create a new branch saving master
|
||||||
|
$ git reflog # find the original place of master
|
||||||
|
$ git reset $COMMIT_ID
|
||||||
|
$ git cherry-pick save-master~ save-master
|
||||||
|
$ git branch -D save-master # remove temporary branch
|
||||||
|
|
||||||
|
|
||||||
|
git revert: revert a commit
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
``git revert`` reverts a commit or commits, that is, it creates a new
|
||||||
|
commit or commits that revert(s) the effects of the given commits.
|
||||||
|
It's the only way to undo published commits (``git commit --amend``,
|
||||||
|
``git rebase`` and ``git reset`` change the branch in
|
||||||
|
non-fast-forwardable ways so they should only be used for non-pushed
|
||||||
|
commits.)
|
||||||
|
|
||||||
|
There is a problem with reverting a merge commit. ``git revert`` can
|
||||||
|
undo the code created by the merge commit but it cannot undo the fact
|
||||||
|
of merge. See the discussion `How to revert a faulty merge
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/howto/revert-a-faulty-merge.html>`_.
|
||||||
|
|
||||||
|
|
||||||
|
One thing that cannot be undone
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
Whatever you undo, there is one thing that cannot be undone -
|
||||||
|
overwritten uncommitted changes. Uncommitted changes don't belong to
|
||||||
|
git so git cannot help preserving them.
|
||||||
|
|
||||||
|
Most of the time git warns you when you're going to execute a command
|
||||||
|
that overwrites uncommitted changes. Git doesn't allow you to switch
|
||||||
|
branches with ``git checkout``. It stops you when you're going to
|
||||||
|
rebase with non-clean working tree. It refuses to pull new commits
|
||||||
|
over non-committed files.
|
||||||
|
|
||||||
|
But there are commands that do exactly that - overwrite files in the
|
||||||
|
working tree. Commands like ``git checkout $PATHs`` or ``git reset
|
||||||
|
--hard`` silently overwrite files including your uncommitted changes.
|
||||||
|
|
||||||
|
With that in mind you can understand the stance "commit early, commit
|
||||||
|
often". Commit as often as possible. Commit on every save in your
|
||||||
|
editor or IDE. You can edit your commits before pushing - edit commit
|
||||||
|
messages, change commits, reorder, combine, split, remove. But save
|
||||||
|
your changes in git database, either commit changes or at least stash
|
||||||
|
them with ``git stash``.
|
||||||
|
|
||||||
|
|
||||||
|
Merge or rebase?
|
||||||
|
================
|
||||||
|
|
||||||
|
Internet is full of heated discussions on the topic: "merge or
|
||||||
|
rebase?" Most of them are meaningless. When a DVCS is being used in a
|
||||||
|
big team with a big and complex project with many branches there is
|
||||||
|
simply no way to avoid merges. So the question's diminished to
|
||||||
|
"whether to use rebase, and if yes - when to use rebase?" Considering
|
||||||
|
that it is very much recommended not to rebase published commits the
|
||||||
|
question's diminished even further: "whether to use rebase on
|
||||||
|
non-pushed commits?"
|
||||||
|
|
||||||
|
That small question is for the team to decide. The author of the PEP
|
||||||
|
recommends to use rebase when pulling, i.e. always do ``git pull
|
||||||
|
--rebase`` or even configure automatic setup of rebase for every new
|
||||||
|
branch::
|
||||||
|
|
||||||
|
$ git config branch.autosetuprebase always
|
||||||
|
|
||||||
|
and configure rebase for existing branches::
|
||||||
|
|
||||||
|
$ git config branch.$NAME.rebase true
|
||||||
|
|
||||||
|
For example::
|
||||||
|
|
||||||
|
$ git config branch.v1.rebase true
|
||||||
|
$ git config branch.master.rebase true
|
||||||
|
|
||||||
|
After that ``git pull origin master`` becomes equivalent to ``git pull
|
||||||
|
--rebase origin master``.
|
||||||
|
|
||||||
|
It is recommended to create new commits in a separate feature or topic
|
||||||
|
branch while using rebase to update the mainline branch. When the
|
||||||
|
topic branch is ready merge it into mainline. To avoid a tedious task
|
||||||
|
of resolving large number of conflicts at once you can merge the topic
|
||||||
|
branch to the mainline from time to time and switch back to the topic
|
||||||
|
branch to continue working on it. The entire workflow would be
|
||||||
|
something like::
|
||||||
|
|
||||||
|
$ git checkout -b issue-42 # create a new issue branch and switch to it
|
||||||
|
...edit/test/commit...
|
||||||
|
$ git checkout master
|
||||||
|
$ git pull --rebase origin master # update master from the upstream
|
||||||
|
$ git merge issue-42
|
||||||
|
$ git branch -d issue-42 # delete the topic branch
|
||||||
|
$ git push origin master
|
||||||
|
|
||||||
|
When the topic branch is deleted only the label is removed, commits
|
||||||
|
are stayed in the database, they are now merged into master::
|
||||||
|
|
||||||
|
o--o--o--o--o--M--< master - the mainline branch
|
||||||
|
\ /
|
||||||
|
--*--*--* - the topic branch, now unnamed
|
||||||
|
|
||||||
|
The topic branch is deleted to avoid cluttering branch namespace with
|
||||||
|
small topic branches. Information on what issue was fixed or what
|
||||||
|
feature was implemented should be in the commit messages.
|
||||||
|
|
||||||
|
|
||||||
|
Null-merges
|
||||||
|
===========
|
||||||
|
|
||||||
|
Git has a builtin merge strategy for what Python core developers call
|
||||||
|
"null-merge"::
|
||||||
|
|
||||||
|
$ git merge -s ours v1 # null-merge v1 into master
|
||||||
|
|
||||||
|
|
||||||
|
Branching models
|
||||||
|
================
|
||||||
|
|
||||||
|
Git doesn't assume any particular development model regarding
|
||||||
|
branching and merging. Some projects prefer to graduate patches from
|
||||||
|
the oldest branch to the newest, some prefer to cherry-pick commits
|
||||||
|
backwards, some use squashing (combining a number of commits into
|
||||||
|
one). Anything is possible.
|
||||||
|
|
||||||
|
There are a few examples to start with. `git help workflows
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/gitworkflows.html>`_
|
||||||
|
describes how the very git authors develop git.
|
||||||
|
|
||||||
|
ProGit book has a few chapters devoted to branch management in
|
||||||
|
different projects: `Git Branching - Branching Workflows
|
||||||
|
<https://git-scm.com/book/en/Git-Branching-Branching-Workflows>`_ and
|
||||||
|
`Distributed Git - Contributing to a Project
|
||||||
|
<https://git-scm.com/book/en/Distributed-Git-Contributing-to-a-Project>`_.
|
||||||
|
|
||||||
|
There is also a well-known article `A successful Git branching model
|
||||||
|
<http://nvie.com/posts/a-successful-git-branching-model/>`_ by Vincent
|
||||||
|
Driessen. It recommends a set of very detailed rules on creating and
|
||||||
|
managing mainline, topic and bugfix branches. To support the model the
|
||||||
|
author implemented `git flow <https://github.com/nvie/gitflow>`_
|
||||||
|
extension.
|
||||||
|
|
||||||
|
|
||||||
|
Advanced configuration
|
||||||
|
======================
|
||||||
|
|
||||||
|
Line endings
|
||||||
|
------------
|
||||||
|
|
||||||
|
Git has builtin mechanisms to handle line endings between platforms
|
||||||
|
with different end-of-line styles. To allow git to do CRLF conversion
|
||||||
|
assign ``text`` attribute to files using `.gitattributes
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html>`_.
|
||||||
|
For files that have to have specific line endings assign ``eol``
|
||||||
|
attribute. For binary files the attribute is, naturally, ``binary``.
|
||||||
|
|
||||||
|
For example::
|
||||||
|
|
||||||
|
$ cat .gitattributes
|
||||||
|
*.py text
|
||||||
|
*.txt text
|
||||||
|
*.png binary
|
||||||
|
/readme.txt eol=CRLF
|
||||||
|
|
||||||
|
To check what attributes git uses for files use ``git check-attr``
|
||||||
|
command. For example::
|
||||||
|
|
||||||
|
$ git check-attr -a -- \*.py
|
||||||
|
|
||||||
|
|
||||||
|
Advanced topics
|
||||||
|
===============
|
||||||
|
|
||||||
|
Staging area
|
||||||
|
------------
|
||||||
|
|
||||||
|
Staging area aka index aka cache is a distinguishing feature of git.
|
||||||
|
Staging area is where git collects patches before committing them.
|
||||||
|
Separation between collecting patches and commit phases provides a
|
||||||
|
very useful feature of git: you can review collected patches before
|
||||||
|
commit and even edit them - remove some hunks, add new hunks and
|
||||||
|
review again.
|
||||||
|
|
||||||
|
To add files to the index use ``git add``. Collecting patches before
|
||||||
|
committing means you need to do that for every change, not only to add
|
||||||
|
new (untracked) files. To simplify committing in case you just want to
|
||||||
|
commit everything without reviewing run ``git commit --all`` (or just
|
||||||
|
``-a``) - the command adds every changed tracked file to the index and
|
||||||
|
then commit. To commit a file or files regardless of patches collected
|
||||||
|
in the index run ``git commit [--only|-o] -- $FILE...``.
|
||||||
|
|
||||||
|
To add hunks of patches to the index use ``git add --patch`` (or just
|
||||||
|
``-p``). To remove collected files from the index use ``git reset HEAD
|
||||||
|
-- $FILE...`` To add/inspect/remove collected hunks use ``git add
|
||||||
|
--interactive`` (``-i``).
|
||||||
|
|
||||||
|
To see the diff between the index and the last commit (i.e., collected
|
||||||
|
patches) use ``git diff --cached``. To see the diff between the
|
||||||
|
working tree and the index (i.e., uncollected patches) use just ``git
|
||||||
|
diff``. To see the diff between the working tree and the last commit
|
||||||
|
(i.e., both collected and uncollected patches) run ``git diff HEAD``.
|
||||||
|
|
||||||
|
See `WhatIsTheIndex
|
||||||
|
<https://git.wiki.kernel.org/index.php/WhatIsTheIndex>`_ and
|
||||||
|
`IndexCommandQuickref
|
||||||
|
<https://git.wiki.kernel.org/index.php/IndexCommandQuickref>`_ in Git
|
||||||
|
Wiki.
|
||||||
|
|
||||||
|
|
||||||
|
ReReRe
|
||||||
|
======
|
||||||
|
|
||||||
|
Rerere is a mechanism that helps to resolve repeated merge conflicts.
|
||||||
|
The most frequent source of recurring merge conflicts are topic
|
||||||
|
branches that are merged into mainline and then the merge commits are
|
||||||
|
removed; that's often performed to test the topic branches and train
|
||||||
|
rerere; merge commits are removed to have clean linear history and
|
||||||
|
finish the topic branch with only one last merge commit.
|
||||||
|
|
||||||
|
Rerere works by remembering the states of tree before and after a
|
||||||
|
successful commit. That way rerere can automatically resolve conflicts
|
||||||
|
if they appear in the same files.
|
||||||
|
|
||||||
|
Rerere can be used manually with ``git rerere`` command but most often
|
||||||
|
it's used automatically. Enable rerere with these commands in a
|
||||||
|
working tree::
|
||||||
|
|
||||||
|
$ git config rerere.enabled true
|
||||||
|
$ git config rerere.autoupdate true
|
||||||
|
|
||||||
|
You don't need to turn rerere on globally - you don't want rerere in
|
||||||
|
bare repositories or single-branche repositories; you only need rerere
|
||||||
|
in repos where you often perform merges and resolve merge conflicts.
|
||||||
|
|
||||||
|
See `Rerere <https://git-scm.com/book/en/Git-Tools-Rerere>`_ in The
|
||||||
|
Book.
|
||||||
|
|
||||||
|
|
||||||
|
Database maintenance
|
||||||
|
====================
|
||||||
|
|
||||||
|
Git object database and other files/directories under ``.git`` require
|
||||||
|
periodic maintenance and cleanup. For example, commit editing left
|
||||||
|
unreferenced objects (dangling objects, in git terminology) and these
|
||||||
|
objects should be pruned to avoid collecting cruft in the DB. The
|
||||||
|
command ``git gc`` is used for maintenance. Git automatically runs
|
||||||
|
``git gc --auto`` as a part of some commands to do quick maintenance.
|
||||||
|
Users are recommended to run ``git gc --aggressive`` from time to
|
||||||
|
time; ``git help gc`` recommends to run it every few hundred
|
||||||
|
changesets; for more intensive projects it should be something like
|
||||||
|
once a week and less frequently (biweekly or monthly) for lesser
|
||||||
|
active projects.
|
||||||
|
|
||||||
|
``git gc --aggressive`` not only removes dangling objects, it also
|
||||||
|
repacks object database into indexed and better optimized pack(s); it
|
||||||
|
also packs symbolic references (branches and tags). Another way to do
|
||||||
|
it is to run ``git repack``.
|
||||||
|
|
||||||
|
There is a well-known `message
|
||||||
|
<https://gcc.gnu.org/ml/gcc/2007-12/msg00165.html>`_ from Linus
|
||||||
|
Torvalds regarding "stupidity" of ``git gc --aggressive``. The message
|
||||||
|
can safely be ignored now. It is old and outdated, ``git gc
|
||||||
|
--aggressive`` became much better since that time.
|
||||||
|
|
||||||
|
For those who still prefer ``git repack`` over ``git gc --aggressive``
|
||||||
|
the recommended parameters are ``git repack -a -d -f --depth=20
|
||||||
|
--window=250``. See `this detailed experiment
|
||||||
|
<http://vcscompare.blogspot.ru/2008/06/git-repack-parameters.html>`_
|
||||||
|
for explanation of the effects of these parameters.
|
||||||
|
|
||||||
|
From time to time run ``git fsck [--strict]`` to verify integrity of
|
||||||
|
the database. ``git fsck`` may produce a list of dangling objects;
|
||||||
|
that's not an error, just a reminder to perform regular maintenance.
|
||||||
|
|
||||||
|
|
||||||
|
Tips and tricks
|
||||||
|
===============
|
||||||
|
|
||||||
|
Command-line options and arguments
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
`git help cli
|
||||||
|
<https://www.kernel.org/pub/software/scm/git/docs/gitcli.html>`_
|
||||||
|
recommends not to combine short options/flags. Most of the times
|
||||||
|
combining works: ``git commit -av`` works perfectly, but there are
|
||||||
|
situations when it doesn't. E.g., ``git log -p -5`` cannot be combined
|
||||||
|
as ``git log -p5``.
|
||||||
|
|
||||||
|
Some options have arguments, some even have default arguments. In that
|
||||||
|
case the argument for such option must be spelled in a sticky way:
|
||||||
|
``-Oarg``, never ``-O arg`` because for an option that has a default
|
||||||
|
argument the latter means "use default value for option ``-O`` and
|
||||||
|
pass ``arg`` further to the option parser". For example, ``git grep``
|
||||||
|
has an option ``-O`` that passes a list of names of the found files to
|
||||||
|
a program; default program for ``-O`` is a pager (usually ``less``),
|
||||||
|
but you can use your editor::
|
||||||
|
|
||||||
|
$ git grep -Ovim # but not -O vim
|
||||||
|
|
||||||
|
BTW, if git is instructed to use ``less`` as the pager (i.e., if pager
|
||||||
|
is not configured in git at all it uses ``less`` by default, or if it
|
||||||
|
gets ``less`` from GIT_PAGER or PAGER environment variables, or if it
|
||||||
|
was configured with ``git config --global core.pager less``, or
|
||||||
|
``less`` is used in the command ``git grep -Oless``) ``git grep``
|
||||||
|
passes ``+/$pattern`` option to ``less`` which is quite convenient.
|
||||||
|
Unfortunately, ``git grep`` doesn't pass the pattern if the pager is
|
||||||
|
not exactly ``less``, even if it's ``less`` with parameters (something
|
||||||
|
like ``git config --global core.pager less -FRSXgimq``); fortunately,
|
||||||
|
``git grep -Oless`` always passes the pattern.
|
||||||
|
|
||||||
|
|
||||||
|
bash/zsh completion
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
It's a bit hard to type ``git rebase --interactive --preserve-merges
|
||||||
|
HEAD~5`` manually even for those who are happy to use command-line,
|
||||||
|
and this is where shell completion is of great help. Bash/zsh come
|
||||||
|
with programmable completion, often automatically installed and
|
||||||
|
enabled, so if you have bash/zsh and git installed, chances are you
|
||||||
|
are already done - just go and use it at the command-line.
|
||||||
|
|
||||||
|
If you don't have necessary bits installed, install and enable
|
||||||
|
bash_completion package. If you want to upgrade your git completion to
|
||||||
|
the latest and greatest download necessary file from `git contrib
|
||||||
|
<https://git.kernel.org/cgit/git/git.git/tree/contrib/completion>`_.
|
||||||
|
|
||||||
|
Git-for-windows comes with git-bash for which bash completion is
|
||||||
|
installed and enabled.
|
||||||
|
|
||||||
|
|
||||||
|
bash/zsh prompt
|
||||||
|
---------------
|
||||||
|
|
||||||
|
For command-line lovers shell prompt can carry a lot of useful
|
||||||
|
information. To include git information in the prompt use
|
||||||
|
`git-prompt.sh
|
||||||
|
<https://git.kernel.org/cgit/git/git.git/tree/contrib/completion/git-prompt.sh>`_.
|
||||||
|
Read the detailed instructions in the file.
|
||||||
|
|
||||||
|
Search the Net for "git prompt" to find other prompt variants.
|
||||||
|
|
||||||
|
|
||||||
|
git on server
|
||||||
|
=============
|
||||||
|
|
||||||
|
The simplest way to publish a repository or a group of repositories is
|
||||||
|
``git daemon``. The daemon provides anonymous access, by default it is
|
||||||
|
read-only. The repositories are accessible by git protocol (git://
|
||||||
|
URLs). Write access can be enabled but the protocol lacks any
|
||||||
|
authentication means, so it should be enabled only within a trusted
|
||||||
|
LAN. See ``git help daemon`` for details.
|
||||||
|
|
||||||
|
Git over ssh provides authentication and repo-level authorisation as
|
||||||
|
repositories can be made user- or group-writeable (see parameter
|
||||||
|
``core.sharedRepository`` in ``git help config``). If that's too
|
||||||
|
permissive or too restrictive for some project's needs there is a
|
||||||
|
wrapper `gitolite <http://gitolite.com/gitolite/index.html>`_ that can
|
||||||
|
be configured to allow access with great granularity; gitolite is
|
||||||
|
written in Perl and has a lot of documentation.
|
||||||
|
|
||||||
|
Web interface to browse repositories can be created using `gitweb
|
||||||
|
<https://git.kernel.org/cgit/git/git.git/tree/gitweb>`_ or `cgit
|
||||||
|
<http://git.zx2c4.com/cgit/about/>`_. Both are CGI scripts (written in
|
||||||
|
Perl and C). In addition to web interface both provide read-only dumb
|
||||||
|
http access for git (http(s):// URLs).
|
||||||
|
|
||||||
|
There are also more advanced web-based development environments that
|
||||||
|
include ability to manage users, groups and projects; private,
|
||||||
|
group-accessible and public repositories; they often include issue
|
||||||
|
trackers, wiki pages, pull requests and other tools for development
|
||||||
|
and communication. Among these environments are `Kallithea
|
||||||
|
<https://kallithea-scm.org/>`_ and `pagure <https://pagure.io/>`_,
|
||||||
|
both are written in Python; pagure was written by Fedora developers
|
||||||
|
and is being used to develop some Fedora projects. `Gogs
|
||||||
|
<http://gogs.io/>`_ is written in Go; there is a fork `Gitea
|
||||||
|
<http://gitea.io/>`_.
|
||||||
|
|
||||||
|
And last but not least, `Gitlab <https://about.gitlab.com/>`_. It's
|
||||||
|
perhaps the most advanced web-based development environment for git.
|
||||||
|
Written in Ruby, community edition is free and open source (MIT
|
||||||
|
license).
|
||||||
|
|
||||||
|
|
||||||
|
From Mercurial to git
|
||||||
|
=====================
|
||||||
|
|
||||||
|
There are many tools to convert Mercurial repositories to git. The
|
||||||
|
most famous are, probably, `hg-git <https://hg-git.github.io/>`_ and
|
||||||
|
`fast-export <http://repo.or.cz/w/fast-export.git>`_ (many years ago
|
||||||
|
it was known under the name ``hg2git``).
|
||||||
|
|
||||||
|
But a better tool, perhaps the best, is `git-remote-hg
|
||||||
|
<https://github.com/felipec/git-remote-hg>`_. It provides transparent
|
||||||
|
bidirectional (pull and push) access to Mercurial repositories from
|
||||||
|
git. Its author wrote a `comparison of alternatives
|
||||||
|
<https://github.com/felipec/git/wiki/Comparison-of-git-remote-hg-alternatives>`_
|
||||||
|
that seems to be mostly objective.
|
||||||
|
|
||||||
|
To use git-remote-hg, install or clone it, add to your PATH (or copy
|
||||||
|
script ``git-remote-hg`` to a directory that's already in PATH) and
|
||||||
|
prepend ``hg::`` to Mercurial URLs. For example::
|
||||||
|
|
||||||
|
$ git clone https://github.com/felipec/git-remote-hg.git
|
||||||
|
$ PATH=$PATH:"`pwd`"/git-remote-hg
|
||||||
|
$ git clone hg::https://hg.python.org/peps/ PEPs
|
||||||
|
|
||||||
|
To work with the repository just use regular git commands including
|
||||||
|
``git fetch/pull/push``.
|
||||||
|
|
||||||
|
To start converting your Mercurial habits to git see the page
|
||||||
|
`Mercurial for Git users
|
||||||
|
<https://mercurial.selenic.com/wiki/GitConcepts>`_ at Mercurial wiki.
|
||||||
|
At the second half of the page there is a table that lists
|
||||||
|
corresponding Mercurial and git commands. Should work perfectly in
|
||||||
|
both directions.
|
||||||
|
|
||||||
|
Python Developer's Guide also has a chapter `Mercurial for git
|
||||||
|
developers <https://docs.python.org/devguide/gitdevs.html>`_ that
|
||||||
|
documents a few differences between git and hg.
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
sentence-end-double-space: t
|
||||||
|
fill-column: 70
|
||||||
|
coding: utf-8
|
||||||
|
End:
|
||||||
|
vim: set fenc=us-ascii tw=70 :
|
|
@ -66,7 +66,7 @@ Features for 3.5
|
||||||
* PEP 479, change StopIteration handling inside generators
|
* PEP 479, change StopIteration handling inside generators
|
||||||
* PEP 484, the typing module, a new standard for type annotations
|
* PEP 484, the typing module, a new standard for type annotations
|
||||||
* PEP 485, math.isclose(), a function for testing approximate equality
|
* PEP 485, math.isclose(), a function for testing approximate equality
|
||||||
* PEP 486, making the Widnows Python launcher aware of virtual environments
|
* PEP 486, making the Windows Python launcher aware of virtual environments
|
||||||
* PEP 488, eliminating .pyo files
|
* PEP 488, eliminating .pyo files
|
||||||
* PEP 489, a new and improved mechanism for loading extension modules
|
* PEP 489, a new and improved mechanism for loading extension modules
|
||||||
* PEP 492, coroutines with async and await syntax
|
* PEP 492, coroutines with async and await syntax
|
||||||
|
|
13
pep-0495.txt
13
pep-0495.txt
|
@ -404,6 +404,19 @@ where ``delta`` is the size of the fold or the gap.
|
||||||
Temporal Arithmetic and Comparison Operators
|
Temporal Arithmetic and Comparison Operators
|
||||||
============================================
|
============================================
|
||||||
|
|
||||||
|
.. epigraph::
|
||||||
|
|
||||||
|
| In *mathematicks* he was greater
|
||||||
|
| Than Tycho Brahe, or Erra Pater:
|
||||||
|
| For he, by geometric scale,
|
||||||
|
| Could take the size of pots of ale;
|
||||||
|
| Resolve, by sines and tangents straight,
|
||||||
|
| If bread or butter wanted weight,
|
||||||
|
| And wisely tell what hour o' th' day
|
||||||
|
| The clock does strike by algebra.
|
||||||
|
|
||||||
|
-- "Hudibras" by Samuel Butler
|
||||||
|
|
||||||
The value of the ``fold`` attribute will be ignored in all operations
|
The value of the ``fold`` attribute will be ignored in all operations
|
||||||
with naive datetime instances. As a consequence, naive
|
with naive datetime instances. As a consequence, naive
|
||||||
``datetime.datetime`` or ``datetime.time`` instances that differ only
|
``datetime.datetime`` or ``datetime.time`` instances that differ only
|
||||||
|
|
19
pep-0498.txt
19
pep-0498.txt
|
@ -8,7 +8,7 @@ Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 01-Aug-2015
|
Created: 01-Aug-2015
|
||||||
Python-Version: 3.6
|
Python-Version: 3.6
|
||||||
Post-History: 07-Aug-2015, 30-Aug-2015, 04-Sep-2015
|
Post-History: 07-Aug-2015, 30-Aug-2015, 04-Sep-2015, 19-Sep-2015
|
||||||
Resolution: https://mail.python.org/pipermail/python-dev/2015-September/141526.html
|
Resolution: https://mail.python.org/pipermail/python-dev/2015-September/141526.html
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
|
@ -201,6 +201,11 @@ braces ``'{{'`` or ``'}}'`` inside literal portions of an f-string are
|
||||||
replaced by the corresponding single brace. Doubled opening braces do
|
replaced by the corresponding single brace. Doubled opening braces do
|
||||||
not signify the start of an expression.
|
not signify the start of an expression.
|
||||||
|
|
||||||
|
Note that ``__format__()`` is not called directly on each value. The
|
||||||
|
actual code uses the equivalent of ``type(value).__format__(value,
|
||||||
|
format_spec)``, or ``format(value, format_spec)``. See the
|
||||||
|
documentation of the builtin ``format()`` function for more details.
|
||||||
|
|
||||||
Comments, using the ``'#'`` character, are not allowed inside an
|
Comments, using the ``'#'`` character, are not allowed inside an
|
||||||
expression.
|
expression.
|
||||||
|
|
||||||
|
@ -209,7 +214,7 @@ specified. The allowed conversions are ``'!s'``, ``'!r'``, or
|
||||||
``'!a'``. These are treated the same as in ``str.format()``: ``'!s'``
|
``'!a'``. These are treated the same as in ``str.format()``: ``'!s'``
|
||||||
calls ``str()`` on the expression, ``'!r'`` calls ``repr()`` on the
|
calls ``str()`` on the expression, ``'!r'`` calls ``repr()`` on the
|
||||||
expression, and ``'!a'`` calls ``ascii()`` on the expression. These
|
expression, and ``'!a'`` calls ``ascii()`` on the expression. These
|
||||||
conversions are applied before the call to ``__format__``. The only
|
conversions are applied before the call to ``format()``. The only
|
||||||
reason to use ``'!s'`` is if you want to specify a format specifier
|
reason to use ``'!s'`` is if you want to specify a format specifier
|
||||||
that applies to ``str``, not to the type of the expression.
|
that applies to ``str``, not to the type of the expression.
|
||||||
|
|
||||||
|
@ -222,9 +227,9 @@ So, an f-string looks like::
|
||||||
|
|
||||||
f ' <text> { <expression> <optional !s, !r, or !a> <optional : format specifier> } <text> ... '
|
f ' <text> { <expression> <optional !s, !r, or !a> <optional : format specifier> } <text> ... '
|
||||||
|
|
||||||
The resulting expression's ``__format__`` method is called with the
|
The expression is then formatted using the ``__format__`` protocol,
|
||||||
format specifier as an argument. The resulting value is used when
|
using the format specifier as an argument. The resulting value is
|
||||||
building the value of the f-string.
|
used when building the value of the f-string.
|
||||||
|
|
||||||
Expressions cannot contain ``':'`` or ``'!'`` outside of strings or
|
Expressions cannot contain ``':'`` or ``'!'`` outside of strings or
|
||||||
parentheses, brackets, or braces. The exception is that the ``'!='``
|
parentheses, brackets, or braces. The exception is that the ``'!='``
|
||||||
|
@ -293,7 +298,7 @@ For example, this code::
|
||||||
|
|
||||||
Might be be evaluated as::
|
Might be be evaluated as::
|
||||||
|
|
||||||
'abc' + expr1.__format__(spec1) + repr(expr2).__format__(spec2) + 'def' + str(expr3).__format__('') + 'ghi'
|
'abc' + format(expr1, spec1) + format(repr(expr2)) + 'def' + format(str(expr3)) + 'ghi'
|
||||||
|
|
||||||
Expression evaluation
|
Expression evaluation
|
||||||
---------------------
|
---------------------
|
||||||
|
@ -371,7 +376,7 @@ yields the value::
|
||||||
While the exact method of this run time concatenation is unspecified,
|
While the exact method of this run time concatenation is unspecified,
|
||||||
the above code might evaluate to::
|
the above code might evaluate to::
|
||||||
|
|
||||||
'ab' + x.__format__('') + '{c}' + 'str<' + y.__format__('^4') + '>de'
|
'ab' + format(x) + '{c}' + 'str<' + format(y, '^4') + '>de'
|
||||||
|
|
||||||
Each f-string is entirely evaluated before being concatenated to
|
Each f-string is entirely evaluated before being concatenated to
|
||||||
adjacent f-strings. That means that this::
|
adjacent f-strings. That means that this::
|
||||||
|
|
445
pep-0502.txt
445
pep-0502.txt
|
@ -1,43 +1,45 @@
|
||||||
PEP: 502
|
PEP: 502
|
||||||
Title: String Interpolation Redux
|
Title: String Interpolation - Extended Discussion
|
||||||
Version: $Revision$
|
Version: $Revision$
|
||||||
Last-Modified: $Date$
|
Last-Modified: $Date$
|
||||||
Author: Mike G. Miller
|
Author: Mike G. Miller
|
||||||
Status: Draft
|
Status: Draft
|
||||||
Type: Standards Track
|
Type: Informational
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 10-Aug-2015
|
Created: 10-Aug-2015
|
||||||
Python-Version: 3.6
|
Python-Version: 3.6
|
||||||
|
|
||||||
Note: Open issues below are stated with a question mark (?),
|
|
||||||
and are therefore searchable.
|
|
||||||
|
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
========
|
========
|
||||||
|
|
||||||
This proposal describes a new string interpolation feature for Python,
|
PEP 498: *Literal String Interpolation*, which proposed "formatted strings" was
|
||||||
called an *expression-string*,
|
accepted September 9th, 2015.
|
||||||
that is both concise and powerful,
|
Additional background and rationale given during its design phase is detailed
|
||||||
improves readability in most cases,
|
below.
|
||||||
yet does not conflict with existing code.
|
|
||||||
|
To recap that PEP,
|
||||||
|
a string prefix was introduced that marks the string as a template to be
|
||||||
|
rendered.
|
||||||
|
These formatted strings may contain one or more expressions
|
||||||
|
built on `the existing syntax`_ of ``str.format()``.
|
||||||
|
The formatted string expands at compile-time into a conventional string format
|
||||||
|
operation,
|
||||||
|
with the given expressions from its text extracted and passed instead as
|
||||||
|
positional arguments.
|
||||||
|
|
||||||
To achieve this end,
|
|
||||||
a new string prefix is introduced,
|
|
||||||
which expands at compile-time into an equivalent expression-string object,
|
|
||||||
with requested variables from its context passed as keyword arguments.
|
|
||||||
At runtime,
|
At runtime,
|
||||||
the new object uses these passed values to render a string to given
|
the resulting expressions are evaluated to render a string to given
|
||||||
specifications, building on `the existing syntax`_ of ``str.format()``::
|
specifications::
|
||||||
|
|
||||||
>>> location = 'World'
|
>>> location = 'World'
|
||||||
>>> e'Hello, {location} !' # new prefix: e''
|
>>> f'Hello, {location} !' # new prefix: f''
|
||||||
'Hello, World !' # interpolated result
|
'Hello, World !' # interpolated result
|
||||||
|
|
||||||
.. _the existing syntax: https://docs.python.org/3/library/string.html#format-string-syntax
|
Format-strings may be thought of as merely syntactic sugar to simplify traditional
|
||||||
|
calls to ``str.format()``.
|
||||||
|
|
||||||
This PEP does not recommend to remove or deprecate any of the existing string
|
.. _the existing syntax: https://docs.python.org/3/library/string.html#format-string-syntax
|
||||||
formatting mechanisms.
|
|
||||||
|
|
||||||
|
|
||||||
Motivation
|
Motivation
|
||||||
|
@ -50,12 +52,16 @@ In comparison to other dynamic scripting languages
|
||||||
with similar use cases,
|
with similar use cases,
|
||||||
the amount of code necessary to build similar strings is substantially higher,
|
the amount of code necessary to build similar strings is substantially higher,
|
||||||
while at times offering lower readability due to verbosity, dense syntax,
|
while at times offering lower readability due to verbosity, dense syntax,
|
||||||
or identifier duplication. [1]_
|
or identifier duplication.
|
||||||
|
|
||||||
|
These difficulties are described at moderate length in the original
|
||||||
|
`post to python-ideas`_
|
||||||
|
that started the snowball (that became PEP 498) rolling. [1]_
|
||||||
|
|
||||||
Furthermore, replacement of the print statement with the more consistent print
|
Furthermore, replacement of the print statement with the more consistent print
|
||||||
function of Python 3 (PEP 3105) has added one additional minor burden,
|
function of Python 3 (PEP 3105) has added one additional minor burden,
|
||||||
an additional set of parentheses to type and read.
|
an additional set of parentheses to type and read.
|
||||||
Combined with the verbosity of current formatting solutions,
|
Combined with the verbosity of current string formatting solutions,
|
||||||
this puts an otherwise simple language at an unfortunate disadvantage to its
|
this puts an otherwise simple language at an unfortunate disadvantage to its
|
||||||
peers::
|
peers::
|
||||||
|
|
||||||
|
@ -66,7 +72,7 @@ peers::
|
||||||
# Python 3, str.format with named parameters
|
# Python 3, str.format with named parameters
|
||||||
print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(**locals()))
|
print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(**locals()))
|
||||||
|
|
||||||
# Python 3, variation B, worst case
|
# Python 3, worst case
|
||||||
print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(user=user,
|
print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(user=user,
|
||||||
id=id,
|
id=id,
|
||||||
hostname=
|
hostname=
|
||||||
|
@ -74,7 +80,7 @@ peers::
|
||||||
|
|
||||||
In Python, the formatting and printing of a string with multiple variables in a
|
In Python, the formatting and printing of a string with multiple variables in a
|
||||||
single line of code of standard width is noticeably harder and more verbose,
|
single line of code of standard width is noticeably harder and more verbose,
|
||||||
indentation often exacerbating the issue.
|
with indentation exacerbating the issue.
|
||||||
|
|
||||||
For use cases such as smaller projects, systems programming,
|
For use cases such as smaller projects, systems programming,
|
||||||
shell script replacements, and even one-liners,
|
shell script replacements, and even one-liners,
|
||||||
|
@ -82,36 +88,17 @@ where message formatting complexity has yet to be encapsulated,
|
||||||
this verbosity has likely lead a significant number of developers and
|
this verbosity has likely lead a significant number of developers and
|
||||||
administrators to choose other languages over the years.
|
administrators to choose other languages over the years.
|
||||||
|
|
||||||
|
.. _post to python-ideas: https://mail.python.org/pipermail/python-ideas/2015-July/034659.html
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
=========
|
=========
|
||||||
|
|
||||||
|
|
||||||
Naming
|
|
||||||
------
|
|
||||||
|
|
||||||
The term expression-string was chosen because other applicable terms,
|
|
||||||
such as format-string and template are already well used in the Python standard
|
|
||||||
library.
|
|
||||||
|
|
||||||
The string prefix itself, ``e''`` was chosen to demonstrate that the
|
|
||||||
specification enables expressions,
|
|
||||||
is not limited to ``str.format()`` syntax,
|
|
||||||
and also does not lend itself to `the shorthand term`_ "f-string".
|
|
||||||
It is also slightly easier to type than other choices such as ``_''`` and
|
|
||||||
``i''``,
|
|
||||||
while perhaps `less odd-looking`_ to C-developers.
|
|
||||||
``printf('')`` vs. ``print(f'')``.
|
|
||||||
|
|
||||||
.. _the shorthand term: reference_needed
|
|
||||||
.. _less odd-looking: https://mail.python.org/pipermail/python-dev/2015-August/141147.html
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Goals
|
Goals
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
The design goals of expression-strings are as follows:
|
The design goals of format strings are as follows:
|
||||||
|
|
||||||
#. Eliminate need to pass variables manually.
|
#. Eliminate need to pass variables manually.
|
||||||
#. Eliminate repetition of identifiers and redundant parentheses.
|
#. Eliminate repetition of identifiers and redundant parentheses.
|
||||||
|
@ -133,40 +120,44 @@ Python specified both single (``'``) and double (``"``) ASCII quote
|
||||||
characters to enclose strings.
|
characters to enclose strings.
|
||||||
It is not reasonable to choose one of them now to enable interpolation,
|
It is not reasonable to choose one of them now to enable interpolation,
|
||||||
while leaving the other for uninterpolated strings.
|
while leaving the other for uninterpolated strings.
|
||||||
"Backtick" characters (`````) are also `constrained by history`_ as a shortcut
|
Other characters,
|
||||||
for ``repr()``.
|
such as the "Backtick" (or grave accent `````) are also
|
||||||
|
`constrained by history`_
|
||||||
|
as a shortcut for ``repr()``.
|
||||||
|
|
||||||
This leaves a few remaining options for the design of such a feature:
|
This leaves a few remaining options for the design of such a feature:
|
||||||
|
|
||||||
* An operator, as in printf-style string formatting via ``%``.
|
* An operator, as in printf-style string formatting via ``%``.
|
||||||
* A class, such as ``string.Template()``.
|
* A class, such as ``string.Template()``.
|
||||||
* A function, such as ``str.format()``.
|
* A method or function, such as ``str.format()``.
|
||||||
* New syntax
|
* New syntax, or
|
||||||
* A new string prefix marker, such as the well-known ``r''`` or ``u''``.
|
* A new string prefix marker, such as the well-known ``r''`` or ``u''``.
|
||||||
|
|
||||||
The first three options above currently work well.
|
The first three options above are mature.
|
||||||
Each has specific use cases and drawbacks,
|
Each has specific use cases and drawbacks,
|
||||||
yet also suffer from the verbosity and visual noise mentioned previously.
|
yet also suffer from the verbosity and visual noise mentioned previously.
|
||||||
All are discussed in the next section.
|
All options are discussed in the next sections.
|
||||||
|
|
||||||
.. _constrained by history: https://mail.python.org/pipermail/python-ideas/2007-January/000054.html
|
.. _constrained by history: https://mail.python.org/pipermail/python-ideas/2007-January/000054.html
|
||||||
|
|
||||||
|
|
||||||
Background
|
Background
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
This proposal builds on several existing techniques and proposals and what
|
Formatted strings build on several existing techniques and proposals and what
|
||||||
we've collectively learned from them.
|
we've collectively learned from them.
|
||||||
|
In keeping with the design goals of readability and error-prevention,
|
||||||
|
the following examples therefore use named,
|
||||||
|
not positional arguments.
|
||||||
|
|
||||||
The following examples focus on the design goals of readability and
|
|
||||||
error-prevention using named parameters.
|
|
||||||
Let's assume we have the following dictionary,
|
Let's assume we have the following dictionary,
|
||||||
and would like to print out its items as an informative string for end users::
|
and would like to print out its items as an informative string for end users::
|
||||||
|
|
||||||
>>> params = {'user': 'nobody', 'id': 9, 'hostname': 'darkstar'}
|
>>> params = {'user': 'nobody', 'id': 9, 'hostname': 'darkstar'}
|
||||||
|
|
||||||
|
|
||||||
Printf-style formatting
|
Printf-style formatting, via operator
|
||||||
'''''''''''''''''''''''
|
'''''''''''''''''''''''''''''''''''''
|
||||||
|
|
||||||
This `venerable technique`_ continues to have its uses,
|
This `venerable technique`_ continues to have its uses,
|
||||||
such as with byte-based protocols,
|
such as with byte-based protocols,
|
||||||
|
@ -178,7 +169,7 @@ and familiarity to many programmers::
|
||||||
|
|
||||||
In this form, considering the prerequisite dictionary creation,
|
In this form, considering the prerequisite dictionary creation,
|
||||||
the technique is verbose, a tad noisy,
|
the technique is verbose, a tad noisy,
|
||||||
and relatively readable.
|
yet relatively readable.
|
||||||
Additional issues are that an operator can only take one argument besides the
|
Additional issues are that an operator can only take one argument besides the
|
||||||
original string,
|
original string,
|
||||||
meaning multiple parameters must be passed in a tuple or dictionary.
|
meaning multiple parameters must be passed in a tuple or dictionary.
|
||||||
|
@ -190,8 +181,8 @@ or forget the trailing type, e.g. (``s`` or ``d``).
|
||||||
.. _venerable technique: https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
|
.. _venerable technique: https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
|
||||||
|
|
||||||
|
|
||||||
string.Template
|
string.Template Class
|
||||||
'''''''''''''''
|
'''''''''''''''''''''
|
||||||
|
|
||||||
The ``string.Template`` `class from`_ PEP 292
|
The ``string.Template`` `class from`_ PEP 292
|
||||||
(Simpler String Substitutions)
|
(Simpler String Substitutions)
|
||||||
|
@ -202,7 +193,7 @@ that finds its main use cases in shell and internationalization tools::
|
||||||
|
|
||||||
Template('Hello, user: $user, id: ${id}, on host: $hostname').substitute(params)
|
Template('Hello, user: $user, id: ${id}, on host: $hostname').substitute(params)
|
||||||
|
|
||||||
Also verbose, however the string itself is readable.
|
While also verbose, the string itself is readable.
|
||||||
Though functionality is limited,
|
Though functionality is limited,
|
||||||
it meets its requirements well.
|
it meets its requirements well.
|
||||||
It isn't powerful enough for many cases,
|
It isn't powerful enough for many cases,
|
||||||
|
@ -232,8 +223,8 @@ and likely contributed to the PEP's lack of acceptance.
|
||||||
It was superseded by the following proposal.
|
It was superseded by the following proposal.
|
||||||
|
|
||||||
|
|
||||||
str.format()
|
str.format() Method
|
||||||
''''''''''''
|
'''''''''''''''''''
|
||||||
|
|
||||||
The ``str.format()`` `syntax of`_ PEP 3101 is the most recent and modern of the
|
The ``str.format()`` `syntax of`_ PEP 3101 is the most recent and modern of the
|
||||||
existing options.
|
existing options.
|
||||||
|
@ -253,36 +244,32 @@ string literals::
|
||||||
host=hostname)
|
host=hostname)
|
||||||
'Hello, user: nobody, id: 9, on host: darkstar'
|
'Hello, user: nobody, id: 9, on host: darkstar'
|
||||||
|
|
||||||
|
The verbosity of the method-based approach is illustrated here.
|
||||||
|
|
||||||
.. _syntax of: https://docs.python.org/3/library/string.html#format-string-syntax
|
.. _syntax of: https://docs.python.org/3/library/string.html#format-string-syntax
|
||||||
|
|
||||||
|
|
||||||
PEP 498 -- Literal String Formatting
|
PEP 498 -- Literal String Formatting
|
||||||
''''''''''''''''''''''''''''''''''''
|
''''''''''''''''''''''''''''''''''''
|
||||||
|
|
||||||
PEP 498 discusses and delves partially into implementation details of
|
PEP 498 defines and discusses format strings,
|
||||||
expression-strings,
|
as also described in the `Abstract`_ above.
|
||||||
which it calls f-strings,
|
|
||||||
the idea and syntax
|
|
||||||
(with exception of the prefix letter)
|
|
||||||
of which is identical to that discussed here.
|
|
||||||
The resulting compile-time transformation however
|
|
||||||
returns a string joined from parts at runtime,
|
|
||||||
rather than an object.
|
|
||||||
|
|
||||||
It also, somewhat controversially to those first exposed to it,
|
|
||||||
introduces the idea that these strings shall be augmented with support for
|
|
||||||
arbitrary expressions,
|
|
||||||
which is discussed further in the following sections.
|
|
||||||
|
|
||||||
|
It also, somewhat controversially to those first exposed,
|
||||||
|
introduces the idea that format-strings shall be augmented with support for
|
||||||
|
arbitrary expressions.
|
||||||
|
This is discussed further in the
|
||||||
|
Restricting Syntax section under
|
||||||
|
`Rejected Ideas`_.
|
||||||
|
|
||||||
PEP 501 -- Translation ready string interpolation
|
PEP 501 -- Translation ready string interpolation
|
||||||
'''''''''''''''''''''''''''''''''''''''''''''''''
|
'''''''''''''''''''''''''''''''''''''''''''''''''
|
||||||
|
|
||||||
The complimentary PEP 501 brings internationalization into the discussion as a
|
The complimentary PEP 501 brings internationalization into the discussion as a
|
||||||
first-class concern, with its proposal of i-strings,
|
first-class concern, with its proposal of the i-prefix,
|
||||||
``string.Template`` syntax integration compatible with ES6 (Javascript),
|
``string.Template`` syntax integration compatible with ES6 (Javascript),
|
||||||
deferred rendering,
|
deferred rendering,
|
||||||
and a similar object return value.
|
and an object return value.
|
||||||
|
|
||||||
|
|
||||||
Implementations in Other Languages
|
Implementations in Other Languages
|
||||||
|
@ -374,7 +361,8 @@ ES6 (Javascript)
|
||||||
Designers of `Template strings`_ faced the same issue as Python where single
|
Designers of `Template strings`_ faced the same issue as Python where single
|
||||||
and double quotes were taken.
|
and double quotes were taken.
|
||||||
Unlike Python however, "backticks" were not.
|
Unlike Python however, "backticks" were not.
|
||||||
They were chosen as part of the ECMAScript 2015 (ES6) standard::
|
Despite `their issues`_,
|
||||||
|
they were chosen as part of the ECMAScript 2015 (ES6) standard::
|
||||||
|
|
||||||
console.log(`Fifteen is ${a + b} and\nnot ${2 * a + b}.`);
|
console.log(`Fifteen is ${a + b} and\nnot ${2 * a + b}.`);
|
||||||
|
|
||||||
|
@ -391,8 +379,10 @@ as the tag::
|
||||||
* User implemented prefixes supported.
|
* User implemented prefixes supported.
|
||||||
* Arbitrary expressions are supported.
|
* Arbitrary expressions are supported.
|
||||||
|
|
||||||
|
.. _their issues: https://mail.python.org/pipermail/python-ideas/2007-January/000054.html
|
||||||
.. _Template strings: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/template_strings
|
.. _Template strings: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/template_strings
|
||||||
|
|
||||||
|
|
||||||
C#, Version 6
|
C#, Version 6
|
||||||
'''''''''''''
|
'''''''''''''
|
||||||
|
|
||||||
|
@ -428,13 +418,14 @@ Arbitrary `interpolation under Swift`_ is available on all strings::
|
||||||
Additional examples
|
Additional examples
|
||||||
'''''''''''''''''''
|
'''''''''''''''''''
|
||||||
|
|
||||||
A number of additional examples may be `found at Wikipedia`_.
|
A number of additional examples of string interpolation may be
|
||||||
|
`found at Wikipedia`_.
|
||||||
|
|
||||||
|
Now that background and history have been covered,
|
||||||
|
let's continue on for a solution.
|
||||||
|
|
||||||
.. _found at Wikipedia: https://en.wikipedia.org/wiki/String_interpolation#Examples
|
.. _found at Wikipedia: https://en.wikipedia.org/wiki/String_interpolation#Examples
|
||||||
|
|
||||||
Now that background and imlementation history have been covered,
|
|
||||||
let's continue on for a solution.
|
|
||||||
|
|
||||||
|
|
||||||
New Syntax
|
New Syntax
|
||||||
----------
|
----------
|
||||||
|
@ -442,178 +433,47 @@ New Syntax
|
||||||
This should be an option of last resort,
|
This should be an option of last resort,
|
||||||
as every new syntax feature has a cost in terms of real-estate in a brain it
|
as every new syntax feature has a cost in terms of real-estate in a brain it
|
||||||
inhabits.
|
inhabits.
|
||||||
There is one alternative left on our list of possibilities,
|
There is however one alternative left on our list of possibilities,
|
||||||
which follows.
|
which follows.
|
||||||
|
|
||||||
|
|
||||||
New String Prefix
|
New String Prefix
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
Given the history of string formatting in Python,
|
Given the history of string formatting in Python and backwards-compatibility,
|
||||||
backwards-compatibility,
|
|
||||||
implementations in other languages,
|
implementations in other languages,
|
||||||
and the avoidance of new syntax unless necessary,
|
avoidance of new syntax unless necessary,
|
||||||
an acceptable design is reached through elimination
|
an acceptable design is reached through elimination
|
||||||
rather than unique insight.
|
rather than unique insight.
|
||||||
Therefore, we choose to explicitly mark interpolated string literals with a
|
Therefore, marking interpolated string literals with a string prefix is chosen.
|
||||||
string prefix.
|
|
||||||
|
|
||||||
We also choose an expression syntax that reuses and builds on the strongest of
|
We also choose an expression syntax that reuses and builds on the strongest of
|
||||||
the existing choices,
|
the existing choices,
|
||||||
``str.format()`` to avoid further duplication.
|
``str.format()`` to avoid further duplication of functionality::
|
||||||
|
|
||||||
|
|
||||||
Specification
|
|
||||||
=============
|
|
||||||
|
|
||||||
String literals with the prefix of ``e`` shall be converted at compile-time to
|
|
||||||
the construction of an ``estr`` (perhaps ``types.ExpressionString``?) object.
|
|
||||||
Strings and values are parsed from the literal and passed as tuples to the
|
|
||||||
constructor::
|
|
||||||
|
|
||||||
>>> location = 'World'
|
>>> location = 'World'
|
||||||
>>> e'Hello, {location} !'
|
>>> f'Hello, {location} !' # new prefix: f''
|
||||||
|
'Hello, World !' # interpolated result
|
||||||
|
|
||||||
# becomes
|
PEP 498 -- Literal String Formatting, delves into the mechanics and
|
||||||
# estr('Hello, {location} !', # template
|
implementation of this design.
|
||||||
('Hello, ', ' !'), # string fragments
|
|
||||||
('location',), # expressions
|
|
||||||
('World',), # values
|
|
||||||
)
|
|
||||||
|
|
||||||
The object interpolates its result immediately at run-time::
|
|
||||||
|
|
||||||
'Hello, World !'
|
|
||||||
|
|
||||||
|
|
||||||
ExpressionString Objects
|
Additional Topics
|
||||||
------------------------
|
=================
|
||||||
|
|
||||||
The ExpressionString object supports both immediate and deferred rendering of
|
|
||||||
its given template and parameters.
|
|
||||||
It does this by immediately rendering its inputs to its internal string and
|
|
||||||
``.rendered`` string member (still necessary?),
|
|
||||||
useful in the majority of use cases.
|
|
||||||
To allow for deferred rendering and caller-specified escaping,
|
|
||||||
all inputs are saved for later inspection,
|
|
||||||
with convenience methods available.
|
|
||||||
|
|
||||||
Notes:
|
|
||||||
|
|
||||||
* Inputs are saved to the object as ``.template`` and ``.context`` members
|
|
||||||
for later use.
|
|
||||||
* No explicit ``str(estr)`` call is necessary to render the result,
|
|
||||||
though doing so might be desired to free resources if significant.
|
|
||||||
* Additional or deferred rendering is available through the ``.render()``
|
|
||||||
method, which allows template and context to be overriden for flexibility.
|
|
||||||
* Manual escaping of potentially dangerous input is available through the
|
|
||||||
``.escape(escape_function)`` method,
|
|
||||||
the rules of which may therefore be specified by the caller.
|
|
||||||
The given function should both accept and return a single modified string.
|
|
||||||
|
|
||||||
* A sample Python implementation can `found at Bitbucket`_:
|
|
||||||
|
|
||||||
.. _found at Bitbucket: https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py
|
|
||||||
|
|
||||||
|
|
||||||
Inherits From ``str`` Type
|
|
||||||
'''''''''''''''''''''''''''
|
|
||||||
|
|
||||||
Inheriting from the ``str`` class is one of the techniques available to improve
|
|
||||||
compatibility with code expecting a string object,
|
|
||||||
as it will pass an ``isinstance(obj, str)`` test.
|
|
||||||
ExpressionString implements this and also renders its result into the "raw"
|
|
||||||
string of its string superclass,
|
|
||||||
providing compatibility with a majority of code.
|
|
||||||
|
|
||||||
|
|
||||||
Interpolation Syntax
|
|
||||||
--------------------
|
|
||||||
|
|
||||||
The strongest of the existing string formatting syntaxes is chosen,
|
|
||||||
``str.format()`` as a base to build on. [10]_ [11]_
|
|
||||||
|
|
||||||
..
|
|
||||||
|
|
||||||
* Additionally, single arbitrary expressions shall also be supported inside
|
|
||||||
braces as an extension::
|
|
||||||
|
|
||||||
>>> e'My age is {age + 1} years.'
|
|
||||||
|
|
||||||
See below for section on safety.
|
|
||||||
|
|
||||||
* Triple quoted strings with multiple lines shall be supported::
|
|
||||||
|
|
||||||
>>> e'''Hello,
|
|
||||||
{location} !'''
|
|
||||||
'Hello,\n World !'
|
|
||||||
|
|
||||||
* Adjacent implicit concatenation shall be supported;
|
|
||||||
interpolation does not `not bleed into`_ other strings::
|
|
||||||
|
|
||||||
>>> 'Hello {1, 2, 3} ' e'{location} !'
|
|
||||||
'Hello {1, 2, 3} World !'
|
|
||||||
|
|
||||||
* Additional implementation details,
|
|
||||||
for example expression and error-handling,
|
|
||||||
are specified in the compatible PEP 498.
|
|
||||||
|
|
||||||
.. _not bleed into: https://mail.python.org/pipermail/python-ideas/2015-July/034763.html
|
|
||||||
|
|
||||||
|
|
||||||
Composition with Other Prefixes
|
|
||||||
-------------------------------
|
|
||||||
|
|
||||||
* Expression-strings apply to unicode objects only,
|
|
||||||
therefore ``u''`` is never needed.
|
|
||||||
Should it be prevented?
|
|
||||||
|
|
||||||
* Bytes objects are not included here and do not compose with e'' as they
|
|
||||||
do not support ``__format__()``.
|
|
||||||
|
|
||||||
* Complimentary to raw strings,
|
|
||||||
backslash codes shall not be converted in the expression-string,
|
|
||||||
when combined with ``r''`` as ``re''``.
|
|
||||||
|
|
||||||
|
|
||||||
Examples
|
|
||||||
--------
|
|
||||||
|
|
||||||
A more complicated example follows::
|
|
||||||
|
|
||||||
n = 5; # t0, t1 = … TODO
|
|
||||||
a = e"Sliced {n} onions in {t1-t0:.3f} seconds."
|
|
||||||
# returns the equvalent of
|
|
||||||
estr("Sliced {n} onions in {t1-t0:.3f} seconds", # template
|
|
||||||
('Sliced ', ' onions in ', ' seconds'), # strings
|
|
||||||
('n', 't1-t0:.3f'), # expressions
|
|
||||||
(5, 0.555555) # values
|
|
||||||
)
|
|
||||||
|
|
||||||
With expressions only::
|
|
||||||
|
|
||||||
b = e"Three random numbers: {rand()}, {rand()}, {rand()}."
|
|
||||||
# returns the equvalent of
|
|
||||||
estr("Three random numbers: {rand():f}, {rand():f}, {rand():}.", # template
|
|
||||||
('Three random numbers: ', ', ', ', ', '.'), # strings
|
|
||||||
('rand():f', 'rand():f', 'rand():f'), # expressions
|
|
||||||
(rand(), rand(), rand()) # values
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
Safety
|
Safety
|
||||||
-----------
|
-----------
|
||||||
|
|
||||||
In this section we will describe the safety situation and precautions taken
|
In this section we will describe the safety situation and precautions taken
|
||||||
in support of expression-strings.
|
in support of format-strings.
|
||||||
|
|
||||||
#. Only string literals shall be considered here,
|
#. Only string literals have been considered for format-strings,
|
||||||
not variables to be taken as input or passed around,
|
not variables to be taken as input or passed around,
|
||||||
making external attacks difficult to accomplish.
|
making external attacks difficult to accomplish.
|
||||||
|
|
||||||
* ``str.format()`` `already handles`_ this use-case.
|
``str.format()`` and alternatives `already handle`_ this use-case.
|
||||||
* Direct instantiation of the ExpressionString object with non-literal input
|
|
||||||
shall not be allowed. (Practicality?)
|
|
||||||
|
|
||||||
#. Neither ``locals()`` nor ``globals()`` are necessary nor used during the
|
#. Neither ``locals()`` nor ``globals()`` are necessary nor used during the
|
||||||
transformation,
|
transformation,
|
||||||
|
@ -622,37 +482,72 @@ in support of expression-strings.
|
||||||
#. To eliminate complexity as well as ``RuntimeError`` (s) due to recursion
|
#. To eliminate complexity as well as ``RuntimeError`` (s) due to recursion
|
||||||
depth, recursive interpolation is not supported.
|
depth, recursive interpolation is not supported.
|
||||||
|
|
||||||
#. Restricted characters or expression classes?, such as ``=`` for assignment.
|
|
||||||
|
|
||||||
However,
|
However,
|
||||||
mistakes or malicious code could be missed inside string literals.
|
mistakes or malicious code could be missed inside string literals.
|
||||||
Though that can be said of code in general,
|
Though that can be said of code in general,
|
||||||
that these expressions are inside strings means they are a bit more likely
|
that these expressions are inside strings means they are a bit more likely
|
||||||
to be obscured.
|
to be obscured.
|
||||||
|
|
||||||
.. _already handles: https://mail.python.org/pipermail/python-ideas/2015-July/034729.html
|
.. _already handle: https://mail.python.org/pipermail/python-ideas/2015-July/034729.html
|
||||||
|
|
||||||
|
|
||||||
Mitigation via tools
|
Mitigation via Tools
|
||||||
''''''''''''''''''''
|
''''''''''''''''''''
|
||||||
|
|
||||||
The idea is that tools or linters such as pyflakes, pylint, or Pycharm,
|
The idea is that tools or linters such as pyflakes, pylint, or Pycharm,
|
||||||
could check inside strings for constructs that exceed project policy.
|
may check inside strings with expressions and mark them up appropriately.
|
||||||
As this is a common task with languages these days,
|
As this is a common task with programming languages today,
|
||||||
tools won't have to implement this feature solely for Python,
|
multi-language tools won't have to implement this feature solely for Python,
|
||||||
significantly shortening time to implementation.
|
significantly shortening time to implementation.
|
||||||
|
|
||||||
Additionally the Python interpreter could check(?) and warn with appropriate
|
Farther in the future,
|
||||||
command-line parameters passed.
|
strings might also be checked for constructs that exceed the safety policy of
|
||||||
|
a project.
|
||||||
|
|
||||||
|
|
||||||
|
Style Guide/Precautions
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
As arbitrary expressions may accomplish anything a Python expression is
|
||||||
|
able to,
|
||||||
|
it is highly recommended to avoid constructs inside format-strings that could
|
||||||
|
cause side effects.
|
||||||
|
|
||||||
|
Further guidelines may be written once usage patterns and true problems are
|
||||||
|
known.
|
||||||
|
|
||||||
|
|
||||||
|
Reference Implementation(s)
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
The `say module on PyPI`_ implements string interpolation as described here
|
||||||
|
with the small burden of a callable interface::
|
||||||
|
|
||||||
|
> pip install say
|
||||||
|
|
||||||
|
from say import say
|
||||||
|
nums = list(range(4))
|
||||||
|
say("Nums has {len(nums)} items: {nums}")
|
||||||
|
|
||||||
|
A Python implementation of Ruby interpolation `is also available`_.
|
||||||
|
It uses the codecs module to do its work::
|
||||||
|
|
||||||
|
> pip install interpy
|
||||||
|
|
||||||
|
# coding: interpy
|
||||||
|
location = 'World'
|
||||||
|
print("Hello #{location}.")
|
||||||
|
|
||||||
|
.. _say module on PyPI: https://pypi.python.org/pypi/say/
|
||||||
|
.. _is also available: https://github.com/syrusakbary/interpy
|
||||||
|
|
||||||
|
|
||||||
Backwards Compatibility
|
Backwards Compatibility
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
By using existing syntax and avoiding use of current or historical features,
|
By using existing syntax and avoiding current or historical features,
|
||||||
expression-strings (and any associated sub-features),
|
format strings were designed so as to not interfere with existing code and are
|
||||||
were designed so as to not interfere with existing code and is not expected
|
not expected to cause any issues.
|
||||||
to cause any issues.
|
|
||||||
|
|
||||||
|
|
||||||
Postponed Ideas
|
Postponed Ideas
|
||||||
|
@ -666,20 +561,12 @@ Though it was highly desired to integrate internationalization support,
|
||||||
the finer details diverge at almost every point,
|
the finer details diverge at almost every point,
|
||||||
making a common solution unlikely: [15]_
|
making a common solution unlikely: [15]_
|
||||||
|
|
||||||
* Use-cases
|
* Use-cases differ
|
||||||
* Compile and run-time tasks
|
* Compile vs. run-time tasks
|
||||||
* Interpolation Syntax
|
* Interpolation syntax needs
|
||||||
* Intended audience
|
* Intended audience
|
||||||
* Security policy
|
* Security policy
|
||||||
|
|
||||||
Rather than try to fit a "square peg in a round hole,"
|
|
||||||
this PEP attempts to allow internationalization to be supported in the future
|
|
||||||
by not preventing it.
|
|
||||||
In this proposal,
|
|
||||||
expression-string inputs are saved for inspection and re-rendering at a later
|
|
||||||
time,
|
|
||||||
allowing for their use by an external library of any sort.
|
|
||||||
|
|
||||||
|
|
||||||
Rejected Ideas
|
Rejected Ideas
|
||||||
--------------
|
--------------
|
||||||
|
@ -687,18 +574,25 @@ Rejected Ideas
|
||||||
Restricting Syntax to ``str.format()`` Only
|
Restricting Syntax to ``str.format()`` Only
|
||||||
'''''''''''''''''''''''''''''''''''''''''''
|
'''''''''''''''''''''''''''''''''''''''''''
|
||||||
|
|
||||||
This was deemed not enough of a solution to the problem.
|
The common `arguments against`_ support of arbitrary expresssions were:
|
||||||
It can be seen in the `Implementations in Other Languages`_ section that the
|
|
||||||
developer community at large tends to agree.
|
|
||||||
|
|
||||||
The common `arguments against`_ arbitrary expresssions were:
|
#. `YAGNI`_, "You aren't gonna need it."
|
||||||
|
#. The feature is not congruent with historical Python conservatism.
|
||||||
#. YAGNI, "You ain't gonna need it."
|
|
||||||
#. The change is not congruent with historical Python conservatism.
|
|
||||||
#. Postpone - can implement in a future version if need is demonstrated.
|
#. Postpone - can implement in a future version if need is demonstrated.
|
||||||
|
|
||||||
|
.. _YAGNI: https://en.wikipedia.org/wiki/You_aren't_gonna_need_it
|
||||||
.. _arguments against: https://mail.python.org/pipermail/python-ideas/2015-August/034913.html
|
.. _arguments against: https://mail.python.org/pipermail/python-ideas/2015-August/034913.html
|
||||||
|
|
||||||
|
Support of only ``str.format()`` syntax however,
|
||||||
|
was deemed not enough of a solution to the problem.
|
||||||
|
Often a simple length or increment of an object, for example,
|
||||||
|
is desired before printing.
|
||||||
|
|
||||||
|
It can be seen in the `Implementations in Other Languages`_ section that the
|
||||||
|
developer community at large tends to agree.
|
||||||
|
String interpolation with arbitrary expresssions is becoming an industry
|
||||||
|
standard in modern languages due to its utility.
|
||||||
|
|
||||||
|
|
||||||
Additional/Custom String-Prefixes
|
Additional/Custom String-Prefixes
|
||||||
'''''''''''''''''''''''''''''''''
|
'''''''''''''''''''''''''''''''''
|
||||||
|
@ -720,7 +614,7 @@ this was thought to create too much uncertainty of when and where string
|
||||||
expressions could be used safely or not.
|
expressions could be used safely or not.
|
||||||
The concept was also difficult to describe to others. [12]_
|
The concept was also difficult to describe to others. [12]_
|
||||||
|
|
||||||
Always consider expression-string variables to be unescaped,
|
Always consider format string variables to be unescaped,
|
||||||
unless the developer has explicitly escaped them.
|
unless the developer has explicitly escaped them.
|
||||||
|
|
||||||
|
|
||||||
|
@ -735,33 +629,13 @@ and looking too much like bash/perl,
|
||||||
which could encourage bad habits. [13]_
|
which could encourage bad habits. [13]_
|
||||||
|
|
||||||
|
|
||||||
Reference Implementation(s)
|
|
||||||
===========================
|
|
||||||
|
|
||||||
An expression-string implementation is currently attached to PEP 498,
|
|
||||||
under the ``f''`` prefix,
|
|
||||||
and may be available in nightly builds.
|
|
||||||
|
|
||||||
A Python implementation of Ruby interpolation `is also available`_,
|
|
||||||
which is similar to this proposal.
|
|
||||||
It uses the codecs module to do its work::
|
|
||||||
|
|
||||||
> pip install interpy
|
|
||||||
|
|
||||||
# coding: interpy
|
|
||||||
location = 'World'
|
|
||||||
print("Hello #{location}.")
|
|
||||||
|
|
||||||
.. _is also available: https://github.com/syrusakbary/interpy
|
|
||||||
|
|
||||||
|
|
||||||
Acknowledgements
|
Acknowledgements
|
||||||
================
|
================
|
||||||
|
|
||||||
* Eric V. Smith for providing invaluable implementation work and design
|
* Eric V. Smith for the authoring and implementation of PEP 498.
|
||||||
opinions, helping to focus this PEP.
|
* Everyone on the python-ideas mailing list for rejecting the various crazy
|
||||||
* Others on the python-ideas mailing list for rejecting the craziest of ideas,
|
ideas that came up,
|
||||||
also helping to achieve focus.
|
helping to keep the final design in focus.
|
||||||
|
|
||||||
|
|
||||||
References
|
References
|
||||||
|
@ -771,7 +645,6 @@ References
|
||||||
|
|
||||||
(https://mail.python.org/pipermail/python-ideas/2015-July/034659.html)
|
(https://mail.python.org/pipermail/python-ideas/2015-July/034659.html)
|
||||||
|
|
||||||
|
|
||||||
.. [2] Briefer String Format
|
.. [2] Briefer String Format
|
||||||
|
|
||||||
(https://mail.python.org/pipermail/python-ideas/2015-July/034669.html)
|
(https://mail.python.org/pipermail/python-ideas/2015-July/034669.html)
|
||||||
|
|
|
@ -0,0 +1,396 @@
|
||||||
|
PEP: 504
|
||||||
|
Title: Using the System RNG by default
|
||||||
|
Version: $Revision$
|
||||||
|
Last-Modified: $Date$
|
||||||
|
Author: Nick Coghlan <ncoghlan@gmail.com>
|
||||||
|
Status: Withdrawn
|
||||||
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 15-Sep-2015
|
||||||
|
Python-Version: 3.6
|
||||||
|
Post-History: 15-Sep-2015
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
Python currently defaults to using the deterministic Mersenne Twister random
|
||||||
|
number generator for the module level APIs in the ``random`` module, requiring
|
||||||
|
users to know that when they're performing "security sensitive" work, they
|
||||||
|
should instead switch to using the cryptographically secure ``os.urandom`` or
|
||||||
|
``random.SystemRandom`` interfaces or a third party library like
|
||||||
|
``cryptography``.
|
||||||
|
|
||||||
|
Unfortunately, this approach has resulted in a situation where developers that
|
||||||
|
aren't aware that they're doing security sensitive work use the default module
|
||||||
|
level APIs, and thus expose their users to unnecessary risks.
|
||||||
|
|
||||||
|
This isn't an acute problem, but it is a chronic one, and the often long
|
||||||
|
delays between the introduction of security flaws and their exploitation means
|
||||||
|
that it is difficult for developers to naturally learn from experience.
|
||||||
|
|
||||||
|
In order to provide an eventually pervasive solution to the problem, this PEP
|
||||||
|
proposes that Python switch to using the system random number generator by
|
||||||
|
default in Python 3.6, and require developers to opt-in to using the
|
||||||
|
deterministic random number generator process wide either by using a new
|
||||||
|
``random.ensure_repeatable()`` API, or by explicitly creating their own
|
||||||
|
``random.Random()`` instance.
|
||||||
|
|
||||||
|
To minimise the impact on existing code, module level APIs that require
|
||||||
|
determinism will implicitly switch to the deterministic PRNG.
|
||||||
|
|
||||||
|
PEP Withdrawal
|
||||||
|
==============
|
||||||
|
|
||||||
|
During discussion of this PEP, Steven D'Aprano proposed the simpler alternative
|
||||||
|
of offering a standardised ``secrets`` module that provides "one obvious way"
|
||||||
|
to handle security sensitive tasks like generating default passwords and other
|
||||||
|
tokens.
|
||||||
|
|
||||||
|
Steven's proposal has the desired effect of aligning the easy way to generate
|
||||||
|
such tokens and the right way to generate them, without introducing any
|
||||||
|
compatibility risks for the existing ``random`` module API, so this PEP has
|
||||||
|
been withdrawn in favour of further work on refining Steven's proposal as
|
||||||
|
PEP 506.
|
||||||
|
|
||||||
|
|
||||||
|
Proposal
|
||||||
|
========
|
||||||
|
|
||||||
|
Currently, it is never correct to use the module level functions in the
|
||||||
|
``random`` module for security sensitive applications. This PEP proposes to
|
||||||
|
change that admonition in Python 3.6+ to instead be that it is not correct to
|
||||||
|
use the module level functions in the ``random`` module for security sensitive
|
||||||
|
applications if ``random.ensure_repeatable()`` is ever called (directly or
|
||||||
|
indirectly) in that process.
|
||||||
|
|
||||||
|
To achieve this, rather than being bound methods of a ``random.Random``
|
||||||
|
instance as they are today, the module level callables in ``random`` would
|
||||||
|
change to be functions that delegate to the corresponding method of the
|
||||||
|
existing ``random._inst`` module attribute.
|
||||||
|
|
||||||
|
By default, this attribute will be bound to a ``random.SystemRandom`` instance.
|
||||||
|
|
||||||
|
A new ``random.ensure_repeatable()`` API will then rebind the ``random._inst``
|
||||||
|
attribute to a ``system.Random`` instance, restoring the same module level
|
||||||
|
API behaviour as existed in previous Python versions (aside from the
|
||||||
|
additional level of indirection)::
|
||||||
|
|
||||||
|
def ensure_repeatable():
|
||||||
|
"""Switch to using random.Random() for the module level APIs
|
||||||
|
|
||||||
|
This switches the default RNG instance from the crytographically
|
||||||
|
secure random.SystemRandom() to the deterministic random.Random(),
|
||||||
|
enabling the seed(), getstate() and setstate() operations. This means
|
||||||
|
a particular random scenario can be replayed later by providing the
|
||||||
|
same seed value or restoring a previously saved state.
|
||||||
|
|
||||||
|
NOTE: Libraries implementing security sensitive operations should
|
||||||
|
always explicitly use random.SystemRandom() or os.urandom in order to
|
||||||
|
correctly handle applications that call this function.
|
||||||
|
"""
|
||||||
|
if not isinstance(_inst, Random):
|
||||||
|
_inst = random.Random()
|
||||||
|
|
||||||
|
To minimise the impact on existing code, calling any of the following module
|
||||||
|
level functions will implicitly call ``random.ensure_repeatable()``:
|
||||||
|
|
||||||
|
* ``random.seed``
|
||||||
|
* ``random.getstate``
|
||||||
|
* ``random.setstate``
|
||||||
|
|
||||||
|
There are no changes proposed to the ``random.Random`` or
|
||||||
|
``random.SystemRandom`` class APIs - applications that explicitly instantiate
|
||||||
|
their own random number generators will be entirely unaffected by this
|
||||||
|
proposal.
|
||||||
|
|
||||||
|
Warning on implicit opt-in
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
In Python 3.6, implicitly opting in to the use of the deterministic PRNG will
|
||||||
|
emit a deprecation warning using the following check::
|
||||||
|
|
||||||
|
if not isinstance(_inst, Random):
|
||||||
|
warnings.warn(DeprecationWarning,
|
||||||
|
"Implicitly ensuring repeatability. "
|
||||||
|
"See help(random.ensure_repeatable) for details")
|
||||||
|
ensure_repeatable()
|
||||||
|
|
||||||
|
The specific wording of the warning should have a suitable answer added to
|
||||||
|
Stack Overflow as was done for the custom error message that was added for
|
||||||
|
missing parentheses in a call to print [#print]_.
|
||||||
|
|
||||||
|
In the first Python 3 release after Python 2.7 switches to security fix only
|
||||||
|
mode, the deprecation warning will be upgraded to a RuntimeWarning so it is
|
||||||
|
visible by default.
|
||||||
|
|
||||||
|
This PEP does *not* propose ever removing the ability to ensure the default RNG
|
||||||
|
used process wide is a deterministic PRNG that will produce the same series of
|
||||||
|
outputs given a specific seed. That capability is widely used in modelling
|
||||||
|
and simulation scenarios, and requiring that ``ensure_repeatable()`` be called
|
||||||
|
either directly or indirectly is a sufficient enhancement to address the cases
|
||||||
|
where the module level random API is used for security sensitive tasks in web
|
||||||
|
applications without due consideration for the potential security implications
|
||||||
|
of using a deterministic PRNG.
|
||||||
|
|
||||||
|
Performance impact
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Due to the large performance difference between ``random.Random`` and
|
||||||
|
``random.SystemRandom``, applications ported to Python 3.6 will encounter a
|
||||||
|
significant performance regression in cases where:
|
||||||
|
|
||||||
|
* the application is using the module level random API
|
||||||
|
* cryptographic quality randomness isn't needed
|
||||||
|
* the application doesn't already implicitly opt back in to the deterministic
|
||||||
|
PRNG by calling ``random.seed``, ``random.getstate``, or ``random.setstate``
|
||||||
|
* the application isn't updated to explicitly call ``random.ensure_repeatable``
|
||||||
|
|
||||||
|
This would be noted in the Porting section of the Python 3.6 What's New guide,
|
||||||
|
with the recommendation to include the following code in the ``__main__``
|
||||||
|
module of affected applications::
|
||||||
|
|
||||||
|
if hasattr(random, "ensure_repeatable"):
|
||||||
|
random.ensure_repeatable()
|
||||||
|
|
||||||
|
Applications that do need cryptographic quality randomness should be using the
|
||||||
|
system random number generator regardless of speed considerations, so in those
|
||||||
|
cases the change proposed in this PEP will fix a previously latent security
|
||||||
|
defect.
|
||||||
|
|
||||||
|
Documentation changes
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
The ``random`` module documentation would be updated to move the documentation
|
||||||
|
of the ``seed``, ``getstate`` and ``setstate`` interfaces later in the module,
|
||||||
|
along with the documentation of the new ``ensure_repeatable`` function and the
|
||||||
|
associated security warning.
|
||||||
|
|
||||||
|
That section of the module documentation would also gain a discussion of the
|
||||||
|
respective use cases for the deterministic PRNG enabled by
|
||||||
|
``ensure_repeatable`` (games, modelling & simulation, software testing) and the
|
||||||
|
system RNG that is used by default (cryptography, security token generation).
|
||||||
|
This discussion will also recommend the use of third party security libraries
|
||||||
|
for the latter task.
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
Writing secure software under deadline and budget pressures is a hard problem.
|
||||||
|
This is reflected in regular notifications of data breaches involving personally
|
||||||
|
identifiable information [#breaches]_, as well as with failures to take
|
||||||
|
security considerations into account when new systems, like motor vehicles
|
||||||
|
[#uconnect]_, are connected to the internet. It's also the case that a lot of
|
||||||
|
the programming advice readily available on the internet [#search] simply
|
||||||
|
doesn't take the mathemetical arcana of computer security into account.
|
||||||
|
Compounding these issues is the fact that defenders have to cover *all* of
|
||||||
|
their potential vulnerabilites, as a single mistake can make it possible to
|
||||||
|
subvert other defences [#bcrypt]_.
|
||||||
|
|
||||||
|
One of the factors that contributes to making this last aspect particularly
|
||||||
|
difficult is APIs where using them inappropriately creates a *silent* security
|
||||||
|
failure - one where the only way to find out that what you're doing is
|
||||||
|
incorrect is for someone reviewing your code to say "that's a potential
|
||||||
|
security problem", or for a system you're responsible for to be compromised
|
||||||
|
through such an oversight (and you're not only still responsible for that
|
||||||
|
system when it is compromised, but your intrusion detection and auditing
|
||||||
|
mechanisms are good enough for you to be able to figure out after the event
|
||||||
|
how the compromise took place).
|
||||||
|
|
||||||
|
This kind of situation is a significant contributor to "security fatigue",
|
||||||
|
where developers (often rightly [#owasptopten]_) feel that security engineers
|
||||||
|
spend all their time saying "don't do that the easy way, it creates a
|
||||||
|
security vulnerability".
|
||||||
|
|
||||||
|
As the designers of one of the world's most popular languages [#ieeetopten]_,
|
||||||
|
we can help reduce that problem by making the easy way the right way (or at
|
||||||
|
least the "not wrong" way) in more circumstances, so developers and security
|
||||||
|
engineers can spend more time worrying about mitigating actually interesting
|
||||||
|
threats, and less time fighting with default language behaviours.
|
||||||
|
|
||||||
|
Discussion
|
||||||
|
==========
|
||||||
|
|
||||||
|
Why "ensure_repeatable" over "ensure_deterministic"?
|
||||||
|
----------------------------------------------------
|
||||||
|
|
||||||
|
This is a case where the meaning of a word as specialist jargon conflicts with
|
||||||
|
the typical meaning of the word, even though it's *technically* the same.
|
||||||
|
|
||||||
|
From a technical perspective, a "deterministic RNG" means that given knowledge
|
||||||
|
of the algorithm and the current state, you can reliably compute arbitrary
|
||||||
|
future states.
|
||||||
|
|
||||||
|
The problem is that "deterministic" on its own doesn't convey those qualifiers,
|
||||||
|
so it's likely to instead be interpreted as "predictable" or "not random" by
|
||||||
|
folks that are familiar with the conventional meaning, but aren't familiar with
|
||||||
|
the additional qualifiers on the technical meaning.
|
||||||
|
|
||||||
|
A second problem with "deterministic" as a description for the traditional RNG
|
||||||
|
is that it doesn't really tell you what you can *do* with the traditional RNG
|
||||||
|
that you can't do with the system one.
|
||||||
|
|
||||||
|
"ensure_repeatable" aims to address both of those problems, as its common
|
||||||
|
meaning accurately describes the main reason for preferring the deterministic
|
||||||
|
PRNG over the system RNG: ensuring you can repeat the same series of outputs
|
||||||
|
by providing the same seed value, or by restoring a previously saved PRNG state.
|
||||||
|
|
||||||
|
Only changing the default for Python 3.6+
|
||||||
|
-----------------------------------------
|
||||||
|
|
||||||
|
Some other recent security changes, such as upgrading the capabilities of the
|
||||||
|
``ssl`` module and switching to properly verifying HTTPS certificates by
|
||||||
|
default, have been considered critical enough to justify backporting the
|
||||||
|
change to all currently supported versions of Python.
|
||||||
|
|
||||||
|
The difference in this case is one of degree - the additional benefits from
|
||||||
|
rolling out this particular change a couple of years earlier than will
|
||||||
|
otherwise be the case aren't sufficient to justify either the additional effort
|
||||||
|
or the stability risks involved in making such an intrusive change in a
|
||||||
|
maintenance release.
|
||||||
|
|
||||||
|
Keeping the module level functions
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
In additional to general backwards compatibility considerations, Python is
|
||||||
|
widely used for educational purposes, and we specifically don't want to
|
||||||
|
invalidate the wide array of educational material that assumes the availabilty
|
||||||
|
of the current ``random`` module API. Accordingly, this proposal ensures that
|
||||||
|
most of the public API can continue to be used not only without modification,
|
||||||
|
but without generating any new warnings.
|
||||||
|
|
||||||
|
Warning when implicitly opting in to the deterministic RNG
|
||||||
|
----------------------------------------------------------
|
||||||
|
|
||||||
|
It's necessary to implicitly opt in to the deterministic PRNG as Python is
|
||||||
|
widely used for modelling and simulation purposes where this is the right
|
||||||
|
thing to do, and in many cases, these software models won't have a dedicated
|
||||||
|
maintenance team tasked with ensuring they keep working on the latest versions
|
||||||
|
of Python.
|
||||||
|
|
||||||
|
Unfortunately, explicitly calling ``random.seed`` with data from ``os.urandom``
|
||||||
|
is also a mistake that appears in a number of the flawed "how to generate a
|
||||||
|
security token in Python" guides readily available online.
|
||||||
|
|
||||||
|
Using first DeprecationWarning, and then eventually a RuntimeWarning, to
|
||||||
|
advise against implicitly switching to the deterministic PRNG aims to
|
||||||
|
nudge future users that need a cryptographically secure RNG away from
|
||||||
|
calling ``random.seed()`` and those that genuinely need a deterministic
|
||||||
|
generator towards explicitily calling ``random.ensure_repeatable()``.
|
||||||
|
|
||||||
|
Avoiding the introduction of a userspace CSPRNG
|
||||||
|
-----------------------------------------------
|
||||||
|
|
||||||
|
The original discussion of this proposal on python-ideas[#csprng]_ suggested
|
||||||
|
introducing a cryptographically secure pseudo-random number generator and using
|
||||||
|
that by default, rather than defaulting to the relatively slow system random
|
||||||
|
number generator.
|
||||||
|
|
||||||
|
The problem [#nocsprng]_ with this approach is that it introduces an additional
|
||||||
|
point of failure in security sensitive situations, for the sake of applications
|
||||||
|
where the random number generation may not even be on a critical performance
|
||||||
|
path.
|
||||||
|
|
||||||
|
Applications that do need cryptographic quality randomness should be using the
|
||||||
|
system random number generator regardless of speed considerations, so in those
|
||||||
|
cases.
|
||||||
|
|
||||||
|
Isn't the deterministic PRNG "secure enough"?
|
||||||
|
---------------------------------------------
|
||||||
|
|
||||||
|
In a word, "No" - that's why there's a warning in the module documentation
|
||||||
|
that says not to use it for security sensitive purposes. While we're not
|
||||||
|
currently aware of any studies of Python's random number generator specifically,
|
||||||
|
studies of PHP's random number generator [#php]_ have demonstrated the ability
|
||||||
|
to use weaknesses in that subsystem to facilitate a practical attack on
|
||||||
|
password recovery tokens in popular PHP web applications.
|
||||||
|
|
||||||
|
However, one of the rules of secure software development is that "attacks only
|
||||||
|
get better, never worse", so it may be that by the time Python 3.6 is released
|
||||||
|
we will actually see a practical attack on Python's deterministic PRNG publicly
|
||||||
|
documented.
|
||||||
|
|
||||||
|
Security fatigue in the Python ecosystem
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
|
Over the past few years, the computing industry as a whole has been
|
||||||
|
making a concerted effort to upgrade the shared network infrastructure we all
|
||||||
|
depend on to a "secure by default" stance. As one of the most widely used
|
||||||
|
programming languages for network service development (including the OpenStack
|
||||||
|
Infrastructure-as-a-Service platform) and for systems administration
|
||||||
|
on Linux systems in general, a fair share of that burden has fallen on the
|
||||||
|
Python ecosystem, which is understandably frustrating for Pythonistas using
|
||||||
|
Python in other contexts where these issues aren't of as great a concern.
|
||||||
|
|
||||||
|
This consideration is one of the primary factors driving the substantial
|
||||||
|
backwards compatibility improvements in this proposal relative to the initial
|
||||||
|
draft concept posted to python-ideas [#draft]_.
|
||||||
|
|
||||||
|
Acknowledgements
|
||||||
|
================
|
||||||
|
|
||||||
|
* Theo de Raadt, for making the suggestion to Guido van Rossum that we
|
||||||
|
seriously consider defaulting to a cryptographically secure random number
|
||||||
|
generator
|
||||||
|
* Serhiy Storchaka, Terry Reedy, Petr Viktorin, and anyone else in the
|
||||||
|
python-ideas threads that suggested the approach of transparently switching
|
||||||
|
to the ``random.Random`` implementation when any of the functions that only
|
||||||
|
make sense for a deterministic RNG are called
|
||||||
|
* Nathaniel Smith for providing the reference on practical attacks against
|
||||||
|
PHP's random number generator when used to generate password reset tokens
|
||||||
|
* Donald Stufft for pursuing additional discussions with network security
|
||||||
|
experts that suggested the introduction of a userspace CSPRNG would mean
|
||||||
|
additional complexity for insufficient gain relative to just using the
|
||||||
|
system RNG directly
|
||||||
|
* Paul Moore for eloquently making the case for the current level of security
|
||||||
|
fatigue in the Python ecosystem
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [#breaches] Visualization of data breaches involving more than 30k records (each)
|
||||||
|
(http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/)
|
||||||
|
|
||||||
|
.. [#uconnect] Remote UConnect hack for Jeep Cherokee
|
||||||
|
(http://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/)
|
||||||
|
|
||||||
|
.. [#php] PRNG based attack against password reset tokens in PHP applications
|
||||||
|
(https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf)
|
||||||
|
|
||||||
|
.. [#search] Search link for "python password generator"
|
||||||
|
(https://www.google.com.au/search?q=python+password+generator)
|
||||||
|
|
||||||
|
.. [#csprng] python-ideas thread discussing using a userspace CSPRNG
|
||||||
|
(https://mail.python.org/pipermail/python-ideas/2015-September/035886.html)
|
||||||
|
|
||||||
|
.. [#draft] Initial draft concept that eventually became this PEP
|
||||||
|
(https://mail.python.org/pipermail/python-ideas/2015-September/036095.html)
|
||||||
|
|
||||||
|
.. [#nocsprng] Safely generating random numbers
|
||||||
|
(http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/)
|
||||||
|
|
||||||
|
.. [#ieeetopten] IEEE Spectrum 2015 Top Ten Programming Languages
|
||||||
|
(http://spectrum.ieee.org/computing/software/the-2015-top-ten-programming-languages)
|
||||||
|
|
||||||
|
.. [#owasptopten] OWASP Top Ten Web Security Issues for 2013
|
||||||
|
(https://www.owasp.org/index.php/OWASP_Top_Ten_Project#tab=OWASP_Top_10_for_2013)
|
||||||
|
|
||||||
|
.. [#print] Stack Overflow answer for missing parentheses in call to print
|
||||||
|
(http://stackoverflow.com/questions/25445439/what-does-syntaxerror-missing-parentheses-in-call-to-print-mean-in-python/25445440#25445440)
|
||||||
|
|
||||||
|
.. [#bcrypt] Bypassing bcrypt through an insecure data cache
|
||||||
|
(http://arstechnica.com/security/2015/09/once-seen-as-bulletproof-11-million-ashley-madison-passwords-already-cracked/)
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
sentence-end-double-space: t
|
||||||
|
fill-column: 70
|
||||||
|
coding: utf-8
|
||||||
|
End:
|
|
@ -0,0 +1,205 @@
|
||||||
|
PEP: 505
|
||||||
|
Title: None coalescing operators
|
||||||
|
Version: $Revision$
|
||||||
|
Last-Modified: $Date$
|
||||||
|
Author: Mark E. Haase <mehaase@gmail.com>
|
||||||
|
Status: Draft
|
||||||
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 18-Sep-2015
|
||||||
|
Python-Version: 3.6
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
Several modern programming languages have so-called "null coalescing" or
|
||||||
|
"null aware" operators, including C#, Dart, Perl, Swift, and PHP (starting in
|
||||||
|
version 7). These operators provide syntactic sugar for common patterns
|
||||||
|
involving null references. [1]_ [2]_
|
||||||
|
|
||||||
|
* The "null coalescing" operator is a binary operator that returns its first
|
||||||
|
first non-null operand.
|
||||||
|
* The "null aware member access" operator is a binary operator that accesses
|
||||||
|
an instance member only if that instance is non-null. It returns null
|
||||||
|
otherwise.
|
||||||
|
* The "null aware index access" operator is a binary operator that accesses a
|
||||||
|
member of a collection only if that collection is non-null. It returns null
|
||||||
|
otherwise.
|
||||||
|
|
||||||
|
Python does not have any directly equivalent syntax. The ``or`` operator can
|
||||||
|
be used to similar effect but checks for a truthy value, not ``None``
|
||||||
|
specifically. The ternary operator ``... if ... else ...`` can be used for
|
||||||
|
explicit null checks but is more verbose and typically duplicates part of the
|
||||||
|
expression in between ``if`` and ``else``. The proposed ``None`` coalescing
|
||||||
|
and ``None`` aware operators ofter an alternative syntax that is more
|
||||||
|
intuitive and concise.
|
||||||
|
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
Null Coalescing Operator
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
The following code illustrates how the ``None`` coalescing operators would
|
||||||
|
work in Python::
|
||||||
|
|
||||||
|
>>> title = 'My Title'
|
||||||
|
>>> title ?? 'Default Title'
|
||||||
|
'My Title'
|
||||||
|
>>> title = None
|
||||||
|
>>> title ?? 'Default Title'
|
||||||
|
'Default Title'
|
||||||
|
|
||||||
|
Similar behavior can be achieved with the ``or`` operator, but ``or`` checks
|
||||||
|
whether its left operand is false-y, not specifically ``None``. This can lead
|
||||||
|
to surprising behavior. Consider the scenario of computing the price of some
|
||||||
|
products a customer has in his/her shopping cart::
|
||||||
|
|
||||||
|
>>> price = 100
|
||||||
|
>>> requested_quantity = 5
|
||||||
|
>>> default_quantity = 1
|
||||||
|
>>> (requested_quantity or default_quantity) * price
|
||||||
|
500
|
||||||
|
>>> requested_quantity = None
|
||||||
|
>>> (requested_quantity or default_quantity) * price
|
||||||
|
100
|
||||||
|
>>> requested_quantity = 0
|
||||||
|
>>> (requested_quantity or default_quantity) * price # oops!
|
||||||
|
100
|
||||||
|
|
||||||
|
This type of bug is not possible with the ``None`` coalescing operator,
|
||||||
|
because there is no implicit type coersion to ``bool``::
|
||||||
|
|
||||||
|
>>> price = 100
|
||||||
|
>>> requested_quantity = 0
|
||||||
|
>>> default_quantity = 1
|
||||||
|
>>> (requested_quantity ?? default_quantity) * price
|
||||||
|
0
|
||||||
|
|
||||||
|
The same correct behavior can be achieved with the ternary operator. Here is
|
||||||
|
an excerpt from the popular Requests package::
|
||||||
|
|
||||||
|
data = [] if data is None else data
|
||||||
|
files = [] if files is None else files
|
||||||
|
headers = {} if headers is None else headers
|
||||||
|
params = {} if params is None else params
|
||||||
|
hooks = {} if hooks is None else hooks
|
||||||
|
|
||||||
|
This particular formulation has the undesirable effect of putting the operands
|
||||||
|
in an unintuitive order: the brain thinks, "use ``data`` if possible and use
|
||||||
|
``[]`` as a fallback," but the code puts the fallback *before* the preferred
|
||||||
|
value.
|
||||||
|
|
||||||
|
The author of this package could have written it like this instead::
|
||||||
|
|
||||||
|
data = data if data is not None else []
|
||||||
|
files = files if files is not None else []
|
||||||
|
headers = headers if headers is not None else {}
|
||||||
|
params = params if params is not None else {}
|
||||||
|
hooks = hooks if hooks is not None else {}
|
||||||
|
|
||||||
|
This ordering of the operands is more intuitive, but it requires 4 extra
|
||||||
|
characters (for "not "). It also highlights the repetition of identifiers:
|
||||||
|
``data if data``, ``files if files``, etc. The ``None`` coalescing operator
|
||||||
|
improves readability::
|
||||||
|
|
||||||
|
data = data ?? []
|
||||||
|
files = files ?? []
|
||||||
|
headers = headers ?? {}
|
||||||
|
params = params ?? {}
|
||||||
|
hooks = hooks ?? {}
|
||||||
|
|
||||||
|
The ``None`` coalescing operator also has a corresponding assignment shortcut.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
data ?= []
|
||||||
|
files ?= []
|
||||||
|
headers ?= {}
|
||||||
|
params ?= {}
|
||||||
|
hooks ?= {}
|
||||||
|
|
||||||
|
The ``None`` coalescing operator is left-associative, which allows for easy
|
||||||
|
chaining::
|
||||||
|
|
||||||
|
>>> user_title = None
|
||||||
|
>>> local_default_title = None
|
||||||
|
>>> global_default_title = 'Global Default Title'
|
||||||
|
>>> title = user_title ?? local_default_title ?? global_default_title
|
||||||
|
'Global Default Title'
|
||||||
|
|
||||||
|
The direction of associativity is important because the ``None`` coalescing
|
||||||
|
operator short circuits: if its left operand is non-null, then the right
|
||||||
|
operand is not evaluated.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
>>> def get_default(): raise Exception()
|
||||||
|
>>> 'My Title' ?? get_default()
|
||||||
|
'My Title'
|
||||||
|
|
||||||
|
|
||||||
|
Null-Aware Member Access Operator
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
>>> title = 'My Title'
|
||||||
|
>>> title.upper()
|
||||||
|
'MY TITLE'
|
||||||
|
>>> title = None
|
||||||
|
>>> title.upper()
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "<stdin>", line 1, in <module>
|
||||||
|
AttributeError: 'NoneType' object has no attribute 'upper'
|
||||||
|
>>> title?.upper()
|
||||||
|
None
|
||||||
|
|
||||||
|
|
||||||
|
Null-Aware Index Access Operator
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
>>> person = {'name': 'Mark', 'age': 32}
|
||||||
|
>>> person['name']
|
||||||
|
'Mark'
|
||||||
|
>>> person = None
|
||||||
|
>>> person['name']
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "<stdin>", line 1, in <module>
|
||||||
|
TypeError: 'NoneType' object is not subscriptable
|
||||||
|
>>> person?['name']
|
||||||
|
None
|
||||||
|
|
||||||
|
|
||||||
|
Specification
|
||||||
|
=============
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [1] Wikipedia: Null coalescing operator
|
||||||
|
(https://en.wikipedia.org/wiki/Null_coalescing_operator)
|
||||||
|
|
||||||
|
.. [2] Seth Ladd's Blog: Null-aware operators in Dart
|
||||||
|
(http://blog.sethladd.com/2015/07/null-aware-operators-in-dart.html)
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
sentence-end-double-space: t
|
||||||
|
fill-column: 70
|
||||||
|
coding: utf-8
|
||||||
|
End:
|
|
@ -0,0 +1,356 @@
|
||||||
|
PEP: 506
|
||||||
|
Title: Adding A Secrets Module To The Standard Library
|
||||||
|
Version: $Revision$
|
||||||
|
Last-Modified: $Date$
|
||||||
|
Author: Steven D'Aprano <steve@pearwood.info>
|
||||||
|
Status: Draft
|
||||||
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 19-Sep-2015
|
||||||
|
Python-Version: 3.6
|
||||||
|
Post-History:
|
||||||
|
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
This PEP proposes the addition of a module for common security-related
|
||||||
|
functions such as generating tokens to the Python standard library.
|
||||||
|
|
||||||
|
|
||||||
|
Definitions
|
||||||
|
===========
|
||||||
|
|
||||||
|
Some common abbreviations used in this proposal:
|
||||||
|
|
||||||
|
* PRNG:
|
||||||
|
|
||||||
|
Pseudo Random Number Generator. A deterministic algorithm used
|
||||||
|
to produce random-looking numbers with certain desirable
|
||||||
|
statistical properties.
|
||||||
|
|
||||||
|
* CSPRNG:
|
||||||
|
|
||||||
|
Cryptographically Strong Pseudo Random Number Generator. An
|
||||||
|
algorithm used to produce random-looking numbers which are
|
||||||
|
resistant to prediction.
|
||||||
|
|
||||||
|
* MT:
|
||||||
|
|
||||||
|
Mersenne Twister. An extensively studied PRNG which is currently
|
||||||
|
used by the ``random`` module as the default.
|
||||||
|
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
This proposal is motivated by concerns that Python's standard library
|
||||||
|
makes it too easy for developers to inadvertently make serious security
|
||||||
|
errors. Theo de Raadt, the founder of OpenBSD, contacted Guido van Rossum
|
||||||
|
and expressed some concern [1]_ about the use of MT for generating sensitive
|
||||||
|
information such as passwords, secure tokens, session keys and similar.
|
||||||
|
|
||||||
|
Although the documentation for the random module explicitly states that
|
||||||
|
the default is not suitable for security purposes [2]_, it is strongly
|
||||||
|
believed that this warning may be missed, ignored or misunderstood by
|
||||||
|
many Python developers. In particular:
|
||||||
|
|
||||||
|
* developers may not have read the documentation and consequently
|
||||||
|
not seen the warning;
|
||||||
|
|
||||||
|
* they may not realise that their specific use of it has security
|
||||||
|
implications; or
|
||||||
|
|
||||||
|
* not realising that there could be a problem, they have copied code
|
||||||
|
(or learned techniques) from websites which don't offer best
|
||||||
|
practises.
|
||||||
|
|
||||||
|
The first [3]_ hit when searching for "python how to generate passwords" on
|
||||||
|
Google is a tutorial that uses the default functions from the ``random``
|
||||||
|
module [4]_. Although it is not intended for use in web applications, it is
|
||||||
|
likely that similar techniques find themselves used in that situation.
|
||||||
|
The second hit is to a StackOverflow question about generating
|
||||||
|
passwords [5]_. Most of the answers given, including the accepted one, use
|
||||||
|
the default functions. When one user warned that the default could be
|
||||||
|
easily compromised, they were told "I think you worry too much." [6]_
|
||||||
|
|
||||||
|
This strongly suggests that the existing ``random`` module is an attractive
|
||||||
|
nuisance when it comes to generating (for example) passwords or secure
|
||||||
|
tokens.
|
||||||
|
|
||||||
|
Additional motivation (of a more philosophical bent) can be found in the
|
||||||
|
post which first proposed this idea [7]_.
|
||||||
|
|
||||||
|
|
||||||
|
Proposal
|
||||||
|
========
|
||||||
|
|
||||||
|
Alternative proposals have focused on the default PRNG in the ``random``
|
||||||
|
module, with the aim of providing "secure by default" cryptographically
|
||||||
|
strong primitives that developers can build upon without thinking about
|
||||||
|
security. (See Alternatives below.) This proposes a different approach:
|
||||||
|
|
||||||
|
* The standard library already provides cryptographically strong
|
||||||
|
primitives, but many users don't know they exist or when to use them.
|
||||||
|
|
||||||
|
* Instead of requiring crypto-naive users to write secure code, the
|
||||||
|
standard library should include a set of ready-to-use "batteries" for
|
||||||
|
the most common needs, such as generating secure tokens. This code
|
||||||
|
will both directly satisfy a need ("How do I generate a password reset
|
||||||
|
token?"), and act as an example of acceptable practises which
|
||||||
|
developers can learn from [8]_.
|
||||||
|
|
||||||
|
To do this, this PEP proposes that we add a new module to the standard
|
||||||
|
library, with the suggested name ``secrets``. This module will contain a
|
||||||
|
set of ready-to-use functions for common activities with security
|
||||||
|
implications, together with some lower-level primitives.
|
||||||
|
|
||||||
|
The suggestion is that ``secrets`` becomes the go-to module for dealing
|
||||||
|
with anything which should remain secret (passwords, tokens, etc.)
|
||||||
|
while the ``random`` module remains backward-compatible.
|
||||||
|
|
||||||
|
|
||||||
|
API and Implementation
|
||||||
|
======================
|
||||||
|
|
||||||
|
The contents of the ``secrets`` module is expected to evolve over time, and
|
||||||
|
likely will evolve between the time of writing this PEP and actual release
|
||||||
|
in the standard library [9]_. At the time of writing, the following functions
|
||||||
|
have been suggested:
|
||||||
|
|
||||||
|
* A high-level function for generating secure tokens suitable for use
|
||||||
|
in (e.g.) password recovery, as session keys, etc.
|
||||||
|
|
||||||
|
* A limited interface to the system CSPRNG, using either ``os.urandom``
|
||||||
|
directly or ``random.SystemRandom``. Unlike the ``random`` module, this
|
||||||
|
does not need to provide methods for seeding, getting or setting the
|
||||||
|
state, or any non-uniform distributions. It should provide the
|
||||||
|
following:
|
||||||
|
|
||||||
|
- A function for choosing items from a sequence, ``secrets.choice``.
|
||||||
|
- A function for generating an integer within some range, such as
|
||||||
|
``secrets.randrange`` or ``secrets.randint``.
|
||||||
|
- A function for generating a given number of random bits and/or bytes
|
||||||
|
as an integer.
|
||||||
|
- A similar function which returns the value as a hex digit string.
|
||||||
|
|
||||||
|
* ``hmac.compare_digest`` under the name ``equal``.
|
||||||
|
|
||||||
|
The consensus appears to be that there is no need to add a new CSPRNG to
|
||||||
|
the ``random`` module to support these uses, ``SystemRandom`` will be
|
||||||
|
sufficient.
|
||||||
|
|
||||||
|
Some illustrative implementations have been given by Nick Coghlan [10]_.
|
||||||
|
This idea has also been discussed on the issue tracker for the
|
||||||
|
"cryptography" module [11]_.
|
||||||
|
|
||||||
|
The ``secrets`` module itself will be pure Python, and other Python
|
||||||
|
implementations can easily make use of it unchanged, or adapt it as
|
||||||
|
necessary.
|
||||||
|
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
============
|
||||||
|
|
||||||
|
One alternative is to change the default PRNG provided by the ``random``
|
||||||
|
module [12]_. This received considerable scepticism and outright opposition:
|
||||||
|
|
||||||
|
* There is fear that a CSPRNG may be slower than the current PRNG (which
|
||||||
|
in the case of MT is already quite slow).
|
||||||
|
|
||||||
|
* Some applications (such as scientific simulations, and replaying
|
||||||
|
gameplay) require the ability to seed the PRNG into a known state,
|
||||||
|
which a CSPRNG lacks by design.
|
||||||
|
|
||||||
|
* Another major use of the ``random`` module is for simple "guess a number"
|
||||||
|
games written by beginners, and many people are loath to make any
|
||||||
|
change to the ``random`` module which may make that harder.
|
||||||
|
|
||||||
|
* Although there is no proposal to remove MT from the ``random`` module,
|
||||||
|
there was considerable hostility to the idea of having to opt-in to
|
||||||
|
a non-CSPRNG or any backwards-incompatible changes.
|
||||||
|
|
||||||
|
* Demonstrated attacks against MT are typically against PHP applications.
|
||||||
|
It is believed that PHP's version of MT is a significantly softer target
|
||||||
|
than Python's version, due to a poor seeding technique [13]_. Consequently,
|
||||||
|
without a proven attack against Python applications, many people object
|
||||||
|
to a backwards-incompatible change.
|
||||||
|
|
||||||
|
Nick Coghlan made an earlier suggestion for a globally configurable PRNG
|
||||||
|
which uses the system CSPRNG by default [14]_, but has since hinted that he
|
||||||
|
may withdraw it in favour of this proposal [15]_.
|
||||||
|
|
||||||
|
|
||||||
|
Comparison To Other Languages
|
||||||
|
=============================
|
||||||
|
|
||||||
|
* PHP
|
||||||
|
|
||||||
|
PHP includes a function ``uniqid`` [16]_ which by default returns a
|
||||||
|
thirteen character string based on the current time in microseconds.
|
||||||
|
Translated into Python syntax, it has the following signature::
|
||||||
|
|
||||||
|
def uniqid(prefix='', more_entropy=False)->str
|
||||||
|
|
||||||
|
The PHP documentation warns that this function is not suitable for
|
||||||
|
security purposes. Nevertheless, various mature, well-known PHP
|
||||||
|
applications use it for that purpose (citation needed).
|
||||||
|
|
||||||
|
PHP 5.3 and better also includes a function ``openssl_random_pseudo_bytes``
|
||||||
|
[17]_. Translated into Python syntax, it has roughly the following
|
||||||
|
signature::
|
||||||
|
|
||||||
|
def openssl_random_pseudo_bytes(length:int)->Tuple[str, bool]
|
||||||
|
|
||||||
|
This function returns a pseudo-random string of bytes of the given
|
||||||
|
length, and an boolean flag giving whether the string is considered
|
||||||
|
cryptographically strong. The PHP manual suggests that returning
|
||||||
|
anything but True should be rare except for old or broken platforms.
|
||||||
|
|
||||||
|
* Javascript
|
||||||
|
|
||||||
|
Based on a rather cursory search [18]_, there doesn't appear to be any
|
||||||
|
well-known standard functions for producing strong random values in
|
||||||
|
Javascript, although there may be good quality third-party libraries.
|
||||||
|
Standard Javascript doesn't seem to include an interface to the
|
||||||
|
system CSPRNG either, and people have extensively written about the
|
||||||
|
weaknesses of Javascript's ``Math.random`` [19]_.
|
||||||
|
|
||||||
|
* Ruby
|
||||||
|
|
||||||
|
The Ruby standard library includes a module ``SecureRandom`` [20]_
|
||||||
|
which includes the following methods:
|
||||||
|
|
||||||
|
* base64 - returns a Base64 encoded random string.
|
||||||
|
|
||||||
|
* hex - returns a random hexadecimal string.
|
||||||
|
|
||||||
|
* random_bytes - returns a random byte string.
|
||||||
|
|
||||||
|
* random_number - depending on the argument, returns either a random
|
||||||
|
integer in the range(0, n), or a random float between 0.0 and 1.0.
|
||||||
|
|
||||||
|
* urlsafe_base64 - returns a random URL-safe Base64 encoded string.
|
||||||
|
|
||||||
|
* uuid - return a version 4 random Universally Unique IDentifier.
|
||||||
|
|
||||||
|
|
||||||
|
What Should Be The Name Of The Module?
|
||||||
|
======================================
|
||||||
|
|
||||||
|
There was a proposal to add a "random.safe" submodule, quoting the Zen
|
||||||
|
of Python "Namespaces are one honking great idea" koan. However, the
|
||||||
|
author of the Zen, Tim Peters, has come out against this idea [21]_, and
|
||||||
|
recommends a top-level module.
|
||||||
|
|
||||||
|
In discussion on the python-ideas mailing list so far, the name "secrets"
|
||||||
|
has received some approval, and no strong opposition.
|
||||||
|
|
||||||
|
|
||||||
|
Frequently Asked Questions
|
||||||
|
==========================
|
||||||
|
|
||||||
|
* Q: Is this a real problem? Surely MT is random enough that nobody can
|
||||||
|
predict its output.
|
||||||
|
|
||||||
|
A: The consensus among security professionals is that MT is not safe
|
||||||
|
in security contexts. It is not difficult to reconstruct the internal
|
||||||
|
state of MT [22]_ [23]_ and so predict all past and future values. There
|
||||||
|
are a number of known, practical attacks on systems using MT for
|
||||||
|
randomness [24]_.
|
||||||
|
|
||||||
|
While there are currently no known direct attacks on applications
|
||||||
|
written in Python due to the use of MT, there is widespread agreement
|
||||||
|
that such usage is unsafe.
|
||||||
|
|
||||||
|
* Q: Is this an alternative to specialise cryptographic software such as SSL?
|
||||||
|
|
||||||
|
A: No. This is a "batteries included" solution, not a full-featured
|
||||||
|
"nuclear reactor". It is intended to mitigate against some basic
|
||||||
|
security errors, not be a solution to all security-related issues. To
|
||||||
|
quote Nick Coghlan referring to his earlier proposal [25]_::
|
||||||
|
|
||||||
|
"...folks really are better off learning to use things like
|
||||||
|
cryptography.io for security sensitive software, so this change
|
||||||
|
is just about harm mitigation given that it's inevitable that a
|
||||||
|
non-trivial proportion of the millions of current and future
|
||||||
|
Python developers won't do that."
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [1] https://mail.python.org/pipermail/python-ideas/2015-September/035820.html
|
||||||
|
|
||||||
|
.. [2] https://docs.python.org/3/library/random.html
|
||||||
|
|
||||||
|
.. [3] As of the date of writing. Also, as Google search terms may be
|
||||||
|
automatically customised for the user without their knowledge, some
|
||||||
|
readers may see different results.
|
||||||
|
|
||||||
|
.. [4] http://interactivepython.org/runestone/static/everyday/2013/01/3_password.html
|
||||||
|
|
||||||
|
.. [5] http://stackoverflow.com/questions/3854692/generate-password-in-python
|
||||||
|
|
||||||
|
.. [6] http://stackoverflow.com/questions/3854692/generate-password-in-python/3854766#3854766
|
||||||
|
|
||||||
|
.. [7] https://mail.python.org/pipermail/python-ideas/2015-September/036238.html
|
||||||
|
|
||||||
|
.. [8] At least those who are motivated to read the source code and documentation.
|
||||||
|
|
||||||
|
.. [9] Tim Peters suggests that bike-shedding the contents of the module will
|
||||||
|
be 10000 times more time consuming than actually implementing the
|
||||||
|
module. Words do not begin to express how much I am looking forward to
|
||||||
|
this.
|
||||||
|
|
||||||
|
.. [10] https://mail.python.org/pipermail/python-ideas/2015-September/036271.html
|
||||||
|
|
||||||
|
.. [11] https://github.com/pyca/cryptography/issues/2347
|
||||||
|
|
||||||
|
.. [12] Link needed.
|
||||||
|
|
||||||
|
.. [13] By default PHP seeds the MT PRNG with the time (citation needed),
|
||||||
|
which is exploitable by attackers, while Python seeds the PRNG with
|
||||||
|
output from the system CSPRNG, which is believed to be much harder to
|
||||||
|
exploit.
|
||||||
|
|
||||||
|
.. [14] http://legacy.python.org/dev/peps/pep-0504/
|
||||||
|
|
||||||
|
.. [15] https://mail.python.org/pipermail/python-ideas/2015-September/036243.html
|
||||||
|
|
||||||
|
.. [16] http://php.net/manual/en/function.uniqid.php
|
||||||
|
|
||||||
|
.. [17] http://php.net/manual/en/function.openssl-random-pseudo-bytes.php
|
||||||
|
|
||||||
|
.. [18] Volunteers and patches are welcome.
|
||||||
|
|
||||||
|
.. [19] http://ifsec.blogspot.fr/2012/05/cross-domain-mathrandom-prediction.html
|
||||||
|
|
||||||
|
.. [20] http://ruby-doc.org/stdlib-2.1.2/libdoc/securerandom/rdoc/SecureRandom.html
|
||||||
|
|
||||||
|
.. [21] https://mail.python.org/pipermail/python-ideas/2015-September/036254.html
|
||||||
|
|
||||||
|
.. [22] https://jazzy.id.au/2010/09/22/cracking_random_number_generators_part_3.html
|
||||||
|
|
||||||
|
.. [23] https://mail.python.org/pipermail/python-ideas/2015-September/036077.html
|
||||||
|
|
||||||
|
.. [24] https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf
|
||||||
|
|
||||||
|
.. [25] https://mail.python.org/pipermail/python-ideas/2015-September/036157.html
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
sentence-end-double-space: t
|
||||||
|
fill-column: 70
|
||||||
|
coding: utf-8
|
||||||
|
End:
|
|
@ -2,7 +2,7 @@ PEP: 3140
|
||||||
Title: str(container) should call str(item), not repr(item)
|
Title: str(container) should call str(item), not repr(item)
|
||||||
Version: $Revision$
|
Version: $Revision$
|
||||||
Last-Modified: $Date$
|
Last-Modified: $Date$
|
||||||
Author: Oleg Broytmann <phd@phd.pp.ru>,
|
Author: Oleg Broytman <phd@phdru.name>,
|
||||||
Jim J. Jewett <jimjjewett@gmail.com>
|
Jim J. Jewett <jimjjewett@gmail.com>
|
||||||
Discussions-To: python-3000@python.org
|
Discussions-To: python-3000@python.org
|
||||||
Status: Rejected
|
Status: Rejected
|
||||||
|
|
Loading…
Reference in New Issue