PEP: 476 Title: Enabling certificate verification by default for stdlib http clients Version: $Revision$ Last-Modified: $Date$ Author: Alex Gaynor Status: Accepted Type: Standards Track Content-Type: text/x-rst Created: 28-August-2014 Abstract ======== Currently when a standard library http client (the ``urllib``, ``urllib2``, ``http``, and ``httplib`` modules) encounters an ``https://`` URL it will wrap the network HTTP traffic in a TLS stream, as is necessary to communicate with such a server. However, during the TLS handshake it will not actually check that the server has an X509 certificate is signed by a CA in any trust root, nor will it verify that the Common Name (or Subject Alternate Name) on the presented certificate matches the requested host. The failure to do these checks means that anyone with a privileged network position is able to trivially execute a man in the middle attack against a Python application using either of these HTTP clients, and change traffic at will. This PEP proposes to enable verification of X509 certificate signatures, as well as hostname verification for Python's HTTP clients by default, subject to opt-out on a per-call basis. This change would be applied to Python 2.7, Python 3.4, and Python 3.5. Rationale ========= The "S" in "HTTPS" stands for secure. When Python's users type "HTTPS" they are expecting a secure connection, and Python should adhere to a reasonable standard of care in delivering this. Currently we are failing at this, and in doing so, APIs which appear simple are misleading users. When asked, many Python users state that they were not aware that Python failed to perform these validations, and are shocked. The popularity of ``requests`` (which enables these checks by default) demonstrates that these checks are not overly burdensome in any way, and the fact that it is widely recommended as a major security improvement over the standard library clients demonstrates that many expect a higher standard for "security by default" from their tools. The failure of various applications to note Python's negligence in this matter is a source of *regular* CVE assignment [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_. .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-4340 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-3533 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-5822 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-5825 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-1909 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2037 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2073 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2191 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-4111 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-6396 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-6444 Technical Details ================= Python would use the system provided certificate database on all platforms. Failure to locate such a database would be an error, and users would need to explicitly specify a location to fix it. This will be achieved by adding a new ``ssl._create_default_https_context`` function, which is the same as ``ssl.create_default_context``. ``http.client`` can then replace its usage of ``ssl._create_stdlib_context`` with the ``ssl._create_default_https_context``. Additionally ``ssl._create_stdlib_context`` is renamed ``ssl._create_unverified_context`` (an alias is kept around for backwards compatibility reasons). Trust database -------------- This PEP proposes using the system-provided certificate database. Previous discussions have suggested bundling Mozilla's certificate database and using that by default. This was decided against for several reasons: * Using the platform trust database imposes a lower maintenance burden on the Python developers -- shipping our own trust database would require doing a release every time a certificate was revoked. * Linux vendors, and other downstreams, would unbundle the Mozilla certificates, resulting in a more fragmented set of behaviors. * Using the platform stores makes it easier to handle situations such as corporate internal CAs. OpenSSL also has a pair of environment variables, ``SSL_CERT_DIR`` and ``SSL_CERT_FILE`` which can be used to point Python at a different certificate database. Backwards compatibility ----------------------- This change will have the appearance of causing some HTTPS connections to "break", because they will now raise an Exception during handshake. This is misleading however, in fact these connections are presently failing silently, an HTTPS URL indicates an expectation of confidentiality and authentication. The fact that Python does not actually verify that the user's request has been made is a bug, further: "Errors should never pass silently." Nevertheless, users who have a need to access servers with self-signed or incorrect certificates would be able to do so by providing a context with custom trust roots or which disables validation (documentation should strongly recommend the former where possible). Users will also be able to add necessary certificates to system trust stores in order to trust them globally. Twisted's 14.0 release made this same change, and it has been met with almost no opposition. Opting out ---------- For users who wish to opt out of certificate verification, they can achieve this by providing the ``context`` argument to ``urllib.urlopen``:: import ssl # This restores the same behavior as before. context = ssl._create_unverified_context() urllib.urlopen("https://no-valid-cert", context=context) It is also possible **though highly discouraged** to globally disable verification by monkeypatching the ``ssl`` module:: import ssl ssl._create_default_https_context = ssl._create_unverified_context Other protocols =============== This PEP only proposes requiring this level of validation for HTTP clients, not for other protocols such as SMTP. This is because while a high percentage of HTTPS servers have correct certificates, as a result of the validation performed by browsers, for other protocols self-signed or otherwise incorrect certificates are far more common. Note that for SMTP at least, this appears to be changing and should be reviewed for a potential similar PEP in the future: * https://www.facebook.com/notes/protect-the-graph/the-current-state-of-smtp-starttls-deployment/1453015901605223 * https://www.facebook.com/notes/protect-the-graph/massive-growth-in-smtp-starttls-deployment/1491049534468526 Python Versions =============== This PEP describes changes that will occur on both the 3.4.x, 3.5 and 2.7.X branches. For 2.7.X this will require backporting the ``context`` (``SSLContext``) argument to ``httplib``, in addition to the features already backported in :pep:`466`. Implementation ============== * **LANDED**: `Issue 22366 `_ adds the ``context`` argument to ``urlib.request.urlopen``. * `Issue 22417 `_ implements the substance of this PEP. Copyright ========= This document has been placed into the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8