PEP: 594 Title: Removing dead batteries from the standard library Author: Christian Heimes Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 20-May-2019 Post-History: 21-May-2019 Abstract ======== This PEP proposed a list of standard library modules to be removed from the standard library. The modules are mostly historic data formats and APIs that have been superseded a long time ago, e.g. Mac OS 9 and Commodore. Rationale ========= Back in the early days of Python, the interpreter came with a large set of useful modules. This was often refrained to as "batteries included" philosophy and was one of the corner stones to Python's success story. Users didn't have to figure out how to download and install separate packages in order to write a simple web server or parse email. Times have changed. The introduction of the cheese shop (PyPI), setuptools, and later pip, it became simple and straight forward to download and install packages. Nowadays Python has a rich and vibrant ecosystem of third party packages. It's pretty much standard to either install packages from PyPI or use one of the many Python or Linux distributions. On the other hand, Python's standard library is piling up cruft, unnecessary duplication of functionality, and dispensable features. This is undesirable for several reasons. * Any additional module increases the maintenance cost for the Python core development team. The team has limited resources, reduced maintenance cost frees development time for other improvements. * Modules in the standard library are generally favored and seen as the de-facto solution for a problem. A majority of users only pick 3rd party modules to replace a stdlib module, when they have a compelling reason, e.g. lxml instead of `xml`. The removal of an unmaintained stdlib module increases the chances of a community contributed module to become widely used. * A lean and mean standard library benefits platforms with limited resources like devices with just a few hundred kilobyte of storage (e.g. BBC Micro:bit). Python on mobile platforms like BeeWare or WebAssembly (e.g. pyodide) also benefit from reduced download size. The modules in the PEP have been selected for deprecation because their removal is either least controversial or most beneficial. For example least controversial are 30 years old multimedia formats like ``sunau`` audio format, which was used on SPARC and NeXT workstations in the late 1980ties. The ``crypt`` module has fundamental flaws that are better solved outside the standard library. This PEP also designates some modules as not scheduled for removal. Some modules have been deprecated for several releases or seem unnecessary at first glance. However it is beneficial to keep the modules in the standard library, mostly for environments where installing a package from PyPI is not an option. This can be cooperate environments or class rooms where external code is not permitted without legal approval. * The usage of FTP is declining, but some files are still provided over the FTP protocol or hosters offer FTP to upload content. Therefore ``ftplib`` is going to stay. * The ``optparse`` and ``getopt`` module are widely used. They are mature modules with very low maintenance overhead. * According to David Beazley [5]_ the ``wave`` module is easy to teach to kids and can make crazy sounds. Making a computer generate crazy sounds is powerful and highly motivating exercise for a 9yo aspiring developer. It's a fun battery to keep. Deprecation schedule ==================== 3.8 --- This PEP targets Python 3.8. Version 3.8.0 final is scheduled to be released a few months before Python 2.7 will reach its end of lifetime. We expect that Python 3.8 will be targeted by users that migrate to Python 3 in 2019 and 2020. To reduce churn and to allow a smooth transition from Python 2, Python 3.8 will neither raise `DeprecationWarning` nor remove any modules that have been scheduled for removal. Instead deprecated modules will just be *documented* as deprecated. Optionally modules may emit a `PendingDeprecationWarning`. All deprecated modules will also undergo a feature freeze. No additional features should be added. Bug should still be fixed. 3.9 --- Starting with Python 3.9, deprecated modules will start issuing `DeprecationWarning`. The `parser`_ module is removed and potentially replaced with a new module. 3.10 ---- In 3.10 all deprecated modules will be removed from the CPython repository together with tests, documentation, and autoconf rules. PEP acceptance process ====================== 3.8.0b1 is scheduled to be release shortly after the PEP is officially submitted. Since it's improbable that the PEP will pass all stages of the PEP process in time, I propose a two step acceptance process that is analogous Python's two release deprecation process. The first *provisionally accepted* phase targets Python 3.8.0b1. In the first phase no code is changes or removed. Modules are only documented as deprecated. The only exception is the `parser`_ module. It has been documented as deprecated since Python 2.5 and is scheduled for removal for 3.9 to make place for a more advanced parser. The final decision, which modules will be removed and how the removed code is preserved, can be delayed for another year. Deprecated modules ================== The modules are grouped as data encoding, multimedia, network, OS interface, and misc modules. The majority of modules are for old data formats or old APIs. Some others are rarely useful and have better replacements on PyPI, e.g. Pillow for image processing or NumPy-based projects to deal with audio processing. .. csv-table:: Table 1: Proposed modules deprecations :header: "Module", "Deprecated in", "To be removed", "Replacement" :widths: 1, 1, 1, 2 aifc,3.8,3.10,\- asynchat,3.8,3.10,asyncio asyncore,3.8,3.10,asyncio audioop,3.8,3.10,\- binhex,3.8,3.10,\- cgi,3.8,3.10,\- cgitb,3.8,3.10,\- chunk,3.8,3.10,\- colorsys,3.8,3.10,"colormath, colour, colorspacious, Pillow" crypt,3.8,3.10,"bcrypt, argon2cffi, hashlib, passlib" fileinput,\-,**keep**,argparse formatter,3.4,3.10,\- fpectl,**3.7**,**3.7**,\- getopt,**3.2**,**keep**,"argparse, optparse" imghdr,3.8,3.10,"filetype, puremagic, python-magic" imp,**3.4**,3.10,importlib lib2to3,\-,**keep**, macpath,**3.7**,**3.8**,\- msilib,3.8,3.10,\- nntplib,3.8,3.10,\- nis,3.8,3.10,\- optparse,\-,**keep**,argparse ossaudiodev,3.8,3.10,\- parser,**2.5**,**3.9**,"ast, lib2to3.pgen2" pipes,3.8,3.10,subprocess smtpd,"**3.4.7**, **3.5.4**",3.10,aiosmtpd sndhdr,3.8,3.10,"filetype, puremagic, python-magic" spwd,3.8,3.10,"python-pam, simplepam" sunau,3.8,3.10,\- uu,3.8,3.10,\- wave,\-,**keep**, xdrlib,3.8,3.10,\- Data encoding modules --------------------- binhex ~~~~~~ The `binhex `_ module encodes and decodes Apple Macintosh binhex4 data. It was originally developed for TSR-80. In the 1980s and early 1990s it was used on classic Mac OS 9 to encode binary email attachments. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute **none** uu ~~ The `uu `_ module provides uuencode format, an old binary encoding format for email from 1980. The uu format has been replaced by MIME. The uu codec is provided by the binascii module. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute **none** xdrlib ~~~~~~ The `xdrlib `_ module supports the Sun External Data Representation Standard. XDR is an old binary serialization format from 1987. These days it's rarely used outside specialized domains like NFS. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute **none** Multimedia modules ------------------ aifc ~~~~ The `aifc `_ module provides support for reading and writing AIFF and AIFF-C files. The Audio Interchange File Format is an old audio format from 1988 based on Amiga IFF. It was most commonly used on the Apple Macintosh. These days only few specialized application use AIFF. Module type pure Python (depends on `audioop`_ C extension) Deprecated in 3.8 To be removed in 3.10 Substitute **none** audioop ~~~~~~~ The `audioop `_ module contains helper functions to manipulate raw audio data and adaptive differential pulse-code modulated audio data. The module is implemented in C without any additional dependencies. The `aifc`_, `sunau`_, and `wave`_ module depend on `audioop`_ for some operations. The byteswap operation in the `wave`_ module can be substituted with little work. Module type C extension Deprecated in 3.8 To be removed in 3.10 Substitute **none** colorsys ~~~~~~~~ The `colorsys `_ module defines color conversion functions between RGB, YIQ, HSL, and HSV coordinate systems. The PyPI packages *colormath*, *colour*, and *colorspacious* provide more and advanced features. The Pillow library is better suited to transform images between color systems. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute `colormath `_, `colour `_ `colorspacious `_, `Pillow `_ chunk ~~~~~ The `chunk `_ module provides support for reading and writing Electronic Arts' Interchange File Format. IFF is an old audio file format originally introduced for Commodore and Amiga. The format is no longer relevant. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute **none** imghdr ~~~~~~ The `imghdr `_ module is a simple tool to guess the image file format from the first 32 bytes of a file or buffer. It supports only a limited amount of formats and neither returns resolution nor color depth. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute `puremagic `_, `filetype `_, `python-magic `_ ossaudiodev ~~~~~~~~~~~ The `ossaudiodev `_ module provides support for Open Sound System, an interface to sound playback and capture devices. OSS was initially free software, but later support for newer sound devices and improvements were proprietary. Linux community abandoned OSS in favor of ALSA [1]_. Some operation systems like OpenBSD and NetBSD provide an incomplete [2]_ emulation of OSS. Module type C extension Deprecated in 3.8 To be removed in 3.10 Substitute **none** sndhdr ~~~~~~ The `sndhdr `_ module is similar to the `imghdr`_ module but for audio formats. It guesses file format, channels, frame rate, and sample widths from the first 512 bytes of a file or buffer. The module only supports AU, AIFF, HCOM, VOC, WAV, and other ancient formats. Module type pure Python (depends on `audioop`_ C extension for some operations) Deprecated in 3.8 To be removed in 3.10 Substitute `puremagic `_, `filetype `_, `python-magic `_ sunau ~~~~~ The `sunau `_ module provides support for Sun AU sound format. It's yet another old, obsolete file format. Module type pure Python (depends on `audioop`_ C extension for some operations) Deprecated in 3.8 To be removed in 3.10 Substitute **none** Networking modules ------------------ asynchat ~~~~~~~~ The `asynchat `_ module is build on top of `asyncore`_ and has been deprecated since Python 3.6. Module type pure Python Deprecated in 3.6 Removed in 3.10 Substitute asyncio asyncore ~~~~~~~~ The `asyncore `_ module was the first module for asynchronous socket service clients and servers. It has been replaced by asyncio and is deprecated since Python 3.6. The ``asyncore`` module is also used in stdlib tests. The tests for ``ftplib``, ``logging``, ``smptd``, ``smtplib``, and ``ssl`` are partly based on ``asyncore``. These tests must be updated to use asyncio or threading. Module type pure Python Deprecated in 3.6 Removed in 3.10 Substitute asyncio cgi ~~~ The `cgi `_ module is a support module for Common Gateway Interface (CGI) scripts. CGI is deemed as inefficient because every incoming request is handled in a new process. PEP 206 considers the module as *designed poorly and are now near-impossible to fix*. Several people proposed to either keep the cgi module for features like `cgi.parse_qs()` or move `cgi.escape()` to a different module. The functions `cgi.parse_qs` and `cgi.parse_qsl` have been deprecated for a while and are actually aliases for `urllib.parse.parse_qs` and `urllib.parse.parse_qsl`. The function `cgi.quote` has been deprecated in favor of `html.quote` with secure default values. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute **none** cgitb ~~~~~ The `cgitb `_ module is a helper for the cgi module for configurable tracebacks. The ``cgitb`` module is not used by any major Python web framework (Django, Pyramid, Plone, Flask, CherryPy, or Bottle). Only Paste uses it in an optional debugging middleware. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute **none** smtpd ~~~~~ The `smtpd `_ module provides a simple implementation of a SMTP mail server. The module documentation marks the module as deprecated and recommends ``aiosmtpd`` instead. The deprecation message was added in releases 3.4.7, 3.5.4, and 3.6.1. Module type pure Python Deprecated in **3.7** To be removed in 3.10 Substitute aiosmtpd nntplib ~~~~~~~ The `nntplib `_ module implements the client side of the Network News Transfer Protocol (nntp). News groups used to be a dominant platform for online discussions. Over the last two decades, news has been slowly but steadily replaced with mailing lists and web-based discussion platforms. Twisted is also `planning `_ to deprecate NNTP support. The ``nntplib`` tests have been the cause of additional work in the recent past. Python only contains client side of NNTP. The tests connect to external news server. The servers are sometimes unavailble, too slow, or do not work correctly over IPv6. The situation causes flaky test runs on buildbots. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute **none** Operating system interface -------------------------- crypt ~~~~~ The `crypt `_ module implements password hashing based on ``crypt(3)`` function from ``libcrypt`` or ``libxcrypt`` on Unix-like platform. The algorithms are mostly old, of poor quality and insecure. Users are discouraged to use them. * The module is not available on Windows. Cross-platform application need an alternative implementation any way. * Only DES encryption is guarenteed to be available. DES has an extremely limited key space of 2**56. * MD5, salted SHA256, salted SHA512, and Blowfish are optional extension. SSHA256 and SSHA512 are glibc extensions. Blowfish (bcrypt) is the only algorithm that is still secure. However it's in glibc and therefore not commonly available on Linux. * Depending on the platform, the ``crypt`` module is not thread safe. Only implementations with ``crypt_r(3)`` are thread safe. * The module was never useful to interact with system user and password databases. On BSD, macOS, and Linux, all user authentication and password modification operations must go through PAM (pluggable authentication module), see `spwd`_ deprecation. Module type C extension + Python module Deprecated in 3.8 To be removed in 3.10 Substitute `bcrypt `_, `passlib `_, `argon2cffi `_, hashlib module (PBKDF2, scrypt) macpath ~~~~~~~ The `macpath `_ module provides Mac OS 9 implementation of os.path routines. Mac OS 9 is no longer supported Module type pure Python Deprecated in 3.7 Removed in 3.8 Substitute **none** nis ~~~ The `nis `_ module provides NIS/YP support. Network Information Service / Yellow Pages is an old and deprecated directory service protocol developed by Sun Microsystems. It's designed successor NIS+ from 1992 never took off. For a long time, libc's Name Service Switch, LDAP, and Kerberos/GSSAPI are considered a more powerful and more secure replacement of NIS. Module type C extension Deprecated in 3.8 To be removed in 3.10 Substitute **none** spwd ~~~~ The `spwd `_ module provides direct access to Unix shadow password database using non-standard APIs. In general it's a bad idea to use the spwd. The spwd circumvents system security policies, it does not use the PAM stack, and is only compatible with local user accounts, because it ignores NSS. The use of the ``spwd`` module for access control must be consider a *security bug*, as it bypasses PAM's access control. Further more the ``spwd`` module uses the `shadow(3) `_ APIs. Functions like ``getspnam(3)`` access the ``/etc/shadow`` file directly. This is dangerous and even forbidden for confined services on systems with a security engine like SELinux or AppArmor. Module type C extension Deprecated in 3.8 To be removed in 3.10 Substitute `python-pam `_, `simpleplam `_ Misc modules ------------ formatter ~~~~~~~~~ The `formatter `_ module is an old text formatting module which has been deprecated since Python 3.4. Module type pure Python Deprecated in 3.4 To be removed in 3.10 Substitute *n/a* imp ~~~ The `imp `_ module is the predecessor of the `importlib `_ module. Most functions have been deprecated since Python 3.3 and the module since Python 3.4. Module type C extension Deprecated in 3.4 To be removed in 3.10 Substitute importlib msilib ~~~~~~ The `msilib `_ package is a Windows-only package. It supports the creation of Microsoft Installers (MSI). The package also exposes additional APIs to create cabinet files (CAB). The module is used to facilitate distutils to create MSI installers with ``bdist_msi`` command. In the past it was used to create CPython's official Windows installer, too. Microsoft is slowly moving away from MSI in favor of Windows 10 Apps (AppX) as new deployment model [3]_. Module type C extension + Python code Deprecated in 3.8 To be removed in 3.10 Substitute **none** parser ~~~~~~ The `parser `_ module provides an interface to Python’s internal parser and byte-code compiler. The stdlib has superior ways to interact with the parse tree. From Python 2.5 onward, it's much more convenient to cut in at the Abstract Syntax Tree (AST) generation and compilation stage. The ``parser`` module causes additional work. It's C code that must be kept in sync with any change to Python's grammar and internal parser. Pablo wants to remove the parser module and promote lib2to3's pgen2 instead [6]_. Most importantly the presence of the ``parser`` module makes it harder to switch to something more powerful than a LL(1) parser [7]_. Since the ``parser`` module is documented as deprecated since Python 2.5 and a new parsing technology is planned for 3.9, the ``parser`` module is scheduled for removal in 3.9. Module type C extension Deprecated in 3.8, documented as deprecated since **2.5** To be removed in **3.9** Substitute ast, lib2to3.pgen2 pipes ~~~~~ The `pipes `_ module provides helpers to pipe the input of one command into the output of another command. The module is built on top of ``os.popen``. Users are encouraged to use the subprocess module instead. Module type pure Python Deprecated in 3.8 To be removed in 3.10 Substitute subprocess module Removed modules =============== fpectl ------ The `fpectl `_ module was never built by default, its usage was discouraged and considered dangerous. It also required a configure flag that caused an ABI incompatibility. The module was removed in 3.7 by Nathaniel J. Smith in `bpo-29137 `_. Module type C extension + CAPI Deprecated in 3.7 Removed in 3.7 Substitute **none** Modules to keep =============== Some modules were originally proposed for deprecation. fileinput --------- The `fileinput `_ module implements a helpers to iterate over a list of files from ``sys.argv``. The module predates the optparser and argparser module. The same functionality can be implemented with the argparser module. Several core developers expressed their interest to keep the module in the standard library, as it is handy for quick scripts. Module type pure Python lib2to3 ------- The `lib2to3 `_ package provides the ``2to3`` command to transpile Python 2 code to Python 3 code. The package is useful for other tasks besides porting code from Python 2 to 3. For example `black`_ uses it for code reformatting. Module type pure Python getopt ------ The `getopt `_ module mimics C's getopt() option parser. Although users are encouraged to use argparse instead, the getopt module is still widely used. Module type pure Python Substitute argparse optparse -------- The `optparse `_ module is the predecessor of the argparse module. Although it has been deprecated for many years, it's still widely used. Module type pure Python Deprecated in 3.2 Substitute argparse wave ~~~~ The `wave `_ module provides support for the WAV sound format. The module uses one simple function from the `audioop`_ module to perform byte swapping between little and big endian formats. Before 24 bit WAV support was added, byte swap used to be implemented with the ``array`` module. To remove ``wave``'s dependency on the ``audioop``, the byte swap function could be either be moved to another module (e.g. ``operator``) or the ``array`` module could gain support for 24 bit (3 byte) arrays. Module type pure Python (depends on *byteswap* from `audioop`_ C extension) Deprecated in 3.8 To be removed in 3.10 Substitute *n/a* Future maintenance of removed modules ===================================== The main goal of the PEP is to reduce the burden and workload on the Python core developer team. Therefore removed modules will not be maintained by the core team as separate PyPI packages. However the removed code, tests and documentation may be moved into a new git repository, so community members have a place from which they can pick up and fork code. A first draft of a `legacylib `_ repository is available on my private Github account. It's my hope that some of the deprecated modules will be picked up and adopted by users that actually care about them. For example ``colorsys`` and ``imghdr`` are useful modules, but have limited feature set. A fork of ``imghdr`` can add new features and support for more image formats, without being constrained by Python's release cycle. Most of the modules are in pure Python and can be easily packaged. Some depend on a simple C module, e.g. `audioop`_ and `crypt`_. Since `audioop`_ does not depend on any external libraries, it can be shipped in as binary wheels with some effort. Other C modules can be replaced with ctypes or cffi. For example I created `legacycrypt `_ with ``_crypt`` extension reimplemented with a few lines of ctypes code. Discussions =========== * Elana Hashman and Nick Coghlan suggested to keep the *getopt* module. * Berker Peksag proposed to deprecate and removed *msilib*. * Brett Cannon recommended to delay active deprecation warnings and removal of modules like *imp* until Python 3.10. Version 3.8 will be released shortly before Python 2 reaches end of lifetime. A delay reduced churn for users that migrate from Python 2 to 3.8. * Brett also came up with the idea to keep lib2to3. The package is useful for other purposes, e.g. `black `_ uses it to reformat Python code. * At one point, distutils was mentioned in the same sentence as this PEP. To avoid lengthy discussion and delay of the PEP, I decided against dealing with distutils. Deprecation of the distutils package will be handled by another PEP. * Multiple people (Gregory P. Smith, David Beazley, Nick Coghlan, ...) convinced me to keep the `wave`_ module. [4]_ * Gregory P. Smith proposed to deprecate `nntplib`_. [4]_ * Andrew Svetlov mentioned the ``socketserver`` module is questionable. However it's used to implement ``http.server`` and ``xmlrpc.server``. The stdlib doesn't have a replacement for the servers, yet. Update history ============== Update 1 -------- * Deprecate `parser`_ module * Keep `fileinput`_ module * Elaborate why `crypt`_ and `spwd`_ are dangerous and bad * Improve sections for `cgitb`_, `colorsys`_, `nntplib`_, and `smtpd`_ modules * The `colorsys`_, `crypt`_, `imghdr`_, `sndhdr`_, and `spwd`_ sections now list suitable substitutions. * Mention that ``socketserver`` is going to stay for ``http.server`` and ``xmlrpc.server`` References ========== .. [1] https://en.wikipedia.org/wiki/Open_Sound_System#Free,_proprietary,_free .. [2] https://man.openbsd.org/ossaudio .. [3] https://blogs.msmvps.com/installsite/blog/2015/05/03/the-future-of-windows-installer-msi-in-the-light-of-windows-10-and-the-universal-windows-platform/ .. [4] https://twitter.com/ChristianHeimes/status/1130257799475335169 .. [5] https://twitter.com/dabeaz/status/1130278844479545351 .. [6] https://mail.python.org/pipermail/python-dev/2019-May/157464.html .. [7] https://discuss.python.org/t/switch-pythons-parsing-tech-to-something-more-powerful-than-ll-1/379 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: