python-peps/pep-0270.txt

82 lines
2.2 KiB
Plaintext
Raw Normal View History

PEP: 270
Title: uniq method for list objects
Author: Jason Petrone <jp@demonseed.net>
2002-11-06 00:41:32 -05:00
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 21-Aug-2001
Python-Version: 2.2
Post-History:
2002-11-06 00:41:32 -05:00
Notice
======
2002-11-06 00:41:32 -05:00
This PEP is withdrawn by the author. He writes:
2002-11-06 00:41:32 -05:00
Removing duplicate elements from a list is a common task, but
there are only two reasons I can see for making it a built-in.
The first is if it could be done much faster, which isn't the
case. The second is if it makes it significantly easier to
write code. The introduction of ``sets.py`` eliminates this
situation since creating a sequence without duplicates is just
a matter of choosing a different data structure: a set instead
of a list.
2002-11-06 00:41:32 -05:00
As described in :pep:`218`, sets are being added to the standard
library for Python 2.3.
2002-11-06 00:41:32 -05:00
Abstract
========
This PEP proposes adding a method for removing duplicate elements to
the list object.
Rationale
=========
Removing duplicates from a list is a common task. I think it is
useful and general enough to belong as a method in list objects.
It also has potential for faster execution when implemented in C,
especially if optimization using hashing or sorted cannot be used.
On comp.lang.python there are many, many, posts [1]_ asking about
the best way to do this task. It's a little tricky to implement
optimally and it would be nice to save people the trouble of
figuring it out themselves.
Considerations
==============
Tim Peters suggests trying to use a hash table, then trying to
sort, and finally falling back on brute force [2]_. Should uniq
maintain list order at the expense of speed?
Is it spelled 'uniq' or 'unique'?
Reference Implementation
========================
I've written the brute force version. It's about 20 lines of code
in ``listobject.c``. Adding support for hash table and sorted
duplicate removal would only take another hour or so.
References
==========
.. [1] https://groups.google.com/forum/#!searchin/comp.lang.python/duplicates
.. [2] Tim Peters unique() entry in the Python cookbook:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52560/index_txt
Copyright
=========
This document has been placed in the public domain.