84 lines
2.3 KiB
Plaintext
84 lines
2.3 KiB
Plaintext
PEP: 270
|
||
Title: uniq method for list objects
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: jp@demonseed.net (Jason Petrone)
|
||
Status: Rejected
|
||
Type: Standards Track
|
||
Created: 21-Aug-2001
|
||
Python-Version: 2.2
|
||
Post-History:
|
||
|
||
|
||
Notice
|
||
|
||
This PEP is withdrawn by the author. He writes:
|
||
|
||
Removing duplicate elements from a list is a common task, but
|
||
there are only two reasons I can see for making it a built-in.
|
||
The first is if it could be done much faster, which isn't the
|
||
case. The second is if it makes it significantly easier to
|
||
write code. The introduction of sets.py eliminates this
|
||
situation since creating a sequence without duplicates is just
|
||
a matter of choosing a different data structure: a set instead
|
||
of a list.
|
||
|
||
As described in PEP 218, sets are being added to the standard
|
||
library for Python 2.3.
|
||
|
||
|
||
Abstract
|
||
|
||
This PEP proposes adding a method for removing duplicate elements to
|
||
the list object.
|
||
|
||
|
||
Rationale
|
||
|
||
Removing duplicates from a list is a common task. I think it is
|
||
useful and general enough to belong as a method in list objects.
|
||
It also has potential for faster execution when implemented in C,
|
||
especially if optimization using hashing or sorted cannot be used.
|
||
|
||
On comp.lang.python there are many, many, posts[1] asking about
|
||
the best way to do this task. Its a little tricky to implement
|
||
optimally and it would be nice to save people the trouble of
|
||
figuring it out themselves.
|
||
|
||
|
||
Considerations
|
||
|
||
Tim Peters suggests trying to use a hash table, then trying to
|
||
sort, and finally falling back on brute force[2]. Should uniq
|
||
maintain list order at the expense of speed?
|
||
|
||
Is it spelled 'uniq' or 'unique'?
|
||
|
||
|
||
Reference Implementation
|
||
|
||
I've written the brute force version. Its about 20 lines of code
|
||
in listobject.c. Adding support for hash table and sorted
|
||
duplicate removal would only take another hour or so.
|
||
|
||
|
||
References
|
||
|
||
[1] http://groups.google.com/groups?as_q=duplicates&as_ugroup=comp.lang.python
|
||
|
||
[2] Tim Peters unique() entry in the Python cookbook:
|
||
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52560/index_txt
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
fill-column: 70
|
||
End:
|