82 lines
2.2 KiB
ReStructuredText
82 lines
2.2 KiB
ReStructuredText
PEP: 270
|
|
Title: uniq method for list objects
|
|
Author: Jason Petrone <jp@demonseed.net>
|
|
Status: Rejected
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 21-Aug-2001
|
|
Python-Version: 2.2
|
|
Post-History:
|
|
|
|
|
|
Notice
|
|
======
|
|
|
|
This PEP is withdrawn by the author. He writes:
|
|
|
|
Removing duplicate elements from a list is a common task, but
|
|
there are only two reasons I can see for making it a built-in.
|
|
The first is if it could be done much faster, which isn't the
|
|
case. The second is if it makes it significantly easier to
|
|
write code. The introduction of ``sets.py`` eliminates this
|
|
situation since creating a sequence without duplicates is just
|
|
a matter of choosing a different data structure: a set instead
|
|
of a list.
|
|
|
|
As described in :pep:`218`, sets are being added to the standard
|
|
library for Python 2.3.
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes adding a method for removing duplicate elements to
|
|
the list object.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Removing duplicates from a list is a common task. I think it is
|
|
useful and general enough to belong as a method in list objects.
|
|
It also has potential for faster execution when implemented in C,
|
|
especially if optimization using hashing or sorted cannot be used.
|
|
|
|
On comp.lang.python there are many, many, posts [1]_ asking about
|
|
the best way to do this task. It's a little tricky to implement
|
|
optimally and it would be nice to save people the trouble of
|
|
figuring it out themselves.
|
|
|
|
|
|
Considerations
|
|
==============
|
|
|
|
Tim Peters suggests trying to use a hash table, then trying to
|
|
sort, and finally falling back on brute force [2]_. Should uniq
|
|
maintain list order at the expense of speed?
|
|
|
|
Is it spelled 'uniq' or 'unique'?
|
|
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
I've written the brute force version. It's about 20 lines of code
|
|
in ``listobject.c``. Adding support for hash table and sorted
|
|
duplicate removal would only take another hour or so.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] https://groups.google.com/forum/#!searchin/comp.lang.python/duplicates
|
|
|
|
.. [2] Tim Peters unique() entry in the Python cookbook:
|
|
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52560/index_txt
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|