From: | glen herrmannsfeldt <gah@ugcs.caltech.edu> |
Newsgroups: | comp.compilers |
Date: | 15 Jun 2006 15:01:45 -0400 |
Organization: | Compilers Central |
References: | 06-05-083 06-06-004 06-06-035 06-06-037 |
Keywords: | design |
Posted-Date: | 15 Jun 2006 15:01:44 EDT |
Jonathan Thornburg wrote:
(snip)
> Consider the following basic-computer-science-algorithms exercise:
> Given two lists (1-D arrays) of numbers, suggest a
> "reasonably efficient" algorithm to find all numbers
> which are common to both lists.
(snip of hash table and sort solutions)
The last time I needed to do this, I needed to find the non-duplicates
between two lists that each had a fair number of duplicates.
Using the unix sort -u command on each list sorts each list while
removing duplicates. (As an external sort, it should remove duplicates
for each sort group before writing them to disk. I don't know if it
does that, though.)
With two sorted lists with duplicates removed, sort -m (merge)
and uniq (with the appropriate option) can detect duplicate or
non-duplicate values.
For very large lists, the sort can be done as an external sort,
where hash tables are usually not external.
-- glen
Return to the
comp.compilers page.
Search the
comp.compilers archives again.