One-pass algorithm

From HandWiki

In computing, a one-pass algorithm is a streaming algorithm which reads its input exactly once, in order, without unbounded buffering. A one-pass algorithm generally requires O(n) (see 'big O' notation) time and less than O(n) storage (typically O(1)), where n is the size of the input.

Basically one-pass algorithm operates as follows:

  1. The object descriptions are processed serially
  2. The first object becomes the cluster representative of the first cluster
  3. Each subsequent object is matched against all cluster representatives existing at its processing time
  4. A given object is assigned to one cluster (or more if overlap is allowed) according to some condition on the matching function
  5. When an object is assigned to a cluster the representative for that cluster is recomputed
  6. If an object fails a certain test it becomes the cluster representative of a new cluster nothing happened

Example problems solvable by one-pass algorithms

Given any list as an input:

  • Count the number of elements.

Given a list of numbers:

Given a list of symbols from an alphabet of k symbols, given in advance.

  • Count the number of times each symbol appears in the input.
  • Find the most or least frequent elements.
  • Sort the list according to some order on the symbols (possible since the number of symbols is limited).
  • Find the maximum gap between two appearances of a given symbol.

Example problems not solvable by one-pass algorithms

Given any list as an input:

  • Find the nth element from the end (or report that the list has fewer than n elements).
  • Find the middle element of the list.

Given a list of numbers:

  • Find the median.
  • Find the modes (This is not the same as finding the most frequent symbol from a limited alphabet).
  • Sort the list.