User:Patrick O'Jackson/sandbox

In computer science, a ternary search tree is a special trie data structure where the child nodes of a standard trie are ordered as a binary search tree. Ternary search trees are effective and for many applications involving mapping strings to other values as well as completing near-neighbor lookups and other string-related queries. In addition, Ternary search trees are a more space efficient (at the cost of speed) equivalent of typical tries

Description edit

A ternary search tree is similar to other tries [1] . Each node of a ternary search tree stores a single character, an object (or a pointer to an object depending on implementation), and pointers to its three children conventially named "equal kid" "lo kid" and "hi kid" [1][2]. A node may also have a pointer to its parent node as well as an indicator as to whether or not the node marks the end of a word[1]. The lo kid pointer must point to a node whose character value is less than the current node. Conversely, the hi kid pointer must point to a node whose character is greater than the current node.[2] The figure below shows a ternary search tree with the strings "as", "at", "cup", "cute", "he", "i" and "us":

          c
        / | \
       a  u  h
       |  |  | \
       t  t  e  u
     /  / |   / |
    s  p  e  i  s

As with other trie data structures, each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix.

Comparison to Other Data Structures edit

Tries edit

When compared to tries, ternary search trees run slower. However, tries are often unacceptable solutions for large sets of data because of their large memory footprint [2]. For this reason, ternary search trees can be used as a fast, but more space-efficient alternative. [2]

Hash Maps edit

Hashtables can also be used in place of ternary search trees for mapping strings to values. However, hash maps also frequently use more memory that ternary search trees (but not as much as tries). Additionally, hash maps are typically slower at reporting a string that is not in the sata structure because it must compare the entire string rather than just the first few characters. here is some evidence that shows ternary search trees running faster than hash maps. [2] Additionally, hash maps do not allow for many of the uses of ternary search trees such as near-neighbor lookups.

Ternary Search Tree operations edit

Look up edit

To look up a particular node or the data associated with a node, a string key is needed. A lookup procedure begins by checking the root node of the tree. If the first character of the string is less than the character in the root node, a recursive lookup can be called on the tree whose root is the lo kid of the current root. Similarly, if the first character is greater than the current node in the tree, then a recursive call can be made to the tree whose root is the hi kid of the current node. [2] As a final case, if the first character of the string is equal to the character of the current node then the function returns the node if there are more characters in the key. If there are more characters in the key then the first character of the key must be removed and a recursive call is made given the equal kid node and the modified key. [2] This can also be written in a non-recursive way by making a pointer to the current node and a pointer to the current character of the key. [2]

Insertion edit

Inserting a value into a ternary search can be defined recursively much as lookups are defined. This recursive method is continually called on nodes of the tree given a key get gets progressively shorter by pruning characters off the front of the string. If this method reaches a node that has not been created, it creates the node and assigns it the character value of the first character in the key. Whether a new node is created or not, the method checks to see if the first character in the string is greater than or less than the character value in the node and makes a recursive call on the appropriate node as in the lookup operation. If, however, the key's first character is equal to the node's value then the insertion procedure is called on the equal kid and the key's first character s pruned away. [2] Like binary searh trees and other data structures, ternary search trees can become degenerate depending on the order of the keys [3]. One order of key insertions is to insert them in order. This is one way to produce the worst-case ternary search tree [2]. Inserting the keys in random order often produces a well-balanced tree[2].

Uses edit

Ternary search trees can be used to solve many problems in which a large number of strings must be stored and retrieved in an arbitrary order. Some of the most common or most useful of these are below:

See also edit

Hashtable

References edit

  1. ^ a b c d Ostrovsky, Igor. "Efficient auto-complete with a ternary search tree".
  2. ^ a b c d e f g h i j k l m Dobbs. "Ternary Search Trees".
  3. ^ a b Wrobel, Lukasz. "Ternary Search Tree".
  4. ^ a b c Flint, Wally. "Plant your data in a ternary search tree".

External links edit

Category:Trees (data structures) Category:Search algorithms