Notes on Tries

Author

Trent Hauck

Published

June 7, 2024

A trie is a tree-like structure for storing a set of strings. It’s also called a prefix tree. For example, given the strings apple, app, apricot, and banana, the trie would look like this:

Trie for apple, app, apricot, and banana root a a root->a b b root->b e1 e apricot_end banana_end apple_end p1 p p2 p p1->p2 r r p1->r p2->e1 l1 l p2->l1 a1 a n1 n a1->n1 a2 a n1->a2 n2 n a2->n2 a3 a n2->a3 a3->banana_end e2 e l1->e2 e2->apple_end a->p1 b->a1 i i r->i c c i->c o o c->o t t o->t t->apricot_end

A basic implementation consists of the TrieNode class and the Trie class. For former is a node with a dictionary of children nodes and an indicator of whether the node is the end of a word. The latter is a class with a single root node and methods for inserting and searching for words.

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end_of_word = False


class Trie:
    def __init__(self):
        self.root = TrieNode()

    def add_word(self, w: str):
        # Start at the root, will be updated as we go
        node = self.root

        # For each character in the word
        for c in w:
            if c not in node.children:
                # we haven't seen this character before, so add it
                node.children[c] = TrieNode()

            # move to the next node
            node = node.children[c]

        # mark the end of the word after the last character
        node.is_end_of_word = True

    def search(self, w: str) -> bool:
        # Start at the root, will be updated as we go
        node = self.root
        for c in w:
            # as soon as we can't find a character, the word isn't in the trie
            if c not in node.children:
                return False
            node = node.children[c]

        # if we've reached the end of the word, it's in the trie if
        # the node is marked as the end of a word
        return node.is_end_of_word

Complexity

Inserting, searching, and deleting words in a trie is \(O(m)\) where \(m\) is the length of the word. This is because we only need to traverse the length of the word to insert, search, or delete it, which is much faster than if we had to search through all the words individually.

The space complexity of the trie is \(O(n * \overline{w})\) where \(n\) is the number of words and \(\overline{w}\) is the average length of the words. This is because each node in the trie has a dictionary of children nodes, which can be up to the size of the alphabet. If the average length of the words is \(\overline{w}\), then the number of nodes in the trie is \(n * \overline{w}\).

Citation

BibTeX citation:
@online{hauck2024,
  author = {Hauck, Trent},
  title = {Notes on {Tries}},
  date = {2024-06-07},
  url = {https://www.trenthauck.com/tries.html},
  langid = {en}
}
For attribution, please cite this work as:
Hauck, Trent. 2024. “Notes on Tries.” June 7, 2024. https://www.trenthauck.com/tries.html.