前缀树Trie

力扣——208. 实现 Trie (前缀树)

题目

Trie（发音类似 "try"）或者说前缀树是一种树形数据结构，用于高效地存储和检索字符串数据集中的键。这一数据结构有相当多的应用情景，例如自动补完和拼写检查。

样例如下：

输入
["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
输出
[null, null, true, false, true, null, true]

解释
Trie trie = new Trie();
trie.insert("apple");
trie.search("apple");   // 返回 True
trie.search("app");     // 返回 False
trie.startsWith("app"); // 返回 True
trie.insert("app");
trie.search("app");     // 返回 True

算法思路

简单来说，前缀树就是一个26-叉树。

每个节点是一个结构体：

class Trie{
    vector<Trie*> children;  //存储所有可能的下一个节点 大小为26的数组
    bool isEnd;	//用于判断是否为单词的结尾
}

忽略了空节点之后可以简化为如下形式：

代码实现

class Trie {
private:
    vector<Trie*> child;
    bool isEnd = false;
public:

    /** Initialize your data structure here. */
    Trie():child(26),isEnd(0) {}
    
    /** Inserts a word into the trie. */
    void insert(string word) {
        Trie* node = this;
        for(char c : word){
            int pos = c - 'a';
            if(node->child[pos] == nullptr){
                node->child[pos] = new Trie();
            }
            node = node->child[pos];
        }
        node->isEnd = 1;
    }
    
    /** Returns if the word is in the trie. */
    bool search(string word) {
        Trie* node = this;
        for(char c : word){
            int pos = c - 'a';
            if(node->child[pos] == nullptr){
                node = node->child[pos];
                break;
            }
            node = node->child[pos];
        }
        if(node != nullptr && node->isEnd == 1)return 1;
        return 0;
    }
    
    /** Returns if there is any word in the trie that starts with the given prefix. */
    bool startsWith(string prefix) {
        Trie* node = this;
        for(char c : prefix){
            int pos = c - 'a';
            if(node->child[pos] == nullptr){
                node = node->child[pos];
                break;
            }
            node = node->child[pos];
        }
        if(node != nullptr)return 1;
        return 0;
    }
};

但是，我们发现函数的重复率非常高，可以写一个搜索前缀的函数，修改为如下形式：

class Trie {
private:
    vector<Trie*> child;
    bool isEnd = false;
    
    Trie* search_prefix(string word){
        Trie* node = this;
        for(char c : word){
            int pos = c - 'a';
            if(node->child[pos] == nullptr){
                return node->child[pos];
            }
            node = node->child[pos];
        }
        return node;
    }
public:

    /** Initialize your data structure here. */
    Trie():child(26),isEnd(0) {}
    
    /** Inserts a word into the trie. */
    void insert(string word) {
        Trie* node = this;
        for(char c : word){
            int pos = c - 'a';
            if(node->child[pos] == nullptr){
                node->child[pos] = new Trie();
            }
            node = node->child[pos];
        }
        node->isEnd = 1;
    }
    
    /** Returns if the word is in the trie. */
    bool search(string word) {
        Trie* node = search_prefix(word);
        if(node != nullptr && node->isEnd == 1)return 1;
        return 0;
    }
    
    /** Returns if there is any word in the trie that starts with the given prefix. */
    bool startsWith(string prefix) {
        Trie* node = search_prefix(prefix);
        if(node != nullptr)return 1;
        return 0;
    }
};

拓展一下

压缩前缀树

由于一个字母占用一个节点（含有大小为26的数组），非常浪费空间。压缩前缀树是前缀树的一种优化方案。

应用场景

当然在本题中前缀树起到的作用仅仅是判断是否收录了某个单词。在实际应用中可以通过修改isEnd为一个值，形成健与值的映射，增加应用范围。

这种时候，我们很自然地就会将前缀树与哈希表进行对比。

哈希表

当哈希表hash函数选取的好，计算量少且冲突少时，效率可达\(O(1)\)。

哈希表不能支持动态查询功能（即写前缀，显示后缀）
前缀树

前缀树的时间复杂度为\(O(L)\)，其中\(L\)为字符串长度。处理大量字符串的时候，效率比哈希表高。但是额外消耗的空间较大。经典：以空间换时间。

字符串的快速检索
字符串排序
最长公共前缀
自动匹配前缀显示后缀

如我们在搜索引擎和字典中输入前面的东西，会自动显示后面的内容。（只需把后缀遍历显示出来）

参考资料

https://zhuanlan.zhihu.com/p/46670859

https://cloud.tencent.com/developer/article/1678886

https://leetcode-cn.com/problems/implement-trie-prefix-tree/

#数据结构与算法

前缀树Trie

https://wuhlan3.gitee.io/2021/04/14/前缀树Trie/

Author

Wuhlan3

Posted on

April 14, 2021

Licensed under

实验4：具有中断处理的内核 Previous

《引力和宇宙学：广义相对论的原理和应用》读书笔记一 Next