算法复习 (二) 查找-二叉查找树

发表于 2019-04-29 更新于 2024-01-02 分类于技术文档

本文字数： 12k 阅读时长 ≈ 10 分钟

算法复习 (二) 查找-二叉查找树

二叉查找树-BST

一.什么是二叉查找树

二叉查找树，结合了链表的插入灵活性和有序数组查找的高效性。

结点： 有一个父结点，两个左右子结点，一个键值对，键之间有顺序之分方便高效查找（每个结点的键都大于左子树的键，小于右子树的键）。

详解二叉树

private Node root; //二叉查找树的根节点
private class Node{
    private Key key;
    private Value value;
    private Node left,right;
    private int N; //以该结点为根的子树的结点总数

    public Node(Key key,Value value,int N){
        this.key = key;
        this.value = value;
        this.N = N;
    }
}

特点：

键的有序性：左子树小于根节点，右子树大于根节点。
高效的查找、插入和删除：查找时根据节点值和目标值大小关系缩小搜索范围；插入和删除时可以快速找到位置进行操作。
中序遍历有序：按照左根右的顺序得到从小到大排列的节点序列。

实现代码： 二叉查找树-BST

1.1 结点计数器

变量N给出了以该结点为根的子树的结点总数：

1	size(x) = size(x.left) + size(x.right) + 1

1.2 不同的二叉树表示

一棵二叉查找树代表了一组键（及其相应的值）的集合，而同一个集合可以用多棵不同的二叉查找树表示（如图 3.2.3 所示）。如果我们将一棵二叉查找树的所有键投影到一条直线上，保证一个结点的左子树中的键出现在它的左边，右子树中的键出现在它的右边，那么我们一定可以得到一条有序的键列。我们会利用二叉查找树的这种天生的灵活性，用多棵二叉查找树表示同一组有序的键来实现构建和使用二叉查找树的高效算法。

两棵能够表示同一组键的二叉查找树

二.查找递归算法

递归的在适当的子树中查找，若被查找的键较小就查左子树，若较大则查右子树，若遍历结束仍未查到则未命中。

二叉查找树中的查找命中（左）和未命中（右）

2.1 查找和排序实现

public int size(){ return size(root); }

private int size(Node node){
    if(node == null) return 0;
    else return node.N;
}

public Value get(Key key){
    return get(root,key);
}

private Value get(Node x, Key key){
    // 在以x为根节点的子树中查找并返回Key所对应的值
    // 若找不到则返回null
    if(x == null) return null;
    int cmp = key.compareTo(x.key);
    if(cmp < 0) return get(x.left, key);
    else if(cmp > 0) return get(x.right, key);
    else return x.value;
}

public void put(Key key, Value value){
    // 查找Key，找到则更新值，否则为它创建一个新的结点
    root = put(root, key, value);
}

private Node put(Node x, Key key, Value value){
    // 如果Key存在于以x为根节点的子树中则更新它的值
    // 否则将以key和val为键值对的新结点插入到该子树中
    if(x == null) return new Node(key, value, 1);
    int cmp = key.compareTo(x.key);
    if(cmp < 0) x.left = put(x.left, key, value);
    else if(cmp > 0) x.right = put(x.right, key, value);
    else x.value = value;
    x.N = size(x.left) + size(x.right) + 1;
    return x;
}

2.2 插入

上述算法中的查找代码几乎和二分查找的一样简单，这种简洁性是二叉查找树的重要特性之一。而二叉查找树的另一个更重要的特性就是插入的实现难度和查找差不多。当查找一个不存在于树中的结点并结束于一条空链接时，我们需要做的就是将链接指向一个含有被查找的键的新结点（详见图 3.2.5）。

上述算法中递归的 put() 方法的实现逻辑和递归查找很相似：如果树是空的，就返回一个含有该键值对的新结点；如果被查找的键小于根结点的键，我们会继续在左子树中插入该键，否则在右子树中插入该键。

二叉查找树的插入操作

2.3 递归

可以将递归调用前的代码想象成沿着树向下走：它会将给定的键和每个结点的键相比较并根据结果向左或者向右移动到下一个结点。然后可以将递归调用后的代码想象成沿着树向上爬。

在一棵简单的二叉查找树中，唯一的新链接就是在最底层指向新结点的链接，重置更上层的链接可以通过比较语句来避免。同样，我们只需要将路径上每个结点中的计数器的值加 1，但我们使用了更加通用的代码，使之等于结点的所有子结点的计数器之和加1。基本的二叉查找树的实现常常是非递归的——我们在实现中使用了递归，一来是为了便于读者理解代码的工作方式，二来也是为学习更加复杂的算法做准备。

图 3.2.6 是对我们的标准索引用例轨迹的一份详细的研究，它向你展示了二叉树是如何生长的。新结点会连接到树底层的空链接上，树的其他部分则不会改变。例如，第一个被插入的键就是根结点，第二个被插入的键是根结点的两个子结点之一，以此类推。因为每个结点都含有两个链接，树会逐渐长大而不是萎缩。不仅如此，因为只有查找或者插入路径上的结点才会被访问，所以随着树的增长，被访问的结点数量占树的总结点数的比例也会不断的降低。

使用二叉查找树的标准索引用例的轨迹

2.4 分析

使用二叉查找树算法的运行时间取决于树的形状，树的形状又取决于键被插入的先后顺序。

最好情况，含有N个结点的二叉树是完全平衡的，每条空链接和根结点的距离都是lgN。
最坏情况，搜索路径上有N个结点，一直在树的一边。

二叉查找树和快速排序很像，树的根结点就是快排的第一个切分元素，左边都比它小，右边都比它大，对于所有子树都适用。

在由N个随机数构成的二叉查找树中，查找命中平均所需的比较次数为2lnN（约1.39lgN），插入操作和查找未命中平均所需的比较次数为2lnN（约1.39lgN）。

二叉查找树的可能形状

三.有序性相关的方法和删除操作

3.1 最大键和最小键

若根结点左链接为空，则根结点是最小键，否则就是左子树的最小键，即递归实现。最大键就是右子树。

public Key min(){
    return min(root).key;
}

private Node min(Node x){
    if(x.left == null) return x;
    return min(x.left);
}

public Key max(){
    return max(root).key;
}

private Node max(Node x){
    if(x.right == null) return x;
    return max(x.right);
}

3.2 向上取整和向下取整

若给定键小于根结点的键，则floor一定在左子树中；若给定键大于根结点的键，只有当根结点右子树中存在小于等于Key的结点时，小于等于Key的最大键才出现于右子树，否则就是根结点。

public Key floor(Key key){
    //向下取整
    Node x = floor(root,key);
    if(x == null) return null;
    return x.key;
}

private Node floor(Node x,Key key){
    //向下取整
    if(x == null) return null;
    int cmp = key.compareTo(x.key);
    if(cmp == 0) return x;
    else if (cmp < 0) return floor(x.left,key);
    Node t = floor(x.right,key);
    if(t != null) return t;
    else return null;
}

public Key ceiling(Key key){
    //向上取整
    Node x = ceiling(root,key);
    if(x == null) return null;
    return x.key;
}

private Node ceiling(Node x,Key key){
    //向上取整
    if(x == null) return null;
    int cmp = key.compareTo(x.key);
    if(cmp == 0) return x;
    else if (cmp > 0) return ceiling(x.right,key);
    Node t = ceiling(x.left,key);
    if(t != null) return t;
    else return null;
}

3.3 选择操作

假设需要寻找排名为k的键，若左子树中的结点数t大于k，继续在左子树中查找，若等于则返回根结点的键，若大于则在右子树中查找排名为k-t-1的键。

public Key select(int k){
    return select(root,k).key;
}

private Node select(Node x,int k){
    //返回排名为k的结点
    if(x == null) return null;
    int t = size(x.left);
    if(t > k) return select(x.left,k);
    else if(t > k) return select(x.right,k-t-1);
    else return x;
}

3.4 排名

rank() 是 select() 的逆方法，它会返回给定键的排名。实现方法和 select() 类似，给定键和根结点键相等返回左子树结点总数t，若给定键小于根结点则返回其在左子树中的排名，大于根结点则返回t+1+右子树中的排名

public int rank(Key key){
    return rank(key,root);
}

private int rank(Key key,Node x){
    //返回以x为根结点的子树中小于x.key的键的数量
    if(x == null) return 0;
    int cmp = key.compareTo(x.key);
    if(cmp < 0) return rank(key,x.left);
    else if(cmp > 0) return 1 + size(x.left) + rank(key,x.right);
    else return size(x.left);
}

3.5 删除最大键和最小键

删除操作是二叉树最难实现的部分，删除最小键需要不断深入根结点的左子树直到遇到一个空链接，然后将指向当前结点的链接指向该结点的右子树（只需要在递归调用中返回它的右链接即可）。

public void deleteMin(){
    root = deleteMin(root);
}

private Node deleteMin(Node x){
    if(x.left == null) return x.right;
    x.left = deleteMin(x.left);
    x.N = size(x.left) + size(x.right) + 1;//?
    return x;
}

public void deleteMax(){
    root = deleteMax(root);
}

private Node deleteMax(Node x){
    if(x.right == null) return x.left;
    x.right = deleteMax(x.right);
    x.N = size(x.left) + size(x.right) + 1;//?
    return x;
}

3.6 删除操作

删除一个没有或只有一个子结点的结点是上个所做操作，但删除操作通常是对有两个子结点的结点进行删除，如何处理两颗子树是一个问题。第一个方法：删除某结点x，将其后继结点填补它的位置，因为x有右子结点，所以后继结点就是右子树的最小结点，如此替换能保证树的有序性，因为x结点和其后继结点没有键关联，所以需要通过以下步骤完成替换流程：

将指向将被删除结点的链接保存为t。

将x指向它的后继结点min(t.right)。

将x的右链接（原本指向一棵所有结点都大于x.key的二叉查找树）指向deleteMin(t.right)，即删除后所有结点仍大于x.key。

将x的左链接设为t.left。

更新链接和结点计数器。

public void delete(Key key){
    root = delete(root, key);
}

private Node delete(Node x,Key key){
    if(x == null) return null;
    int cmp = key.compareTo(x.key);
    if(cmp < 0) x.left = delete(x.left,key);
    else if(cmp > 0) x.right = delete(x.right,key);
    else{
        if(x.right == null) return x.left;
        if(x.left == null) return x.right;
        Node t = x;
        x = min(t.right);//?
        x.right = deleteMin(t.right);
        x.left = t.left;
    }
    x.N = size(x.left) + size(x.right) + 1;//?
    return x;
}

3.7 范围查找

要返回给定范围内键的方法，需要参考中序遍历的方法，将所有落在给定范围内的键加入队列Queue中，跳过不可能含有所查找键的子树。

1
2
3

先序遍历：根->左->右
中序遍历：左->根->右
后序遍历：左->右->根

public Iterable<Key> keys(){
    //二叉查找树范围查找
    return keys(min(),max());
}

public Iterable<Key> keys(Key lo,Key hi){
    MyQueue<Key> queue = new MyQueue<Key>();
    keys(root,queue,lo,hi);
    return queue;
}

private void keys(Node x,MyQueue<Key> queue,Key lo,Key hi){
    if(x == null) return;
    int cmplo = lo.compareTo(x.key);
    int cmphi = hi.compareTo(x.key);
    if(cmplo < 0) keys(x.left,queue,lo,hi);
    if(cmplo <= 0 && cmphi >= 0) queue.enqueue(x.key);
    if(cmphi > 0) keys(x.right,queue,lo,hi);
}

3.8 性能分析

算法(数据结构)	最坏情况的成本 (N次插入后)		平均情况的成本 (N次随机插入后)		是否高效的支持有序性的相关操作
算法(数据结构)	查找	插入	查找	插入	是否高效的支持有序性的相关操作
顺序查找(无序链表)	N	N	N/2	N	否
二分查找(有序数组)	lgN	2N	lgN	N	是
二叉树查找(二叉查找树)	N	N	1.39lgN	1.39lgN	是

四.问答

二叉查找树的递归实现和非递归实现有什么区别？

递归实现更容易验证正确性，非递归实现效率更高

4.1 二叉查找树的非递归实现

public class BinarySearchTree<Key extends Comparable<Key>, Value> {
    private Node root;

    private class Node {
        private Key key;
        private Value value;
        private Node left, right;
        private int N; // 以该结点为根的子树的结点总数

        public Node(Key key, Value value, int N) {
            this.key = key;
            this.value = value;
            this.N = N;
        }
    }

    // 获取树的大小
    public int size() {
        return size(root);
    }

    private int size(Node x) {
        if (x == null) return 0;
        return x.N;
    }

    // 获取给定键的值
    public Value get(Key key) {
        Node x = root;
        while (x != null) {
            int cmp = key.compareTo(x.key);
            if (cmp < 0) x = x.left;
            else if (cmp > 0) x = x.right;
            else return x.value;
        }
        return null;
    }

    // 向树中插入键值对
    public void put(Key key, Value value) {
        if (root == null) {
            root = new Node(key, value, 1);
            return;
        }

        Node x = root;
        Node parent = null;
        while (x != null) {
            parent = x;
            int cmp = key.compareTo(x.key);
            if (cmp < 0) x = x.left;
            else if (cmp > 0) x = x.right;
            else {
                x.value = value; // 如果键已存在，则更新值
                return;
            }
        }

        int cmp = key.compareTo(parent.key);
        if (cmp < 0) parent.left = new Node(key, value, 1);
        else parent.right = new Node(key, value, 1);
    }

    // 获取最小键的值
    public Key min() {
        if (root == null) return null;

        Node x = root;
        while (x.left != null) {
            x = x.left;
        }
        return x.key;
    }

    // 获取最大键的值
    public Key max() {
        if (root == null) return null;

        Node x = root;
        while (x.right != null) {
            x = x.right;
        }
        return x.key;
    }

    // 获取小于等于给定键的最大键
    public Key floor(Key key) {
        Node x = floor(root, key);
        if (x == null) return null;
        return x.key;
    }

    private Node floor(Node x, Key key) {
        Node floor = null;
        while (x != null) {
            int cmp = key.compareTo(x.key);
            if (cmp == 0) return x;
            if (cmp < 0) {
                x = x.left;
            } else {
                floor = x;
                x = x.right;
            }
        }
        return floor;
    }

    // 获取大于等于给定键的最小键
    public Key ceiling(Key key) {
        Node x = ceiling(root, key);
        if (x == null) return null;
        return x.key;
    }

    private Node ceiling(Node x, Key key) {
        Node ceiling = null;
        while (x != null) {
            int cmp = key.compareTo(x.key);
            if (cmp == 0) return x;
            if (cmp > 0) {
                x = x.right;
            } else {
                ceiling = x;
                x = x.left;
            }
        }
        return ceiling;
    }

    // 返回树中小于给定键的键的数量
    public int rank(Key key) {
        return rank(key, root);
    }

    private int rank(Key key, Node x) {
        if (x == null) return 0;

        int cmp = key.compareTo(x.key);
        if (cmp < 0) return rank(key, x.left);
        else if (cmp > 0) return 1 + size(x.left) + rank(key, x.right);
        else return size(x.left);
    }

    // 返回排名为 k 的键
    public Key select(int k) {
        Node x = select(root, k);
        if (x == null) return null;
        return x.key;
    }

    private Node select(Node x, int k) {
        if (x == null) return null;

        int t = size(x.left);
        if (t > k) return select(x.left, k);
        else if (t < k) return select(x.right, k - t - 1);
        else return x;
    }
}

4.2 几种遍历

假设有均衡二叉树1，2 3，4 5 6 7

前序遍历打印顺序：1 2 4 5 3 6 7
中序遍历打印顺序：4 2 5 1 6 3 7
后序遍历打印顺序：4 5 2 6 7 3 1

1. 前序遍历（Preorder Traversal）

根节点 -> 左子树 -> 右子树

// 递归实现
public void preorderTraversal(TreeNode root) {
    if (root == null) return;
    
    System.out.print(root.val + " "); // 处理根节点
    preorderTraversal(root.left); // 递归处理左子树
    preorderTraversal(root.right); // 递归处理右子树
}

// 迭代（栈）实现
public void preorderTraversal(TreeNode root) {
    if (root == null) return;

    Stack<TreeNode> stack = new Stack<>();
    stack.push(root);

    while (!stack.isEmpty()) {
        TreeNode node = stack.pop();
        System.out.print(node.val + " "); // 处理当前节点
        if (node.right != null) stack.push(node.right); // 先压入右子树
        if (node.left != null) stack.push(node.left); // 再压入左子树
    }
}

2.中序遍历（Inorder Traversal）

左子树 -> 根节点 -> 右子树

// 递归实现
public void inorderTraversal(TreeNode root) {
    if (root == null) return;

    inorderTraversal(root.left); // 递归处理左子树
    System.out.print(root.val + " "); // 处理根节点
    inorderTraversal(root.right); // 递归处理右子树
}

// 迭代（栈）实现
public void inorderTraversal(TreeNode root) {
    if (root == null) return;

    Stack<TreeNode> stack = new Stack<>();
    TreeNode curr = root;

    while (curr != null || !stack.isEmpty()) {
        while (curr != null) {
            stack.push(curr);
            curr = curr.left;
        }

        curr = stack.pop();
        System.out.print(curr.val + " "); // 处理当前节点
        curr = curr.right; // 处理右子树
    }
}

3.后序遍历（Postorder Traversal）

左子树 -> 右子树 -> 根节点

// 递归实现
public void postorderTraversal(TreeNode root) {
    if (root == null) return;

    postorderTraversal(root.left); // 递归处理左子树
    postorderTraversal(root.right); // 递归处理右子树
    System.out.print(root.val + " "); // 处理根节点
}


// 迭代（栈）实现
public void postorderTraversal(TreeNode root) {
    if (root == null) return;

    Stack<TreeNode> stack1 = new Stack<>();
    Stack<TreeNode> stack2 = new Stack<>();
    stack1.push(root);

    while (!stack1.isEmpty()) {
        TreeNode node = stack1.pop();
        stack2.push(node);

        if (node.left != null) stack1.push(node.left);
        if (node.right != null) stack1.push(node.right);
    }

    while (!stack2.isEmpty()) {
        System.out.print(stack2.pop().val + " ");
    }
}

参考：

🔗《算法第4版》