CopyOnWriteArrayList的实现原理

发表于 2020-06-05 更新于 2020-10-02 分类于技术文档

本文字数： 8.2k 阅读时长 ≈ 7 分钟

简单整理了一下CopyOnWriteArrayList的相关内容，主要内容包括：概述（为什么需要CopyOnWriteArrayList、什么是CopyOnWrite、写时复制和读写锁的区别），实现原理（基本结构、get、add、set、remove）等。

CopyOnWriteArrayList

第一节概述

1.1 为什么需要CopyOnWriteArrayList？

多读场景需要高效的并发数组。

常用的 ArrayList 不是一个线程安全的集合，基于快速失败机制，多线程操作会抛出 ConcurrentModificationException ，而线程安全的 Vector 又因为低效的设计而被弃用，又或者 Collections.synchronizedList() 来获取一个包装的线程安全类，与前者一样使用 synchronized 来保证的线程安全。

synchronized 这种独占锁在同一时刻只能有一个线程拥有对象监视器，对于大部分读多写少的业务场景并不适用。针对这种场景而设计的读写锁 ReentrantReadWriteLock 是一个合适的选择。

如果只是在 List 上封装一层读写锁，读数据仍会被其他操作阻塞，所以需要并发包提供的 CopyOnWriteArrayList 。

1.2 什么是CopyOnWrite？

CopyOnWrite（COW）即写时复制，通过延迟更新的策略来实现数据一致性，并且能保证线程间不阻塞。

通俗的讲就是在向容器添加元素时，并非直接向容器内添加，而是先将当前容器进行Copy，复制出一个新容器，然后向新容器添加元素，完成后再将指向旧容器的引用指向新容器。

这一过程的优势就是在进行并发的读操作时不用加锁，因为旧容器并不改变，所以写时复制也是一种读写分离的思想，通过放弃数据的实时性来达到数据一致性，获得的是提高并发性。

1.3 写时复制和读写锁的区别？

相同点：两者都是通过读写分离思想实现，都实现了读线程间互不阻塞。
不同点：使用读写锁依然会存在线程阻塞等待的情况，而COW则牺牲了数据实时性，保证读线程不会存在等待。

读写锁中，读线程为了实现数据实时性，在写锁被获取时，读线程会等待。而写时复制中虽然写线程最终的修改结果可以被读线程感知，但有一定延迟。

第二节实现原理

2.1 基本结构

根据其接口实现 RandomAccess 就可得知，其底层为数组。

public class CopyOnWriteArrayList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable {
    private static final long serialVersionUID = 8673264195747942595L;

    /** The lock protecting all mutators */
    final transient ReentrantLock lock = new ReentrantLock();

    /** The array, accessed only via getArray/setArray. */
    private transient volatile Object[] array;
 
    ...
}

通过可重入锁 ReentrantLock 来保证线程安全，通过 volatile 修饰底层数组来保证可见性。

2.2 get

get() 方法源码如下，读操作不需要考虑并发安全，直接通过数组索引获取元素。

/**
 * {@inheritDoc}
 *
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E get(int index) {
    return get(getArray(), index);
}

// Positional Access Operations
@SuppressWarnings("unchecked")
private E get(Object[] a, int index) {
    return (E) a[index];
}

/**
 * Gets the array.  Non-private so as to also be accessible
 * from CopyOnWriteArraySet class.
 */
final Object[] getArray() {
    return array;
}

2.3 add

add() 方法源码如下，写线程需要阻塞。

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    // 1.首先获取Lock，保证写操作线程安全
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        // 2.获取旧数组
        Object[] elements = getArray();
        int len = elements.length;
        // 3.通过Arrays.copyOf进行数组浅复制扩容，底层是通过调用System.arraycopy()进行数组的浅拷贝
        Object[] newElements = Arrays.copyOf(elements, len + 1);
        // 4.将新元素添加到新数组的末尾
        newElements[len] = e;
        // 5.指向新数组
        setArray(newElements);
        return true;
    } finally {
        lock.unlock();
    }
}

/**
 * Sets the array.
 */
final void setArray(Object[] a) {
    array = a;
}

实现核心主要是：

通过 ReentrantLock 保证同一时刻只有一个写线程在进行数组的复制，否则会导致内存中存在多份复制数组。
volatile 修饰的底层数组，根据 happens-before 规则，写线程对数组引用的修改对读线程是可见的。

add() 还有重载方法，指定位置添加元素：

/**
 * Inserts the specified element at the specified position in this
 * list. Shifts the element currently at that position (if any) and
 * any subsequent elements to the right (adds one to their indices).
 *
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        if (index > len || index < 0)
            throw new IndexOutOfBoundsException("Index: "+index+
                                                ", Size: "+len);
        Object[] newElements;
        int numMoved = len - index;
        if (numMoved == 0)
            // 末尾添加，直接拷贝生成新数组
            newElements = Arrays.copyOf(elements, len + 1);
        else {
            // 非末尾，先构建新数组，再通过System.arraycopy错位复制
            newElements = new Object[len + 1];
            // 第一次复制源数组下标0-index的元素
            System.arraycopy(elements, 0, newElements, 0, index);
            // 第二次复制源数组index-end的元素，错位复制到新数组，此步完成后index元素重复一次
            System.arraycopy(elements, index, newElements, index + 1,
                             numMoved);
        }
        // 更新元素
        newElements[index] = element;
        // 更新引用
        setArray(newElements);
    } finally {
        lock.unlock();
    }
}

核心就是通过 System.arraycopy() 进行错位复制。

2.4 set

set() 方法源码如下，写线程需要阻塞。

/**
 * Replaces the element at the specified position in this list with the
 * specified element.
 *
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E set(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        // 先获取旧值
        E oldValue = get(elements, index);

        if (oldValue != element) {
            // 仍通过浅复制进行替换
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len);
            newElements[index] = element;
            setArray(newElements);
        } else {
            // 旧值等价于新值，不需要替换，此处只更新一下引用
            // Not quite a no-op; ensures volatile write semantics
            setArray(elements);
        }
        return oldValue;
    } finally {
        lock.unlock();
    }
}

2.5 remove

remove() 方法源码如下，写线程需要阻塞。

/**
 * Removes the element at the specified position in this list.
 * Shifts any subsequent elements to the left (subtracts one from their
 * indices).  Returns the element that was removed from the list.
 *
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E remove(int index) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        E oldValue = get(elements, index);
        int numMoved = len - index - 1;
        if (numMoved == 0)
            setArray(Arrays.copyOf(elements, len - 1));
        else {
            Object[] newElements = new Object[len - 1];
            System.arraycopy(elements, 0, newElements, 0, index);
            System.arraycopy(elements, index + 1, newElements, index,
                             numMoved);
            setArray(newElements);
        }
        return oldValue;
    } finally {
        lock.unlock();
    }
}

/**
 * Removes the first occurrence of the specified element from this list,
 * if it is present.  If this list does not contain the element, it is
 * unchanged.  More formally, removes the element with the lowest index
 * {@code i} such that
 * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>
 * (if such an element exists).  Returns {@code true} if this list
 * contained the specified element (or equivalently, if this list
 * changed as a result of the call).
 *
 * @param o element to be removed from this list, if present
 * @return {@code true} if this list contained the specified element
 */
public boolean remove(Object o) {
    Object[] snapshot = getArray();
    int index = indexOf(o, snapshot, 0, snapshot.length);
    return (index < 0) ? false : remove(o, snapshot, index);
}

/**
 * A version of remove(Object) using the strong hint that given
 * recent snapshot contains o at the given index.
 */
private boolean remove(Object o, Object[] snapshot, int index) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] current = getArray();
        int len = current.length;
        if (snapshot != current) findIndex: {
            int prefix = Math.min(index, len);
            for (int i = 0; i < prefix; i++) {
                if (current[i] != snapshot[i] && eq(o, current[i])) {
                    index = i;
                    break findIndex;
                }
            }
            if (index >= len)
                return false;
            if (current[index] == o)
                break findIndex;
            index = indexOf(o, current, index, len);
            if (index < 0)
                return false;
        }
        Object[] newElements = new Object[len - 1];
        System.arraycopy(current, 0, newElements, 0, index);
        System.arraycopy(current, index + 1,
                         newElements, index,
                         len - index - 1);
        setArray(newElements);
        return true;
    } finally {
        lock.unlock();
    }
}

/**
 * Removes from this list all of the elements whose index is between
 * {@code fromIndex}, inclusive, and {@code toIndex}, exclusive.
 * Shifts any succeeding elements to the left (reduces their index).
 * This call shortens the list by {@code (toIndex - fromIndex)} elements.
 * (If {@code toIndex==fromIndex}, this operation has no effect.)
 *
 * @param fromIndex index of first element to be removed
 * @param toIndex index after last element to be removed
 * @throws IndexOutOfBoundsException if fromIndex or toIndex out of range
 *         ({@code fromIndex < 0 || toIndex > size() || toIndex < fromIndex})
 */
void removeRange(int fromIndex, int toIndex) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;

        if (fromIndex < 0 || toIndex > len || toIndex < fromIndex)
            throw new IndexOutOfBoundsException();
        int newlen = len - (toIndex - fromIndex);
        int numMoved = len - toIndex;
        if (numMoved == 0)
            setArray(Arrays.copyOf(elements, newlen));
        else {
            Object[] newElements = new Object[newlen];
            System.arraycopy(elements, 0, newElements, 0, fromIndex);
            System.arraycopy(elements, toIndex, newElements,
                             fromIndex, numMoved);
            setArray(newElements);
        }
    } finally {
        lock.unlock();
    }
}

和 set() 、add() 等方法一样，remove() 仍通过数组浅复制进行删除操作，主要是错位复制。

参考：

🔗 《Java并发编程的艺术》

🔗 并发容器之CopyOnWriteArrayList