Java源码解析 —— ArrayList

YXiao's Blog

2018-03-15

Java › Source Code

前言

ArrayList继承了AbstractList，实现了List。ArrayList在工作中经常用到，所以要弄懂这个类是极其重要的。

构造图如下:
蓝色线条：继承
绿色线条：接口实现

ArrayList简介

定义

1 2	public class ArrayList<E> extends AbstractList<E> implements List<E>, RandomAccess, Cloneable, java.io.Serializable

ArrayList 是一个数组队列，相当于动态数组。与Java中的数组相比，它的容量能动态增长。它继承于AbstractList，实现了List, RandomAccess, Cloneable, java.io.Serializable这些接口。

ArrayList 继承了AbstractList，实现了List。它是一个数组队列，提供了相关的添加、删除、修改、遍历等功能。

ArrayList 实现了RandomAccess接口，即提供了随机访问功能。RandomAccess是java中用来被List实现，为List提供快速访问功能的。在ArrayList中，我们可以通过元素的序号快速获取元素对象；这就是快速随机访问。稍后，我们会比较List的“快速随机访问”和“通过Iterator迭代器访问”的效率。

ArrayList 实现了Cloneable接口，即覆盖了函数clone()，能被克隆。

ArrayList 实现java.io.Serializable接口，这意味着ArrayList支持序列化，能通过序列化去传输。

和Vector不同，ArrayList中的操作不是线程安全的！所以，建议在单线程中才使用ArrayList，而在多线程中可以选择Vector或者CopyOnWriteArrayList。

ArrayList属性

顾名思义，ArrayList就是用数组实现的List容器，既然是用数组实现，当然底层用数组来保存数据。

// 保存ArrayList中数据的数组
transient Object[] elementData;
// ArrayList中实际数据的数量
private int size;

ArrayList包含了两个重要的对象：elementData 和 size。

elementData 是”Object[]类型的数组”，它保存了添加到ArrayList中的元素。实际上，elementData是个动态数组，我们能通过构造函数ArrayList(int initialCapacity)来执行它的初始容量为initialCapacity；如果通过不含参数的构造函数ArrayList()来创建ArrayList，则elementData的容量默认是10。elementData数组的大小会根据ArrayList容量的增长而动态的增长，具体的增长方式，请参考源码分析中的ensureCapacity()函数；
size 则是动态数组的实际大小。

ArrayList构造函数

// ArrayList带容量大小的构造函数。
public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        // 新建一个数组
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        // 指向空数组
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity);
    }
}

// ArrayList构造函数。默认容量是10。
public ArrayList() {
    // 指向空数组
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

// 构造一个包含指定元素的list，这些元素的是按照Collection的迭代器返回的顺序排列的
public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

第一个构造方法使用提供的initialCapacity来初始化elementData数组的大小。
第二个构造方法调用第一个构造方法并传入参数10，即默认elementData数组的大小为10。
第三个构造方法则将提供的集合转成数组返回给elementData（返回若不是Object[]将调用Arrays.copyOf方法将其转为Object[]）。

源码解析

增加

/**
 * 添加一个元素
 */
public boolean add(E e) {
  // 进行扩容检查
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    // 将e增加至list的数据尾部，容量+1
    elementData[size++] = e;
    return true;
}

/**
* 在指定位置添加一个元素
*/
public void add(int index, E element) {
  // 判断索引是否越界
    rangeCheckForAdd(index);

  // 进行扩容检查
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    // 对数组进行复制处理，目的就是空出index的位置插入element，并将index后的元素位移一个位置
    System.arraycopy(elementData, index, elementData, index + 1, size - index);
    // 将指定的index位置赋值为element
    elementData[index] = element;
    // list容量+1
    size++;
}

/**
 * 增加一个集合元素
 */
public boolean addAll(Collection<? extends E> c) {
  //将c转换为数组
    Object[] a = c.toArray();
    int numNew = a.length;
    //扩容检查
    ensureCapacityInternal(size + numNew);  // Increments modCount
    //将c添加至list的数据尾部
    System.arraycopy(a, 0, elementData, size, numNew);
    //更新当前容器大小
    size += numNew;
    return numNew != 0;
}

/**
 * 在指定位置，增加一个集合元素
 */
public boolean addAll(int index, Collection<? extends E> c) {
    rangeCheckForAdd(index);

    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount

  // 计算需要移动的长度（index之后的元素个数）
    int numMoved = size - index;
    // 数组复制，空出第index到index+numNum的位置，即将数组index后的元素向右移动numNum个位置
    if (numMoved > 0)
        System.arraycopy(elementData, index, elementData, index + numNew, numMoved);

  // 将要插入的集合元素复制到数组空出的位置中
    System.arraycopy(a, 0, elementData, index, numNew);
    size += numNew;
    return numNew != 0;
}

/**
 * 数组容量检查，不够时则进行扩容
 */
private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

private void grow(int minCapacity) {
    // 当前数组的长度
    int oldCapacity = elementData.length;
    // 分配原capacity的1.5倍
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    // 如果新扩容的数组长度还是比最小需要的容量小，则以最小需要的容量为长度进行扩容
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);

    // 进行数据拷贝，Arrays.copyOf底层实现是System.arrayCopy()
    elementData = Arrays.copyOf(elementData, newCapacity);
}

删除

/**
 * 根据索引位置删除元素
 */
  public E remove(int index) {
      // 数组越界检查
      rangeCheck(index);

      modCount++;
      // 取出要删除位置的元素，供返回使用
      E oldValue = elementData(index);

      // 计算数组要复制的数量
      int numMoved = size - index - 1;
      // 数组复制，就是将index之后的元素往前移动一个位置
      if (numMoved > 0)
          System.arraycopy(elementData, index+1, elementData, index, numMoved);
          // 将数组最后一个元素置空（因为删除了一个元素，然后index后面的元素都向前移动了，所以最后一个就没用了），好让gc尽快回收
          elementData[--size] = null;

          return oldValue;
  }
  
  /**
   * 根据元素内容删除，只删除匹配的第一个
   */
  public boolean remove(Object o) {
      // 对数据元素进行遍历查找，知道找到第一个要删除的元素，删除后进行返回，如果要删除的元素正好是最后一个那就惨了，时间复杂度可达O(n)
      if (o == null) {
          for (int index = 0; index < size; index++)
            // null值要用==比较
              if (elementData[index] == null) {
                  fastRemove(index);
                  return true;
              }
      } else {
          for (int index = 0; index < size; index++)
              if (o.equals(elementData[index])) {
                  fastRemove(index);
                  return true;
              }
      }
      return false;
  }
  
  public boolean removeAll(Collection<?> c) {
      Objects.requireNonNull(c);
      return batchRemove(c, false);
  }
  
  private boolean batchRemove(Collection<?> c, boolean complement) {
      final Object[] elementData = this.elementData;
      int r = 0, w = 0;
      boolean modified = false;
      try {
          for (; r < size; r++)
              if (c.contains(elementData[r]) == complement)
                  elementData[w++] = elementData[r];
      } finally {
          if (r != size) {
              System.arraycopy(elementData, r,
                               elementData, w,
                               size - r);
              w += size - r;
          }
          if (w != size) {
              // clear to let GC do its work
              for (int i = w; i < size; i++)
                  elementData[i] = null;
              modCount += size - w;
              size = w;
              modified = true;
          }
      }
      return modified;
  }
  
  private void fastRemove(int index) {
      modCount++;
      // 原理和之前的add一样，还是进行数组复制，将index后的元素向前移动一个位置
      int numMoved = size - index - 1;
      if (numMoved > 0)
          System.arraycopy(elementData, index+1, elementData, index, numMoved);
      elementData[--size] = null;
}

代码是很简单，主要需要特别关心的就两个地方：

数组扩容
数组复制

这两个操作都是极费效率的，最惨的情况下(添加到list第一个位置，删除list最后一个元素或删除list第一个索引位置的元素)时间复杂度可达O(n)。

上面讲增加元素可能会进行扩容，而删除元素却不会进行缩容，如果在已删除为主的场景下使用list，一直不停的删除而很少进行增加，那么会出现什么情况？再或者数组进行一次大扩容后，我们后续只使用了几个空间，如何处理这种情况呢？

/**
 * 将底层数组的容量调整为当前实际元素的大小，来释放空间。
 */
public void trimToSize() {
    modCount++;
    // 如果当前实际元素大小小于当前数组的容量，则进行缩容
    if (size < elementData.length) {
        elementData = (size == 0)
          ? EMPTY_ELEMENTDATA
          : Arrays.copyOf(elementData, size);
    }
}

size和length是不同的：size是私有变量，只能通过size()来访问，返回当前数组的实际元素大小；length是数组的属性，返回当前数组的容量。

关于ArrayList删除元素的坑

关于ArrayList的remove方法，在使用中经常会遇到坑，这也是面试中经常会问的点 —— 输出是什么？为什么会这样？我觉得有必要把它单独拿出来说明一下。

我们经常遇到的需求是：当集合中的某些元素符合一定的条件时，需要删除这个元素，例如：

public class ListTest {  
    public static void main(String[] args) {  
       List<Integer> intList = new ArrayList<Integer>();  
       Collections.addAll(intList, 1, 2, 3, 5, 6);  
       // for循环优化写法，只获取一次长度  
       for(int i = 0, size = intList.size(); i < size; i++) {  
           Integer value = intList.get(i);  
           // 符合条件，删除元素  
           if(value == 3 || value == 5) {  
              intList.remove(i);  
           }  
       }  
       System.out.println(intList);  
    }  
}

在执行之后，会抛出IndexOutOfBoundsException，因为ArrayList删除掉符合条件的元素后，长度动态发生了改变，由于长度只获取了一次，导致越界。

去掉for循环优化的代码：

public class ListTest {  
    public static void main(String[] args) {  
       List<Integer> intList = new ArrayList<Integer>();  
       Collections.addAll(intList, 1, 2, 3, 5, 6);  
       for(int i = 0; i < intList.size(); i++) {  
           Integer value = intList.get(i);  
           // 符合条件，删除元素  
           if(value == 3 || value == 5) {  
              intList.remove(i);  
           }  
       }  
       System.out.println(intList);  
    }  
}

输出：[1, 2, 5, 6]。

因为在i == 2时，值为3，删除后，后面的元素往前补一位，所以跳过了5。

使用foreach：

public class ListTest {  
    public static void main(String[] args) {  
       List<Integer> intList = new ArrayList<Integer>();  
       Collections.addAll(intList, 1, 2, 3, 5, 6);  
       for(Integer value : intList) {  
           // 符合条件，删除元素  
           if(value == 3 || value == 5) {  
              intList.remove(value);  
           }  
       }  
       System.out.println(intList);  
    }  
}

执行后，会抛出ConcurrentModificationException，意思是并发修改异常，异常追踪信息如下：

Exception inthread "main" java.util.ConcurrentModificationException
     atjava.util.AbstractList$Itr.checkForComodification(AbstractList.java:449)
     at java.util.AbstractList$Itr.next(AbstractList.java:420)
     at ListTest.main(ListTest.java:13)

大概看出是执行到AbstractList中内部类的checkForComodification方法抛出的异常。集合遍历是使用Iterator，Iterator是工作在一个独立的线程中，并且拥有一个互斥锁。Iterator被创建之后会建立一个指向原来对象的单链索引表，当原来的对象数量发生变化时，这个索引表的内容不会同步改变，所以当索引指针往后移动的时候就找不到要迭代的对象。

AbstractList$Itr.checkForComodification：

final void checkForComodification() {
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}

所以记住一点：Iterator在工作的时候是不允许被迭代的对象被改变的。

那么正确删除ArrayList元素的姿势是什么呢？这里有三种方法：

使用Iterator的remove方法

该方法会删除当前迭代对象的同时，维护索引的一致性。

Iterator的remove方法源码：

public void remove() {
    if (lastRet < 0)
        throw new IllegalStateException();
    checkForComodification();

    try {
        AbstractList.this.remove(lastRet);
        if (lastRet < cursor)
            cursor--;
        lastRet = -1;
        expectedModCount = modCount;
    } catch (IndexOutOfBoundsException e) {
        throw new ConcurrentModificationException();
    }
}

使用Iterator删除元素：

public class ListTest {  
    public static void main(String[] args) {  
       List<Integer> intList = new ArrayList<Integer>();  
       Collections.addAll(intList, 1, 2, 3, 5, 6);  
       Iterator<Integer> it = intList.iterator();  
       while(it.hasNext()) {  
           Integer value = it.next();  
           if(value == 3 || value == 5) {  
              it.remove();  
           }  
       }  
       System.out.println(intList);  
    }  
}

输出：[1, 2, 6]。

自己维护索引

即，删除元素后，索引-1。

public class ListTest {  
    public static void main(String[] args) {  
       List<Integer> intList = new ArrayList<Integer>();  
       Collections.addAll(intList, 1, 2, 3, 5, 6);  
       for(int i = 0; i < intList.size(); i++) {  
           Integer value = intList.get(i);  
           if(value == 3 || value == 5) {  
              intList.remove(i);  
              i--;  
           }  
       }  
        System.out.println(intList);  
    }  
}

输出：[1, 2, 6]。

从后向前遍历

public class ListTest {  
    public static void main(String[] args) {  
       List<Integer> intList = new ArrayList<Integer>();  
       Collections.addAll(intList, 1, 2, 3, 5, 6);  
       for(int i = intList.size() - 1; i >= 0; i--) {  
           Integer value = intList.get(i);  
           if(value == 3 || value == 5) {  
              intList.remove(i);  
           }  
       }  
        System.out.println(intList);  
    }  
}

输出：[1, 2, 6]。

更新

/**
 * 将指定位置的元素更新为新元素
 */
public E set(int index, E element) {
    // 数组越界检查
    rangeCheck(index);

    // 取出要更新位置的元素，供返回使用
    E oldValue = elementData(index);
    // 将该位置赋值为行的元素
    elementData[index] = element;
    // 返回旧元素
    return oldValue;
}

查找

public E get(int index) {
    rangeCheck(index);

    return (E) elementData[index];
}

总结

ArrayList 实际上是通过一个数组去保存数据的。当我们构造ArrayList时；若使用默认构造函数，则ArrayList的默认容量大小是10。
当ArrayList容量不足以容纳全部元素时，ArrayList会重新设置容量：新的容量=“原始容量 * 1.5”。
ArrayList的克隆函数，即是将全部元素克隆到一个数组中。
ArrayList实现java.io.Serializable的方式。当写入到输出流时，先写入“容量”，再依次写入“每一个元素”；当读出输入流时，先读取“容量”，再依次读取“每一个元素”。

ArrayList遍历方式

ArrayList支持3种遍历方式：

迭代器遍历

Integer value = null;
Iterator iter = list.iterator();
while (iter.hasNext()) {
    value = (Integer)iter.next();
}

随机访问

即通过索引值去遍历，由于ArrayList实现了RandomAccess接口，它支持通过索引值去随机访问元素。

Integer value = null;
int size = list.size();
for (int i = 0; i < size; i++) {
    value = (Integer)list.get(i);        
}

for-each遍历

Integer value = null;
for (Integer i : list) {
    value = i;
}

遍历ArrayList时，使用随机访问(即，通过索引序号访问)效率最高，而使用迭代器的效率最低。

总结

ArrayList和LinkedList的区别

ArrayList是实现了基于动态数组的数据结构，LinkedList基于链表的数据结构。
对于随机访问get和set，ArrayList觉得优于LinkedList，因为LinkedList要移动指针。
对于新增和删除操作add和remove，LinkedList比较占优势，因为ArrayList要移动数据。

ArrayList和Vector的区别

Vector和ArrayList几乎是完全相同的,唯一的区别在于Vector是同步类(synchronized)，属于强同步类。因此开销就比ArrayList要大，访问要慢。正常情况下,大多数的Java程序员使用ArrayList而不是Vector,因为同步完全可以由程序员自己来控制。
Vector每次扩容请求其大小的2倍空间，而ArrayList是1.5倍。
Vector还有一个子类Stack.

附：集合框架思维导图

collection