jdk源码-Map与HashTable

2019 年 10 月 30 日
笔记

Map

map是一个接口，是一个映射着key和value关系的容器，从定义上看，map不能包含重复的key，一个key最多只能映射一个value。map是否有序取决于它的实现类，treeMap是有序的，hashmap是无序的。

对于map不支持的操作，会报UnsupportedOperationException。

map接口不会限制key和value是否可以为null，不会限制一定用equals or hashCode判断key是否相等，这都取决于具体的实现类。

map有get(Object key)，put(key, value)，replace(key, value)，remove(key)，entrySet()，keySet()等方法。都是通过key来查找，替换value。

HashTable

Hashtable继承Dictionary，实现Map接口，是一个不推荐使用的Map实现类。Dictionary是1.2推出，用于存键值对的抽象类，已经废弃。

Hashtable的数据结构是数组加链表，它是并发安全的。下面说说它的数据结构与get(key)，put(key, value)，remove(key)，keySet()都是如何实现的。

数据结构：Hashtable的数据结构实际是数组加链表，即使用数组存储，数组的每个item都是一个链表对象。之所以这么设计的原因是Object采用(key.hashcode & 0x7FFFFFFF) % tab.length来定位Object存放下标，但是两个对象的hashCode可能相同，两个不同hashCode& 0x7FFFFFFF也可能相同，取余也可能会相同，对于index相同key，就采用链表的方式去存了。

private transient Entry<?,?>[] table;    private static class Entry<K,V> implements Map.Entry<K,V> {      final int hash;      final K key;      V value;      Entry<K,V> next;  }

既然hashTable存的是数组，那就肯定要涉及到数组的初始化长度和扩容策略了。hashTable数组默认长度是11，扩容阀值是0.75，每次扩容长度为length*2+1。即当hashTable中的元素数达到table.length*阀值(0.75)时，就会触发扩容，它会创建一个新的数组，长度为length*2+1，并对所有元素进行重新定位。扩容机制发生在put方法，所以put(value)方法效率可能会很慢（发生扩容）。

// hashTable会默认加载一个长度为11的数组  public Hashtable() {      this(11, 0.75f);  }

public Hashtable(int initialCapacity, float loadFactor) {      if (initialCapacity < 0)          throw new IllegalArgumentException("Illegal Capacity: "+                                             initialCapacity);      if (loadFactor <= 0 || Float.isNaN(loadFactor))          throw new IllegalArgumentException("Illegal Load: "+loadFactor);        if (initialCapacity==0)          initialCapacity = 1;      this.loadFactor = loadFactor;      table = new Entry<?,?>[initialCapacity];      threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);  }

put方法：通过(key.hashcode & 0x7FFFFFFF) % tab.length来定位数组下标，获下标对于链表，通过key.equals判断可以相同的对象，将key，value放入新Entry对象并替换原对象并跳出方法；进入添加元素方法，判断元素长度是否超过阀值，定位到数组index，构造新Entry对象并插入到链表头节点。

插入链表头节点是最简单的操作，e.next = (Entry<K,V>)newMap[index]; newMap[index] = e;

public synchronized V put(K key, V value) {      // Make sure the value is not null      if (value == null) {          throw new NullPointerException();      }        // Makes sure the key is not already in the hashtable.      Entry<?,?> tab[] = table;      // key不可为null      int hash = key.hashCode();      // 获得一个小于int最大值的数，0x7FFFFFFF int最大值      int index = (hash & 0x7FFFFFFF) % tab.length;      @SuppressWarnings("unchecked")      Entry<K,V> entry = (Entry<K,V>)tab[index];      for(; entry != null ; entry = entry.next) {          if ((entry.hash == hash) && entry.key.equals(key)) {              V old = entry.value;              entry.value = value;              return old;          }      }        addEntry(hash, key, value, index);      return null;  }

private void addEntry(int hash, K key, V value, int index) {      modCount++;        Entry<?,?> tab[] = table;      if (count >= threshold) {          // 扩容并重新hash定位          rehash();            tab = table;          hash = key.hashCode();          index = (hash & 0x7FFFFFFF) % tab.length;      }        @SuppressWarnings("unchecked")      Entry<K,V> e = (Entry<K,V>) tab[index];      tab[index] = new Entry<>(hash, key, value, e);      count++;  }

protected void rehash() {      int oldCapacity = table.length;      Entry<?,?>[] oldMap = table;        // 扩容      int newCapacity = (oldCapacity << 1) + 1;      if (newCapacity - MAX_ARRAY_SIZE > 0) {          if (oldCapacity == MAX_ARRAY_SIZE)              // Keep running with MAX_ARRAY_SIZE buckets              return;          newCapacity = MAX_ARRAY_SIZE;      }      Entry<?,?>[] newMap = new Entry<?,?>[newCapacity];        modCount++;      threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);      table = newMap;        for (int i = oldCapacity ; i-- > 0 ;) {          for (Entry<K,V> old = (Entry<K,V>)oldMap[i] ; old != null ; ) {              Entry<K,V> e = old;              old = old.next;                int index = (e.hash & 0x7FFFFFFF) % newCapacity;              e.next = (Entry<K,V>)newMap[index];              newMap[index] = e;          }      }  }

get(key)的实现逻辑是通过hash找到数组index，在遍历index对应的链表，通过equals判读相同的key。

public synchronized V get(Object key) {      Entry<?,?> tab[] = table;      int hash = key.hashCode();      int index = (hash & 0x7FFFFFFF) % tab.length;      for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {          if ((e.hash == hash) && e.key.equals(key)) {              return (V)e.value;          }      }      return null;  }

remove(key)实现也很简单，通过hash找到数组index，在遍历这个链表，找到equals的key，接着把上一个节点的指针指向要删除节点的next，对象赋空。

public synchronized V remove(Object key) {      Entry<?,?> tab[] = table;      int hash = key.hashCode();      int index = (hash & 0x7FFFFFFF) % tab.length;      @SuppressWarnings("unchecked")      Entry<K,V> e = (Entry<K,V>)tab[index];      for(Entry<K,V> prev = null ; e != null ; prev = e, e = e.next) {          if ((e.hash == hash) && e.key.equals(key)) {              modCount++;              if (prev != null) {                  prev.next = e.next;              } else {                  tab[index] = e.next;              }              count--;              V oldValue = e.value;              e.value = null;              return oldValue;          }      }      return null;  }

keySet()是获所有的key，对map进行迭代的一个处理。HashTable的keySet是这么做处理的。HashTable自己显现了一个keySet类，但它只是一个空架子，真正的实现在迭代器方法中，而迭代器返回的是HashTable实现的一个内部类Enumerator。实现和运行逻辑是

private class Enumerator<T> implements Enumeration<T>, Iterator<T> {      // map的数组      Entry<?,?>[] table = Hashtable.this.table;      // 迭代到哪一个下标      int index = table.length;      // 目前迭代的对象      Entry<?,?> entry;      // 上次返回的对象      Entry<?,?> lastReturned;      // 迭代key还是value      int type;        boolean iterator;        protected int expectedModCount = modCount;        Enumerator(int type, boolean iterator) {          this.type = type;          this.iterator = iterator;      }        //1.判断当前元素是否有next，2.在数组中index到0去遍历，如果都是null，则没有下个元素。      public boolean hasMoreElements() {          Entry<?,?> e = entry;          int i = index;          Entry<?,?>[] t = table;          /* Use locals for faster loop iteration */          while (e == null && i > 0) {              e = t[--i];          }          entry = e;          index = i;          return e != null;      }        // 当前返回entry不为空时，直接返回entry，并把entry赋值成next；数组倒叙遍历到不是null的，把这个node赋值给entry，赋值数组下标，返回entry，并把entry赋值成next      @SuppressWarnings("unchecked")      public T nextElement() {          Entry<?,?> et = entry;          int i = index;          Entry<?,?>[] t = table;          /* Use locals for faster loop iteration */          while (et == null && i > 0) {              et = t[--i];          }          entry = et;          index = i;          if (et != null) {              Entry<?,?> e = lastReturned = entry;              entry = e.next;              return type == KEYS ? (T)e.key : (type == VALUES ? (T)e.value : (T)e);          }          throw new NoSuchElementException("Hashtable Enumerator");      }        public boolean hasNext() {          return hasMoreElements();      }        public T next() {          if (modCount != expectedModCount)              throw new ConcurrentModificationException();          return nextElement();      }  }

HashTable是线程安全的，原因是在每个方法前都加上了synchronized，故而它最多支持一个线程操作。