概述
HashMap是常用的数据结构,在java8中,其底层是用数组、链表、红黑树结合实现的。在新建一个HashMap时会初始化一个指定大小的数组,在插入元素的过程中,会通过key定位到数组的某个位置上,然后根据链表或红黑树的数据结构进行插入。当数组中已使用的位置达到某个阈值时,会对数组进行扩容。HashMap的数据结构大致如下图所示
本文主要分析put、get和remove方法。
一些成员变量
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
static final int MAXIMUM_CAPACITY = 1 << 30;
static final float DEFAULT_LOAD_FACTOR = 0.75f;
static final int TREEIFY_THRESHOLD = 8;
static final int UNTREEIFY_THRESHOLD = 6;
transient Node<K,V>[] table;
int threshold;
final float loadFactor;
transient int size;
|
put方法
大致流程
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| start=>start: Start hash=>operation: 对key进行hash求值 isTableEmpty=>condition: 数组未初始化 initTable=>operation: 初始化数组 isFindNodeInArray=>condition: 在数组位置上 找到相同key 的节点 isRedBlackTree=>condition: 数组位置关联 的是红黑树 handleInRedBlackTree=>operation: 在红黑树中执行 节点查找插入操作 isFindNode=>condition: 找到相同 key的节点 handleReplacement=>operation: 根据需要替换对应的value isToReSize=>condition: map元素大小 达到阈值 resize=>operation: 扩容 isFindInList=>condition: 在链表中找到 相同key节点 insertNodeInList=>operation: 在链表尾部插入节点 isChangeToTree=>condition: 链表节点数是 否超过阈值 changeListToTree=>operation: 将链表转为红黑树 end=>end
start->hash->isTableEmpty(yes)->initTable->isFindNodeInArray(yes,right)->isFindNode(yes)->handleReplacement->isToReSize(yes)->resize->end isTableEmpty(no)->isFindNodeInArray(no)->isRedBlackTree(yes)->handleInRedBlackTree->isFindNode isRedBlackTree(no)->isFindInList(no)->insertNodeInList->isChangeToTree(yes)->changeListToTree(right)->isFindNode isFindInList(yes)->isFindNode isFindNode(no)->isToReSize isToReSize(no)->end
|
代码
putVal函数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
| public V put(K key, V value) { return putVal(hash(key), key, value, false, true); }
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); else { Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }
|
resize函数
resize函数是HashMap扩容的函数,扩容的主要流程为:
- 将capacity和threshold进行扩容,一般情况下是原先的两倍。
- 根据新的capacity新建数组(一般情况下为原先长度的两倍),将table指向新建的数组
- 对原先数组及其关联的链表、红黑树的节点进行重分布。将一些节点留在 i (i 为节点所在的原先数组位置)位置上,另一些节点留在 i + oldCapacity 上(i 为节点所在的原先数组位置,oldCapacity为原先数组长度)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
| final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; } else if (oldThr > 0) newCap = oldThr; else { newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
|
treeifyBin函数
这是将链表转换为红黑树的函数。在转换之前,会将节点的数据结构由Node转为TreeNode,此转换过程中并不会改变原先链表的顺序,这也是方便重分布红黑树和后续将红黑树转为链表时仍能维持原先的顺序。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| final void treeifyBin(Node<K,V>[] tab, int hash) { int n, index; Node<K,V> e; if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY) resize(); else if ((e = tab[index = (n - 1) & hash]) != null) { TreeNode<K,V> hd = null, tl = null; do { TreeNode<K,V> p = replacementTreeNode(e, null); if (tl == null) hd = p; else { p.prev = tl; tl.next = p; } tl = p; } while ((e = e.next) != null); if ((tab[index] = hd) != null) hd.treeify(tab); } }
|
split函数
这个函数主要是对红黑树进行重分布。由于将链表转为红黑树的过程中维持者链表节点的顺序,因此此处重分布红黑树也是先执行链表的重分布步骤将节点以链表的形式分散在数组两个位置上,然后对这两个位置的链表以插入的方式构造红黑树。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
| final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) { TreeNode<K,V> b = this; TreeNode<K,V> loHead = null, loTail = null; TreeNode<K,V> hiHead = null, hiTail = null; int lc = 0, hc = 0; for (TreeNode<K,V> e = b, next; e != null; e = next) { next = (TreeNode<K,V>)e.next; e.next = null; if ((e.hash & bit) == 0) { if ((e.prev = loTail) == null) loHead = e; else loTail.next = e; loTail = e; ++lc; } else { if ((e.prev = hiTail) == null) hiHead = e; else hiTail.next = e; hiTail = e; ++hc; } }
if (loHead != null) { if (lc <= UNTREEIFY_THRESHOLD) tab[index] = loHead.untreeify(map); else { tab[index] = loHead; if (hiHead != null) loHead.treeify(tab); } } if (hiHead != null) { if (hc <= UNTREEIFY_THRESHOLD) tab[index + bit] = hiHead.untreeify(map); else { tab[index + bit] = hiHead; if (loHead != null) hiHead.treeify(tab); } } }
|
untreeify函数
此函数是将红黑树转为链表的函数,由于将链表转为红黑树的过程中,替换了节点的数据结构,但是链表的节点顺序并为改变,因此,在将红黑树转为链表的过程中,只需将节点的数据结构由TreeNode转为Node即可。
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| final Node<K,V> untreeify(HashMap<K,V> map) { Node<K,V> hd = null, tl = null; for (Node<K,V> q = this; q != null; q = q.next) { Node<K,V> p = map.replacementNode(q, null); if (tl == null) hd = p; else tl.next = p; tl = p; } return hd; }
|
get方法
get方法相对而言比较简单,主要是先计算hash(key)的值,然后进行查找操作。
大致流程
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| start=>start: Start hash=>operation: 对key进行hash求值 isFindInArray=>condition: 在数组位置上找 到相同key的节点 returnValue=>operation: 返回节点值 returnNull=>operation: 返回空 isRedBlackTree=>condition: 数组位置 关联红黑树 findInRedBlackTree=>operation: 在红黑树中查找节点 isFindNode=>condition: 找到相同key节点 findInList=>operation: 在链表中查找节点 end=>end: End
start->hash->isFindInArray(no, left)->isRedBlackTree(yes)->findInRedBlackTree->isFindNode(yes)->returnValue->end isFindNode(no)->returnNull->end isFindInArray(yes)->isFindNode isRedBlackTree(no)->findInList->isFindNode
|
代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| public V get(Object key) { Node<K,V> e; return (e = getNode(hash(key), key)) == null ? null : e.value; }
final Node<K,V> getNode(int hash, Object key) { Node<K,V>[] tab; Node<K,V> first, e; int n; K k; if ((tab = table) != null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) != null) { if (first.hash == hash && ((k = first.key) == key || (key != null && key.equals(k)))) return first; if ((e = first.next) != null) { if (first instanceof TreeNode) return ((TreeNode<K,V>)first).getTreeNode(hash, key); do { if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } while ((e = e.next) != null); } } return null; }
|
remove方法
大致流程
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| start=>start: Start hash=>operation: 对key进行hash求值 isFindInArray=>condition: 在数组位置上找 到相同key的节点 isRedBlackTree=>condition: 数组位置 关联红黑树 findInRedBlackTree=>operation: 在红黑树中查找节点 isFindNode=>condition: 找到相同key节点 findInList=>operation: 在链表中查找节点 isRBT=>condition: 数组位置 关联红黑树 deleteNodeInRBT=>operation: 执行红黑树删除节点操作 deleteNodeInList=>operation: 执行链表删除操作 end=>end: End
start->hash->isFindInArray(no, left)->isRedBlackTree(yes)->findInRedBlackTree->isFindNode(yes)->isRBT(yes)->deleteNodeInRBT->end isFindNode(no)->end isFindInArray(yes)->isFindNode isRedBlackTree(no)->findInList->isFindNode isRBT(no)->deleteNodeInList->end
|
代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
| final Node<K,V> removeNode(int hash, Object key, Object value, boolean matchValue, boolean movable) { Node<K,V>[] tab; Node<K,V> p; int n, index; if ((tab = table) != null && (n = tab.length) > 0 && (p = tab[index = (n - 1) & hash]) != null) { Node<K,V> node = null, e; K k; V v; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) node = p; else if ((e = p.next) != null) { if (p instanceof TreeNode) node = ((TreeNode<K,V>)p).getTreeNode(hash, key); else { do { if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) { node = e; break; } p = e; } while ((e = e.next) != null); } } if (node != null && (!matchValue || (v = node.value) == value || (value != null && value.equals(v)))) { if (node instanceof TreeNode) ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable); else if (node == p) tab[index] = node.next; else p.next = node.next; ++modCount; --size; afterNodeRemoval(node); return node; } } return null; }
|
其他
HashMap是非线程安全的
在 put 方法中,查找链表到尾部时,如果仍然找不到 key 值相同的元素,则在尾部插入新节点。假设线程A要插入(a, a)的键值对,线程B要插入(b, b)的键值对,再假设 a、b 这两个 key 会映射到同一个 table[i] 上,并且原先都没有插入过 key 为 a 或 b 的键值对;当线程A遍历到链表尾部时,在准备(下一步)插入新节点的时候,线程切换到B执行,B也遍历到链表尾部,并插入新节点(b, b),然后线程切换到A,A插入新节点(a, a),此时会造成插入的(b, b)节点被覆盖丢失。
疑问
在查找节点和删除节点这两个操作中,都涉及了查找节点的过程,但是在remove的查找节点的过程中并不是调用get方法中调用的getNode
函数,而是重新写了类似的查找过程,这让我很疑惑。不管怎样,不可否认,java8中HashMap的源码是优秀的。关于疑问就先留着以后解决吧。