.NET深入了解哈希表和Dictionary

引子

問題:給定一串數字{1,2,5,7,15,24,33,52},如何在時間複雜度為O(1)下,對數據進行CURD?

數組:我創建一個Length為53的數組,將元素插入相同下標處,是不是就可以實現查找複雜度O(1)了?但是添加修改元素時間複雜度為O(n)了。

鏈表:添加刪除複雜度為O(1),但是查找時間複雜度為O(n)了。

身為.NETer肯定熟練使用Dictionary和HashSet,這兩個容器的底層就是HashTable,所以帶著對技術濃重的興趣(面試),所以就從頭到尾梳理一下!

理論

鏈地址法(拉鏈法)

回到問題本身,我們用數組可以實現查找複雜度為O(1),鏈表實現添加刪除複雜度為O(1),如果我們將兩個合起來,不就可以實現增刪查都為O(1)了么?如何結合呢?

我們先定義一個數組,長度為7(敲黑板,思考下為什麼選7?),將所有元素對7取余,這樣所有元素都可以放在數組上了,如下圖所示:

如上圖,如果我們將數組中每個下標位置都放成一個鏈條,這樣,複雜度不久降下去了么?

有問題么?沒問題。真沒問題么?有問題……

注意

  1. 插入元素是{0,7,14,21,28}怎麼辦?這樣都落在下標為0的鏈條里,時間複雜度不又上去了?針對這種情況,隔壁Java將鏈表優化成了紅黑書,我們.NET呢?往下看。

  2. 如果我的數組長度不是7,是2怎麼辦?所有數對2取余,不是1就是0,時間複雜度不又上去了?所以我們對數組長度應該取素數。

  3. 如果元素超級多或者特別少,我們的數組長度要固定么?就要動態長度

上邊這種方法學名就叫拉鏈法!

開放地址法

上邊我們聊過拉鏈法(為什麼老想著褲子拉鏈……),拉鏈法是向下開闢新的空間,如果我們橫向開闢空間呢?還是剛才的例子,我們這樣搞一下試試。

線性探測法

我們插完7以後,在插24時,發現下標為2的地方有元素了,於是向後移動一位,發現有空位,於是就插進去了。

上邊這種方法就是線性探測法!

二次聚集(堆積)

聰明的老鳥們,肯定疑惑啦,如果我們繼續添加元素{x%11=4},{y%11=5},此時x,y元素都要往下標6插數據。這樣就導致了原始哈希地址不同的元素要插入同一個地址。即添加同義詞的衝突過程中又添加了非同義詞的衝突。這就是二次聚集

二次探測法

如果在線性探測法中,我們不依次尋找下一個呢?我們針對”下一個”採取{1 ^ 2,-1 ^ 2,2 ^ 2,-2 ^ 2….}(垃圾編輯器,次方樣式亂了)這樣的步長呢?真聰明!你已經知道二次探測法了!

這……這還能用么?不都亂了么?下標和元素對不上了呀!怎麼去查找元素呢?

別急呀,家人們吶,我們按照這個思路查詢就好了:

查找演算法步驟

1. 給定待查找的關鍵字key,獲取原始應該插入的下標index
2. 如果原始下標index處,元素為空,則所查找的元素不存在
3. 如果index處的元素等於key,則查找成功
4. 否則重複下述解決衝突的過程
  * 按照處理衝突的方法,計算下一個地址nextIndex
  * 若nextIndex為空,則查找元素不存在
  * 若nextIndex等於關鍵詞key,則查找成功

還有要注意的點么?必須有!

注意(敲重點啦)

  1. 數組長度必須大於給定元素的長度!
  2. 當數組元素快裝滿時,時間複雜度也是O(n)!
  3. 如果都裝滿了,就會一直循環找空位,我們應該進行擴容!

理論小結

介面設計

幹活啦,幹活啦,領導嫌查詢效率太低,讓設計一種CURD時間複雜度都為O(n)的數據結構。給了介面。介面如下:

internal interface IDictionary<TK, TV> : IEnumerable<KeyValuePair<TK, TV>>
{
    TV this[TK key] { get; set; }

    int Count { get; }
    /// <summary>
    /// 根據key判斷元素是否存在
    /// </summary>
    /// <param name="key"></param>
    /// <returns></returns>
    bool ContainsKey(TK key);
    /// <summary>
    /// 添加元素
    /// </summary>
    /// <param name="key"></param>
    /// <param name="value"></param>
    void Add(TK key, TV value);
    /// <summary>
    /// 根據key移除元素
    /// </summary>
    /// <param name="key"></param>
    void Remove(TK key);
    /// <summary>
    /// 清除
    /// </summary>
    void Clear();
}

.NET實現線性探測法

實現過程

1. 先來個對象,存儲key和value

對象:KeyValuePair
internal class DictionaryKeyValuePair<TK, TV>
{
    internal TK Key;
    internal TV Value;

    internal DictionaryKeyValuePair(TK key, TV value)
    {
        Key = key;
        Value = value;
    }
}

2. 來個類OpenAddressDictionary,繼承IDictionary介面,就是我們的實現類

實現類:OpenAddressDictionary
/// <summary>
/// 使用線性探測法實現哈希表
/// </summary>
/// <typeparam name="TK"></typeparam>
/// <typeparam name="TV"></typeparam>
public class OpenAddressDictionary<TK, TV> : IDictionary<TK, TV>
{
    //創建一個數組,用來存儲元素
    private DictionaryKeyValuePair<TK, TV>[] hashArray;

    //記錄已插入元素的數量
    public int Count { get; private set; }

    public OpenAddressDictionary(int capacity)
    {
        if (capacity < 0)
            throw new ArgumentOutOfRangeException("初始值容量不能小於0");
        hashArray = new DictionaryKeyValuePair<TK, TV>[capacity];
    }
    public TV this[TK key] {
        get => throw new System.NotImplementedException();
        set => throw new System.NotImplementedException();
    }

    public void Add(TK key, TV value)
    {
        throw new System.NotImplementedException();
    }

    public void Clear()
    {
        throw new System.NotImplementedException();
    }

    public System.Boolean ContainsKey(TK key)
    {
        throw new System.NotImplementedException();
    }

    public IEnumerator<KeyValuePair<TK, TV>> GetEnumerator()
    {
        throw new System.NotImplementedException();
    }

    public void Remove(TK key)
    {
        throw new System.NotImplementedException();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        throw new System.NotImplementedException();
    }
}

3.如何實現查找?跟著上文查找步驟就行

線性探測:查找
/// <summary>
/// 查找,按照上文線性探測查找步驟
/// </summary>
/// <param name="key"></param>
/// <returns></returns>
public bool ContainsKey(TK key)
{
    //1.給定待查找的關鍵字key,獲取原始應該插入的下標index
    var hashCode = GetHash(key);
    var index = hashCode % hashArray.Length;

    //2.如果原始下標index處,元素為空,則所查找的元素不存在
    if (hashArray[index] == null) return false;

    var current = hashArray[index];//當前元素

    /*這個點用來判斷是否走了一整圈*/
    var hitKey = current.Key;

    //4.否則重複下述解決衝突的過程           
    while (current != null)
    {
        //3.如果index處的元素等於key,則查找成功
        if (current.Key.Equals(key)) return true;

        /*這個地方來修改獲取下一個元素位置*/
        index++;

        /*到尾了,但是沒有走完一圈*/
        if (index == hashArray.Length)
            index = 0;

        current = hashArray[index];

        /*走完一圈了,沒找到*/
        if (current != null && current.Key.Equals(hitKey)) break;
    }

    return false;
}

4. 添加

線性探測:添加
/// <summary>
/// 添加元素
/// </summary>
/// <param name="key"></param>
/// <param name="value"></param>
/// <exception cref="Exception"></exception>
public void Add(TK key, TV value)
{
    Grow();
    //1.獲取原始插入位置
    var hashCode = GetHash(key);
    var index = hashCode % hashArray.Length;

    //2.此位置為空,直接插入
    if (hashArray[index] == null)
    {
        hashArray[index] = new DictionaryKeyValuePair<TK, TV>(key, value);
    }
    //3.坑被佔了,去看看下一個
    else
    {
        var current = hashArray[index];
        /*這個點用來判斷是否走了一整圈*/
        var hitKey = current.Key;
        while (current != null)
        {
            if (current.Key.Equals(key)) throw new Exception("重複key");

            /*這個地方來修改獲取下一個元素位置*/
            index++;

            /*到尾了,但是沒有走完一圈*/
            if (index == hashArray.Length)
                index = 0;

            current = hashArray[index];

            /*走完一圈了,沒找到空位*/
            if (current != null && current.Key.Equals(hitKey)) throw new Exception("容器滿了");
        }
        hashArray[index] = new DictionaryKeyValuePair<TK, TV>(key, value);
    }
    Count++;
}
/// <summary>
/// 擴容
/// </summary>
private void Grow()
{
    /*這個地方判斷使用多少擴容*/
    if (hashArray.Length * 0.7 <= Count)
    {
        var orghashArray = hashArray.Length;
        var currentArray = hashArray;

        /*這個地方改變擴容大小的規則*/
        hashArray = new DictionaryKeyValuePair<TK, TV>[hashArray.Length * 2];
        for (var i = 0; i < orghashArray; i++)
        {
            var current = currentArray[i];

            /*舊數組中存在元素,添加到新數組中,Add方法會對Count++,所以加入後要Count--*/
            if (current != null)
            {
                Add(current.Key, current.Value);
                Count--;
            }
        }
        currentArray = null;
    }
}

5. 刪除

線性探測:刪除
/// <summary>
/// 刪除元素key
/// </summary>
/// <param name="key"></param>
/// <exception cref="Exception"></exception>
public void Remove(TK key)
{
    //1.獲取原始插入位置
    var hashCode = GetHash(key);
    var curIndex = hashCode % hashArray.Length;

    //2.此位置為空,無法刪除
    if (hashArray[curIndex] == null) throw new Exception("未找到元素key");

    var current = hashArray[curIndex];

    /*這個點用來判斷是否走了一整圈*/
    var hitKey = current.Key;

    #region 找到待刪除元素
    DictionaryKeyValuePair<TK, TV> target = null;

    while (current != null)
    {
        if (current.Key.Equals(key))
        {
            target = current;
            break;
        }

        /*這個地方來修改獲取下一個元素位置*/
        curIndex++;

        /*到尾了,但是沒有走完一圈*/
        if (curIndex == hashArray.Length)
            curIndex = 0;

        current = hashArray[curIndex];

        /*走完一圈了,沒找到空位*/
        if (current != null && current.Key.Equals(hitKey)) throw new Exception("No such item for given key");
    }

    if (target == null)
    {
        throw new Exception("未找到元素key");
    }
    #endregion  


    //刪除,將當前位置置空
    hashArray[curIndex] = null;

    #region 之前講過刪除,造成元素丟失,所以在此處處理

    curIndex++;

    /*到尾了,但是沒有走完一圈*/
    if (curIndex == hashArray.Length)
        curIndex = 0;

    current = hashArray[curIndex];

    //直到下一個為空的點,到空說明後邊的還沒有被線性探測插入污染
    while (current != null)
    {
        //先刪除
        hashArray[curIndex] = null;

        //重新插入
        Add(current.Key, current.Value);
        Count--;

        curIndex++;

        /*到尾了,但是沒有走完一圈*/
        if (curIndex == hashArray.Length)
            curIndex = 0;

        current = hashArray[curIndex];
    }
    #endregion


    Count--;

    Shrink();
}

/// <summary>
/// 減容
/// </summary>
private void Shrink()
{
    /*這個地方判斷元素在什麼程度算少*/
    if (Count <= hashArray.Length * 0.3 && hashArray.Length / 2 > 0)
    {
        var orghashArray = hashArray.Length;
        var currentArray = hashArray;

        /*這個地方改變擴容大小的規則*/
        hashArray = new DictionaryKeyValuePair<TK, TV>[hashArray.Length / 2];

        for (var i = 0; i < orghashArray; i++)
        {
            var current = currentArray[i];

            /*舊數組中存在元素,添加到新數組中,Add方法會對Count++,所以加入後要Count--*/
            if (current != null)
            {
                Add(current.Key, current.Value);
                Count--;
            }
        }

        currentArray = null;
    }
}

最終程式碼

線性探測:最終程式碼
/// <summary>
/// 使用線性探測法實現哈希表
/// </summary>
/// <typeparam name="TK"></typeparam>
/// <typeparam name="TV"></typeparam>
public class OpenAddressDictionary<TK, TV> : IDictionary<TK, TV>
{
    //創建一個數組,用來存儲元素
    private DictionaryKeyValuePair<TK, TV>[] hashArray;
    //記錄已插入元素的數量
    public int Count { get; private set; }

    public TV this[TK key]
    {
        get => GetValue(key);
        set => SetValue(key, value);
    }

    public OpenAddressDictionary(int capacity)
    {
        if (capacity < 0)
            throw new ArgumentOutOfRangeException("初始值容量不能小於0");
        hashArray = new DictionaryKeyValuePair<TK, TV>[capacity];
    }
    /// <summary>
    /// 清除最簡單
    /// </summary>
    public void Clear()
    {
        if (Count > 0)
            Array.Clear(hashArray, 0, hashArray.Length);
    }

    /// <summary>
    /// 查找,按照上文線性探測查找步驟
    /// </summary>
    /// <param name="key"></param>
    /// <returns></returns>
    public bool ContainsKey(TK key)
    {
        //1.給定待查找的關鍵字key,獲取原始應該插入的下標index
        var hashCode = GetHash(key);
        var index = hashCode % hashArray.Length;

        //2.如果原始下標index處,元素為空,則所查找的元素不存在
        if (hashArray[index] == null) return false;

        var current = hashArray[index];//當前元素

        /*這個點用來判斷是否走了一整圈*/
        var hitKey = current.Key;

        //4.否則重複下述解決衝突的過程           
        while (current != null)
        {
            //3.如果index處的元素等於key,則查找成功
            if (current.Key.Equals(key)) return true;

            /*這個地方來修改獲取下一個元素位置*/
            index++;

            /*到尾了,但是沒有走完一圈*/
            if (index == hashArray.Length)
                index = 0;

            current = hashArray[index];

            /*走完一圈了,沒找到*/
            if (current != null && current.Key.Equals(hitKey)) break;
        }

        return false;
    }

    /// <summary>
    /// 添加元素
    /// </summary>
    /// <param name="key"></param>
    /// <param name="value"></param>
    /// <exception cref="Exception"></exception>
    public void Add(TK key, TV value)
    {
        Grow();
        //1.獲取原始插入位置
        var hashCode = GetHash(key);
        var index = hashCode % hashArray.Length;

        //2.此位置為空,直接插入
        if (hashArray[index] == null)
        {
            hashArray[index] = new DictionaryKeyValuePair<TK, TV>(key, value);
        }
        //3.坑被佔了,去看看下一個
        else
        {
            var current = hashArray[index];
            /*這個點用來判斷是否走了一整圈*/
            var hitKey = current.Key;
            while (current != null)
            {
                if (current.Key.Equals(key)) throw new Exception("重複key");

                /*這個地方來修改獲取下一個元素位置*/
                index++;

                /*到尾了,但是沒有走完一圈*/
                if (index == hashArray.Length)
                    index = 0;

                current = hashArray[index];

                /*走完一圈了,沒找到空位*/
                if (current != null && current.Key.Equals(hitKey)) throw new Exception("容器滿了");
            }
            hashArray[index] = new DictionaryKeyValuePair<TK, TV>(key, value);
        }
        Count++;
    }
    /// <summary>
    /// 刪除元素key
    /// </summary>
    /// <param name="key"></param>
    /// <exception cref="Exception"></exception>
    public void Remove(TK key)
    {
        //1.獲取原始插入位置
        var hashCode = GetHash(key);
        var curIndex = hashCode % hashArray.Length;

        //2.此位置為空,無法刪除
        if (hashArray[curIndex] == null) throw new Exception("未找到元素key");

        var current = hashArray[curIndex];

        /*這個點用來判斷是否走了一整圈*/
        var hitKey = current.Key;

        #region 找到待刪除元素
        DictionaryKeyValuePair<TK, TV> target = null;

        while (current != null)
        {
            if (current.Key.Equals(key))
            {
                target = current;
                break;
            }

            /*這個地方來修改獲取下一個元素位置*/
            curIndex++;

            /*到尾了,但是沒有走完一圈*/
            if (curIndex == hashArray.Length)
                curIndex = 0;

            current = hashArray[curIndex];

            /*走完一圈了,沒找到空位*/
            if (current != null && current.Key.Equals(hitKey)) throw new Exception("No such item for given key");
        }

        if (target == null)
        {
            throw new Exception("未找到元素key");
        }
        #endregion  


        //刪除,將當前位置置空
        hashArray[curIndex] = null;

        #region 之前講過刪除,造成元素丟失,所以在此處處理

        curIndex++;

        /*到尾了,但是沒有走完一圈*/
        if (curIndex == hashArray.Length)
            curIndex = 0;

        current = hashArray[curIndex];

        //直到下一個為空的點,到空說明後邊的還沒有被線性探測插入污染
        while (current != null)
        {
            //先刪除
            hashArray[curIndex] = null;

            //重新插入
            Add(current.Key, current.Value);
            Count--;

            curIndex++;

            /*到尾了,但是沒有走完一圈*/
            if (curIndex == hashArray.Length)
                curIndex = 0;

            current = hashArray[curIndex];
        }
        #endregion


        Count--;

        Shrink();
    }

    /// <summary>
    /// 擴容
    /// </summary>
    private void Grow()
    {
        /*這個地方判斷使用多少擴容*/
        if (hashArray.Length * 0.7 <= Count)
        {
            var orghashArray = hashArray.Length;
            var currentArray = hashArray;

            /*這個地方改變擴容大小的規則*/
            hashArray = new DictionaryKeyValuePair<TK, TV>[hashArray.Length * 2];
            for (var i = 0; i < orghashArray; i++)
            {
                var current = currentArray[i];

                /*舊數組中存在元素,添加到新數組中,Add方法會對Count++,所以加入後要Count--*/
                if (current != null)
                {
                    Add(current.Key, current.Value);
                    Count--;
                }
            }
            currentArray = null;
        }
    }
    /// <summary>
    /// 減容
    /// </summary>
    private void Shrink()
    {
        /*這個地方判斷元素在什麼程度算少*/
        if (Count <= hashArray.Length * 0.3 && hashArray.Length / 2 > 0)
        {
            var orghashArray = hashArray.Length;
            var currentArray = hashArray;

            /*這個地方改變擴容大小的規則*/
            hashArray = new DictionaryKeyValuePair<TK, TV>[hashArray.Length / 2];

            for (var i = 0; i < orghashArray; i++)
            {
                var current = currentArray[i];

                /*舊數組中存在元素,添加到新數組中,Add方法會對Count++,所以加入後要Count--*/
                if (current != null)
                {
                    Add(current.Key, current.Value);
                    Count--;
                }
            }

            currentArray = null;
        }
    }

    private void SetValue(TK key, TV value)
    {
        var index = GetHash(key) % hashArray.Length;

        if (hashArray[index] == null)
        {
            Add(key, value);
        }
        else
        {
            var current = hashArray[index];
            var hitKey = current.Key;

            while (current != null)
            {
                if (current.Key.Equals(key))
                {
                    Remove(key);
                    Add(key, value);
                    return;
                }

                index++;

                //wrap around
                if (index == hashArray.Length)
                    index = 0;

                current = hashArray[index];

                //reached original hit again
                if (current != null && current.Key.Equals(hitKey)) throw new Exception("Item not found");
            }
        }

        throw new Exception("Item not found");
    }

    private TV GetValue(TK key)
    {
        var index = GetHash(key) % hashArray.Length;

        if (hashArray[index] == null) throw new Exception("Item not found");

        var current = hashArray[index];
        var hitKey = current.Key;

        while (current != null)
        {
            if (current.Key.Equals(key)) return current.Value;

            index++;

            //wrap around
            if (index == hashArray.Length)
                index = 0;

            current = hashArray[index];

            //reached original hit again
            if (current != null && current.Key.Equals(hitKey)) throw new Exception("Item not found");
        }

        throw new Exception("Item not found");
    }

    private int GetHash(TK key)
    {
        return Math.Abs(key.GetHashCode());
    }
    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
    //迭代器就不寫了,想了解看我部落格容器欄目
    public IEnumerator<KeyValuePair<TK, TV>> GetEnumerator()
    {
        throw new System.NotImplementedException();
    }
}
internal class DictionaryKeyValuePair<TK, TV>
{
    internal TK Key;
    internal TV Value;

    internal DictionaryKeyValuePair(TK key, TV value)
    {
        Key = key;
        Value = value;
    }
}

.NET實現拉鏈法

實現過程

回想一下,上邊的拉鏈法,每個下標位置放置的是一個鏈條,所以我們先實現一個雙向鏈表

1. 實現一個雙向鏈表

拉鏈法:構建雙向鏈表
internal class DLinkedNode<T>
{
    public T Data;
    public DLinkedNode<T> Next;
    public DLinkedNode<T> Previous;

    public DLinkedNode(T data)
    {
        Data = data;
    }
}

2. 創建一個拉鏈法實體類

拉鏈法:實現類
/// <summary>
/// 拉鏈法:實現類
/// </summary>
/// <typeparam name="TK"></typeparam>
/// <typeparam name="TV"></typeparam>
internal class SeparateChainingDictionary<TK, TV>:IDictionary<TK, TV>
{
    //構建一個數組,數組每個節點都是鏈表
    private DLinkedNode<KeyValuePair<TK, TV>>[] hashArray;
    //已使用數組下標個數
    private int filledBuckets;

    public SeparateChainingDictionary(int capacity) {
        if (capacity < 0)
            throw new ArgumentOutOfRangeException("初始值容量不能小於0");
        hashArray = new DLinkedNode<KeyValuePair<TK, TV>>[capacity];
    }
    public TV this[TK key] { 
        get => throw new NotImplementedException(); 
        set => throw new NotImplementedException(); 
    }

    public int Count => throw new NotImplementedException();

    public void Add(TK key, TV value)
    {
        throw new NotImplementedException();
    }

    public void Clear()
    {
        throw new NotImplementedException();
    }

    public bool ContainsKey(TK key)
    {
        throw new NotImplementedException();
    }

    public void Remove(TK key)
    {
        throw new NotImplementedException();
    }

    public IEnumerator<KeyValuePair<TK, TV>> GetEnumerator()
    {
        throw new NotImplementedException();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        throw new NotImplementedException();
    }
}

3. 拉鏈法:查找

拉鏈法:查找
/// <summary>
/// 查找
/// </summary>
/// <param name="key"></param>
/// <returns></returns>
public bool ContainsKey(TK key)
{
    /*1.獲取原始下標*/
    var index = Math.Abs(key.GetHashCode()) % hashArray.Length;

    /*2.為空即無*/
    if (hashArray[index] == null) return false;

    var current = hashArray[index];

    /*3.遍歷鏈表*/
    while (current != null)
    {               
        if (current.Data.Key.Equals(key)) return true;

        current = current.Next;
    }

    return false;
}

4. 拉鏈法:添加

拉鏈法:添加
/// <summary>
/// 添加
/// </summary>
/// <param name="key"></param>
/// <param name="value"></param>
/// <exception cref="Exception"></exception>
public void Add(TK key, TV value)
{
    Grow();

    var index = Math.Abs(key.GetHashCode()) % hashArray.Length;

    if (hashArray[index] == null)
    {
        hashArray[index] = new DLinkedNode<KeyValuePair<TK, TV>>(new KeyValuePair<TK, TV>(key, value));
        filledBuckets++;
    }
    else
    {
        var current = hashArray[index];

        while (current != null && current.Next != null)
        {
            /*此處可以判斷是重複修改,還是拋異常*/
            if (current.Data.Key.Equals(key)) throw new Exception("重複key");

            current = current.Next;
        }
        if (current.Data.Key.Equals(key)) throw new Exception("重複key");
        current.Next = new DLinkedNode<KeyValuePair<TK, TV>>(new KeyValuePair<TK, TV>(key, value));
    }

    Count++;
}

/// <summary>
/// 擴容
/// </summary>
private void Grow()
{
    if (filledBuckets >= hashArray.Length * 0.7)
    {
        filledBuckets = 0;

        var newBucketSize = hashArray.Length * 2;

        var biggerArray = new DLinkedNode<KeyValuePair<TK, TV>>[newBucketSize];

        for (var i = 0; i < hashArray.Length; i++)
        {
            var item = hashArray[i];

            if (item != null)
            {
                var current = item;

                while (current != null)
                {
                    var next = current.Next;

                    var newIndex = Math.Abs(current.Data.Key.GetHashCode()) % newBucketSize;

                    if (biggerArray[newIndex] == null)
                    {
                        filledBuckets++;
                        biggerArray[newIndex] = current;
                    }

                    var bItem = biggerArray[newIndex];
                    while(bItem.Next != null)
                        bItem = bItem.Next;
                    bItem.Next = current;

                    current = next;
                }
            }
        }

        hashArray = biggerArray;
    }
}

5. 拉鏈法:刪除

拉鏈法:刪除
public void Remove(TK key)
{
    var index = Math.Abs(key.GetHashCode()) % hashArray.Length;

    if (hashArray[index] == null) throw new Exception("未找到key");

    var current = hashArray[index];

    /*查找待刪除元素*/
    DLinkedNode<KeyValuePair<TK, TV>> item = null;
    while (current != null)
    {
        if (current.Data.Key.Equals(key))
        {
            item = current;
            break;
        }
        current = current.Next;
    }

    if (item == null)
    {
        throw new Exception("未找到key");
    }

    /*刪除*/
    if (item.Next == null)
        item = null;
    else
    {
        item.Previous = item.Next; 
        item.Next.Previous =item.Previous ;
        item = null;
    }


    if (hashArray[index] == null)
    {
        filledBuckets--;
    }

    Count--;

    Shrink();
}
private void Shrink()
{
    /*是否減容*/
    if (Math.Abs(filledBuckets - hashArray.Length * 0.3) < 0.1 && hashArray.Length / 2 > 0)
    {
        filledBuckets = 0;
        var newBucketSize = hashArray.Length / 2;

        var smallerArray = new DLinkedNode<KeyValuePair<TK, TV>>[newBucketSize];

        for (var i = 0; i < hashArray.Length; i++)
        {
            var item = hashArray[i];

            if (item != null)
            {
                var current = item;

                /*找到新的存儲點*/
                while (current != null)
                {
                    var next = current.Next;

                    var newIndex = Math.Abs(current.Data.Key.GetHashCode()) % newBucketSize;

                    if (smallerArray[newIndex] == null)
                    {
                        filledBuckets++;
                        smallerArray[newIndex] = current;
                    }

                    var newItem = smallerArray[newIndex];
                    while(newItem.Next != null)
                        newItem = newItem.Next;
                    newItem.Next = current;

                    current = next;
                }
            }
        }

        hashArray = smallerArray;
    }
}

最終程式碼

拉鏈法:最終程式碼

internal class DLinkedNode<T>
{
  public T Data;
  public DLinkedNode<T> Next;
  public DLinkedNode<T> Previous;

  public DLinkedNode(T data)
  {
      Data = data;
  }
}
internal class SeparateChainingDictionary<TK, TV> : IDictionary<TK, TV>
{
//構建一個數組,數組每個節點都是鏈表
private DLinkedNode<KeyValuePair<TK, TV>>[] hashArray;
//已使用數組下標個數
private int filledBuckets;

public SeparateChainingDictionary(int capacity)
{
    if (capacity < 0)
        throw new ArgumentOutOfRangeException("初始值容量不能小於0");
    hashArray = new DLinkedNode<KeyValuePair<TK, TV>>[capacity];
}
public TV this[TK key]
{
    get => throw new NotImplementedException();
    set => throw new NotImplementedException();
}

public int Count { get; private set; }

/// <summary>
/// 添加
/// </summary>
/// <param name="key"></param>
/// <param name="value"></param>
/// <exception cref="Exception"></exception>
public void Add(TK key, TV value)
{
    Grow();

    var index = Math.Abs(key.GetHashCode()) % hashArray.Length;

    if (hashArray[index] == null)
    {
        hashArray[index] = new DLinkedNode<KeyValuePair<TK, TV>>(new KeyValuePair<TK, TV>(key, value));
        filledBuckets++;
    }
    else
    {
        var current = hashArray[index];

        while (current != null && current.Next != null)
        {
            /*此處可以判斷是重複修改,還是拋異常*/
            if (current.Data.Key.Equals(key)) throw new Exception("重複key");

            current = current.Next;
        }
        if (current.Data.Key.Equals(key)) throw new Exception("重複key");
        current.Next = new DLinkedNode<KeyValuePair<TK, TV>>(new KeyValuePair<TK, TV>(key, value));
    }

    Count++;
}

/// <summary>
/// 擴容
/// </summary>
private void Grow()
{
    if (filledBuckets >= hashArray.Length * 0.7)
    {
        filledBuckets = 0;

        var newBucketSize = hashArray.Length * 2;

        var biggerArray = new DLinkedNode<KeyValuePair<TK, TV>>[newBucketSize];

        for (var i = 0; i < hashArray.Length; i++)
        {
            var item = hashArray[i];

            if (item != null)
            {
                var current = item;

                while (current != null)
                {
                    var next = current.Next;

                    var newIndex = Math.Abs(current.Data.Key.GetHashCode()) % newBucketSize;

                    if (biggerArray[newIndex] == null)
                    {
                        filledBuckets++;
                        biggerArray[newIndex] = current;
                    }

                    var bItem = biggerArray[newIndex];
                    while(bItem.Next != null)
                        bItem = bItem.Next;
                    bItem.Next = current;

                    current = next;
                }
            }
        }

        hashArray = biggerArray;
    }
}


public void Clear()
{
    throw new NotImplementedException();
}

/// <summary>
/// 查找
/// </summary>
/// <param name="key"></param>
/// <returns></returns>
public bool ContainsKey(TK key)
{
    /*1.獲取原始下標*/
    var index = Math.Abs(key.GetHashCode()) % hashArray.Length;

    /*2.為空即無*/
    if (hashArray[index] == null) return false;

    var current = hashArray[index];

    /*3.遍歷鏈表*/
    while (current != null)
    {
        if (current.Data.Key.Equals(key)) return true;

        current = current.Next;
    }

    return false;
}

public void Remove(TK key)
{
    var index = Math.Abs(key.GetHashCode()) % hashArray.Length;

    if (hashArray[index] == null) throw new Exception("未找到key");

    var current = hashArray[index];

    /*查找待刪除元素*/
    DLinkedNode<KeyValuePair<TK, TV>> item = null;
    while (current != null)
    {
        if (current.Data.Key.Equals(key))
        {
            item = current;
            break;
        }
        current = current.Next;
    }

    if (item == null)
    {
        throw new Exception("未找到key");
    }

    /*刪除*/
    if (item.Next == null)
        item = null;
    else
    {
        item.Previous = item.Next; 
        item.Next.Previous =item.Previous ;
        item = null;
    }


    if (hashArray[index] == null)
    {
        filledBuckets--;
    }

    Count--;

    Shrink();
}
private void Shrink()
{
    /*是否減容*/
    if (Math.Abs(filledBuckets - hashArray.Length * 0.3) < 0.1 && hashArray.Length / 2 > 0)
    {
        filledBuckets = 0;
        var newBucketSize = hashArray.Length / 2;

        var smallerArray = new DLinkedNode<KeyValuePair<TK, TV>>[newBucketSize];

        for (var i = 0; i < hashArray.Length; i++)
        {
            var item = hashArray[i];

            if (item != null)
            {
                var current = item;

                /*找到新的存儲點*/
                while (current != null)
                {
                    var next = current.Next;

                    var newIndex = Math.Abs(current.Data.Key.GetHashCode()) % newBucketSize;

                    if (smallerArray[newIndex] == null)
                    {
                        filledBuckets++;
                        smallerArray[newIndex] = current;
                    }

                    var newItem = smallerArray[newIndex];
                    while(newItem.Next != null)
                        newItem = newItem.Next;
                    newItem.Next = current;

                    current = next;
                }
            }
        }

        hashArray = smallerArray;
    }
}
public IEnumerator<KeyValuePair<TK, TV>> GetEnumerator()
{
    throw new NotImplementedException();
}

IEnumerator IEnumerable.GetEnumerator()
{
    throw new NotImplementedException();
}
}

Dictionary源碼分析

模擬實現:一個Dictionary,存儲數據{1,’a’},{‘4′,’b’},{5,’c’}

1. 創建一個單鏈表,用來存儲K-V

private struct Entry
{
    public uint hashCode;
    //值為-1,表示是該鏈條最後一個節點
    //值小於-1,表示已經被刪除的自由節點
    public int next;
    public TKey key;     // Key of entry
    public TValue value; // Value of entry
}

2. 創建一個數組當桶,還有一個鏈表數組(核心就這兩個數組)

private int[]? _buckets;
private Entry[]? _entries;

3. 模擬實現插入{1,’a’},{‘4′,’b’},{5,’c’}

初始化

第一次插入{1,’a’}

第二次插入{‘4′,’b’}

第三次插入{5,’c’}

仔細看一下這三個數據的插入,及數據的變化,應該可以理解_buckets和_entries的關係

4.刪除

上邊再講哈希表,包括我們自己實現的程式碼中,刪除一個節點後,都要重新計算後邊的位置。如何解決這個問題呢?我們可以使用Entry的next,來表示是否已經刪除,小於0就表示是自由節點。

關於刪除就這樣幾個變數:

private int _freeList;//最後一個刪除的Entry下標
private int _freeCount;//當前已刪除,但是還未重新使用的節點數量
private const int StartOfFreeList = -3;//幫助尋找自由節點的一個常量

看一下StartOfFreeList和_freeList和Entry.next如何尋找自由節點

  • 刪除時:Entry[i].next=上一層中的StartOfFreeList-_freeList
  • 添加&&_freeCount>0:_freeList=StartOfFreeList – entries[_freeList].next

請看圖理解:

源碼:簡化版(debug理解)

源碼:簡化版可直接運行
public static void Main(string[] args)
{
    Dictionary<int, char> dic = new Dictionary<int, char>();
    dic.TryInsert(1, 'a');
    dic.TryInsert(4, 'b');
    dic.TryInsert(5, 'c');
    dic.Remove(4);
    dic.Remove(5);
    dic.TryInsert(0, 'd');
    dic.TryInsert(1, 'e');
}
public class Dictionary<TKey, TValue>
{
private int[]? _buckets;
private Entry[]? _entries;
private int _count;
private int _freeList;
private int _freeCount;
private int _version;
private const int StartOfFreeList = -3;
public Dictionary()
{
    /*初始值為素數,這裡就不動態了,獲取素數可以使用埃及篩選法*/
    Initialize(7);
}
private int Initialize(int capacity)
{
    int size = capacity;
    int[] buckets = new int[size];
    Entry[] entries = new Entry[size];
    _freeList = -1;
    _buckets = buckets;
    _entries = entries;
    return size;
}

public bool TryInsert(TKey key, TValue value)
{
    Entry[]? entries = _entries;
    uint hashCode = (uint)key.GetHashCode();

    uint collisionCount = 0;
    ref int bucket = ref GetBucket(hashCode);
    int i = bucket - 1; // Value in _buckets is 1-based
    if (typeof(TKey).IsValueType)
    {
        while (true)
        {
            if ((uint)i >= (uint)entries.Length)
            {
                break;
            }

            if (entries[i].hashCode == hashCode && EqualityComparer<TKey>.Default.Equals(entries[i].key, key))
            {
                entries[i].value = value;
                return true;
            }

            i = entries[i].next;

            collisionCount++;
            if (collisionCount > (uint)entries.Length)
            {
                throw new Exception("");
            }
        }
    }
    int index;
    if (_freeCount > 0)
    {
        index = _freeList;
        // Debug.Assert((StartOfFreeList - entries[_freeList].next) >= -1, "shouldn't overflow because `next` cannot underflow");
        _freeList = StartOfFreeList - entries[_freeList].next;
        _freeCount--;
    }
    else
    {
        int count = _count;
        if (count == entries.Length)
        {
            //Resize();
            bucket = ref GetBucket(hashCode);
        }
        index = count;
        _count = count + 1;
        entries = _entries;
    }

    ref Entry entry = ref entries![index];
    entry.hashCode = hashCode;
    entry.next = bucket - 1; // Value in _buckets is 1-based
    entry.key = key;
    entry.value = value; // Value in _buckets is 1-based
    bucket = index + 1;
    _version++;
    return true;
}
public bool Remove(TKey key)
{
    if (key == null) return false;

    if (_buckets != null)
    {
        uint collisionCount = 0;
        uint hashCode = (uint)key.GetHashCode();
        ref int bucket = ref GetBucket(hashCode);
        Entry[]? entries = _entries;
        int last = -1;
        int i = bucket - 1; // Value in buckets is 1-based
        while (i >= 0)
        {
            ref Entry entry = ref entries[i];

            if (entry.hashCode == hashCode && EqualityComparer<TKey>.Default.Equals(entry.key, key))
            {
                if (last < 0)
                {
                    bucket = entry.next + 1; 
                }
                else
                {
                    entries[last].next = entry.next;
                }

                entry.next = StartOfFreeList - _freeList;

                entry.key = default!;

                entry.value = default!;

                _freeList = i;
                _freeCount++;
                return true;
            }
            last = i;
            i = entry.next;
            collisionCount++;
            if (collisionCount > (uint)entries.Length)
            {

            }
        }
    }
    return false;
}
private ref int GetBucket(uint hashCode)
{
    int[] buckets = _buckets!;
    return ref buckets[hashCode % (uint)buckets.Length];
}
private struct Entry
{
    public uint hashCode;
    //值為-1,表示是該鏈條最後一個節點
    public int next;
    public TKey key;     // Key of entry
    public TValue value; // Value of entry
}