SparseArrayとHashMap

8976 ワード

以前にHashMapのソースコードを解析し,その拡張機構を理解した.知らない友达は私のもう一つの文章を見てもいいです:HashMap、HashTableとConcurrentHashMapソースコードの解析.HashMapは全体的に使いやすいですが、数が拡張条件に達すると拡張が開始され、わずかなメモリを使用するだけでメモリが大幅に浪費され、容量が大きいほど浪費が顕著になることがわかります.Androidのようなメモリに特に敏感なプラットフォームでは、このようなことをできるだけ避けるべきです.そこでGoogleは、自分に合ったapi、SparseArray、ArrayMapを発売しました.今日はSparseArrayのソースコードを簡単に分析します.
まず、公式の紹介を見てみましょう.

/**
 * SparseArrays map integers to Objects.  Unlike a normal array of Objects,
 * there can be gaps in the indices.  It is intended to be more memory efficient
 * than using a HashMap to map Integers to Objects, both because it avoids
 * auto-boxing keys and its data structure doesn't rely on an extra entry object
 * for each mapping.
 *
 * Note that this container keeps its mappings in an array data structure,
 * using a binary search to find keys.  The implementation is not intended to be appropriate for
 * data structures
 * that may contain large numbers of items.  It is generally slower than a traditional
 * HashMap, since lookups require a binary search and adds and removes require inserting
 * and deleting entries in the array.  For containers holding up to hundreds of items,
 * the performance difference is not significant, less than 50%.
 *
 * To help with performance, the container includes an optimization when removing
 * keys: instead of compacting its array immediately, it leaves the removed entry marked
 * as deleted.  The entry can then be re-used for the same key, or compacted later in
 * a single garbage collection step of all removed entries.  This garbage collection will
 * need to be performed at any time the array needs to be grown or the the map size or
 * entry values are retrieved.
 *
 * It is possible to iterate over the items in this container using
 * {@link #keyAt(int)} and {@link #valueAt(int)}. Iterating over the keys using
 * keyAt(int) with ascending values of the index will return the
 * keys in ascending order, or the values corresponding to the keys in ascending
 * order in the case of valueAt(int).

SparseArrayはHashmapメモリよりも使用効率が高く、主にkeyの自動箱詰めを許可する一方で、追加のEntryに依存する必要はありません.このコンテナは、キー値ペアが配列のデータ構造を維持し、keyを二分法で検索するので、大量のデータを格納するのに適していないことに注意してください.SparseArrayは、処理するたびに二分法の検索が必要になるため、追加削除の効率が従来のHashmapよりも遅い.データ量が100未満の場合、このパフォーマンスの差は明らかではなく、50%未満になります.パフォーマンスを向上させるために、SparseArrayはkeyを削除するときに最適化しました.彼はすぐにkeyを配列から削除するのではなく、削除状態としてマークします.その後、それを多重化するか、後で専門的に収集された容器に移動して、配列を拡張する必要があるときに使用することができます.
くだらないことは言わないで、まずデータ構造を見てみましょう.

    private int[] mKeys;
    private Object[] mValues;

注記で述べたように、SparseArrayは主に2つの配列でkeyとvalueを格納し、keyはintのみである.
データ構造を見終わったので、追加削除を見てみましょう.まず増加を見ると、2つの方法があります.1つはappendで、もう1つはputです.まずputの方法を見てみましょう.

 public void put(int key, E value) {
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
     
        if (i >= 0) {
            mValues[i] = value;
        } else {
            i = ~i;

            if (i < mSize && mValues[i] == DELETED) {
                mKeys[i] = key;
                mValues[i] = value;
                return;
            }

            if (mGarbage && mSize >= mKeys.length) {
                gc();

                // Search again because indices may have changed.
                i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
            }

            mKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key);
            mValues = GrowingArrayUtils.insert(mValues, mSize, i, value);
            mSize++;
        }
    }

まず二分検索を行い,key配列に対応するkeyが存在するか否かを判断し,ヒットif条件が存在する場合value値を直接対応インデックス位置にコピーすることが分かる.存在しない場合は、返されるインデックスの位置を二分検索して、対応する位置のvalueがdeletedとマークされているかどうかを判断し、そうであればkeyとvalueをそれぞれ対応するインデックス位置に付与する.返されるインデックスの位置が現在のsizeよりも大きい場合は、ゴミ回収があるかどうかを判断し、ある場合はgcを行い、二分法で対応するインデックスを再検索し、最後に対応するkeyとvalueを対応するインデックスの位置に挿入します.
再看append方法:

public void append(int key, E value) {
        if (mSize != 0 && key <= mKeys[mSize - 1]) {
            put(key, value);
            return;
        }

        if (mGarbage && mSize >= mKeys.length) {
            gc();
        }

        mKeys = GrowingArrayUtils.append(mKeys, mSize, key);
        mValues = GrowingArrayUtils.append(mValues, mSize, value);
        mSize++;
    }

size>0でkeyが最後の要素のkeyより小さい場合、putメソッドが呼び出されることがわかる.そうでなければgarbageが以前に存在しsizeが配列の長さ以上である場合はgcを行い,対応するkey,valueを配列の後ろに追加する.ではgc法は何をしましたか?

private void gc() {
        // Log.e("SparseArray", "gc start with " + mSize);

        int n = mSize;
        int o = 0;
        int[] keys = mKeys;
        Object[] values = mValues;

        for (int i = 0; i < n; i++) {
            Object val = values[i];

            if (val != DELETED) {
                if (i != o) {
                    keys[o] = keys[i];
                    values[o] = val;
                    values[i] = null;
                }

                o++;
            }
        }

        mGarbage = false;
        mSize = o;

        // Log.e("SparseArray", "gc end with " + mSize);
    }

すでにDELETEと表記されている対応するインデックス位置のkey,valueを再割り当て,すなわち配列を圧縮していることがわかる.
追加を見終わったら、削除方法を見続けます.削除シリーズの方法は全部で6つあります.

 public void remove(int key) {
        delete(key);
    }

public void removeAt(int index) {
        if (mValues[index] != DELETED) {
            mValues[index] = DELETED;
            mGarbage = true;
        }
    }

  public void removeAtRange(int index, int size) {
        final int end = Math.min(mSize, index + size);
        for (int i = index; i < end; i++) {
            removeAt(i);
        }
    }

public E removeReturnOld(int key) {
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i >= 0) {
            if (mValues[i] != DELETED) {
                final E old = (E) mValues[i];
                mValues[i] = DELETED;
                mGarbage = true;
                return old;
            }
        }
        return null;
    }

public void delete(int key) {
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i >= 0) {
            if (mValues[i] != DELETED) {
                mValues[i] = DELETED;
                mGarbage = true;
            }
        }
    }

 public void clear() {
        int n = mSize;
        Object[] values = mValues;

        for (int i = 0; i < n; i++) {
            values[i] = null;
        }

        mSize = 0;
        mGarbage = false;
    }

は,実際にはほとんどが同じであり,主に対応インデックスまたはkey対応valueの値をDeleteとする.clearメソッドは、すべてのvalueの値を空にし、sizeを0に設定することです.
再参照修正:setValue Atメソッド

public void setValueAt(int index, E value) {
        if (mGarbage) {
            gc();
        }

        mValues[index] = value;
    }

は簡単で、gcが必要かどうかを判断し、valuesに対応するインデックスに値を割り当てることです.
最後にクエリー方法です. クエリー方法が多いです.

public E get(int key) {
       return get(key, null);
   }


   @SuppressWarnings("unchecked")
   public E get(int key, E valueIfKeyNotFound) {
       int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

       if (i < 0 || mValues[i] == DELETED) {
           return valueIfKeyNotFound;
       } else {
           return (E) mValues[i];
       }
   }

ジルコニウムの2つの方法は主にkeyによってvalueを取得する.

 */
    public int indexOfKey(int key) {
        if (mGarbage) {
            gc();
        }

        return ContainerHelpers.binarySearch(mKeys, mSize, key);
    }

    public int indexOfValue(E value) {
        if (mGarbage) {
            gc();
        }

        for (int i = 0; i < mSize; i++) {
            if (mValues[i] == value) {
                return i;
            }
        }

        return -1;
    }

 
    public int indexOfValueByValue(E value) {
        if (mGarbage) {
            gc();
        }

        for (int i = 0; i < mSize; i++) {
            if (value == null) {
                if (mValues[i] == null) {
                    return i;
                }
            } else {
                if (value.equals(mValues[i])) {
                    return i;
                }
            }
        }
        return -1;
    }

第1の方法はkey対応インデックスを取得することであり,後の2つの方法は対応valueのインデックスを取得することであり,両者の違いは1つはequalsで判断し,もう1つは==で判断することである.
には最後の2つの方法があります:keyAtとvalueAt

public int keyAt(int index) {
        if (mGarbage) {
            gc();
        }

        return mKeys[index];
    }
 public E valueAt(int index) {
        if (mGarbage) {
            gc();
        }

        return (E) mValues[index];
    }

は、インデックスに対応するkeyとvalueをそれぞれ返します.
全体的に見ると、SparseArrayはHashMapのデータ構造よりずっと簡単です.hashmapのnodeノードがなく、反復器も必要ありません.二分法検索は検索に有利です.欠点はkeyがintタイプのみをサポートしていることです.明らかに私たちの実際の開発では足りないので、ArrayMapもあります.ArrayMapのソースコードは分析しません.実際には、その使い方やデータの格納はSparseArrayと同様に、2つの配列で行われ、二分検索にも基づいています.異なる点は、ArrayMapのkeyは任意のタイプであってもよいことである.

カスタムアコーディオンの要素を使用してページの検索に関する考え.

もう一つの通知ライブラリ