Go Maps の説明: Key-Value ペアが実際にどのように保存されるか-Golang-php.cn

これは投稿の抜粋です。投稿全文はこちらからご覧いただけます: https://victoriametrics.com/blog/go-map/.

Go を初めて使用する場合、Go でマップを使用する方法を理解するのは少し混乱するかもしれません。また、経験が豊富であっても、マップが実際にどのように機能するかを理解するのはかなり難しい場合があります。

次の例を考えてみましょう: マップに「ヒント」を設定したことがありますが、スライスの場合のように長さなどの単純なものではなく、なぜ「ヒント」と呼ばれるのか疑問に思ったことはありますか?

// hint = 10
m := make(map[string]int, 10)

ログイン後にコピー

あるいは、マップ上で for-range ループを使用すると、順序が挿入順序と一致せず、同じマップを異なる時点でループすると順序が変わることに気付いたかもしれません。しかし、奇妙なことに、まったく同時にループした場合、順序は通常同じままです。

これは長い話なので、シートベルトを締めて飛び込んでください。

次に進む前に、注意してください。ここでの情報は Go 1.23 に基づいています。状況が変更され、これが最新でない場合は、お気軽に X(@func25) にメールしてください。

Go でのマップ: クイックスタート

それでは、Go のマップについて話しましょう。これは、キーと値のストレージとして機能する組み込み型です。 0、1、2 などのインデックスを増加させるキーを使用する配列とは異なり、マップの場合、キーは任意の同等の型にすることができます。

これにより、柔軟性が大幅に高まります。

m := make(map[string]int)
m["a"] = 1
m["b"] = 2

m // map[a:1 b:2]

ログイン後にコピー

この例では、make() を使用して空のマップを作成しました。キーは文字列、値は int です。

Go Maps Explained: How Key-Value Pairs Are Actually Stored — マップ["a": 1, "b": 2]

各キーを手動で割り当てる代わりに、マップリテラルを使用することで時間を節約できます。これにより、マップを作成するときにキーと値のペアを一度にセットアップできます:

m := map[string]int{
    "a": 1,
    "b": 2,
}

ログイン後にコピー

最初にマップを作成するときに、中括弧内にキーとその値をリストするだけです。とても簡単です。

そして、特定のキーと値のペアがもう必要ないことに後で気づいた場合でも、Go が対応してくれます。不要なキーを削除する便利な削除関数があります: delete(m, "a").

マップのゼロ値は nil であり、nil マップはある意味空のマップに似ています。その中でキーを検索しようとしても、Go が驚いてプログラムをクラッシュさせることはありません。

そこに存在しないキーを検索すると、Go はそのマップの値タイプの「ゼロ値」を静かに返します。

var m map[string]int

println(m["a"]) // 0
m["a"] = 1      // panic: assignment to entry in nil map

ログイン後にコピー

しかし、問題は、新しいキーと値のペアを nil マップに追加することはできないということです。

実際、Go はスライスを処理する方法とよく似た方法でマップを処理します。マップとスライスは両方とも nil として始まり、nil の間にそれらに対して「無害な」何かを行っても Go はパニックになりません。たとえば、「ドラマ」なしで nil スライスをループできます。

では、nil マップをループしようとするとどうなるでしょうか?

var m map[string]int

for k, v := range m {
    println(k, v)
}

ログイン後にコピー

何も起こらず、エラーも、驚きもありません。ただ静かに何もしません。

Go のアプローチは、任意の型のデフォルト値を、プログラムのダウンを引き起こすものではなく、有用なものとして扱うことです。 Go がフィットをスローするのは、新しいキーと値のペアを nil マップに追加しようとしたり、スライス内の範囲外のインデックスにアクセスしたりするなど、本当に違法な操作を行った場合のみです。

Go のマップについて知っておくべきことがさらにいくつかあります:

マップ上の for-range ループは、特定の順序でキーを返しません。
マップはスレッドセーフではありません。同じマップの読み取り (または for-range での反復) と書き込みを同時に行おうとすると、Go ランタイムは致命的なエラーを引き起こします。
単純な ok チェックを実行することで、キーがマップ内にあるかどうかを確認できます: _, ok := m[key]。
マップのキータイプは同等である必要があります。

マップキーに関する最後の点を詳しく見ていきましょう。先ほど「キーは同等のタイプであれば何でもよい」と述べましたが、それだけではありません。

「では、比較可能な型とは正確には何ですか?またそうでないものは何ですか?」

これは非常に簡単です。== を使用して同じ型の 2 つの値を比較できる場合、その型は比較可能なものとみなされます。

func main() {
    var s map[int]string

    if s == s {
        println("comparable")
    }
}

// compile error: invalid operation: s == s (map can only be compared to nil)

ログイン後にコピー

しかし、ご覧のとおり、上記のコードはコンパイルすらできません。コンパイラは次のように警告します: 「無効な操作: s == s (map は nil とのみ比較できます)。」

これと同じルールが、スライス、関数、スライスやマップを含む構造体などの他の比較不可能な型にも適用されます。したがって、これらの型のいずれかをマップ内のキーとして使用しようとしている場合、運が悪いです。

func main() {
  var s map[[]int]string
}

// compile error: invalid map key type []intcompilerIncomparableMapKey

ログイン後にコピー

しかしここにちょっとした秘密があります。インターフェースは比較できるものと比較できないものがあります。

それはどういう意味ですか?コンパイルエラーを発生させることなく、空のインターフェイスをキーとしてマップを定義できます。ただし、実行時エラーが発生する可能性が高いので注意してください。

func main() {
    m := map[interface{}]int{
        1: 1,
        "a": 2,
    }

    m[[]int{1, 2, 3}] = 3
    m[func() {}] = 4
}

// panic: runtime error: hash of unhashable type []int
// panic: runtime error: hash of unhashable type func()

ログイン後にコピー

Everything looks fine until you try to assign an uncomparable type as a map key.

That's when you'll hit a runtime error, which is trickier to deal with than a compile-time error. Because of this, it's usually a good idea to avoid using interface{} as a map key unless you have a solid reason and constraints that prevent misuse.

But that error message: "hash of unhashable type []int" might seem a bit cryptic. What's this about a hash? Well, that's our cue to dig into how Go handles things under the hood.

Map Anatomy

When explaining the internals of something like a map, it's easy to get bogged down in the nitty-gritty details of the Go source code. But we're going to keep it light and simple so even those new to Go can follow along.

What you see as a single map in your Go code is actually an abstraction that hides the complex details of how the data is organized. In reality, a Go map is composed of many smaller units called "buckets."

type hmap struct {
  ...
  buckets unsafe.Pointer
  ...
}

ログイン後にコピー

Look at Go source code above, a map contains a pointer that points to the bucket array.

This is why when you assign a map to a variable or pass it to a function, both the variable and the function's argument are sharing the same map pointer.

func changeMap(m2 map[string]int) {
  m2["hello"] = 2
}

func main() {
  m1 := map[string]int{"hello": 1}
  changeMap(m1)
  println(m1["hello"]) // 2
}

ログイン後にコピー

But don't get it twisted, maps are pointers to the hmap under the hood, but they aren't reference types, nor are they passed by reference like a ref argument in C#, if you change the whole map m2, it won't reflect on the original map m1 in the caller.

func changeMap(m2 map[string]int) {
  m2 = map[string]int{"hello": 2}
}

func main() {
  m1 := map[string]int{"hello": 1}
  changeMap(m1)
  println(m1["hello"]) // 1
}

ログイン後にコピー

In Go, everything is passed by value. What's really happening is a bit different: when you pass the map m1 to the changeMap function, Go makes a copy of the *hmap structure. So, m1 in the main() and m2 in the changeMap() function are technically different pointers point to the same hmap.

For more on this topic, there's a great post by Dave Cheney titled There is no pass-by-reference in Go.

Each of these buckets can only hold up to 8 key-value pairs, as you can see in the image below.

The map above has 2 buckets, and len(map) is 6.

So, when you add a key-value pair to a map, Go doesn't just drop it in there randomly or sequentially. Instead, it places the pair into one of these buckets based on the key's hash value, which is determined by hash(key, seed).

Let's see the simplest assignment scenario in the image below, when we have an empty map, and assign a key-value pair "hello": 1 to it.

It starts by hashing "hello" to a number, then it takes that number and mods it by the number of buckets.

Since we only have one bucket here, any number mod 1 is 0, so it's going straight into bucket 0 and the same process happens when you add another key-value pair. It'll try to place it in bucket 0, and if the first slot's taken or has a different key, it'll move to the next slot in that bucket.

Take a look at the hash(key, seed), when you use a for-range loop over two maps with the same keys, you might notice that the keys come out in a different order:

func main() {
    a := map[string]int{"a": 1, "b": 2, "c": 3, "d": 4, "e": 5, "f": 6}
    b := map[string]int{"a": 1, "b": 2, "c": 3, "d": 4, "e": 5, "f": 6}

    for i := range a {
        print(i, " ")
    }
    println()

    for i := range b {
        print(i, " ")
    }
}

// Output:
// a b c d e f 
// c d e f a b

ログイン後にコピー

How's that possible? Isn't the key "a" in map a and the key "a" in map b hashed the same way?

But here's the deal, while the hash function used for maps in Go is consistent across all maps with the same key type, the seed used by that hash function is different for each map instance. So, when you create a new map, Go generates a random seed just for that map.

In the example above, both a and b use the same hash function because their keys are string types, but each map has its own unique seed.

"Wait, a bucket has only 8 slots? What happens if the bucket gets full? Does it grow like a slice?"

Well, sort of. When the buckets start getting full, or even almost full, depending on the algorithm's definition of "full", the map will trigger a growth, which might double the number of main buckets.

But here's where it gets a bit more interesting.

'주 버킷'이라고 하면 '오버플로 버킷'이라는 또 다른 개념을 설정하는 것입니다. 이는 충돌이 많은 상황에 처했을 때 적용됩니다. 4개의 버킷이 있는데 그 중 하나는 높은 충돌로 인해 8개의 키-값 쌍으로 완전히 채워지고 나머지 3개 버킷은 비어 있다고 상상해 보세요.