Go Maps Explained: How Key-Value Pairs Are Actually Stored

WBOY
Release: 2024-09-10 11:01:50
Original
1004 people have browsed it

This is an excerpt of the post; the full post is available here: https://victoriametrics.com/blog/go-map/.

If you're new to Go, it can be a bit confusing to figure out how to use maps in Go. And even when you're more experienced, understanding how maps really work can be pretty tough.

Take this example: Have you ever set a 'hint' for a map and wondered why it's called a 'hint' and not something simple like length, like we do with slices?

// hint = 10
m := make(map[string]int, 10)
Copy after login

Or maybe you've noticed that when you use a for-range loop on a map, the order doesn't match the insertion order, and it even changes if you loop over the same map at different times. But weirdly enough, if you loop over it at the exact same time, the order usually stays the same.

This is a long story, so fasten your seat belt and dive in.

Before we move on, just a heads up—the info here is based on Go 1.23. If things have changed and this isn't up to date, feel free to ping me on X(@func25).

Map in Go: Quick Start

So, let's talk about maps in Go, it's a built-in type that acts as a key-value storage. Unlike arrays where you're stuck with keys as increasing indices like 0, 1, 2, and so on, with maps, the key can be any comparable type.

That gives you a lot more flexibility.

m := make(map[string]int)
m["a"] = 1
m["b"] = 2

m // map[a:1 b:2]
Copy after login

In that example, we created an empty map using make(), where the keys are strings and the values are ints.

Go Maps Explained: How Key-Value Pairs Are Actually Stored

Map["a": 1, "b": 2]

Now, instead of manually assigning each key, you can save yourself some time by using a map literal. This lets you set up your key-value pairs all at once, right when you create the map:

m := map[string]int{
    "a": 1,
    "b": 2,
}
Copy after login

All you do is list out the keys and their values inside curly braces when you first create the map, simple as that.

And if you realize later that you don't need a certain key-value pair anymore, Go's got you covered. There's a handy delete function that, well, deletes the key you don't want: delete(m, "a").

The zero value of a map is nil, and a nil map is kind of like an empty map in some ways. You can try to look up keys in it, and Go won't freak out and crash your program.

If you search for a key that isn't there, Go just quietly gives you the "zero value" for that map's value type:

var m map[string]int

println(m["a"]) // 0
m["a"] = 1      // panic: assignment to entry in nil map
Copy after login

But here's the thing: you can't add new key-value pairs to a nil map.

In fact, Go handles maps in a way that's pretty similar to how it deals with slices. Both maps and slices start off as nil, and Go doesn't panic if you do something “harmless” with them while they're nil. For example, you can loop over a nil slice without any "drama".

So, what happens if you try to loop over a nil map?

var m map[string]int

for k, v := range m {
    println(k, v)
} 
Copy after login

Nothing happens, no errors, no surprises. It just quietly does nothing.

Go's approach is to treat the default value of any type as something useful, not something that causes your program to blow up. The only time Go throws a fit is when you do something that's truly illegal, like trying to add a new key-value pair to a nil map or accessing an out-of-bounds index in a slice.

There are a couple more things you should know about maps in Go:

  • A for-range loop over a map won't return the keys in any specific order.
  • Maps aren't thread-safe, the Go runtime will cause a fatal error if you try to read (or iterate with a for-range) and write to the same map simultaneously.
  • You can check if a key is in a map by doing a simple ok check: _, ok := m[key].
  • The key type for a map must be comparable.

Let's dive into that last point about map keys. I mentioned earlier that "the key could be any comparable type," but there's a bit more to it than just that.

"So, what exactly is a comparable type? And what isn't?"

It's pretty simple: if you can use == to compare two values of the same type, then that type is considered comparable.

func main() {
    var s map[int]string

    if s == s {
        println("comparable")
    }
}

// compile error: invalid operation: s == s (map can only be compared to nil)
Copy after login

But as you can see, the code above doesn't even compile. The compiler complains: "invalid operation: s == s (map can only be compared to nil)."

This same rule applies to other non-comparable types like slices, functions, or structs that contain slices or maps, etc. So, if you're trying to use any of those types as keys in a map, you're out of luck.

func main() {
  var s map[[]int]string
}

// compile error: invalid map key type []intcompilerIncomparableMapKey
Copy after login

But here's a little secret, interfaces can be both comparable and non-comparable.

What does that mean? You can absolutely define a map with an empty interface as the key without any compile errors. But watch out, there's a good chance you'll run into a runtime error.

func main() {
    m := map[interface{}]int{
        1: 1,
        "a": 2,
    }

    m[[]int{1, 2, 3}] = 3
    m[func() {}] = 4
}

// panic: runtime error: hash of unhashable type []int
// panic: runtime error: hash of unhashable type func()
Copy after login

Everything looks fine until you try to assign an uncomparable type as a map key.

That's when you'll hit a runtime error, which is trickier to deal with than a compile-time error. Because of this, it's usually a good idea to avoid using interface{} as a map key unless you have a solid reason and constraints that prevent misuse.

But that error message: "hash of unhashable type []int" might seem a bit cryptic. What's this about a hash? Well, that's our cue to dig into how Go handles things under the hood.

Map Anatomy

When explaining the internals of something like a map, it's easy to get bogged down in the nitty-gritty details of the Go source code. But we're going to keep it light and simple so even those new to Go can follow along.

What you see as a single map in your Go code is actually an abstraction that hides the complex details of how the data is organized. In reality, a Go map is composed of many smaller units called "buckets."

type hmap struct {
  ...
  buckets unsafe.Pointer
  ...
}
Copy after login

Look at Go source code above, a map contains a pointer that points to the bucket array.

This is why when you assign a map to a variable or pass it to a function, both the variable and the function's argument are sharing the same map pointer.

func changeMap(m2 map[string]int) {
  m2["hello"] = 2
}

func main() {
  m1 := map[string]int{"hello": 1}
  changeMap(m1)
  println(m1["hello"]) // 2
}
Copy after login

But don't get it twisted, maps are pointers to the hmap under the hood, but they aren't reference types, nor are they passed by reference like a ref argument in C#, if you change the whole map m2, it won't reflect on the original map m1 in the caller.

func changeMap(m2 map[string]int) {
  m2 = map[string]int{"hello": 2}
}

func main() {
  m1 := map[string]int{"hello": 1}
  changeMap(m1)
  println(m1["hello"]) // 1
}
Copy after login

In Go, everything is passed by value. What's really happening is a bit different: when you pass the map m1 to the changeMap function, Go makes a copy of the *hmap structure. So, m1 in the main() and m2 in the changeMap() function are technically different pointers point to the same hmap.

Go Maps Explained: How Key-Value Pairs Are Actually Stored

Map is passed by value

For more on this topic, there's a great post by Dave Cheney titled There is no pass-by-reference in Go.

Each of these buckets can only hold up to 8 key-value pairs, as you can see in the image below.

Go Maps Explained: How Key-Value Pairs Are Actually Stored

Buckets of a map

The map above has 2 buckets, and len(map) is 6.

So, when you add a key-value pair to a map, Go doesn't just drop it in there randomly or sequentially. Instead, it places the pair into one of these buckets based on the key's hash value, which is determined by hash(key, seed).

Let's see the simplest assignment scenario in the image below, when we have an empty map, and assign a key-value pair "hello": 1 to it.

Go Maps Explained: How Key-Value Pairs Are Actually Stored

Assign a key-value pair to an empty map

It starts by hashing "hello" to a number, then it takes that number and mods it by the number of buckets.

Since we only have one bucket here, any number mod 1 is 0, so it's going straight into bucket 0 and the same process happens when you add another key-value pair. It'll try to place it in bucket 0, and if the first slot's taken or has a different key, it'll move to the next slot in that bucket.

Take a look at the hash(key, seed), when you use a for-range loop over two maps with the same keys, you might notice that the keys come out in a different order:

func main() {
    a := map[string]int{"a": 1, "b": 2, "c": 3, "d": 4, "e": 5, "f": 6}
    b := map[string]int{"a": 1, "b": 2, "c": 3, "d": 4, "e": 5, "f": 6}

    for i := range a {
        print(i, " ")
    }
    println()

    for i := range b {
        print(i, " ")
    }
}

// Output:
// a b c d e f 
// c d e f a b     
Copy after login

How's that possible? Isn't the key "a" in map a and the key "a" in map b hashed the same way?

But here's the deal, while the hash function used for maps in Go is consistent across all maps with the same key type, the seed used by that hash function is different for each map instance. So, when you create a new map, Go generates a random seed just for that map.

In the example above, both a and b use the same hash function because their keys are string types, but each map has its own unique seed.

"Wait, a bucket has only 8 slots? What happens if the bucket gets full? Does it grow like a slice?"

Well, sort of. When the buckets start getting full, or even almost full, depending on the algorithm's definition of "full", the map will trigger a growth, which might double the number of main buckets.

But here's where it gets a bit more interesting.

私が「メイン バケット」と言うとき、私は別の概念、「オーバーフロー バケット」を準備しています。これらは、衝突が多い状況で有効になります。 4 つのバケットがあるが、そのうちの 1 つは衝突が多いため 8 つのキーと値のペアで完全に埋められ、他の 3 つのバケットは空のままであると想像してください。

投稿全文はこちらからご覧いただけます: https://victoriametrics.com/blog/go-map/.

The above is the detailed content of Go Maps Explained: How Key-Value Pairs Are Actually Stored. For more information, please follow other related articles on the PHP Chinese website!

source:dev.to
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template