Go Slices: internals and usage

发表于 2024-01-20 分类于 golang Disqus：本文字数： 3.8k 阅读时长 ≈ 14 分钟

本文基于官方文档，并借鉴了大量讨论 slice 的文章来综述 slice 的底层结构和使用方法。你可以在文末找到这些文章的链接，欢迎阅读。

介绍

Go 的切片类型提供了一种处理类型化数据序列的便捷高效的方法。切片类似于其他语言中的数组，但具有一些不寻常的属性。

Arrays

切片类型是建立在 Go 数组类型之上的抽象，但它在日常编程中的使用并不多，此处仅作简单介绍。

数组类型定义指定长度和元素类型。例如，类型 [4]int 表示一个由四个整数组成的数组。
数组的大小是固定的(数组的 len 和 cap 相等，就等于数据长度)，它的长度是其类型的一部分（[4]int 和 [5]int 是不同的、不兼容的类型）。
数组可以以通常的方式索引，因此表达式 s[n] 访问第 n 个元素，从零开始。
数组不需要显式初始化；数组的零值是一个随时可用的数组，其元素本身已为零

var sliceA [2]int
sliceB := [2]int{}
sliceC := [...]int{0, 0}

fmt.Println(sliceA, len(sliceA), cap(sliceA)) // prints [] 0 0
fmt.Println(sliceB, len(sliceB), cap(sliceB)) // prints [] 0 0
fmt.Println(sliceC, len(sliceC), cap(sliceC)) // prints [] 0 0

(Try it yourself)

Slices

切片的类型规范为 []T，其中 T 是切片元素的类型。与数组类型不同，切片类型没有指定的长度。切片文字的声明方式与数组文字类似，只是省略了元素计数：

var sliceA []int
sliceB := []int{}
sliceC := []int{1, 2}
sliceD := sliceC[:1]
sliceE := sliceC[0:1:cap(sliceC)]
sliceF := sliceC[:0]
sliceG := make([]int, 0)
sliceH := make([]int, 1)
sliceI := make([]int, 0, 1)
sliceJ := make([]int, 1, 2)

fmt.Println(sliceA, len(sliceA), cap(sliceA)) // prints [] 0 0
fmt.Println(sliceB, len(sliceB), cap(sliceB)) // prints [] 0 0
fmt.Println(sliceC, len(sliceC), cap(sliceC)) // prints [1 2] 2 2
fmt.Println(sliceD, len(sliceD), cap(sliceD)) // prints [1] 1 2
fmt.Println(sliceE, len(sliceE), cap(sliceE)) // prints [1] 1 2
fmt.Println(sliceF, len(sliceF), cap(sliceF)) // prints [] 0 2
fmt.Println(sliceG, len(sliceG), cap(sliceG)) // prints [] 0 0
fmt.Println(sliceH, len(sliceH), cap(sliceH)) // prints [0] 1 1
fmt.Println(sliceI, len(sliceI), cap(sliceI)) // prints [] 0 1
fmt.Println(sliceJ, len(sliceJ), cap(sliceJ)) // prints [0] 1 2

(Try it yourself)

// 注意：slice 只可以和 nil 做比较, slice 之间做比较编译期间就不会通过。

1 2	> fmt.Println(sliceA == sliceB) invalid operation: sliceA == sliceB (slice can only be compared to nil)

数据结构

https://github.com/golang/go/blob/master/src/runtime/slice.go#L15

type slice struct {
	array unsafe.Pointer
	len   int
	cap   int
}

array 是指向数组的指针。
len 是切片中的元素数量（与调用 len(s) 获得的值相同）。
cap 是切片的总容量（切片开头和分配内存末尾之间可以容纳的元素数量）。

从 slice 的数据结构来看，它所存储的是数据数组的指针地址(array)。所以如果传递切片，就会创建一个指向原始数组 array 的一个新的 slice 切片。这样，对切片的操作就与对数组索引的操作有着相同的高效性，但这也造成了新切片元素的改动会同时影响到原始切片元素（这是因为 array 是相同的）。

append 元素(扩容机制)

sliceA := []int{1, 2, 3}
sliceB := append(sliceA, 4)
sliceC := append(sliceB, 5)
sliceC[0] = 0

fmt.Println(*(*slice)(unsafe.Pointer(&sliceA)), sliceA) // prints {0xc00009a000 3 3} [1 2 3]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceB)), sliceB) // prints {0xc00009c000 4 6} [0 2 3 4]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceC)), sliceC) // prints {0xc00009c000 5 6} [0 2 3 4 5]

sliceD := append([]int{1, 2}, 3)
sliceE := append(sliceD, 4)
sliceF := append(sliceD, 5)
sliceF[0] = 0

fmt.Println(*(*slice)(unsafe.Pointer(&sliceD)), sliceD) // prints {0xc0000aa020 3 4} [0 2 3]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceE)), sliceE) // prints {0xc0000aa020 4 4} [0 2 3 5]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceF)), sliceF) // prints {0xc0000aa020 4 4} [0 2 3 5]

(Try it yourself)

append 的两种场景

screenshot-20240121-190658

当 append 之后的长度小于等于 cap，将会直接利用原底层数组剩余的空间。
当 append 后的长度大于 cap 时，则会分配一块更大的区域来容纳新的底层数组。

因此，为了避免内存发生拷贝，如果能够知道最终的切片的大小，预先设置 cap 的值能够获得更好的性能。

growslice 切片扩容

// growslice allocates new backing store for a slice.
//
// arguments:
//
//	oldPtr = pointer to the slice's backing array
//	newLen = new length (= oldLen + num)
//	oldCap = original slice's capacity.
//	   num = number of elements being added
//	    et = element type
//
// return values:
//
//	newPtr = pointer to the new backing store
//	newLen = same value as the argument
//	newCap = capacity of the new backing store
//
// Requires that uint(newLen) > uint(oldCap).
// Assumes the original slice length is newLen - num
//
// A new backing store is allocated with space for at least newLen elements.
// Existing entries [0, oldLen) are copied over to the new backing store.
// Added entries [oldLen, newLen) are not initialized by growslice
// (although for pointer-containing element types, they are zeroed). They
// must be initialized by the caller.
// Trailing entries [newLen, newCap) are zeroed.
//
// growslice's odd calling convention makes the generated code that calls
// this function simpler. In particular, it accepts and returns the
// new length so that the old length is not live (does not need to be
// spilled/restored) and the new length is returned (also does not need
// to be spilled/restored).
func growslice(oldPtr unsafe.Pointer, newLen, oldCap, num int, et *_type) slice {
	oldLen := newLen - num
	if raceenabled {
		callerpc := getcallerpc()
		racereadrangepc(oldPtr, uintptr(oldLen*int(et.Size_)), callerpc, abi.FuncPCABIInternal(growslice))
	}
	if msanenabled {
		msanread(oldPtr, uintptr(oldLen*int(et.Size_)))
	}
	if asanenabled {
		asanread(oldPtr, uintptr(oldLen*int(et.Size_)))
	}

	if newLen < 0 {
		panic(errorString("growslice: len out of range"))
	}

	if et.Size_ == 0 {
		// append should not create a slice with nil pointer but non-zero len.
		// We assume that append doesn't need to preserve oldPtr in this case.
		return slice{unsafe.Pointer(&zerobase), newLen, newLen}
	}

	newcap := nextslicecap(newLen, oldCap)

	var overflow bool
	var lenmem, newlenmem, capmem uintptr
	// Specialize for common values of et.Size.
	// For 1 we don't need any division/multiplication.
	// For goarch.PtrSize, compiler will optimize division/multiplication into a shift by a constant.
	// For powers of 2, use a variable shift.
	noscan := et.PtrBytes == 0
	switch {
	case et.Size_ == 1:
		lenmem = uintptr(oldLen)
		newlenmem = uintptr(newLen)
		capmem = roundupsize(uintptr(newcap), noscan)
		overflow = uintptr(newcap) > maxAlloc
		newcap = int(capmem)
	case et.Size_ == goarch.PtrSize:
		lenmem = uintptr(oldLen) * goarch.PtrSize
		newlenmem = uintptr(newLen) * goarch.PtrSize
		capmem = roundupsize(uintptr(newcap)*goarch.PtrSize, noscan)
		overflow = uintptr(newcap) > maxAlloc/goarch.PtrSize
		newcap = int(capmem / goarch.PtrSize)
	case isPowerOfTwo(et.Size_):
		var shift uintptr
		if goarch.PtrSize == 8 {
			// Mask shift for better code generation.
			shift = uintptr(sys.TrailingZeros64(uint64(et.Size_))) & 63
		} else {
			shift = uintptr(sys.TrailingZeros32(uint32(et.Size_))) & 31
		}
		lenmem = uintptr(oldLen) << shift
		newlenmem = uintptr(newLen) << shift
		capmem = roundupsize(uintptr(newcap)<<shift, noscan)
		overflow = uintptr(newcap) > (maxAlloc >> shift)
		newcap = int(capmem >> shift)
		capmem = uintptr(newcap) << shift
	default:
		lenmem = uintptr(oldLen) * et.Size_
		newlenmem = uintptr(newLen) * et.Size_
		capmem, overflow = math.MulUintptr(et.Size_, uintptr(newcap))
		capmem = roundupsize(capmem, noscan)
		newcap = int(capmem / et.Size_)
		capmem = uintptr(newcap) * et.Size_
	}

	// The check of overflow in addition to capmem > maxAlloc is needed
	// to prevent an overflow which can be used to trigger a segfault
	// on 32bit architectures with this example program:
	//
	// type T [1<<27 + 1]int64
	//
	// var d T
	// var s []T
	//
	// func main() {
	//   s = append(s, d, d, d, d)
	//   print(len(s), "\n")
	// }
	if overflow || capmem > maxAlloc {
		panic(errorString("growslice: len out of range"))
	}

	var p unsafe.Pointer
	if et.PtrBytes == 0 {
		p = mallocgc(capmem, nil, false)
		// The append() that calls growslice is going to overwrite from oldLen to newLen.
		// Only clear the part that will not be overwritten.
		// The reflect_growslice() that calls growslice will manually clear
		// the region not cleared here.
		memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
	} else {
		// Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
		p = mallocgc(capmem, et, true)
		if lenmem > 0 && writeBarrier.enabled {
			// Only shade the pointers in oldPtr since we know the destination slice p
			// only contains nil pointers because it has been cleared during alloc.
			//
			// It's safe to pass a type to this function as an optimization because
			// from and to only ever refer to memory representing whole values of
			// type et. See the comment on bulkBarrierPreWrite.
			bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(oldPtr), lenmem-et.Size_+et.PtrBytes, et)
		}
	}
	memmove(p, oldPtr, lenmem)

	return slice{p, newLen, newcap}
}

// nextslicecap computes the next appropriate slice length.
func nextslicecap(newLen, oldCap int) int {
	newcap := oldCap
	doublecap := newcap + newcap
	if newLen > doublecap {
		return newLen
	}

	const threshold = 256
	if oldCap < threshold {
		return doublecap
	}
	for {
		// Transition from growing 2x for small slices
		// to growing 1.25x for large slices. This formula
		// gives a smooth-ish transition between the two.
		newcap += (newcap + 3*threshold) >> 2

		// We need to check `newcap >= newLen` and whether `newcap` overflowed.
		// newLen is guaranteed to be larger than zero, hence
		// when newcap overflows then `uint(newcap) > uint(newLen)`.
		// This allows to check for both with the same comparison.
		if uint(newcap) >= uint(newLen) {
			break
		}
	}

	// Set newcap to the requested cap when
	// the newcap calculation overflowed.
	if newcap <= 0 {
		return newLen
	}
	return newcap
}

20240121220913

slice 扩容过程的简化流程:

检查新长度 newLen 是否超过旧 slice 的容量 oldCap，如果新长度 newLen 小于 0，则抛出错误。
如果所处理的元素类型的大小为零，返回长度和容量都和新长度 newLen 相同的 slice。
使用 nextslicecap(newLen, oldCap) 函数计算新的容量 newCap：
- 首先，新的容量 newCap 初始化为旧的容量 oldCap 的二倍。
- 如果新长度 newLen 大于这个 doublecap，那么新长度 newLen 就会成为新的容量 newCap。
- 如果旧的 slice 容量 oldCap 小于阈值 256，则新的容量 newCap 就是 doublecap，即旧容量的二倍。
- 如果旧的 slice 容量 oldCap 大于阈值 256，那么新的容量 newCap 就会在旧的容量 oldCap 基础上增长约 25%，在增长过程中，只要新的容量 newCap 足以容纳新长度 newLen，增长过程就会停止。
- 在整个过程中，一旦 newCap 出现溢出，将直接将新长度 newLen 设为新容量 newCap。
根据元素类型和新的容量 newCap，计算需要的内存大小，并在特殊情况下（例如元素类型大小为 1、指针大小以及 2 的幂次）做一些优化操作。在这个过程中，可能需要特别处理来避免计算溢出。
检查计算结果是否溢出或者超出最大分配大小，如果是则抛出错误。
为新的容量 newCap 分配新的内存。
清空新内存的未使用部分。
将旧 slice 的数据复制到新的内存中。
返回新的 slice，包括新内存的指针，新的长度 newLen 和新的容量 newCap。

截取(拷贝)

切片表达式的签名：s[start:end:cap]

基于切片 s 创建一个新的切片（可以包括原切片的所有元素），包含的元素从索引 start 处开始，到但不包括 end 索引处的元素，cap 是新创建子切片的容量，是可选的。如果 cap 省略，子切片的容量等于其长度。子切片的长度计算公式：end - start。

sliceA := []int{1, 2, 3}
sliceB := sliceA[:]
sliceC := sliceA[:1]
sliceD := sliceA[1:]
sliceE := sliceA[:1:2]

fmt.Println(*(*slice)(unsafe.Pointer(&sliceA)), sliceA) // prints {0xc00001a018 3 3} [1 2 3]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceB)), sliceB) // prints {0xc00001a018 3 3} [1 2 3]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceC)), sliceC) // prints {0xc00001a018 1 3} [1]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceD)), sliceD) // prints {0xc00001a020 2 2} [2 3]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceE)), sliceE) // prints{0xc00001a018 1 2} [1]

sliceF := make([]int, len(sliceA))
copy(sliceF, sliceA)                                    // a deep copy of sliceA
fmt.Println(*(*slice)(unsafe.Pointer(&sliceA)), sliceA) // prints {0xc00001a018 3 3} [1 2 3]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceF)), sliceF) // prints {0xc00001a030 3 3} [1 2 3]

sliceG := make([]int, 1, 2)
sliceH := modify(sliceG)                                // sliceG[0] = 1
fmt.Println(*(*slice)(unsafe.Pointer(&sliceG)), sliceG) // prints {0xc0001200a0 1 2} [1]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceH)), sliceH) // prints {0xc0001200a0 1 2} [1]

sliceI := append1(sliceG)                               // append(sliceG, 4)
fmt.Println(*(*slice)(unsafe.Pointer(&sliceG)), sliceG) // prints {0xc0001200a0 1 2} [1]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceI)), sliceI) // prints {0xc0001200a0 2 2} [1 4]

sliceJ := append2(sliceG)                               // append(sliceG, 5, 6)
fmt.Println(*(*slice)(unsafe.Pointer(&sliceG)), sliceG) // prints {0xc0001200a0 1 2} [1]
fmt.Println(*(*slice)(unsafe.Pointer(&sliceJ)), sliceJ) // prints {0xc000122020 3 4} [1 5 6]

(Try it yourself)

切片的截取是创建一个指向原始数组 array 的一个新的 slice 切片。新切片的 cap 最大不得超过原切片 cap（超过则会报错 panic: runtime error: slice bounds out of range [::5] with capacity 3）。[:n] 时，新切片 cap 等于原切片 cap, [n:] 新切片 cap 等于原切片 cap 减 n。
切片在传递的过程中，会创建一个指向原始数组 array 的一个新的 slice 切片，函数中对切片的修改可能会导致原切片的底层数组发生变化。而一旦切片发生扩容，则会对切片进行深度拷贝，深度拷贝会创建一个新的切片，并拷贝原切片的元素，新切片将拥有自己独立的元素副本。
可以使用 func copy(dst, src []T) int 函数对切片进行深度拷贝, dst 是目标切片，src 是源切片。两个切片必须有相同的元素类型 T。该函数返回复制的元素数量，取 dst 和 src 长度的最小值。注意，如果目标切片 dst 的长度小于源切片 src 的长度，则只会复制 src 的前 len(dst) 个元素。要深度复制整个切片，必须确保 dst 具有足够的容量以容纳 src 的所有元素。

range 遍历

sliceA := []int{1, 2, 3}
for i := range sliceA {
  fmt.Printf("sliceA[%d]: %d\n", i, sliceA[i])
}

sliceB := []int{4, 5}
for i, v := range sliceB {
  fmt.Printf("sliceB[%d]: %d\n", i, v)
}

sliceC := []int{7}
for i := 0; i < len(sliceC); i++ {
  fmt.Printf("sliceC[%d]: %d\n", i, sliceC[i])
}

(Try it yourself)

技巧与陷阱

`slice` 不是并发安全的数据结构

将切片 `slice` 作为输入传递给 `append` 或函数后避免再次使用它

They’re called Slices because they have Sharp Edges: Even More Go Pitfalls

对于初学者或不理解底层原理的用户而言，append 或传递后的 slice 变得不可预测（详见上面的 append 元素(扩容机制)）。在日常场景中，避免再次使用可以减少很多奇奇怪怪的问题。

预分配 `slice` 内存，避免 `growslice` 发生内存拷贝

Golang 中预分配 slice 内存对性能的影响

Golang 中预分配 slice 内存对性能的影响（续）

sliceA := []int{1, 2, 3}
sliceB := make([]int, 0, len(sliceA))
for _, v := range sliceA {
	sliceB = append(sliceB, v)
}

切片复用，避免重复创建

Go Wiki: SliceTricks #Filtering without allocating

切片内存技巧

func TrimSpace(s []byte) []byte {
    b := s[:0]
    for _, x := range s {
        if x != ' ' {
            b = append(b, x)
        }
    }
    return b
}

对于性能敏感的场景, 复用对象可以避免反复的对象创建的内存开销。另外还有使用 sync.Pool 以及使用俄罗斯大佬 (fasthttp 作者) 的 bytebufferpool 将性能推到极限的做法（详见: Golang 中预分配 slice 内存对性能的影响（续））。

避免切片内存泄漏

避免切片内存泄漏

深入探究 Go 中的 array 与 slice

切片(slice)性能及陷阱

slices: have Delete and others clear the tail #63393

var gopherRegexp = regexp.MustCompile("gopher")

// 泄漏场景
func FindGopher(filename string) []byte {
    // 读取大文件  1,000,000,000 bytes (1GB)
    b, _ := ioutil.ReadFile(filename)
    // 只取一个 6 字节的子切片
    gopherSlice := make([]byte, len("gopher"))

    // 深度拷贝
    copy(gopherSlice, gopherRegexp.Find(b...)
    return gopherSlice
}

// 正确场景
func FindGopher(filename string) []byte {
    // 读取大文件  1,000,000,000 bytes (1GB)
    b, _ := ioutil.ReadFile(filename)
    // 只取一个 6 字节的子切片
    gopherSlice := make([]byte, len("gopher"))

    // 深度拷贝
    copy(gopherSlice, gopherRegexp.Find(b...)
    return gopherSlice
}

`for i := 0; i < len(slice); i++ { ... }` 迭代，避免 `range` 值拷贝开销

Does Go’s range Copy a Slice Before Iterating Over It?

for 和 range 的性能比较

sliceA := []int{1, 2, 3}
for k, v := range sliceA {
	fmt.Println(k, v)
}

for i := 0; i < len(sliceA); i++ {
	fmt.Println(i, sliceA[i])
}

对于性能敏感的场景，避免使用 range 可以减少值拷贝开销。range 在迭代过程中返回的是迭代值的拷贝，如果每次迭代的元素的内存占用很低，那么 for 和 range 的性能几乎是一样，例如 []int。但是如果迭代的元素内存占用较高，例如一个包含很多属性的 struct 结构体，那么 for 的性能将显著地高于 range，有时候甚至会有上千倍的性能差异。对于这种场景，建议使用 for，如果使用 range，建议只迭代下标，通过下标访问迭代值，这种使用方式和 for 就没有区别了。如果想使用 range 同时迭代下标和值，则需要将切片/数组的元素改为指针，才能不影响性能。

介绍

Arrays

Slices

数据结构

append 元素(扩容机制)

append 的两种场景

growslice 切片扩容

截取(拷贝)

range 遍历

技巧与陷阱

slice 不是并发安全的数据结构

将切片 slice 作为输入传递给 append 或函数后避免再次使用它

预分配 slice 内存，避免 growslice 发生内存拷贝

切片复用，避免重复创建

避免切片内存泄漏

for i := 0; i < len(slice); i++ { ... } 迭代，避免 range 值拷贝开销

推荐阅读

`slice` 不是并发安全的数据结构

将切片 `slice` 作为输入传递给 `append` 或函数后避免再次使用它

预分配 `slice` 内存，避免 `growslice` 发生内存拷贝

`for i := 0; i < len(slice); i++ { ... }` 迭代，避免 `range` 值拷贝开销