golang ssa实战
go doc: index go doc: ssa go doc: cmd/compile go doc: ssa(internal code) go doc: ssa(internal doc) ssa rule
编译go编译器读起来有点绕口,指的是将go语言的源码下载下来,然后编译,生成go工具。 以go1.15源码为例,参考https://golang.org/doc/install/source。
go工具链的代码也是go语言编写的,因此需要一个现存的go工具链作为编译工具,有两个方法:1.确保PATH中能够找到go 2.将GOROOT_BOOTSTRAP变量设置成你需要使用的go安装目录,这样将会使用$GOROOT_BOOTSTRAP/bin/go作为编译工具
对bootstrap go的版本要求: go1.4及以上。
make.bash 编译
(cd src && ./make.bash)all.bash 编译和运行测试用例
(cd src && ./all.bash)重新编译:clean.bash (可选) ,运行make.bash或all.bash
测试新的编译器: testdata/hello/hello.go
package main import "fmt" func main() { fmt.Printf("hello, world\n") }下面在运行go时,必须保证GOROOT环境变量指向当前的目录,因为go tool查找工具的目录是$GOROOT/pkg/tool/linux_amd64,如果不设置会使用旧的GOROOT从而导致工具链不一致,编译错误
GOROOT=$PWD bin/go run testdata/hello/*.go输出
hello, world在golang源码的根目录下,新建scripts文件夹,添加下面的脚本: go-init: 将scripts目录添加到PATH
# usage: source go-init file=${BASH_SOURCE[0]} if [[ -z $file ]];then echo "usage: source go-init" >&2 exit 1 fi SCRIPTS_DIR=$(dirname "$(realpath "$file")") MYGO=$(dirname "$SCRIPTS_DIR") export PATH=$MYGO/scripts:$PATHgengo: 编译go
#!/usr/bin/env bash SCRIPTS_DIR=$(dirname "$(realpath "$0")") MYGO=$(dirname "$SCRIPTS_DIR") gos=$(which -a go 2>/dev/null) bootstrapGo= for go in $gos;do if [[ $go != $MYGO/scripts/go ]];then bootstrapGo=$go break fi done if [[ -z $bootstrapGo ]];then echo "cannot find bootstrap go" >&2 exit 1 fi export GOROOT_BOOTSTRAP=$(dirname "$(dirname "$bootstrapGo")") cd "$MYGO/src" log=$(mktemp) ./make.bash 1>"$log" 2>&1 if [[ $? != 0 ]];then cat "$log" >&2 fi rm "$log"cleango: 清除编译结果
#!/usr/bin/env bash SCRIPTS_DIR=$(dirname "$(realpath "$0")") MYGO=$(dirname "$SCRIPTS_DIR") cd "$MYGO/src" ./clean.bashgo: 调用编译的go工具
#!/usr/bin/env bash SCRIPTS_DIR=$(dirname "$(realpath "$0")") MYGO=$(dirname "$SCRIPTS_DIR") GOROOT=$MYGO $MYGO/bin/go "$@"在开始的使用,使用source scripts/go-init准备好开发环境,然后随时可使用gengo生成go工具,使用go即可调用。
如果你使用coc.nvim进行开发,那么正确解析代码是十分重要的。在使用coc.nvim来开发时,遇到go源码跳转不了的情况,你可以通过 :CocInfo来查看gopls的错误输出信息。
在配置gopls时我们指定了如何发现go,通过go.mod, .vim, .git来确定
"languageserver": { "go": { "command": "gopls", "rootPatterns": ["go.mod",".vim",".git"], "trace.server": "verbose", "disableWorkspaceFolders": true, "filetypes": ["go"] }, ... }所以,我们必须从src目录进入nvim, 而不是源码根目录。
所以,完整地打开nvim的命令如下:
(GOROOT=$PWD && cd src && GOROOT=$GOROOT GO111MODULE=on nvim)这样才能保证gopls能够正确解析go源码。注意GO111MODULE=on, on必须小写.
设置完成之后,你可以感觉到gd(go to definition)飞一般的速度,妈妈再也不用担心你的nvim跳转不了了。
在 https://github.com/golang/go/blob/release-branch.go1.15/src/cmd/compile/internal/ssa/gen/generic.rules#L590中有下列规则:
// basic phi simplifications (Phi (Const8 [c]) (Const8 [c])) => (Const8 [c]) (Phi (Const16 [c]) (Const16 [c])) => (Const16 [c]) (Phi (Const32 [c]) (Const32 [c])) => (Const32 [c]) (Phi (Const64 [c]) (Const64 [c])) => (Const64 [c])这个规则的含义: 如果Phi节点的两个可选值都是相同的常量,则可以使用常量替换这个Phi节点
生成的代码:https://github.com/golang/go/blob/release-branch.go1.15/src/cmd/compile/internal/ssa/rewritegeneric.go#18559
func rewriteValuegeneric_OpPhi(v *Value) bool { // match: (Phi (Const8 [c]) (Const8 [c])) // result: (Const8 [c]) for { _ = v.Args[1] v_0 := v.Args[0] if v_0.Op != OpConst8 { break } c := auxIntToInt8(v_0.AuxInt) v_1 := v.Args[1] if v_1.Op != OpConst8 || auxIntToInt8(v_1.AuxInt) != c || len(v.Args) != 2 { break } v.reset(OpConst8) v.AuxInt = int8ToAuxInt(c) return true } // .... 其他3个规则 }我们可以通过改动这里相关的代码来调试ssa.
查看编译过程中的日志
改写cmd/compile/internal/ssa/rewritegeneric.go文件
func rewriteValuegeneric(v *Value) bool { if os.Getenv("LET_ME_PANIC") == "true" { panic(fmt.Errorf("LET_ME_PANIC = true")) } switch v.Op { case OpAdd16: // ... }生成go工具,然后编译一个工程:
MYGO=$PWD cd SomeModule GOROOT=$MYGO LET_ME_PANIC=true $MYGO/bin/go build查看调用栈
../../../../pkg/mod/gopkg.in/yaml.v3@v3.0.0-20200615113413-eeeca48fe776/decode.go:326:33: internal compiler error: 'init': panic during opt while compiling init: LET_ME_PANIC = true goroutine 25 [running]: cmd/compile/internal/ssa.Compile.func1(0xc00171ce98, 0xc000de7a20) X/golang/go/src/cmd/compile/internal/ssa/compile.go:48 +0xa5 panic(0xbf95a0, 0xc000664cb0) X/golang/go/src/runtime/panic.go:965 +0x1b9 cmd/compile/internal/ssa.rewriteValuegeneric(0xc0010e8888, 0xc001719f00) X/golang/go/src/cmd/compile/internal/ssa/rewritegeneric.go:15 +0x22f7 cmd/compile/internal/ssa.applyRewrite(0xc000de7a20, 0xc8cae0, 0xc8cb60, 0x3297b7dc1f4e01) X/golang/go/src/cmd/compile/internal/ssa/rewrite.go:129 +0x50d cmd/compile/internal/ssa.opt(0xc000de7a20) X/golang/go/src/cmd/compile/internal/ssa/opt.go:9 +0x48 cmd/compile/internal/ssa.Compile(0xc000de7a20) X/golang/go/src/cmd/compile/internal/ssa/compile.go:96 +0x98d cmd/compile/internal/gc.buildssa(0xc000a8ba20, 0x3, 0x0) X/golang/go/src/cmd/compile/internal/gc/ssa.go:463 +0xe1a cmd/compile/internal/gc.compileSSA(0xc000a8ba20, 0x3) X/golang/go/src/cmd/compile/internal/gc/pgen.go:319 +0x5d cmd/compile/internal/gc.compileFunctions.func2(0xc0012e5380, 0xc000459880, 0x3) X/golang/go/src/cmd/compile/internal/gc/pgen.go:384 +0x4d created by cmd/compile/internal/gc.compileFunctions X/golang/go/src/cmd/compile/internal/gc/pgen.go:382 +0x129 goroutine 25 [running]: runtime/debug.Stack(0xd7bea0, 0xc00000e018, 0x0) ....可以看到编译过程使用了并发,rewriteValuegeneric的调用栈:
cmd/compile/internal/gc.compileFunctions -> cmd/compile/internal/gc.compileSSA -> cmd/compile/internal/gc.buildssa -> cmd/compile/internal/ssa.Compile -> cmd/compile/internal/ssa.opt -> cmd/compile/internal/ssa.applyRewrite -> cmd/compile/internal/ssa.rewriteValuegeneric来看一段phi节点消除的代码
// phielimValue tries to convert the phi v to a copy. func phielimValue(v *Value) bool { if v.Op != OpPhi { return false } // If there are two distinct args of v which // are not v itself, then the phi must remain. // Otherwise, we can replace it with a copy. var w *Value for _, x := range v.Args { if x == v { continue } if x == w { continue } if w != nil { return false } w = x } if w == nil { // v references only itself. It must be in // a dead code loop. Don't bother modifying it. return false } v.Op = OpCopy v.SetArgs1(w) f := v.Block.Func if f.pass.debug > 0 { f.Warnl(v.Pos, "eliminated phi") } return true }主要步骤分为: 1.检查是否满足全部相同 2.将Op重写为OpCopy, Args设置为目标。
cmd/compile/internal/ssa/copyelim.go
在一条连续的OpCopy链上,消除所有的OpCopy,保证V的值是最终的值 算法:
V = (OpCopy (OpCopy (OpCopy ....(OpCopy X)...))) -> V = X除了最终V=X的优化之外,中间的所有OpCopy节点的Args[0]也都简化为X的值,保证后面不需要继续遍历这条链。
对函数体中的所有块的值,将Phi node简化为Copy node, 然后不断简化Copy node,重复这个过程,直到不能再简化。
// phielim eliminates redundant phi values from f. │ 70 // the use count of all of its argument // A phi is redundant if its arguments are all equal. For │ s. // purposes of counting, ignore the phi itself. Both of │ 71 // Not quite a deadcode pass, because i // these phis are redundant: │ t does not handle cycles. // v = phi(x,x,x) │ 72 // But it should help Uses==1 rules to // v = phi(x,v,x,v) │ fire. // We repeat this process to also catch situations like: │ 73 v.reset(OpInvalid) // v = phi(x, phi(x, x), phi(x, v)) │ 74 change = true // TODO: Can we also simplify cases like: │ 75 } // v = phi(v, w, x) │ 76 // No point rewriting values which aren't used. // w = phi(v, w, x) │ 77 continue // and would that be useful? │ 78 } func phielim(f *Func) { │ 79 for { │ 80 vchange := phielimValue(v) change := false │ 81 if vchange && debug > 1 { for _, b := range f.Blocks { │ 82 fmt.Printf("rewriting %s -> %s\n", v0.LongStr for _, v := range b.Values { │ ing(), v.LongString()) copyelimValue(v) │ 83 } change = phielimValue(v) || change │ 84 } │ 85 // Eliminate copy inputs. } │ 86 // If any copy input becomes unused, mark it if !change { │ 87 // as invalid and discard its argument. Repeat break │ 88 // recursively on the discarded argument. } │ 89 // This phase helps remove phantom "dead copy" uses } │ 90 // of a value so that a x.Uses==1 rule condition }ssa.Compile是函数优化的入口
调用链的分析中,主要是确定实体之间的关系。实体与实体之间的关系:
_______ _______ | | relation | | | 实体 | ---------> | 实体 | | | | | ------- -------关系可以是任何谓词,实体由类型+实例化标识构成。
将go build, go install, go link等每一项基于package看成一个action,action的属性Deps表示所依赖的其他动作,可以知道:go link 依赖 go build. 并且 main包的build依赖所有子包的build动作。
而且,最终肯定有一些包是无需依赖第三方包的(unsafe, builtin等)。所以,go构造了一个有向无环图,并确保从依赖为0的节点开始构建,直到最终构建完main包。
代码欣赏:Do函数 https://github.com/golang/go/blob/release-branch.go1.15/src/cmd/go/internal/work/exec.go#L56
在测试go build时,由于缓存存在,有些编译的分支可能因为缓存而走不到。使缓存无效的命令:
# -x show rm commands # -r recursively,including all dependencies # -modcache module's cache # -cache build cache go clean -x -r -modcache -cache