Go library for compile-time code transformation
MIT License
Superpose is a library for creating Go compiler wrappers/plugins that support transforming packages in other "dimensions" and making them callable from the original package.
Quick start example from the example/mocktime README, can build the tool:
go build ./example/mocktime/superpose-mocktime
Then can be executed as toolexec
with the just-built executable:
go run -toolexec /path/to/superpose-mocktime ./example/mocktime
Note how there are log statements outside the dimension with normal timestamps and inside the dimension with
mocked timestamps. This is because time.Now()
is altered in the separate dimension and therefore all things that
reference time
in that dimension are altered too, e.g. the log
package.
WARNING: This library is an intentionally-untagged proof of concept with no guarantees on future maintenance. Many advanced uses may not be supported.
Contents
This library leverages the -toolexec
option of go
build
/run
/test
to intercept compilation and allow
transforming certain packages in a separate dimension that are compiled alongside the untransformed code. Then a
"bridge" method can call into the other dimension. Developers simply have to write the transformer and most details
concerning caching, building, and other nuances are taken care of.
The uses of this are the same as any other compile-time transformer. Potential uses:
-race
, code coverage, go:embed
, etc can work)Granted, as with all tools like this and especially in the Go ecosystem, compile-time code transformation should be the last resort. It should only be used when it's really needed. It can also be a bit unwieldy for the compiling developer as they have to opt-in with a special argument.
time.Now()
for a mock clockSee the README in each example for how to run it.
For a basically-unusable simple example, let's say we want to change every function named "ReturnString" in our package to return "foo".
Terms in common use:
var
in the same file is present with a type of func
that is the exact signature of the exported function and a//my-dimension:MyFuncName
, it is set by the Superpose compiler to that same function in thebool
var
with a comment in the form of //my-dimension:<in>
that Superpose sets to true
whenWe must first create a transformer. This is the main executable that is used as Go's toolexec
, meaning it is invoked
for every Go build/compile/link/etc command. The transformer applies to a certain dimension and set of packages.
Assuming we want to use dimension name my-dimension
, here's how it might look:
package main
import (
"context"
"go/ast"
"strings"
"github.com/cretz/superpose"
)
func main() {
superpose.RunMain(
context.Background(),
superpose.Config{
// We use the current content ID of the executable of our version which
// adds a slight performance penalty
Version: superpose.MustLoadCurrentExeContentID(),
Transformers: map[string]superpose.Transformer{"my-dimension": transformer{}},
// This is very noisy if verbose by default. Consider only setting this as
// true during development.
Verbose: true,
},
superpose.RunMainConfig{},
)
}
type transformer struct{}
func (transformer) AppliesToPackage(ctx *superpose.TransformContext, pkgPath string) (bool, error) {
return strings.HasPrefix(pkgPath, "example.com/mymodule"), nil
}
func (transformer) Transform(
ctx *superpose.TransformContext,
pkg *superpose.TransformPackage,
) (*superpose.TransformResult, error) {
// Change any ReturnString function to return "foo"
res := &superpose.TransformResult{
// We set this to true so we can make sure our patched file appears like it
// was named the original file name
AddLineDirectives: true,
// If verbose is on, this will log the entirety of every patched file, which
// we want during development
LogPatchedFiles: true,
}
// Go over each file in the package
for _, file := range pkg.Syntax {
for _, decl := range file.Decls {
// Add patch if it's the func we want
decl, _ := decl.(*ast.FuncDecl)
if decl == nil || decl.Name.Name != "ReturnString" {
continue
}
res.Patches = append(res.Patches, &superpose.Patch{
// We're replacing from just after opening brace to just before closing
// brace
Range: superpose.Range{Pos: decl.Body.Lbrace + 1, End: decl.Body.Rbrace},
// In addition to our return statement, we also want to set a line
// directive before the closing brace to what it was before so all other
// line numbers of the file still read the same
Str: fmt.Sprintf(
` return "foo" /*line :%v*/`,
pkg.Fset.Position(decl.Body.Rbrace).Line,
),
})
}
}
return res, nil
}
In any package underneath example.com/mymodule
that has a ReturnString
top-level function, we will change it to just
return "foo"
. A more advanced example would have done some type checking to confirm the function looked right, but
this is a simplified example.
Note how we built a patch and set AddLineDirectives: true
and added /*line :<line>*/
to our patch. Superpose works
on patches instead of AST alterations. This is important to retain line information. When we may alter line counts but
we want to appear in stack traces and debugger as the original line, we need AddLineDirectives: true
to fix the
filename, and then we need to set line directives for the
compiler.
Once that transformer is built as an executable, we can now use it in -toolexec
. -toolexec
build flag is accepted in
all go
calls that may build, e.g. go build
, go run
, go test
, etc. So if we had a user_code.go
file, we
could:
go run -toolexec /path/to/my-transformer user_code.go
There is a caveat however for build tags. Go does not provide toolexec
executables a way to know what build tags are
in use by itself and dependencies. Therefore, if we set -tags
on the go
command, we have to set -buildtags
for the
toolexec
. For example:
go run -tags mytag -toolexec "/path/to/my-transformer -buildtags mytag" user_code.go
This ensures build tags are respected when building the other dimensions.
Now that we have a transformer for a dimension and know how to build with it, we need to be able to call into the
dimension. Say we have this file at example.com/mymodule/otherpkg/return_string.go
:
package otherpkg
func ReturnString() string { return "original string" }
Now say we want to call otherpkg
in the other dimension. If we just call otherpkg.ReturnString()
we'll get
"original string"
. To call the other dimension we have to make a "bridge function".
A bridge function is an exported function in a file accompanied by a var
of that exact function signature, including
parameter/return var names, in that same file. The var has a special comment in the form //my-dimension:MyFunc
that
tells the Superpose compiler that it should be set with the same function from that dimension. The package for the file
containing this bridge function must also return true
for Transformer.AppliesToPackage
for that dimension.
Here's an example, say at file example.com/mymodule/cmd/main.go
, that has a bridge function to the my-dimension
dimension:
package main
import (
"fmt"
"example.com/mymodule/otherpkg"
)
func CallReturnString() string { return otherpkg.ReturnString() }
var CallReturnStringInMyDimension func() string //my-dimension:CallReturnString
func main() {
fmt.Printf("Normal code: %v\n", CallReturnString())
fmt.Printf("Other dimension code: %v\n", CallReturnStringInMyDimension())
}
Running:
go run -toolexec /path/to/my-transformer ./cmd
Will output:
Normal code: original string
Other dimension code: foo
Bridge functions do not have to be in the main package. Any number of bridge functions can be defined. Since package-level vars are different in different dimensions, it may make sense to have a bridge function reference/mutate them. Note, types from a transformed package can't be used as parameter/return to the bridge function because it will appear as another type in the bridge and a compile error will occur.
Sometimes in transformed code we need to know whether we're running in a dimension or not. This can be done with a
"in-var" which is a special bool var
with a comment in the form //my-dimension:<in>
where <in>
is literally the
term. For example, if we had:
package main
import "fmt"
var inMyDimension bool //my-dimension:<in>
func PrintSomething() {
if inMyDimension {
fmt.Println("In my dimension")
} else {
fmt.Println("In normal code")
}
}
var PrintSomethingInMyDimension func() string //my-dimension:PrintSomething
func main() {
PrintSomething()
PrintSomethingInMyDimension()
}
Then running with the toolexec
, the output will be:
In normal code
In my dimension
These in-vars can be in any transformed package and any number of them may be created. They do not have to be exported.
An earlier incarnation of this library had an entire test framework, but it became very apparent it was much clearer to
just pass toolexec
to go test
too and run code and build bridge functions to test across dimensions there.
Therefore to test a transformer, just write tests with bridge functions as needed to assert the transformer did the
right thing, and run go test
with -toolexec
of the transformer. This means there is transformer build a step that
runs before go test
which can be automated as needed.
TransformResult
contains a set of patches that reference positions on the file set of the incoming package. Each
Patch
contains a required Range
it replaces that contains a required inclusive start Pos
and an optional exclusive
End
. If End
is 0/unset, the patch will be an insertion instead of a replacement. The required Str
of the patch
contains the string contents to patch.
Some notes about patches:
Str
contains {{
, it is assumed to be a Go template
Captures
which is a named map of ranges that are made available via the Captures
object inSome guidance on patching:
go fmt
is not applied on patched code
1
will reference the character right after or before/*line :<line>*/
-styleWrapWithPatch
, if patching things that are not full replacements, two patches should be usedWhen transforming, sometimes it is necessary to depend on a package that may not have been depended on by the
transformed package before. The transformer is expected to patch the import
s necessary in source to do this. However,
the linker needs to know about any new packages to include at compile time. This can be done by setting the dependency
package name as a key on the TransformResult.IncludeDependencyPackages
map. If the package is already a dependency of
this package, it will have no effect.
When Go compiles a package, it first collects and compiles its dependencies. Go expects all dependencies are compiled
before the current package is compiled. Therefore, any dependencies added to this map must have already been compiled.
And it must also be resolvable. go list -f "{{.Export}}" -export qualified/pkg/path
is used to obtain the package
file.
Users are encouraged to have their transformer code and their runtime code explicitly reference the package that may
be needed somewhere in code so that it is included as a go.mod
dependency at compile time and runtime. In cases
where the transformer is compiled somewhere differently than the code that uses it is compiled, this can still result in
cases where the dependency is not yet compiled. In these cases, it is encouraged to build the transformer where the code
is built, or if that can't be done, technically go build
can be done on the package as needed.
Go uses a concept of a "build ID" for caching output and determining whether to re-run. This is built on a set of slash-delimited hashes: a leading hash representing input called an "action ID", a trailing hash representing output called a "content ID" (which may be unset if not yet compiled), and any content in between. See comments at the top of buildid.go in the Go source if curious about details.
The build ID can be affected by content changes, Go version changes, build tag changes, different build flags, etc.
Superpose leverages this behavior by just altering the existing action IDs with reproducible dimension-specific hashes
for the other dimensions and caches the results in its own cache. Since this hash is built by dimension name and not
patched content, it can be stale if the transformer changes. So a required Version
must be set in the Superpose
config.
Version
should be unique for each change of a transformer that would alter code. Otherwise old cached builds from a
previous version of the same transformer may be used. Many developers may choose to use
superpose.MustLoadCurrentExeContentID()
which is the content ID of the current executable (so it changes when the exe
changes). This is a reasonable default choice but it has two downsides:
go tool buildid <current exe>
on every single Go compile/link command. So now every package that has to beIf either of these are a concern, the Version
field can be manually maintained.
Executables for toolexec
built with Superpose already accept flags like -verbose
and -buildtags
. Users can add
their own options to be set by a user using superpose.Config.AdditionalFlags
. Don't forget to properly quote the flags
when compiling, e.g.:
go build -toolexec "/path/to/my-transformer -myflag flag value" some_code.go
Effort has not currently been made to support step-based debuggers in toolexec. Therefore, the only approach to having development/debugging details is to use logging.
During development, superpose.Config.Verbose
can be true
to show a lot of output during compilation. It can also be
set to true via the -verbose
flag on the toolexec
executable. Verbose will also include any logs to
ctx.Superpose.Debugf
on the context inside the transformer. Also, TransformResult.LogPatchedFiles
can be set to
true
on the transformer result to have full patched files dumped via that same logging mechanism (so still only
visible if Verbose
is set).
When go build
is run, here's (mostly) what happens:
compile -V=full
is called to get the tool build ID to affect build IDs of the compiler's inputs/outputscompile
is run for each package, with dependencies run before dependents
go env GOCACHE
to see where by default these are cachedimportcfg
is given which is a file containing a list of dependency packages already compiled that thelink
is called to build the executable
importcfg
is given which contains all built dependency packages for the entire programWhen -toolexec
is added to go
build calls, instead of the above steps executing directly, that tool is called for
each of the above steps where the compile/link/etc executables with their args just become the tool's args. Therefore
Superpose just intercepts -toolexec
calls.
compile
When toolexec
is executed for the compile step, Superpose does two steps defined below - "compile dimensions" and
"build bridge". Then it continues the compilation, possibly using updated arguments from the last step.
If any transformers apply to the given package and if that package has not already been compiled in that dimension before for its given action ID, we run the package through the transformers as described below.
true
AddLineDirectives: true
, for every file that has a patch on it, add a line directive at the top of the fileimportcfg
argument to a temp importcfg
file containing updated dependencies that are applicable toIf there are any bridge function vars in the package:
importcfg
compile arg with a new file that contains the contents of the existing file and adds new packagelink
Before the downstream link
call is performed, the following argument alterations are made:
importcfg
file that has the contents of the old oneimportcfg
file that applies to a dimension, add the dimension-specific package tooimportcfg
file if not alreadyTODO(cretz):
At Temporal, workflows in Go are written using our SDK. Workflow code is required to be deterministic and isolated. Currently, Temporal just asks that users to not use the non-deterministic constructs in Go (i.e. async constructs, external stuff, map ranging, global state mutation). This is part of a research project to see if we can make an insecure sandbox that does make those constructs deterministic so the code doesn't have to concern itself with safety. So we can make map ranging deterministic, do goroutine-local globals, use deterministic emulations of Go async constructs, and somewhat restrict external system access in an acceptably-not-foolproof way.
go
command and injecting toolexec
on build
, e.g. my-go build ...
would becomego build -toolexec "/path/to/my-go toolexec"
go:generate
or manual code generation that writes entire patched set of source somewhere for easy compilationinternal
package transformed-modfile
and really anything