Go-plugin now reports stack traces via and Error log level, so we need
to update the logger to watch that level instead. We also must
explicitly enable logging for the plugin, because go-plugin will
otherwise no send any logs at all.
Graph transformations are really only interesting for developers
directly working on graph transformers. Otherwise the complete graph can
be printed once in the trace log for general debugging purposes.
Removing the intermediate graph steps can reduce log output for large
configs by many megabytes per run. On top of the volume of output, the
graph string generation can cause a noticeable impact on performance with
large graphs.
We were importing this to resolve an init-time conflict with this library
when it was indirectly loaded by the etcd libraries.
We removed the etcd backends a while back and so we no longer use any of
the etcd modules in Terraform, and so this tricky import was our only
remaining reference to github.com/coreos/pkg/capnslog.
Dropping this eliminates two unnecessary dependencies.
Programs that monitor Terraform's output to report panics might
make the reasonable assumption that the string "panic:" is always
included and is therefore safe to monitor for. Our custom panic
output unfortunately breaks these assumptions at the moment.
Instead of asking consumers to add their own handling to deal with
this problem, let's add that greppable string to our custom panic
output.
Our goal with this panic-interception was to largely mimic how the Go
runtime would normally report panics except for two intentional
exceptions: an extra prompt explaining to the user that Terraform crashed,
and exiting with status code 11 instead of 2.
Unfortunately we accidentally deviated in a different way: we're reporting
to whatever os.Stderr happens to refer to, instead of to the real process
stderr. It seems like that shouldn't really matter, but unfortunately
go-plugin intentionally changes os.Stderr to refer to a totally separate
stream that it manages, causing the captured panic messages to be routed
over a grpc-based channel to the plugin client.
This deviation makes the panic messages not visible to usual strategies
for trying to heuristically detect that a Go program has panicked. Without
a special interception like Terraform is doing here, the Go runtime
writes directly to the stderr file descriptor without going through the
os.Stderr abstraction, and so to achieve consistent behavior we need to
do a little hoop-jumping to approximate that result.
In particular, this makes the behavior now consistent with what happens
when a provider plugin running as a child of Terraform Core panics, and
so a system which tries to sniff stderr for content that seems like a
panic message will be able to handle both situations equally and avoid
making a special case for Terraform Core/CLI's own panics.
etcd rewrote its import path from coreos/etcd to go.etcd.io/etcd.
Changed the imports path in this commit, which also updates the code
version.
This lets us remove the github.com/ugorji/go/codec dependency, which
was pinned to a fairly old version. The net change is a loss of 30,000
lines of code in the vendor directory. (I first noticed this problem
because the outdated go/codec dependency was causing a dependency
failure when I tried to put Terraform and another project in the same
vendor directory.)
Note the version shows up funkily in go.mod, but I verified
visually it's the same commit as the "release-3.4" tag in
github.com/coreos/etcd. The etcd team plans to fix the release version
tagging in v3.5, which should be released soon.
Once a plugin process is started, go-plugin will redirect the stdout and
stderr stream through a grpc service and provide those streams to the
client. This is rarely used, as it is prone to failing with races
because those same file descriptors are needed for the initial handshake
and logging setup, but data may be accidentally sent to these
nonetheless.
The usual culprits are stray `fmt.Print` usage where logging was
intended, or the configuration of a logger after the os.Stderr file
descriptor was replaced by go-plugin. These situations are very hard for
provider developers to debug since the data is discarded entirely.
While there may be improvements to be made in the go-plugin package to
configure this behavior, in the meantime we can add a simple monitoring
io.Writer to the streams which will surface th data as warnings in the
logs instead of writing it to `io.Discard`
This is not currently a supported interface, but we plan to release
tool(s) that consume parts of it that are more dependable later,
separately from Terraform CLI itself.
When logging is turned on, panicwrap will still see provider crashes and
falsely report them as core crashes, hiding the formatted provider
error. We can trick panicwrap by slightly obfuscating the error line.
Create a logger that will record any apparent crash output for later
processing.
If the cli command returns with a non-zero exit status, check for any
recorded crashes and add those to the output.
Now that hclog can independently set levels on related loggers, we can
separate the log levels for different subsystems in terraform.
This adds the new environment variables, `TF_LOG_CORE` and
`TF_LOG_PROVIDER`, which each take the same set of log level arguments,
and only applies to logs from that subsystem. This means that setting
`TF_LOG_CORE=level` will not show logs from providers, and
`TF_LOG_PROVIDER=level` will not show logs from core. The behavior of
`TF_LOG` alone does not change.
While it is not necessarily needed since the default is to disable logs,
there is also a new level argument of `off`, which reflects the
associated level in hclog.
Use a separate log sink to always capture trace logs for the panicwrap
handler to write out in a crash log.
This requires creating a log file in the outer process and passing that
path to the child process to log to.
Use a single log writer instance for all std library logging.
Setup the std log writer in the logging package, and remove boilerplate
from test packages.