Since the DAG package was lifted from Terraform, its contents are more
than what we need for now, so this commit cleans-up the package to keep
only the currently needed parts of code.
If we need to support more in the future, we can revert this commit, or
pickup the changes again from Terraform.
Since we introduce the DAG with this series of commits, only on locals
and data sources, we need to make sure that the behaviour is what we
expect.
Therefore, this commit adds a basic test with Packer build, and packer
validate, to evaluate a template with both locals and data sources
depending on one another.
This is rejected with the sequential evaluation methods, as we process
the different types one-by-one, whereas the DAG allows us to mix the
order between the two, while still rejecting circular dependencies (and
doing that before they even get evaluated), and self-references.
Evaluating local variables used to be directly written to the
PackerConfig while each variable was created.
This was somewhat of an issue with testing, as we have a bunch of tests
that relied on `PackerConfig.Variables` being set only when we actually
write something.
This is not really a concern for normal use, just for testing, but to
limit the number of changes to the tests in hcl2template, I opted to
change how variables' values are retained, so that evaluating a single
variable returns a Variable in addition to hcl.Diagnostics, so we can
reify the approach and only create the map of variables if there's
something evaluated.
When preparing the datasources to add into the DAG for evaluating the
build prerequisites, we ended-up in a weird situation in which the
datasources for each vertex pointed to the same one.
This is because of the loop semantics of Go, where the same object is
reused over and over again during each loop, so in the end every
datasource vertex pointed to the same instance of a datasource block.
To avoid this, we instead grab them through their reference, making the
reference to the datasource purely local, and pointing to the actual
datasource block, not the one scoped to the function.
Walk uses a reverse topological order to walk on the graph, doing that
visit concurrently if possible.
This is nice as we can speed-up execution of datasources and locals,
however since the `Variables` map stored in the config, and the
production of the context for it, are not meant to be used concurrently,
this means that we end-up in cases where Packer crashes because of
concurrent accesses to that map.
So until we can change this behaviour, we will fallback to using the
sequential visit algorithm for those vertexes, therefore limiting the
risk of those conflicts.
Local variables had an attribute called Name with the name of the local
variable.
However, when producing an error while walking the DAG of
local/datasources, if an error is encountered during validation, the raw
structure of the vertex was printed out, making the error message
produced hard to understand.
Therefore in order to clean it up, we rename the `Name` attribute for
Local variables as `LocalName`, and introduce a `Name()` function for
that block so that the complete name of the variable is clearly
reported.
The implementation of the DAG as extracted from Terraform relied on a
Root vertex being injected into the graph as the last node to visit.
This is used as a sanity check for Terraform, but doesn't apply to our
use-case for now, as we are always executing everything and have no need
for this root node.
Instead, we change how Validate operates so it does not error in case
there is no valid root node for the graph, but enables us calling it to
check for self-referencing edges, and circular dependencies.
For all the commands that call Initialise, we introduce a new flag:
UseSequential.
This disables DAG scheduling for evaluating datasources and locals as a
fallback to the newly introduced DAG scheduling approach.
`hcl2_upgrade` is a special case here, as the template is always JSON,
there cannot be any datasource, so the DAG in this case becomes
meaningless, and is not integrated in this code path.
Following up on the DAG work, this commit adds a new option for
initialisation that disables DAG on request.
By default we are going to use the DAG approach, with an option to
fallback to using the older algorithm for evaluation in case users
end-up in an edge-case that prevents them from building a template.
As we have finished setting-up the codebase for it, this commit adds the
logic that uses the internal DAG package, and is able to orchestrate
evaluation of datasources and locals in a non-phased way.
Instead, this code acts by first detecting the dependencies for those
components, builds a graph from them, with edges representing the
dependency links between them, and finally walking on the graph
breadth-first to evaluate those components.
This can act as a drop-in replacement for the current phased logic, but
both should be supported until we are confident that the approach works,
and that there are little to no bugs left to squash.
The dag package is a port over from Terraform to Packer, changing what
little there was to fit our current dependency ecosystem.
Most of the changes are on the type of diagnostics returned, as
Terraform has its own type for them, while we rely on hcl's Diagnostics.
Other than that, the functionality is essentially equivalent, and the
code was barely touched.
When registering dependencies for datasources and locals, we now use
refString.
This allows for the functions that detect dependencies to not only be
able to register the same types as dependencies, but instead generalises
it to any type that refString supports, so data, local and var.
This can then be leveraged for orchestrating evaluation of those
components in a non-phased way (i.e. with a DAG for dependency
management).
As with variables, this commit introduces a new function whose purpose
is evaluating the datasource directly, without looking at its
dependencies, or recursively trying to execute them.
Instead, we rely on the DAG to determine when it is safe to execute it,
or if using the phased approach, the current logic will apply.
Since we are in the process of integrating a new way to orchestrate
dependency management and evaluation for datsources and local variables,
we need to split the current function that manages the evaluation of
said local variables, so that recursion and the actual evaluation of a
local variable are two separate functions.
This will allow for evaluating a single variable once the dag is ready
to be introduced.
The `Name` function is used by the DAG library to determine which vertex
is associated to a component, so the `Name` attribute/function needs to
replicate the combination of type and name, so the vertex can be
accurately fetched from the graph.
The hcl2template package contains references already, but these are
linked to a particular type.
This becomes problematic if we want to support cross-type references, so
this commit adds a new abstraction: refString.
A refString contains the component type, its type (if applicable), and
its name, so that the combination of those points to a cty object that
can be linked to a block in the configuration.
Right now, only `var`, `local` and `data` are supported, but the type is
extensible enough that anything else that fits this model such as
sources can be supported in the future potentially.
In HCP's metadata package, especially the VCS/git parts, we keep the
current HEAD for a repository, along with the state it is in, in order
to report it to HCP Packer when the build completes.
However, when a build is run on a template from an empty Git repository,
and HCP Packer is enabled, the code would crash when trying to get the
information on the current HEAD, as it doesn't exist.
The git library we use returns an error in such a case, but this was
ignored, leading to a crash when attempting to get the hash to this
reference later on.
This commit fixes the problem by NOT ignoring the error to get the head,
and immediately stop processing the git data as it doesn't yet exist.
In order for the creation of a temporary directory to install plugins
into to be simpler to understand and use, we change how the directory is
created, cleaned-up, and installs plugins into.
Now, instead of a tuple of a string (path) and a cleanup function, we
return a structure that comprises the test suite, and the temporary
directory, along with methods to handle those steps independently.
When using a PackerCommand, the Run function was made public as a way to
access the contents of an execution.
This was clumsy as it had too many responsabilities, and was not needed
strictly as Assert was performing the executions, as many times as
required.
This could introduce cases in which one run as spent by the caller, then
the remainder were executed through Assert.
Therefore, we change this convention.
Now, run is private to the type, and only through Assert can a command
be executed.
If a test needs access to a command's output, stderr, or error, it can
do so through the Output function, which requires Assert to be called
first.
When a command asserts its output with checkers, by default it will
register errors through a t.Errorf.
While this works, in some cases we would want to stop execution
immediately if a function's Assert fails, as the rest of the test may
depend on the assertion being valid.
In the current state, this means either getting the result of the run to
check if an error was returned (not fully reliable as if the command was
run multiple times, and the last run succeeded, we won't get an error),
or relying on t.IsFailed() (completely reliable).
Instead, we introduce a new function on packerCommand, that lets users
change how Assert behaves, so that if an error was reported, instead of
logging the error and flagging the test as failed, we can use t.Fatalf,
so that the test immedately fails and stops execution.
The lib name for the common components for writing packer_test suites
was not clear, and did not follow the convention established in Packer
core and plugins.
Therefore this commit does two things: first the lib is renamed into
common as to follow this convention, and clearly document which
components are common to all tests.
Also checkers are placed in a subpackage of common, common/check, so
that it is clearer what is meant to be used as checks for a command's
execution status after it's been run, as part of Assert.
The function's name was the same as the plugins test suite runner/init
function, making it impossible to differentiate between the two when
looking at the logs, or when attempting to filter which tests to run
using the suite function's name.
The interface for building a plugin through the test suite was
confusing, as it would build the plugin and return its path, cache the
path for the version built, and return the path regardless if something
was built or not.
While in the current state this is harmless as builds are idempotent,
since the state of the plugin package/module does not change, this will
in the future as we introduce customisation techniques on the plugin's
directory and files, making this double-use potentially dangerous.
Furthermore, the current behaviour is unclear, as the function hides
that caching mechanism, which could come as a surprise for users
attempting to build a plugin for the duration of a test, while the built
plugin is linked to the test suite being run, and not the unit test
being evaluated.
Therefore this commit changes the sequence in which plugins are built
and used. Now the `CompilePlugin` function builds a plugin, and does not
return its path anymore, instead terminating the tests immediately if
they fail.
In normal test usage, a new `GetPluginPath` function is introduced,
which looks-up the path in the suite's cache, failing immediately if
invoked before the plugin is built.
With this change, it is heavily advised to build plugins when
initialising the suite, then in the tests, the GetPluginPath function
should be used to get a plugin's path for interacting with packer
commands.
The compiledPlugins map with a mutex was essentially a glorified
sync.Map, and therefore did not need to be defined as such.
Instead, this commit replaces its uses by a normal sync.Map, and keeps
it in the base test suite.
When testing a packer build or validate, one common use case is to check
which plugins are loaded and used by Packer to run a command.
This is generally done through Grep, but repeating the same pattern can
be redundant, and if the output changes, all those need to be updated.
Therefore, this commit introduces a PluginsUsed gadget, which can be
orchestrated to ensure a plugin is used, or not used, allowing to check
for multiple plugins at once.
As a small reorganisation attempt, this commit moves some file-system
oriented functions from plugin.go into its own file so they are more
logically grouped.
In terms of organisation, we keep functions that interact with plugins
into its own file, therefore the function that build plugin versions
should really be in plugin.go, not in suite.go.
The compiledPlugins map used to be a global variable, which can be
problematic as we move to independent test suites, since those test
suites run in the same process space, the global variable could point to
now deleted plugin versions/paths in separate suites, which would make
tests fail with random errors.
To avoid this, the map is now scoped to the test suite, and a new copy
is created lazily if used by the test suite.
Having only one test suite for the whole of Packer makes it harder to
segregate between test types, and makes for a longer runtime as no tests
run in parallel by default.
This commit splits the packer_test suite into several components in
order to make extension easier.
First we have `lib`: this package embeds the core for running Packer
test suites. This ships facilities to build your own test suite for
Packer core, and exposes convenience methods and structures for building
plugins, packer core, and use it to run a test suite in a temporary
directory.
Then we have two separate test suites: one for plugins, and one for core
itself, the latter of which does not depend on plugins being compiled at
all.
This sets the stage for more specialised test suites in the future, each
of which can run in parallel on different parts of the code.