This change should hopefully address E2E failures that are happening on main. It's hard to test that as the failures are platform specific and neither my laptop nor the runner for tests on PRs matches the failing platforms.
* refactor: Check that the state storage provider is present when beginning to initialise a state store for use in a non-init command. Ensure reattached providers can be used.
Previously we passed all required providers into backend options to be used within `stateStoreConfig`, which is invoked via (Meta).Backend. The new approach enforces that the provider is present while assembling the backend options passed to (Meta).Backend from (Meta).backend, which is non-init specific. As this code is defending against users running non-init commands before an init, this place feels appropriate and isn't able to impact the init command.
* fix: Reattached PSS providers should return early when checking locks, and an empty locks file is only bad if there isn't a reattached PSS provider
* test: Assert that running init with reattached PSS provider is ok, via an E2E test that uses the reattach feature.
* fix: Allow builtin or reattached providers to be used for state stores when generating a plan file
* test: Expand E2E test to show using a reattached provider can be used for a workflow of init, plan with -out, and apply.
* chore: Replace 'io/ioutil' and format code in unmanaged e2e tests
* feat: Allow reading state store configuration from a planfile and using it to prepare a Local backend that uses the state store
* test: Assert that we can get and use state store configuration from a plan file
* test: Add integration test showing that an apply command can use a plan file to configure and use a state store
* test: Add E2E test showing pluggable state storage being used with the full init-plan-apply workflow
* feat: A plan file will report the state storage provider among its required providers, if PSS is in use.
See the code comment added in this commit. This addition does not impact an apply command as the missing provider will be detected before this code is executed. However I'm making this change so that the method is still accurate is being able to return a complete list of providers needed by the plan.
* fix: Include error messages when there is a problem parsing provider or state store config when getting a backend from a planfile
* feat: Add trace logs to BackendForLocalPlan indicating when the provider is launched and the state store is configured
* chore: Small grammar change in error diagnostic
* refactor: Remove suggestions when the plan's state store doesn't match the implementations in the provider
* test: Add test coverage of what happens when the contents of a plan file using PSS doesn't match the resources available in the project
* test: Add E2E test for using pluggable state storage with the `providers` command
Note: I've excluded the `terraform providers locks` and `terraform providers mirror` commands as they don't interact with backends.
* test: Add integration test for using pluggable state storage with the `providers` command
* refactor: Change ioutil.ReadDir to os.ReadDir
* test: Add integration test for using pluggable state storage with the `providers schema` command
* feat: Allow state store schema's to be included when schemas are marshalled into JSON output
* test: Assert that state stores are present in provider schemas returned from `providers schema`.
* test: Update existing tests to accommodate state stores being in provider schema output
* test: Update E2E test for `providers` commands to be better scoped to testing use of a state store to access and use state when generating output.
This complements TestProvidersSchema that tests that state stores in a provider are reflected in the JSON representations of the schemas that the command returns.
* chore: Replace `io/ioutil` with `io` in `providers schema` tests
* test: Add E2E test demonstrating `output` command used with PSS
* test: Add E2E test demonstrating `show` command used with PSS
* docs: Fix code comment
* test: Add integration test for using pluggable state storage with the `output` command
* test: Add integration test for using pluggable state storage with the `show` command
* test: Add E2E tests for `state list` and `state show` commands
* test: Update `mockPluggableStateStorageProvider` to log a warning during tests where the values in `MockStates` aren't compatible with the `ReadStateBytesFn` default function. Make existing test set an appropriate value in `MockStates`.
* test: Update `mockPluggableStateStorageProvider` helper to include a resource schema
* test: Add command-level test for `state list` showing integration with pluggable state storage code.
* test: Add command-level test for `state show` showing integration with pluggable state storage code.
* test: Add command-level test for `state pull` showing integration with pluggable state storage code.
* test: Add command-level test for `state identities` showing integration with pluggable state storage code.
* test: Add command-level test for `state rm` showing integration with pluggable state storage code.
* test: Add command-level test for `state mv` showing integration with pluggable state storage code.
* test: Add command-level test for `state push` showing integration with pluggable state storage code.
* test: Add command-level test for `state replace-provider` showing integration with pluggable state storage code.
* test: Change shared test fixture to not be named after a specific command under test.
This test fixure is reused across tests that need the config to define a state store but otherwise rely on the mock provider to set up the test scenario.
* test: Update test to use shared test fixture
* test: Remove redundant test fixture
The internal/command/testdata/state-commands-state-store and internal/command/testdata/state-store-unchanged test fixtures are the same.
* fix: Re-add logic for setting chunk size in the context of E2E tests using grpcwrap package
This was removed, incorrectly, in https://github.com/hashicorp/terraform/pull/37899
* refactor: Let panic happen if there's incompatibility between mock returned from `mockPluggableStateStorageProvider` and the `MockStates` that's been set in the mock
* test: Refactor to contain paths in reused variable, remove unnecessary .gitkeep
* test: Remove unneeded test code
* feat: Update the `workspace new` subcommand to work with PSS, add E2E testing
* refactor: Replace instances of `ioutil` with `os` while looking at the workspace command
* docs: Update code comments in `workspace new` command
* test: Update E2E test using PSS with workspace commands to assert state files are created by given commands
* test: Include `workspace show` in happy path E2E test using PSS
* fix: Allow DeleteState RPC to include the id of the state to delete
* test: Include `workspace delete` in happy path E2E test using PSS
* fix: Avoid assignment to nil map in mock provider during WriteStateBytes
* test: Add integration test for workspace commands when using PSS
We still need an E2E test for this, to ensure that the GRPC-related packages pass all the expected data between core and the provider.
* test: Update test to reflect changes in the test fixture configuration
* docs: Fix code comment
* test: Change test to build its own Terraform binary with experiments enabled
* refactor: Replace use of `newMeta` with reuse of Meta that we re-set the UI value on
* feat: Implement `inmem` state store in provider-simple-v6
* feat: Add filesystem state store `fs` in provider-simple-v6, no locking implemented
* refactor: Move PSS chunking-related constants into the `pluggable` package, so they can be reused.
* feat: Implement PSS-related methods in grpcwrap package
* test: Add E2E test checking an init and apply (no plan) workflow is usable with both PSS implementations
* fix: Ensure state stores are configured with a suggested chunk size from Core
---------
Co-authored-by: Radek Simko <radeksimko@users.noreply.github.com>
* Add method to allow accessing factories from locks that are in memory
* Create new getStateStoreProviderFactory method for accessing a factory from config
* Update getStateStoreProviderFactory to use in memory locks instead of reading them from the deps lock file. Update calling code to accommodate this.
* Add tests for getStateStoreProviderFactory, improve errors returned from method
* Update test following schema change in simple providers
* Move test case into e2etest package, so we protect against environments where building binaries isn't possible
* Fix issues with running test in e2etest package
* Update code comments
Co-authored-by: Radek Simko <radeksimko@users.noreply.github.com>
---------
Co-authored-by: Radek Simko <radeksimko@users.noreply.github.com>
At one point, these tests were explicitly passing an empty cli config filepath so that terraform would ignore any existing cli config file (only relevant if you are running these tests locally), but the behavior changed over time such that it was no longer working for these tests (now if the env var file path is empty, we fall back to the default). The actual behavior is reasonably correct (I've depended on passing in blank config file paths before, so I don't love it, but I don't see a need to break this either) so I've added a small method that drops a blank file in place to fix the behavior and avoid file not found errors.
The providers mirror command was updated to inspect the lock file,
however that was not part of the original intent for the command, and
it's possible that the command needs to be run without a lock file.
Since we have been honoring the lock file for the past few releases,
let's keep that consistent and allow disabling the file with
`-lock-file=false`.
When we added these functions we hadn't yet settled on a single convention
for how provider-contributed functions ought to be named, and so these
followed the convention previously used for Terraform's own built-in
functions.
We've now decided to use the new provider namespace as an opportunity to
partially fix the historical design error of naming functions as all
lowercase without any separation between words. This makes the provider
functions inconsistent with the built-in functions, but should at least
hopefully encourage the provider functions to be consistent with one
another, so that the rule for which to use is relatively easy to remember.
This also switches the word order because the provider functions convention
prefers the verb to come first: encode_tfvars instead of tfvarsencode.
Using the new possibility of provider-contributed functions, this
introduces three new functions which live in the
terraform.io/builtin/terraform provider, rather than being language
builtins, due to their Terraform-domain-specific nature.
The three new functions are:
- tfvarsencode: takes a mapping value and tries to transform it into
Terraform CLI's "tfvars" syntax, which is a small subset of HCL that
only supports key/value pairs with constant values.
- tfvarsdecode: takes a string containing content that could potentially
appear in a "tfvars" file and returns an object representing the
raw variable values defined inside.
- exprencode: takes an arbitrary Terraform value and produces a string
that would yield a similar value if parsed as a Terraform expression.
All three of these are very specialized, of use only in unusual situations
where someone is "gluing together" different Terraform configurations etc
when the usual strategies such as data sources are not suitable. There's
more information on the motivations for (and limitations of) each function
in the included documentation.
For a very long time we've had an annoying discrepancy between the
in-memory state model and our state snapshot format where the in-memory
format stores output values for all modules whereas the snapshot format
only tracks the root module output values because those are all we
actually need to preserve between runs.
That design wart was a result of us using the state both as an internal
and an external artifact, due to having nowhere else to store the
transient values of non-root module output values while Terraform Core
does its work.
We now have namedvals.State to internally track all of the throwaway
results from named values that don't need to persist between runs, so now
we'll use that for our internal work instead and reserve the states.State
model only for the data that we will preserve between runs in state
snapshots.
The namedvals internal model isn't really designed to support enumerating
all of the output values for a particular module call, but our expression
evaluator currently depends on being able to do that and so we have a
temporary inefficient implementation of that which just scans the entire
table of values as a stopgap just to avoid this commit growing even larger
than it already is. In a future commit we'll rework the evaluator to
support the PartialEval mode and at the same time move the responsiblity
for enumerating all of the output values into the evaluator itself, since
it should be able to determine what it's expecting by analyzing the
configuration rather than just by trusting that earlier evaluation has
completed correctly.
Because our legacy state string serialization previously included output
values for all modules, some of our context tests were accidentally
depending on the implementation detail of how those got stored internally.
Those tests are updated here to test only the data that is a real part
of Terraform Core's result, by ensuring that the relevant data appears
somewhere either in a root output value or in a resource attribute.
When we originally introduced the trust-on-first-use checksum locking
mechanism in v0.14, we had to make some tricky decisions about how it
should interact with the pre-existing optional read-through global cache
of provider packages:
The global cache essentially conflicts with the checksum locking because
if the needed provider is already in the cache then Terraform skips
installing the provider from upstream and therefore misses the opportunity
to capture the signed checksums published by the provider developer. We
can't use the signed checksums to verify a cache entry because the origin
registry protocol is still using the legacy ziphash scheme and that is
only usable for the original zipped provider packages and not for the
unpacked-layout cache directory. Therefore we decided to prioritize the
existing cache directory behavior at the expense of the lock file behavior,
making Terraform produce an incomplete lock file in that case.
Now that we've had some real-world experience with the lock file mechanism,
we can see that the chosen compromise was not ideal because it causes
"terraform init" to behave significantly differently in its lock file
update behavior depending on whether or not a particular provider is
already cached. By robbing Terraform of its opportunity to fetch the
official checksums, Terraform must generate a lock file that is inherently
non-portable, which is problematic for any team which works with the same
Terraform configuration on multiple different platforms.
This change addresses that problem by essentially flipping the decision so
that we'll prioritize the lock file behavior over the provider cache
behavior. Now a global cache entry is eligible for use if and only if the
lock file already contains a checksum that matches the cache entry. This
means that the first time a particular configuration sees a new provider
it will always be fetched from the configured installation source
(typically the origin registry) and record the checksums from that source.
On subsequent installs of the same provider version already locked,
Terraform will then consider the cache entry to be eligible and skip
re-downloading the same package.
This intentionally makes the global cache mechanism subordinate to the
lock file mechanism: the lock file must be populated in order for the
global cache to be effective. For those who have many separate
configurations which all refer to the same provider version, they will
need to re-download the provider once for each configuration in order to
gather the information needed to populate the lock file, whereas before
they would have only downloaded it for the _first_ configuration using
that provider.
This should therefore remove the most significant cause of folks ending
up with incomplete lock files that don't work for colleagues using other
platforms, and the expense of bypassing the cache for the first use of
each new package with each new configuration. This tradeoff seems
reasonable because otherwise such users would inevitably need to run
"terraform providers lock" separately anyway, and that command _always_
bypasses the cache. Although this change does decrease the hit rate of the
cache, if we subtract the never-cached downloads caused by
"terraform providers lock" then this is a net benefit overall, and does
the right thing by default without the need to run a separate command.
Go 1.19's "fmt" has some awareness of the new doc comment formatting
conventions and adjusts the presentation of the source comments to make
it clearer how godoc would interpret them. Therefore this commit includes
various updates made by "go fmt" to acheve that.
In line with our usual convention that we make stylistic/grammar/spelling
tweaks typically only when we're "in the area" changing something else
anyway, I also took this opportunity to review most of the comments that
this updated to see if there were any other opportunities to improve them.
This also sets an additional variable if it detects that this is an alpha
or development build, which currently does nothing but might eventually
turn on the ability to use experimental features, if we make that
something available only in prereleases.
This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.
Prior to this commit, temporary directory created using `ioutil.TempDir`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
defer func() {
if err := os.RemoveAll(dir); err != nil {
t.Fatal(err)
}
}
is also tedious, but `t.TempDir` handles this for us nicely.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
This uses the decoupled build and run strategy to run the e2etests so that
we can arrange to run the tests against the real release packages produced
elsewhere in this workflow, rather than ones generated just in time by
the test harness.
The modifications to make-archive.sh here make it more consistent with its
originally-intended purpose of producing a harness for testing "real"
release executables. Our earlier compromise of making it include its own
terraform executable came from a desire to use that script as part of
manual cross-platform testing when we weren't yet set up to support
automation of those tests as we're doing here. That does mean, however,
that the terraform-e2etest package content must be combined with content
from a terraform release package in order to produce a valid contest for
running the tests.
We use a single job to cross-compile the test harness for all of the
supported platforms, because that build is relatively fast and so not
worth the overhead of matrix build, but then use a matrix build to
actually run the tests so that we can run them in a worker matching the
target platform.
We currently have access only to amd64 (x64) runners in GitHub Actions
and so for the moment this process is limited only to the subset of our
supported platforms which use that architecture.
When an explicit backend is configured with a configuration which has
not yet been initialized, running `terraform init` performs a state
migration to fetch the remotely stored state in order to operate on it.
Like the previous bug introduced by the recent provider diagnostics
change, this code path was not correctly configured to enable init mode
for the backend, which resulted in a fatal error during init when the
cache dir is deleted.
Setting the `Init` backend option allows this code path to continue
without error when first initializing the backend for state migration.
The new e2e test fails without this change.
We test that a deleted provider cache results in an error when running
terraform plan, but previously did not test that running init (as
instructed) would resolve the issue. This (failing) e2e test adds that
step.
We have various mechanisms that aim to ensure that the installed provider
plugins are consistent with the lock file and that the lock file is
consistent with the provider requirements, and we do have existing unit
tests for them, but all of those cases mock our fake out at least part of
the process and in the past that's caused us to miss usability
regressions, where we still catch the error but do so at the wrong layer
and thus generate error message lacking useful additional context.
Here we'll add some new end-to-end tests to supplement the existing unit
tests, making sure things work as expected when we assemble the system
together as we would in a release. These tests cover a number of different
ways in which the plugin selections can grow inconsistent.
These new tests all run only when we're in a context where we're allowed
to access the network, because they exercise the real plugin installer
codepath. We could technically build this to use a local filesystem mirror
or other such override to avoid that, but the point here is to make sure
we see the expected behavior in the main case, and so it's worth the
small additional cost of downloading the null provider from the real
registry.