* Add store tests with fixtures
* Try connecting to local etcd first, if it is available
* Handle panics from etcd backend code
* Don't try to read WAL and restore v3 snapshots as they almost never exist
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Reconcile against local etcd would short-circuit and skip reading from the datastore if the cert dirs were missing.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes an issue where copying files out from under a currently-running etcd instance can cause startup reconcile to fail. Direct creation of a mvcc store without any of the raft stuff is faster, and gives us direct control over how the store handles snapshot recovery.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes issue where the apiserver on control-plane-only nodes does not
actually wait for a connection to etcd to be available before starting.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Currently only waits on etcd and kine, as other components
are stateless and do not need to shut down cleanly.
Terminal but non-fatal errors now request shutdown via context
cancellation, instead of just logging a fatal error.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
- Add testlet for new provider switch
- Handle migration between providers
- Add exception for criticalcontrolargs
Signed-off-by: Derek Nola <derek.nola@suse.com>
We are not making use of the stack traces that these functions capture, so we should avoid using them as unnecessary overhead.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes issue where CA rotation would fail on servers with join URL set due to using old data from disk on other server
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Also silences warnings about bootstrap fields that are not intended to be handled by CA rotation
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Write the extra metadata both locally and to S3. These files are placed such that they will not be used by older versions of K3s that do not make use of them.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Allow bootstrapping with kubeadm bootstrap token strings or existing
Kubelet certs. This allows agents to join the cluster using kubeadm
bootstrap tokens, as created with the `k3s token create` command.
When the token expires or is deleted, agents can successfully restart by
authenticating with their kubelet certificate via node authentication.
If the token is gone and the node is deleted from the cluster, node auth
will fail and they will be prevented from rejoining the cluster until
provided with a valid token.
Servers still must be bootstrapped with the static cluster token, as
they will need to know it to decrypt the bootstrap data.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Add EncryptSecrets to Critical Control Args
* use deep comparison to extract differences
Signed-off-by: Derek Nola <derek.nola@suse.com>
Signed-off-by: Derek Nola <derek.nola@suse.com>
Problem:
Using defer inside a loop can lead to resource leaks
Solution:
Judge newer file in the separate function
Signed-off-by: iyear <ljyngup@gmail.com>
Since #4438 removed 2-way sync and treats any changed+newer files on disk as an error, we no longer need to determine if files are newer on disk/db or if there is a conflicting mix of both. Any changed+newer file is an error, unless we're doing a cluster reset in which case everything is unconditionally replaced.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Properly skip restoring bootstrap data for files that don't have a path
set because the feature that would set it isn't enabled.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Reuse the existing etcd library code to start up the temporary etcd
server for bootstrap reconcile. This allows us to do proper
health-checking of the datastore on startup, including handling of
alarms.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Several types contained redundant references to ControlRuntime data. Switch to consistently accessing this via config.Runtime instead.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
If you don't explicitly close the etcd client when you're done with it,
the GRPC connection hangs around in the background. Normally this is
harmelss, but in the case of the temporary etcd we start up on 2399 to
reconcile bootstrap data, the client will start logging errors
afterwards when the server goes away.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Enable reconcilation on secondary servers
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Remove unused code
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Attempt to reconcile with datastore first
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Added warning on failure
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Update warning
Signed-off-by: Derek Nola <derek.nola@suse.com>
* golangci-lint fix
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Add etcd extra args support for K3s
Signed-off-by: Chris Kim <oats87g@gmail.com>
* Add etcd custom argument integration test
Signed-off-by: Chris Kim <oats87g@gmail.com>
* go generate
Signed-off-by: Chris Kim <oats87g@gmail.com>