Move flannel annotations into flannel setup, and use patch helpers to manage other node labels and annotations
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds helper function for building JsonPatch operation lists,
which allows modifying a resource without having to manually
refresh the object and retry the change on conflict.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Flannel and VPN setup shouldn't be done in generic agent config as it is only
used with embeded executor's flannel CNI.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Allows properly delegating CNI startup to executor, so that it can be plugged in as platform and distro specific implimentation without relying on cli flag hacks
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Allows importing pkg/metrics without pulling in pkg/etcd, which was causing an import loop in a follow-up commit.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* tunnel: handle pod IP reuse
a valid tunnel/session may be deleted when an IP is reused while a
Complete pod (for example a job) was using that IP but is being gc'ed.
This causes timeouts to webhooks after directDial is attempted because
session was removed.
Solution is to track the owner of the IP and delete the entry only when
the the owner pod is deleted.
Signed-off-by: Julian Vassev <jvassev@gmail.com>
* Pass GOOS into Dockerfile.local build args
Fixes issue with build-windows job not actually building for windows
* Remove `go generate` from package-cli
We no longer use codegen in this repo
* Fix go:embed path separator on Windows
* Bump hcsshim for containerd 2.1 compat on windows
* Include failing lister in error message
* Bump k3s-io/api and k3s-io/helm-controller for embedded CRD windows path fix
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes issue where the apiserver on control-plane-only nodes does not
actually wait for a connection to etcd to be available before starting.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Apparently Kubernetes objects may not have TypeMeta (APIVersion and Kind) fields set if they come from a List response - so we can't count on the objects passed to the handler having these properly set.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Move the node password secret cleanup into its own dedicated controller
that also handles auth. We now use a filtered cache of only
node-password secrets, instead of using the wrangler secret cache,
which stores all secrets from all namespaces.
The coredns node-hosts controller also now uses a single-resource
watch cache on the coredns configmap, instead of reading it from
the apiserver every time a node changes.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
K3s stopped using node password files in v1.19 (92d04355f4), so we do not need to support migrating off these any longer.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Use https port for helm-controller bootstrap charts instead of apiserver internal port, which does not listen on all address families in K3s since it is just set to avoid having the apiserver conflict with the supervisor port.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
The `continue` was incorrectly changed to `return` when converting the
loop to an inline function in 4974fc7c24
Also addresses unnecessary creation of a new kubernetes client every
time the promotion check runs.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Currently only waits on etcd and kine, as other components
are stateless and do not need to shut down cleanly.
Terminal but non-fatal errors now request shutdown via context
cancellation, instead of just logging a fatal error.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
When running K3s as a subprocess for reaping or logging purposes, properly wire up signals to send it SIGINT instead of just exiting immediately.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes issue where member removal would be requeud until the node was deleted, or rejoined with a new name.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Wait for updated ready condition before starting netpol controller, to ensure that node IPs have been updated following a restart. The current checks only ensure that the taint is removed, which works for the initial join - but does not handle changing node IPs on restarts.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Raft is now an independent dependency, with a seperate release version
* errors moved into their own subpackage
* set a default WarningUnaryRequestDuration
Signed-off-by: Derek Nola <derek.nola@suse.com>
Co-authored-by: Michael Fritch <mfritch@suse.com>