When a container restarts before kubelet restarts, containerMap has
multiple entries (old exited + new running). GetContainerID() may
return the exited container, causing the running check to fail. Fixed
by checking if ANY container for the pod/name is running.
Also filter terminal pods from podresources since they no longer
consume resources, and fix test error handling to avoid exiting
Eventually immediately on transient errors.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
With kubernetes/kubernetes#132028 merged, pods in terminal states are no longer
reported by the podresources API. The previous test logic accounted for the old
behavior where even failed pods appeared in the API response (tracked under k/k
issue #119423). As a result, we used to expect the failed test pod to be present in the
response but with an empty device set.
This change updates the test to reflect the new, correct behavior:
1. The failed test pod should no longer appear in the podresources API response.
2. The test now asserts absence of the failed pod rather than checking for an empty device assignment.
This simplifies the test logic and aligns expectations with the current upstream
behavior of the podresources API.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
In the device plugin node reboot e2e test, the test previously waited a
short period for the resources exported by the sample device plugin to
appear on the local node. On slower test nodes, the plugin may take
longer to register, causing flakes where the expected devices are not
yet available.
This change increases the polling duration to 2 minutes, ensuring the test
waits long enough for the expected device capacity and allocatable resources
to appear, improving test stability.
This commit also updates the assertion message to be more explicit making
failures clearer and improving test reliability.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
In the device plugin node reboot e2e test, the registration trigger
(control file deletion) was being executed immediately after pod creation.
This could create a race condition: the device plugin container might not
be fully running, causing the test to flake when devices were not reported
as available on the node.
This change explicitly waits for the sample device plugin pod to reach the
Running/Ready state before deleting the registration control file. This
ensures that the device plugin is ready to register its devices with the
kubelet, eliminating a possible source of test flakiness.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
With the device plugin node reboot test fixed, we can see in testgrid
[node-kubelet-containerd-flaky](https://testgrid.k8s.io/sig-node-containerd#node-kubelet-containerd-flaky)
that the test is passing consitently and we can remove the flaky label.
With the test not flaky anymore, we can validate new PRs against it
and ensure we don't cause regressions.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
We have a e2e test which tries to ensure device plugin assignments to pods are kept
across node reboots. And this tests is permafailing since many weeks at
time of writing (xref: #128443).
Problem is: closer inspection reveals the test was well intentioned, but
puzzling:
The test runs a pod, then restarts the kubelet, then _expects the pod to
end up in admission failure_ and yet _ensure the device assignment is
kept_! https://github.com/kubernetes/kubernetes/blob/v1.32.0-rc.0/test/e2e_node/device_plugin_test.go#L97
A reader can legitmately wonder if this means the device will be kept busy forever?
This is not the case, luckily. The test however embodied the behavior at
time of the kubelet, in turn caused by #103979
Device manager used to record the last admitted pod and forcibly added
to the list of active pod. The retention logic had space for exactly one
pod, the last which attempted admission.
This retention prevented the cleanup code
(see: https://github.com/kubernetes/kubernetes/blob/v1.32.0-rc.0/pkg/kubelet/cm/devicemanager/manager.go#L549
compare to: https://github.com/kubernetes/kubernetes/blob/v1.31.0-rc.0/pkg/kubelet/cm/devicemanager/manager.go#L549)
to clear the registration, so the device was still (mis)reported
allocated to the failed pod.
This fact was in turn leveraged by the test in question:
the test uses the podresources API to learn about the device assignment,
and because of the chain of events above the pod failed admission yet
was still reported as owning the device.
What happened however was the next pod trying admission would have
replaced the previous pod in the device manager data, so the previous
pod was no longer forced to be added into the active list, so its
assignment were correctly cleared once the cleanup code runs;
And the cleanup code is run, among other things, every time device
manager is asked to allocated devices and every time podresources API
queries the device assignment
Later, in PR https://github.com/kubernetes/kubernetes/pull/120661
the forced retention logic was removed from all the resource managers,
thus also from device manager, and this is what caused the permafailure.
Because all of the above, it should be evident that the e2e test was
actually enforcing a very specific and not really work-as-intended
behavior, which was also overall quite puzzling for users.
The best we can do is to fix the test to record and ensure that
pods which did fail admission _do not_ retain device assignment.
Unfortunately, we _cannot_ guarantee the desirable property that
pod going running retain their device assignment across node reboots.
In the kubelet restart flow, all pods race to be admitted. There's no
order enforced between device plugin pods and application pods.
Unless an application pod is lucky enough to _lose_ the race with both
the device plugin (to go running before the app pod does) and _also_
with the kubelet (which needs to set devices healthy before the pod
tries admission).
Signed-off-by: Francesco Romani <fromani@redhat.com>
Feature:DevicePluginProbe and NodeFeature:DevicePluginProbe
are not used by any of the test-infra jobs.
This commit renames NodeFeature:DevicePluginProbe to NodeFeature:DevicePlugin
and removes Feature:DevicePlugin and Feature:DeviceManager to avoid
having both Feature and NodeFeature tags for the same feature.
NOTE: Test-infra SIG-Node jobs should focus on the
NodeFeature:DevicePlugin to run generic Device Plugins tests.
This changes the text registration so that tags for which the framework has a
dedicated API (features, feature gates, slow, serial, etc.) those APIs are
used.
Arbitrary, custom tags are still left in place for now.
This test depends on CDI support in a runtime and doesn't work
with the out-of-the box Containerd. Marking it as a NodeSpecialFeature
should fix Containerd CI job failures.
The kubelet restarts working pods with an exponential back-off delay,
with a maximum cap of 5 minutes. The waiting 1 minutes may happen to be
in back-off time.
Signed-off-by: Ruquan Zhao <ruquan.zhao@arm.com>
Make sure orphanded pods (pods deleted while kubelet is down) are
handled correctly.
Outline:
1. create a pod (not static pod)
2. stop kubelet
3. while kubelet is down, force delete the pod on API server
4. restart kubelet
the pod becomes an orphaned pod and is expected to be killed by HandlePodCleanups.
There is a similar test already, but here we want to check device
assignment.
Signed-off-by: Francesco Romani <fromani@redhat.com>
The recently added e2e device plugins test to cover node reboot
works fine if runs every time on CI environment (e.g CI) but
doesn't handle correctly partial setup when run repeatedly on
the same instance (developer setup).
To accomodate both flows, we extend the error management, checking
more error conditions in the flow.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Fix e2e device manager tests.
Most notably, the workload pods needs to survive a kubelet
restart. Update tests to reflect that.
Signed-off-by: Francesco Romani <fromani@redhat.com>
image_list.go is one of the files included in the non-test variant Go build list, but its getSampleDevicePluginPod function references readDaemonSetV1OrDie function defined in device_plugin_test.go which is included in the test variant Go build list only. (The file name is *_test.go).
As a result, "go build" fails with the undefined reference error.
In practice, that may not be an issue since k8s project contributors aren't meant to run go build on this package. However, tools that depend on go build to operate - e.g., gopls or govulncheck ./... - will report this as an error.
Fix this error and make test/e2e package pass go build by moving this file to also test-only source code.
Additional test cases added:
Keeps device plugin assignments across pod and kubelet restarts (no device plugin re-registration)
Keeps device plugin assignments after the device plugin has re-registered (no kubelet or pod restart)
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Add a test suit to simulate node reboot (achieved by removing pods
using CRI API before kubelet is restarted).
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
This test captures that scenario where after kubelet restart,
application pod comes up and the device plugin pod hasn't re-registered
itself, the pod fails with admission error. It is worth noting that
once the device plugin pod has registered itself, another
application pod requesting devices ends up running
successfully.
For the test case where kubelet is restarted and device plugin
has re-registered without involving pod restart, since the
pod after kubelet restart ends up with admission error,
we cannot be certain the device that the second pod (pod2) would
get. As long as, it gets a device we consider the test to pass.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Capture explicitly a test case pertaining to kubelet restart
but with no pod restart and device plugin re-registration.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Based on whether the test case requires pod restart or not, the sleep
interval needs to be updated and we define constants to represent the two
sleep intervals that can be used in the corresponding test cases.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
Explicitly state that the test involves kubelet restart and device plugin
re-registration (no pod restart)
We remove the part of the code where we wait for the pod to restart as this
test case should no longer involve pod restart.
In addition to that, we use `waitForNodeReady` instead of `WaitForAllNodesSchedulable`
for ensuring that the node is ready for pods to be scheduled on it.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
Rather than testing out for both pod restart and kubelet restart,
we change the tests to just handle pod restart scenario.
Clarify the test purpose and add extra check to tighten the test.
We would be adding additional tests to cover kubelet restart scenarios
in subsequent commits.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
With this change the error message are more helpful and easier
to troubleshoot in case of test failures.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>