These tests have race conditions where they assume immediate state
visibility after a pod transitions to Running. The current code works
on fast runtimes but is fundamentally racy: kubelet log streaming,
log file flushing, and container status updates are eventually
consistent, not synchronous.
Switching to gomega.Eventually polling makes the tests deterministic.
The success path on fast runtimes is unchanged (polling succeeds on
first attempt), but the tests now correctly handle scenarios where
state takes a moment to propagate. This benefits any environment
where containers may take longer to start (VM-isolated runtimes such
as Kata, gVisor, and Windows Hyper-V; overloaded CI VMs; shared
multi-tenant clusters).
- ephemeral_containers.go (both 'should be added' and 'should update'
tests): the 'polo' log-content check is polled via gomega.Eventually
with f.Timeouts.PodStartShort. The container may report Running
before its first stdout has been flushed.
- lifecycle_hook.go ('ignore terminated container'): use
f.Timeouts.PodDelete instead of gracePeriod*time.Second for the
termination wait. The actual correctness check (container's intrinsic
StartedAt/FinishedAt < sleepSeconds) is unchanged and unaffected by
how long we waited.
- pods.go ('retrieving logs from the container over websockets'):
poll the websocket open and read via gomega.Eventually. The container
can be reported Running before its first stdout line has been flushed,
so opening the websocket immediately may return an empty or partial
buffer.
Replace the manual 3-retry loop (with no delay) in VerifyCgroupValue
with framework.Gomega().Eventually() + HandleRetry, matching the
pattern used for oom_score_adj deflake in #137329. This gives proper
polling with backoff when exec fails during container restarts.
Graduates the ImageVolume feature gate to GA in v1.36, locked to enabled.
Changes:
- Add v1.36 GA entry with LockToDefault: true
- Remove +featureGate=ImageVolume annotations from API types
- Promote e2e test to conformance
- Add emulation versioning to disablement tests
- Update conformance test metadata
- Remove feature-gated test expectations for ImageVolume PullPolicy
Ref: https://github.com/kubernetes/enhancements/issues/4639
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
ImagePullTest was using wait.PollUntilContextCancel which has no
timeout, causing tests to hang for hours when the expected container
state is never reached (e.g., ErrImageNeverPull).
Changed to wait.PollUntilContextTimeout with ContainerStatusRetryTimeout
(5 minutes), matching the pattern used by other tests in the same file.
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Updates the codebase to use the new glibc-dns-testing image which replaces
the deprecated jessie-dnsutils image.
This PR depends on the glibc-dns-testing image being available in the
registry (registry.k8s.io/e2e-test-images/glibc-dns-testing:2.0.0).
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The spaces are unnecessary because Ginkgo adds spaces automatically.
This was detected before only for tests using the wrapper functions,
now it also gets detected for ginkgo methods.
The pod resize e2e tests use memory limits as low as 20Mi for Guaranteed
QoS pods. On OpenShift/CRI-O, the container runtime (runc) runs inside
the pod's cgroup and requires ~20-22MB of memory during container
creation and restart operations. This causes intermittent OOM kills
when the pod's memory limit is at or below runc's memory footprint.
This issue does not occur on containerd-based clusters because
containerd's shim runs outside the pod's cgroup by default (ShimCgroup=""),
so runc's memory is not charged against the pod's limit.
Increase memory limits to provide sufficient headroom for runc:
- originalMem: 20Mi -> 35Mi
- reducedMem: 15Mi -> 30Mi
- increasedMem: 25Mi -> 40Mi
The test validates resize behavior, not minimal memory limits, so
larger values do not reduce test coverage.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
This has been replaced by `//build:...` for a long time now.
Removal of the old build tag was automated with:
for i in $(git grep -l '^// +build' | grep -v -e '^vendor/'); do if ! grep -q '^// Code generated' "$i"; then sed -i -e '/^\/\/ +build/d' "$i"; fi; done
The pod resize e2e tests use memory limits as low as 20Mi for Guaranteed
QoS pods. On OpenShift/CRI-O, the container runtime (runc) runs inside
the pod's cgroup and requires ~20-22MB of memory during container
creation and restart operations. This causes intermittent OOM kills
when the pod's memory limit is at or below runc's memory footprint.
This issue does not occur on containerd-based clusters because
containerd's shim runs outside the pod's cgroup by default (ShimCgroup=""),
so runc's memory is not charged against the pod's limit.
Increase memory limits to provide sufficient headroom for runc:
- originalMem: 20Mi -> 35Mi
- reducedMem: 15Mi -> 30Mi
- increasedMem: 25Mi -> 40Mi
The test validates resize behavior, not minimal memory limits, so
larger values do not reduce test coverage.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
Remove the e2e test since we switched to beta (enabled by default)
instead of GA. We re-add the test in 1.36.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>