Commit graph

433 commits

Author SHA1 Message Date
Dawei Wei
b763eaf594 test/e2e/common/node: poll for eventually-consistent state to reduce flakiness
These tests have race conditions where they assume immediate state
visibility after a pod transitions to Running. The current code works
on fast runtimes but is fundamentally racy: kubelet log streaming,
log file flushing, and container status updates are eventually
consistent, not synchronous.

Switching to gomega.Eventually polling makes the tests deterministic.
The success path on fast runtimes is unchanged (polling succeeds on
first attempt), but the tests now correctly handle scenarios where
state takes a moment to propagate. This benefits any environment
where containers may take longer to start (VM-isolated runtimes such
as Kata, gVisor, and Windows Hyper-V; overloaded CI VMs; shared
multi-tenant clusters).

- ephemeral_containers.go (both 'should be added' and 'should update'
  tests): the 'polo' log-content check is polled via gomega.Eventually
  with f.Timeouts.PodStartShort. The container may report Running
  before its first stdout has been flushed.

- lifecycle_hook.go ('ignore terminated container'): use
  f.Timeouts.PodDelete instead of gracePeriod*time.Second for the
  termination wait. The actual correctness check (container's intrinsic
  StartedAt/FinishedAt < sleepSeconds) is unchanged and unaffected by
  how long we waited.

- pods.go ('retrieving logs from the container over websockets'):
  poll the websocket open and read via gomega.Eventually. The container
  can be reported Running before its first stdout line has been flushed,
  so opening the websocket immediately may return an empty or partial
  buffer.
2026-05-12 11:38:18 -07:00
Kubernetes Prow Robot
f82d2d6ae2
Merge pull request #138508 from natasha41575/natasha_approver
OWNERS: nominate natasha41575 as pod-resize subpackage approver
2026-04-23 23:12:48 +05:30
Natasha Sarkar
b26fd418e6 nominate natasha41575 as subpackage approver 2026-04-23 16:48:43 +00:00
Kubernetes Prow Robot
b36864202b
Merge pull request #137755 from HirazawaUi/remove-SidecarContainers-feature-gate
Remove SidecarContainers feature gate
2026-04-23 08:16:45 +05:30
Kubernetes Prow Robot
eb07064af9
Merge pull request #137869 from yangjunmyfm192085/skiptest
Fix [Failing Test] Security Context SupplementalGroupsPolicy [LinuxOnly] [Feature:SupplementalGroupsPolicy]
2026-04-23 04:15:22 +05:30
ndixita
3e7c6e3c83
Simplify isPodLevelResourcesResizeInProgress to check for absence of actuated resources
Signed-off-by: ndixita <ndixita@google.com>
2026-03-27 23:03:20 +00:00
Kubernetes Prow Robot
b30567c744
Merge pull request #135828 from HirazawaUi/5607-alpha-2-stage
Kubelet: Add alpha-2 stage implementation for UserNamespacesHostNetworkSupport feature gate
2026-03-26 15:08:18 +05:30
ndixita
ec6d65e333
test fixes 2026-03-20 22:00:17 +00:00
ndixita
3b19886847
Test fixes - update expected restart count for containers with no resources that inherit changes due to pod-level modifications 2026-03-20 22:00:16 +00:00
ndixita
466e5d4720
sync on observedGeneration and add CPU tolerance in resize framework 2026-03-20 21:59:58 +00:00
HirazawaUi
0ffc845789 Add alpha 2 phase implementation for UserNamespacesHostNetworkSupport 2026-03-19 22:37:01 +08:00
HirazawaUi
964d79dd6e Remove SidecarContainers feature gate 2026-03-19 15:56:47 +08:00
杨军10092085
aa9e4887fa Fix [Failing Test] Security Context SupplementalGroupsPolicy [LinuxOnly] [Feature:SupplementalGroupsPolicy] 2026-03-19 14:34:44 +08:00
Natasha Sarkar
2d23d21802 e2e test for resize of non-sidecar init containers 2026-03-19 00:43:57 +00:00
Kubernetes Prow Robot
0ad0cce87e
Merge pull request #137078 from saschagrunert/label-unlabeled-e2e-node-tests
Label unlabeled e2e node tests
2026-03-14 04:31:36 +05:30
Kubernetes Prow Robot
4df03ea76e
Merge pull request #137550 from KhushAhuja/deflake-resize-cgroup-exec-retry
test/e2e: deflake pod resize cgroup value verification
2026-03-14 03:41:35 +05:30
KhushAhuja
efddaf6561 test/e2e: deflake pod resize cgroup value verification
Replace the manual 3-retry loop (with no delay) in VerifyCgroupValue
with framework.Gomega().Eventually() + HandleRetry, matching the
pattern used for oom_score_adj deflake in #137329. This gives proper
polling with backoff when exec fails during container restarts.
2026-03-13 21:10:37 +05:30
Kubernetes Prow Robot
f7f694e5e0
Merge pull request #136792 from rata/userns-goes-ga
feature: Migrate UserNamespacesSupport to GA
2026-03-12 21:57:36 +05:30
Rodrigo Campos
f25830be53 test/e2e*: Remove references to UserNamespacesSupport feature gate
It's GA now.

Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
2026-03-12 15:20:09 +01:00
Sascha Grunert
d3919c7cef
Label unlabeled e2e node tests
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2026-03-12 09:02:24 +01:00
Paco Xu
8c5548edd6 mark ImageVolume conformance test to LinuxOnly 2026-03-12 15:48:50 +08:00
Kubernetes Prow Robot
d729528df4
Merge pull request #136711 from saschagrunert/graduate-image-volume-ga
[KEP-4639]: Graduate ImageVolume to GA
2026-03-12 00:45:43 +05:30
Yuan Wang
f33a2767aa Refactor container restart policy tests to e2e/common/node
- Added validation for lastTerminationStatus
2026-03-09 23:05:05 +00:00
Kubernetes Prow Robot
147a9ee315
Merge pull request #137231 from danwinship/lifecycle-hook-e2e
Rework container lifecycle tests to not require "privileged" pod security
2026-03-09 23:05:15 +05:30
Mads Jensen
1f2b70a043 Lint: Use modernize/rangeint in test/{e2e,e2e_node,images,soak} 2026-03-07 10:17:31 +01:00
Natasha Sarkar
8504c0d4e3 deflake pod resize oom_score_adj check 2026-03-05 18:36:03 +00:00
Dan Winship
a7d8e3a372 Rework container lifecycle tests to not require "privileged" pod security
Rather than using the probe `Host` field to target a logging server,
use a redirecting sidecar within the test pod. Then we don't need to
bypass PSA.
2026-03-03 10:35:39 -05:00
Mads Jensen
7883039b31 Remove unneeded use of fmt.Sprintf in test/{integration,e2e} 2026-02-08 14:34:13 +01:00
Sascha Grunert
6ec313a045
Graduate ImageVolume to GA
Graduates the ImageVolume feature gate to GA in v1.36, locked to enabled.

Changes:
- Add v1.36 GA entry with LockToDefault: true
- Remove +featureGate=ImageVolume annotations from API types
- Promote e2e test to conformance
- Add emulation versioning to disablement tests
- Update conformance test metadata
- Remove feature-gated test expectations for ImageVolume PullPolicy

Ref: https://github.com/kubernetes/enhancements/issues/4639
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2026-02-03 10:37:49 +01:00
Davanum Srinivas
48f67b9656
Add timeout to ImagePullTest poll loop to prevent infinite hangs
ImagePullTest was using wait.PollUntilContextCancel which has no
timeout, causing tests to hang for hours when the expected container
state is never reached (e.g., ErrImageNeverPull).

Changed to wait.PollUntilContextTimeout with ContainerStatusRetryTimeout
(5 minutes), matching the pattern used by other tests in the same file.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2026-01-23 18:19:17 -05:00
Davanum Srinivas
0183b5547a
test: update code to use GlibcDnsTesting image
Updates the codebase to use the new glibc-dns-testing image which replaces
the deprecated jessie-dnsutils image.

This PR depends on the glibc-dns-testing image being available in the
registry (registry.k8s.io/e2e-test-images/glibc-dns-testing:2.0.0).

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2026-01-22 08:39:46 -05:00
Kubernetes Prow Robot
751ab64d57
Merge pull request #135837 from dgrisonnet/increase-resource-limits
test/e2e: fix pod resize test flakes on CRI-O/runc environments
2026-01-17 08:35:15 +05:30
David Lynch
df66e4728b Back out "Remove image volume e2e test because CI has containerd < 2.1"
Original commit changeset: 71ddb98ae4

user: David Lynch <davite3@protonmail.com>
2026-01-14 12:36:03 +00:00
Kubernetes Prow Robot
08ad958d0d
Merge pull request #135774 from pohly/e2e-framework-ginkgo-wrappers
E2E framework: make usage of Ginkgo wrappers optional
2026-01-07 19:01:38 +05:30
Patrick Ohly
47d02070ba E2E: remove unnecessary trailing spaces in test names
The spaces are unnecessary because Ginkgo adds spaces automatically.

This was detected before only for tests using the wrapper functions,
now it also gets detected for ginkgo methods.
2026-01-07 12:05:43 +01:00
Damien Grisonnet
c29d27bc44 test/e2e: increase memory limits in pod resize tests
The pod resize e2e tests use memory limits as low as 20Mi for Guaranteed
QoS pods. On OpenShift/CRI-O, the container runtime (runc) runs inside
the pod's cgroup and requires ~20-22MB of memory during container
creation and restart operations. This causes intermittent OOM kills
when the pod's memory limit is at or below runc's memory footprint.

This issue does not occur on containerd-based clusters because
containerd's shim runs outside the pod's cgroup by default (ShimCgroup=""),
so runc's memory is not charged against the pod's limit.

Increase memory limits to provide sufficient headroom for runc:
- originalMem: 20Mi -> 35Mi
- reducedMem: 15Mi -> 30Mi
- increasedMem: 25Mi -> 40Mi

The test validates resize behavior, not minimal memory limits, so
larger values do not reduce test coverage.

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2025-12-19 07:41:59 +01:00
Damien Grisonnet
315c38fb8a Revert "test/e2e: increase memory limits in pod resize tests"
This reverts commit a2cf7f770d.
2025-12-18 19:05:47 +01:00
Patrick Ohly
ad79e479c2 build: remove deprecated '// +build' tag
This has been replaced by `//build:...` for a long time now.

Removal of the old build tag was automated with:

    for i in $(git grep -l '^// +build' | grep -v -e '^vendor/'); do if ! grep -q '^// Code generated' "$i"; then sed -i -e '/^\/\/ +build/d' "$i"; fi; done
2025-12-18 12:16:21 +01:00
Damien Grisonnet
a2cf7f770d test/e2e: increase memory limits in pod resize tests
The pod resize e2e tests use memory limits as low as 20Mi for Guaranteed
QoS pods. On OpenShift/CRI-O, the container runtime (runc) runs inside
the pod's cgroup and requires ~20-22MB of memory during container
creation and restart operations. This causes intermittent OOM kills
when the pod's memory limit is at or below runc's memory footprint.

This issue does not occur on containerd-based clusters because
containerd's shim runs outside the pod's cgroup by default (ShimCgroup=""),
so runc's memory is not charged against the pod's limit.

Increase memory limits to provide sufficient headroom for runc:
- originalMem: 20Mi -> 35Mi
- reducedMem: 15Mi -> 30Mi
- increasedMem: 25Mi -> 40Mi

The test validates resize behavior, not minimal memory limits, so
larger values do not reduce test coverage.

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2025-12-09 12:15:25 +01:00
Kubernetes Prow Robot
c245b40b87
Merge pull request #135254 from saschagrunert/image-volume-containerd-skip
[KEP-4639] Remove image volume e2e test because CI has containerd < 2.1
2025-11-12 07:59:49 -08:00
Kubernetes Prow Robot
9673a7fbf1
Merge pull request #132919 from ndixita/pod-level-in-place-pod-resize
Pod level in place pod resize - alpha
2025-11-12 07:59:41 -08:00
Sascha Grunert
71ddb98ae4
Remove image volume e2e test because CI has containerd < 2.1
Remove the e2e test since we switched to beta (enabled by default)
instead of GA. We re-add the test in 1.36.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2025-11-12 09:31:34 +01:00
ndixita
10b73f8ef9
Test fixes
Signed-off-by: ndixita <ndixita@google.com>
2025-11-12 06:21:06 +00:00
ndixita
21920bb37e
Test fixes
Signed-off-by: ndixita <ndixita@google.com>
2025-11-12 01:18:53 +00:00
ndixita
1733d8fc8c
e2e tests
Signed-off-by: ndixita <ndixita@google.com>
2025-11-11 18:19:09 +00:00
Stanislav Láznička
805eb885e3
node e2e: add tests for Ensure Secret Image Pulls default policy
Signed-off-by: Stanislav Láznička <slznika@microsoft.com>

Co-authored-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2025-11-11 11:15:53 -05:00
carlory
094b1bf018
fix [sig-node] Container Runtime blackbox test when running a container with a new image [Serial] should be able to pull from private registry with secret [NodeConformance] 2025-11-11 10:31:12 +08:00
Heba
aceb89debc
KEP-5471: Extend tolerations operators (#134665)
* Add numeric operations to tolerations

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* add default feature gate

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add integration tests

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add toleration value validation

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add validate options for new operators

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove log

Signed-off-by: helayoty <heelayot@microsoft.com>

* Update feature gate check

Signed-off-by: helayoty <heelayot@microsoft.com>

* emove IsValidNumericString func

Signed-off-by: helayoty <heelayot@microsoft.com>

* Implement IsDecimalInteger

Signed-off-by: helayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add logs to v1/toleration

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>

* Update integration tests and address code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add feature gate to the scheduler framework

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove extra test

Signed-off-by: helayoty <heelayot@microsoft.com>

* Fix integration test

Signed-off-by: helayoty <heelayot@microsoft.com>

* pass feature gate via TolerationsTolerateTaint

Signed-off-by: helayoty <heelayot@microsoft.com>

---------

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>
2025-11-10 12:42:54 -08:00
Kubernetes Prow Robot
d777de7741
Merge pull request #135195 from haircommander/image-volume
KEP 4639: Move ImageVolume to on by default beta
2025-11-09 18:34:53 -08:00
Sascha Grunert
c7b277a32e KEP 4639: Move ImageVolume to on by default beta
Coauthored-by: Sascha Grunert <sgrunert@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2025-11-06 16:26:27 -05:00