Commit graph

3823 commits

Author SHA1 Message Date
Patrick Ohly
4a3d822689 DRA e2e: make driver deployment possible in Go unit tests
This leverages ktesting as wrapper around Ginkgo and testing.T to make all
helper code that is needed to deploy a DRA driver available to Go unit
tests and thus integration tests.

How to proceed with unifying helper code for integration and E2E testing is
open. This is just a minimal first step in that direction. Ideally, such
code should be in separate packages where usage of Ginkgo, e2e/framework
and gomega.Expect/Eventually/Consistently are forbidden.

While at it, the builder gets extended to make cleanup optional.
This will be needed for upgrade/downgrade testing with sub-tests.

(cherry picked from commit 7c7b1e1018)
2026-01-16 07:53:00 +01:00
Patrick Ohly
db36339d03 e2e framework: avoid memory overhead of ginkgo.GinkgoT
It turned out that ginkgo.GinkgoT() wasn't as cheap as it should have been (fix
coming in Ginkgo 2.27.5). When instantiated once for each framework.Framework
instance during init by all workers at the same time, the resulting spike in
overall memory usage within the container caused OOM killing of workers in Prow
jobs like ci-kubernetes-e2e-gci-gce with very tight memory limits.

Even with the upcoming fix in Ginkgo it makes sense to set the TB field only
while it really is needed, i.e. while a test runs. This is conceptually similar
to setting and unsetting the test namespace. It may help to flush out incorrect
usage of TB outside of tests.
2026-01-16 07:53:00 +01:00
Patrick Ohly
0d64cbff49 e2e framework: support creating TContext
This makes it possible to call helper packages which expect a TContext from E2E
tests.

The implementation uses GinkgoT as TB and supports registering cleanup
callbacks which expect a context. These callbacks then run with a context that
comes from ginkgo.DeferCleanup, just as if they had called that directly.

(cherry picked from commit 47b613eded)
2026-01-16 07:53:00 +01:00
Patrick Ohly
e999d595b1 testing: partial revert of E2E + DRA upgrade/downgrade
Refactoring the DRA upgrade/downgrade testing such that it runs as Go test
depended on supporting ktesting in the E2E framework. That change worked during
presubmit testing, but broke some periodic jobs. Therefore the relevant commits
from https://github.com/kubernetes/kubernetes/pull/135664/commits get reverted:

c47ad64820 DRA e2e+integration: test ResourceSlice controller
047682908d ktesting: replace Begin/End with TContext.Step
de47714879 DRA upgrade/downgrade: rewrite as Go unit test
7c7b1e1018 DRA e2e: make driver deployment possible in Go unit tests
65ef31973c DRA upgrade/downgrade: split out individual test steps
47b613eded e2e framework: support creating TContext

The last one is what must have caused the problem, but the other commits depend
on it.
2026-01-11 09:55:17 +01:00
Patrick Ohly
80cc14831e E2E framework: fix nil pointer crash in TContext
Not all framework instances have a default namespace. TContext
crashed for those.
2026-01-09 16:23:11 +01:00
Kubernetes Prow Robot
26fd963327
Merge pull request #135664 from pohly/dra-upgrade-downgrade-refactor
DRA e2e: upgrade/downgrade refactor
2026-01-08 19:31:47 +05:30
Kubernetes Prow Robot
08ad958d0d
Merge pull request #135774 from pohly/e2e-framework-ginkgo-wrappers
E2E framework: make usage of Ginkgo wrappers optional
2026-01-07 19:01:38 +05:30
Patrick Ohly
7c7b1e1018 DRA e2e: make driver deployment possible in Go unit tests
This leverages ktesting as wrapper around Ginkgo and testing.T to make all
helper code that is needed to deploy a DRA driver available to Go unit
tests and thus integration tests.

How to proceed with unifying helper code for integration and E2E testing is
open. This is just a minimal first step in that direction. Ideally, such
code should be in separate packages where usage of Ginkgo, e2e/framework
and gomega.Expect/Eventually/Consistently are forbidden.

While at it, the builder gets extended to make cleanup optional.
This will be needed for upgrade/downgrade testing with sub-tests.
2026-01-07 14:11:33 +01:00
Patrick Ohly
e4ab523161 E2E framework: make usage of Ginkgo wrappers optional
Previously it was necessary to use the Ginkgo wrappers when
using any of the custom arguments like WithSlow(). Now the
hook within Ginkgo for modifying arguments is used such that
e.g. the original ginkgo.It also works.
2026-01-07 12:05:43 +01:00
Patrick Ohly
47b613eded e2e framework: support creating TContext
This makes it possible to call helper packages which expect a TContext from E2E
tests.

The implementation uses GinkgoT as TB and supports registering cleanup
callbacks which expect a context. These callbacks then run with a context that
comes from ginkgo.DeferCleanup, just as if they had called that directly.
2026-01-05 13:45:03 +01:00
Patrick Ohly
1a866b8795 e2e framework: fix inconsistency in log output
Example:

    I1208 16:01:05.852628 243 upgradedowngrade_test.go:239] get source code version: bring up v1.34: cluster is running, use KUBECONFIG=/var/run/kubernetes/admin.kubeconfig to access it
    I1208 16:01:05.869679     243 reflector.go:446] "Caches populated" type="*v1.ServiceAccount" reflector="k8s.io/client-go/tools/watch/informerwatcher.go:162"

The first line is printed via framework.Logf, which is meant to emulate the
format used by the klog text logger in the second line. The difference is that
klog formats the pid with 7 characters, padding on the left with spaces.

Consistency trumps brevity here, so let's format exactly as in klog.
2026-01-05 13:45:03 +01:00
Kubernetes Prow Robot
268bdbe214
Merge pull request #135836 from pohly/ginkgo-gomega-update
dependencies: ginkgo v2.27.3 + gomega v1.38.3
2025-12-19 08:36:39 -08:00
Patrick Ohly
db841afdbb dependencies: ginkgo v2.27.3 + gomega v1.38.3
This fixes some issues found in Kubernetes (data race in ginkgo CLI, gomega
formatting) and helps with diagnosing OOM killing in CI jobs (exit status of
processes).

The modified gomega formatting shows up in some of the output tests for the E2E
framework. They get updated accordingly.
2025-12-19 10:37:54 +01:00
hongkang
be9b3d5a46 Add e2e test for VolumeAttachment cleanup when CSIDriver AttachRequired changes
Signed-off-by: hongkang <mzhkcj50@gmail.com>
2025-12-19 15:56:33 +08:00
Kubernetes Prow Robot
243404b870
Merge pull request #134515 from carlory/e2e-autoscaling
e2e: improve test/e2e/framework/autoscaling/autoscaling_utils.go
2025-12-17 16:26:32 -08:00
Stanislav Láznička
805eb885e3
node e2e: add tests for Ensure Secret Image Pulls default policy
Signed-off-by: Stanislav Láznička <slznika@microsoft.com>

Co-authored-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2025-11-11 11:15:53 -05:00
Heba
aceb89debc
KEP-5471: Extend tolerations operators (#134665)
* Add numeric operations to tolerations

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* add default feature gate

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add integration tests

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add toleration value validation

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add validate options for new operators

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove log

Signed-off-by: helayoty <heelayot@microsoft.com>

* Update feature gate check

Signed-off-by: helayoty <heelayot@microsoft.com>

* emove IsValidNumericString func

Signed-off-by: helayoty <heelayot@microsoft.com>

* Implement IsDecimalInteger

Signed-off-by: helayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add logs to v1/toleration

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>

* Update integration tests and address code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add feature gate to the scheduler framework

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove extra test

Signed-off-by: helayoty <heelayot@microsoft.com>

* Fix integration test

Signed-off-by: helayoty <heelayot@microsoft.com>

* pass feature gate via TolerationsTolerateTaint

Signed-off-by: helayoty <heelayot@microsoft.com>

---------

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>
2025-11-10 12:42:54 -08:00
Anish Ramasekar
d82fa1eb98
test: use localhost and HostNetwork for registry, mark test as disruptive
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2025-11-07 11:13:28 -08:00
Stanislav Láznička
8d0fb17a18
e2e test registry: force IPv4 localhost IP
Signed-off-by: Stanislav Láznička <slznika@microsoft.com>
2025-11-07 11:08:34 -08:00
Ryan Phillips
b07c8698b9 test: add retry to getMetricsFromNode 2025-11-06 13:56:17 -06:00
Kubernetes Prow Robot
dd6f46856d
Merge pull request #135117 from borg-land/bump-pvc-5g
provision 10G disks for testing pvc instead of 1 byte or 1GB
2025-11-04 20:56:11 -08:00
upodroid
90f0fd09f2 provision 10G disks for testing pvc instead of 1 byte or 1GB 2025-11-05 01:25:13 +03:00
Stanislav Láznička
a275785bd4
node conformance e2e: log fake registry creds on test failure 2025-11-04 17:26:40 +01:00
Patrick Ohly
f9ef004916 E2E framework: start slow tests first
This avoids the risk of having a slow test started towards the end of a run,
which then would cause the run to take longer. When started early they can run
in parallel to other tests. In serial runs it doesn't matter.

The implementation maps the Slow label to the new ginkgo.SpecPriority. The
default is 0. Tests with priority 1 run first.
2025-11-03 18:17:59 +01:00
Patrick Ohly
34021d451d Revert "E2E framework: start slow tests first"
This reverts commit cff07e7551.

The commit caused several kubeadm jobs to fail while executing all conformance
tests (including slow ones) in parallel. Sometimes execution took longer and
ran into the overall timeout, sometimes there was:

    [FAILED] Expected
        <int>: 440
    to be ==
        <int>: 400
    In [It] at: k8s.io/kubernetes/test/e2e/apimachinery/chunking.go:202

It looks like the tests are flaky and/or reveal a real bug when slow tests run
all in parallel at the same time.

This should work, but doesn't right now, so let's revert until that problem is fixed.
2025-11-02 20:09:28 +01:00
Patrick Ohly
cff07e7551 E2E framework: start slow tests first
This avoids the risk of having a slow test started towards the end of a run,
which then would cause the run to take longer. When started early they can run
in parallel to other tests. In serial runs it doesn't matter.

The implementation maps the Slow label to the new ginkgo.SpecPriority. The
default is 0. Tests with priority 1 run first.
2025-11-01 09:52:09 +01:00
Kubernetes Prow Robot
34988e758d
Merge pull request #134453 from stlaz/node-conformance-e2e
Fix node conformance tests with fake registry
2025-10-30 07:48:06 -07:00
Stanislav Láznička
135b46974a
e2e registry: have SetupRegistry() return registry address 2025-10-29 13:18:06 +01:00
Stanislav Láznička
bb1b23a34e
e2e fake registry: add function docs
Signed-off-by: Stanislav Láznička <slznika@microsoft.com>
2025-10-29 13:17:04 +01:00
Stanislav Láznička
fc81e22735
fix Node Conformance Container Runtime test with fake registry
Signed-off-by: Stanislav Láznička <slznika@microsoft.com>
2025-10-29 13:17:03 +01:00
Stanislav Láznička
a0e64c21f2
Use fake registry in Node's container runtime image pulling tests
Signed-off-by: Stanislav Láznička <slznika@microsoft.com>
2025-10-29 13:06:34 +01:00
Kubernetes Prow Robot
3ef0262766
Merge pull request #134760 from Rishita-Golla/resize-test
Add step field in SizeRange struct for volume_expand tests
2025-10-28 14:20:00 -07:00
Rishita Golla
448584e1c8 feat: add step field and clarify comment for volume expansion 2025-10-21 21:24:45 +00:00
Patrick Ohly
d0a2a0d22e e2e: find and fix reuse of test names
This reports and fixes for test/e2e:

    ERROR: E2E suite initialization was faulty, these errors must be fixed:
    ERROR: apimachinery/mutatingadmissionpolicy.go:184: full test name is not unique: "[sig-api-machinery] MutatingAdmissionPolicy [Privileged:ClusterAdmin] [Feature:MutatingAdmissionPolicy] [FeatureGate:MutatingAdmissionPolicy] [Beta] [Feature:OffByDefault] should support MutatingAdmissionPolicy API operations" (/nvme/gopath/src/k8s.io/kubernetes/test/e2e/apimachinery/mutatingadmissionpolicy.go:184, /nvme/gopath/src/k8s.io/kubernetes/test/e2e/apimachinery/mutatingadmissionpolicy.go:606)
    ERROR: apimachinery/mutatingadmissionpolicy.go:412: full test name is not unique: "[sig-api-machinery] MutatingAdmissionPolicy [Privileged:ClusterAdmin] [Feature:MutatingAdmissionPolicy] [FeatureGate:MutatingAdmissionPolicy] [Beta] [Feature:OffByDefault] should support MutatingAdmissionPolicyBinding API operations" (/nvme/gopath/src/k8s.io/kubernetes/test/e2e/apimachinery/mutatingadmissionpolicy.go:412, /nvme/gopath/src/k8s.io/kubernetes/test/e2e/apimachinery/mutatingadmissionpolicy.go:834)
    ERROR: common/node/pod_level_resources.go:250: full test name is not unique: "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Guaranteed QoS pod with container resources" (/nvme/gopath/src/k8s.io/kubernetes/test/e2e/common/node/pod_level_resources.go:250 (2x))
    ERROR: dra/dra.go:1899: full test name is not unique: "[sig-node] [DRA] kubelet [Feature:DynamicResourceAllocation] [FeatureGate:DRAConsumableCapacity] [Alpha] [Feature:OffByDefault] [FeatureGate:DynamicResourceAllocation] must allow multiple allocations and consume capacity [KubeletMinVersion:1.34]" (/nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:1899 (2x))
    ERROR: storage/testsuites/volume_group_snapshottable.go:173: full test name is not unique: "[sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern:  (delete policy)] volumegroupsnapshottable [Feature:volumegroupsnapshot] VolumeGroupSnapshottable  should create snapshots for multiple volumes in a pod" (/nvme/gopath/src/k8s.io/kubernetes/test/e2e/storage/testsuites/volume_group_snapshottable.go:173 (2x))
    ERROR: storage/testsuites/volume_group_snapshottable.go:173: full test name is not unique: "[sig-storage] CSI Volumes [Driver: pd.csi.storage.gke.io] [Serial] [Testpattern:  (delete policy)] volumegroupsnapshottable [Feature:volumegroupsnapshot] VolumeGroupSnapshottable  should create snapshots for multiple volumes in a pod" (/nvme/gopath/src/k8s.io/kubernetes/test/e2e/storage/testsuites/volume_group_snapshottable.go:173 (2x))

And for test/e2e_node:

    ERROR: cpu_manager_test.go:1622: full test name is not unique: "[sig-node] CPU Manager [Serial] [Feature:CPUManager] when checking the CFS quota management should disable for guaranteed pod with exclusive CPUs assigned" (/nvme/gopath/src/k8s.io/kubernetes/test/e2e_node/cpu_manager_test.go:1622, /nvme/gopath/src/k8s.io/kubernetes/test/e2e_node/cpu_manager_test.go:1642)
    ERROR: eviction_test.go:800: full test name is not unique: "[sig-node] LocalStorageCapacityIsolationFSQuotaMonitoring [Slow] [Serial] [Disruptive] [Feature:LocalStorageCapacityIsolationQuota] [Feature:LSCIQuotaMonitoring] [Feature:UserNamespacesSupport] when we run containers that should cause use quotas for LSCI monitoring (quotas enabled: true)  should eventually evict all of the correct pods" (/nvme/gopath/src/k8s.io/kubernetes/test/e2e_node/eviction_test.go:800 (2x))
2025-10-17 20:19:52 +02:00
carlory
1dd82ce2a4
Add port name and custom metric name to autoscaling utils
Signed-off-by: carlory <baofa.fan@daocloud.io>
2025-10-10 17:13:34 +08:00
Kubernetes Prow Robot
628845b567
Merge pull request #134456 from gnufied/fix-e2e-modify-volume
With new changes we will also have a VolumeModifying condition
2025-10-09 11:31:02 -07:00
Kubernetes Prow Robot
51e35e61ce
Merge pull request #133870 from pohly/build-data-race-detection
build: also support KUBE_RACE for test binaries
2025-10-08 12:57:01 -07:00
Patrick Ohly
9702a2dca2 E2E framework: enable data race detection only if needed
When building the test binary without race detection, we don't
need the post-processing of the JUnit file because it cannot
contain data race reports. This can be done via build tags.
2025-10-08 08:45:21 +02:00
Hemant Kumar
01264d3970 With new changes we will also have a VolumeModifying condition
Fix e2e tests to take that into account
2025-10-07 10:31:29 -04:00
Patrick Ohly
f95d531b0a DRA: CRUD conformance tests
Promoting real tests turned out to be harder than expected (should be rewritten
to be self-contained, additional reviews, etc.).

They would not achieve 100% endpoint+operation coverage because real tests only
use some of the operations. Therefore each API type has to be covered with
CRUD-style tests which only exercise the apiserver, then maybe additional
functional tests can be added later (depending on time and motivation).

The machinery for testing different API types is meant to be reusable, so it
gets added in the new e2e/framework/conformance helper package.
2025-10-02 17:43:33 +02:00
Adrian Moisey
ea914d8077
Remove unused WaitForServiceEndpointsNum function
along with the now-unused countEndpointsSlicesNum function
2025-09-21 14:48:13 +02:00
Adrian Moisey
01f7de46f6
Replace deprecated WaitForServiceEndpointsNum call with WaitForEndpointCount 2025-09-21 14:47:03 +02:00
Heba
36e3adf318
Add e2e test for MaxUnavailable StatefulSet RollingUpdate (#133717)
* Add e2e test for MaxUnavailable rolling update

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Update test/e2e/apps/statefulset.go

Co-authored-by: Maciej Szulik <soltysh@gmail.com>

* Adress feedback comments

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* address feedback

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* expose poll interval as a param

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Update test/e2e/apps/statefulset.go

Co-authored-by: Maciej Szulik <soltysh@gmail.com>

* Update test/e2e/framework/statefulset/wait.go

Co-authored-by: Maciej Szulik <soltysh@gmail.com>

* fix pollinterval

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* update time duration style

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

---------

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Co-authored-by: Maciej Szulik <soltysh@gmail.com>
2025-09-18 03:53:42 -07:00
Jordan Liggitt
4b0eeeb618
Make pod-security-admission honor emulation version 2025-09-17 15:32:32 -04:00
Patrick Ohly
7a62519b36 E2E: treat data races in e2e suite as failures
Ginkgo itself doesn't do this, in which case prun-junit-xml drops the output
and Spyglass wouldn't show the test as failed. If the data race warning is
captured, we now treat that as the failure of a test if it hasn't already
failed for other reasons.

While at it, the entire report cleanup gets moved to our junit package.
2025-09-16 19:34:36 +02:00
Kubernetes Prow Robot
b508767369
Merge pull request #132655 from ylink-lfs/ci/httpd_removal
ci: remove httpd usage while using agnhost instead
2025-09-05 20:23:24 -07:00
Kubernetes Prow Robot
e8b19be173
Merge pull request #133440 from carlory/deflake-service-tests
deflake e2e test: Services should implement NodePort and HealthCheckNodePort correctly when ExternalTrafficPolicy changes
2025-09-05 14:37:42 -07:00
Kubernetes Prow Robot
18c188467d
Merge pull request #133438 from saschagrunert/timeout-pod-should-get-evicted
Increase termination timeout for `evicted pods should be terminal` test
2025-09-03 03:53:14 -07:00
Kubernetes Prow Robot
15b9222fa7
Merge pull request #133477 from chenggu88/e2e
Allow IfNotPresent to be used in node e2e tests
2025-09-03 02:11:16 -07:00
Sascha Grunert
c8f8f66e6d
Increase termination timeout for evicted pods should be terminal test
This doubles the termination timeout for the eviction test from 5min to
10min. Reason for that is that the eviction manager relies on pod stats
metrics, which may not be acceessible during a period of time because of
the kubelet API unreachable. This could be reasoned in hardware or
network pressure when multiple tests run in parallel.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2025-09-03 08:58:46 +02:00