kubernetes

mirror of https://github.com/kubernetes/kubernetes.git synced 2026-06-22 07:50:50 -04:00

Author	SHA1	Message	Date
kubernetes-prow[bot]	11ea3c2a46	Merge pull request #139896 from pohly/e2e-node-artifacts-default E2E node: consider ARTIFACTS env variable for results dir	2026-06-21 13:53:36 +00:00
Patrick Ohly	8eed364890	E2E node: bump to -v4 for remote testing The hard-coded verbosity in `make test-e2e-node` is 4 (`17e2eda611/hack/make-rules/test-e2e-node.sh (L248)`). Pre-pending -v4 emulates that behavior, with the difference that an explicit -v passed by the caller (typically kubetest2) could be used to override it.	2026-06-21 12:43:07 +02:00
Patrick Ohly	fcc5d50fde	E2E node: consider ARTIFACTS env variable for results dir `make test-e2e-node` sets the -results-dir based on the ARTIFACTS Prow job env variable. When e2e_node.test gets invoked directly, it should do the same, otherwise JUnit and log files are not captured for the job.	2026-06-21 11:57:09 +02:00
Kubernetes Prow Robot	17e2eda611	Merge pull request #139745 from ngopalak-redhat/ngopalak/fix_is_xfs Ensure is_xfs evaluates to true in quota tests	2026-06-20 14:19:38 +05:30
Francesco Romani	f0b952f2a1	e2e: node: consolidate more createPodSync calls fix the missing instance which escaped the fix in `f7bd739f22` Signed-off-by: Francesco Romani <fromani@redhat.com>	2026-06-18 17:20:33 +02:00
Neeraj Krishna Gopalakrishna	0a36086243	Ensure is_xfs evaluates to true in quota tests Signed-off-by: Neeraj Krishna Gopalakrishna <ngopalak@redhat.com>	2026-06-18 09:39:38 +05:30
Kubernetes Prow Robot	9d6e94a40d	Merge pull request #139741 from bart0sh/PR241-kubelet-podresources-utils-fix-contextual-todos kubelet/podresources, kubelet/util/manager: propagate logger/context	2026-06-15 22:25:35 +05:30
Ed Bartosh	80e8baa8b8	kubelet/podresources: pass context to GetV1*Client Replace context.TODO() with a context parameter passed by callers.	2026-06-15 13:13:51 +03:00
Kubernetes Prow Robot	57110af20d	Merge pull request #129079 from Tal-or/smtalignment_error staticpolicy:smtalign: count for pre-allocated cpus for container	2026-06-15 14:27:23 +05:30
Talor Itzhak	e8e3fb93ee	e2e:node: consider pre-allocated CPUs This test verifies that pods with pre-allocated CPUs (from the checkpoint file) are not rejected after kubelet restart when SMT alignment is enabled. Regression test for the fix where the container presence check was moved before the SMT alignment check. The key is to request enough CPUs so that if pre-allocated CPUs are not counted, the SMT alignment check would fail due to insufficient available physical CPUs. Calculate the maximum SMT-aligned CPUs we can request We need to request most of the allocatable CPUs to trigger the bug. Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2026-06-14 12:17:32 +03:00
Kubernetes Prow Robot	79751b17da	Merge pull request #137278 from humblec/update-npd-v1.35.2 Update node-problem-detector to v1.35.2 and remove addon manifests	2026-06-11 20:26:42 +05:30
Kubernetes Prow Robot	3841ba06c2	Merge pull request #139530 from QiWang19/cleanuppod-grace-period Set short termination grace period for test pods in MemoryQoS tests	2026-06-11 08:04:49 +05:30
Humble Devassy Chirammal	05033bc8ca	Update node-problem-detector to v1.35.2 and remove addon manifests Update node-problem-detector from v1.34.0 to v1.35.2 and remove all related addon manifests and install logic that is no longer needed: - Update version in build/dependencies.yaml, test/e2e_node/image_list.go and test/kubemark/resources/hollow-node_template.yaml. - Remove cluster/addons/node-problem-detector/ entirely. No e2e tests depend on these manifests: e2e_node tests create NPD pods inline and GCE standalone mode runs NPD as a systemd service. - Remove install-node-problem-detector function and DEFAULT_NPD_* vars from cluster/gce/gci/configure.sh along with the conditional that invoked it, since NPD is no longer installed as a standalone binary via this script. - Remove the setup-addon-manifests calls for node-problem-detector from cluster/gce/gci/configure-helper.sh since the source directory no longer exists. - Remove stale refPaths in build/dependencies.yaml that pointed to the deleted addon files. Signed-off-by: Humble Devassy Chirammal <humble.devassy@gmail.com>	2026-06-10 14:04:57 +05:30
Qi Wang	82e38acb67	Set short termination grace period for test pods in MemoryQoS tests	2026-06-09 13:38:34 -04:00
Sergey Kanzhelev	d74b5907d5	builder pattern in cri client	2026-06-09 09:24:06 -07:00
HirazawaUi	e79d1a4271	Fix flaking e2e_node tests	2026-06-07 15:11:57 +08:00
Kubernetes Prow Robot	a0afe51e25	Merge pull request #139129 from pohly/e2e-node-update-local E2E node: enable using release archives for periodic jobs, simplified	2026-06-03 22:09:47 +05:30
Patrick Ohly	de2d13b27e	e2e_node: support pre-built binaries This is not usable through "make test-e2e-node", which (while feasible) would be a bit pointless because the Kubernetes source could would still be needed for the make rules. Instead, "kubetest2 noop -test=node" gets extended to invoke `e2e_node.test remote` with flags that tell e2e_node.test where to find the binaries and flags that were provided by the caller of kubetest2.	2026-06-03 10:32:48 +02:00
Patrick Ohly	2d574790a6	e2e_node: fix log output fmt.Printf lacked the trailing newline and is inconsistent with other output, which uses klog.	2026-06-03 08:34:56 +02:00
Patrick Ohly	6ba4d21765	e2e_node: multiplex different commands in e2e_node.test The additional commands (mounter, gcp-credentials-provider) are needed for E2E node testing. This change makes e2e_node.test entirely self-contained. Copying the commands' code into separate packages is temporary and only done to avoid touching them while it is still unclear whether this approach will work out. Besides avoiding changes to the build rules, bundling the functionality also has a slight size advantage: the size of e2e_node.test increases by 10KB, whereas the other two separate commands would add 10MB.	2026-06-03 08:34:56 +02:00
Patrick Ohly	071c858417	e2e_node: invoke make once for all targets The caller does not need to enable or disable CGO explicitly, the build rules do that automatically: $ make WHAT="cmd/kubelet cluster/gce/gci/mounter" +++ [0515 17:02:56] Building go targets for linux/amd64 k8s.io/kubernetes/cluster/gce/gci/mounter (static) k8s.io/kubernetes/cmd/kubelet (non-static) BuildGo builds the same targets as before. BuildTargets gets changed to accept a list of targets from the caller, which is a more useful package API.	2026-06-03 08:34:56 +02:00
Kubernetes Prow Robot	3c47d576e5	Merge pull request #137620 from Karthik-K-N/remove-hardcoded-volpath test: Replace hardcoded kubelet volume paths with TestContext.KubeletRootDir in node e2e tests	2026-06-03 08:05:43 +05:30
Kubernetes Prow Robot	ec15ec6d09	Merge pull request #139377 from sohankunkerkar/fix-memoryqos-high-rollback kubelet: clear stale memory.high on containers when MemoryQoS is disabled	2026-06-01 21:34:50 +05:30
Francesco Romani	9d9fd50e15	node: e2e: remove tests referring disable CPU quota The DisableCPUQuotaWithExclusiveCPUs FG is now locked to true, so we can remove all the tests referring to it. Some of them were backward compatibility tests - no longer needed if the FG is locked; some other tests explicitly set the FG to true - no longer needed either as the default is true and can't be changed anymore. Signed-off-by: Francesco Romani <fromani@redhat.com>	2026-06-01 09:51:23 +02:00
Davanum Srinivas	bf37c18d74	e2e: node: cpumanager: don't set the locked DisableCPUQuotaWithExclusiveCPUs gate to false DisableCPUQuotaWithExclusiveCPUs is locked to its default (true) since v1.37, so any KubeletConfiguration that sets it to false is rejected and crash-loops the kubelet at startup. configureCPUManagerInKubelet wrote the gate unconditionally and the field defaults to false, so every CPU Manager test that reconfigured the kubelet hit it. Only set the gate when true, and skip the "CFS quota can be disabled" block that exercised the false path. Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2026-05-30 21:47:39 -04:00
Sohan Kunkerkar	0e5d54a29a	kubelet: clear stale memory.high on containers when MemoryQoS is disabled Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>	2026-05-29 23:46:50 -04:00
hoteye	9c85877613	kubelet: pass logger into container ID parsing Pass a logger into ParseContainerID instead of creating a klog.TODO inside the helper. This lets kubelet, prober, and node e2e call sites use their available contextual logger when container ID parsing fails.	2026-05-27 10:08:32 +08:00
Kubernetes Prow Robot	3fa9f8f97d	Merge pull request #139183 from hoteye/hoteye-util-boottime-context kubelet: thread logger through boot time lookup	2026-05-23 17:08:42 +05:30
Kubernetes Prow Robot	31646c4d02	Merge pull request #139121 from carlory/update-kubelet-removal-1.38 kubelet: defer CRI fallback removal to 1.38	2026-05-22 23:10:59 +05:30
Kubernetes Prow Robot	ec8eaa5789	Merge pull request #139178 from sohankunkerkar/add-memory-events-metrics-test Add memory.events metrics to container metrics test	2026-05-21 10:16:50 +05:30
Sohan Kunkerkar	f1cd17ea97	Add memory.events metrics to container metrics test Verify container_memory_events_high_total and container_memory_events_max_total are reported by cadvisor. These counters were added in cadvisor v0.57.0 to expose cgroup v2 memory.events for MemoryQoS observability. KEP: https://github.com/kubernetes/enhancements/issues/2570 Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>	2026-05-20 16:19:08 -04:00
Kubernetes Prow Robot	c9faa15c83	Merge pull request #139184 from pohly/e2e-node-flag-cleanup e2e_node: avoid polluting e2e_node command line with helper packages	2026-05-20 23:54:53 +05:30
Patrick Ohly	62cfe57459	e2e_node: avoid polluting e2e_node command line with helper packages e2e_node.test depends on test/e2e_node/builder and test/e2e_node/remote because test/e2e_node/services/ uses some small helper functions from those two packages. But e2e_node.test itself never builds any Go binaries, nor does it run remote testing - that functionality is provided by the separate test/e2e_node/runner commands. Therefore these two packages should not put their command line flags into flag.CommandLine because then they show up in the command line of e2e_node test unnecessarily. This change removes the following flags from the e2e_node.test command line: diff -r before/e2e_node after/e2e_node 7,8d6 < --build-only If true, build e2e_node_test.tar.gz and exit. < --cleanup If true remove files from remote hosts and delete temporary instances (default true) 20d17 < --delete-instances If true, delete any instances created (default true) 42d38 < --ginkgo-flags string Passed to ginkgo to specify additional flags such as --skip=. 95d90 < --gubernator If true, output Gubernator link to view logs 97d91 < --hosts string hosts to test 99,100d92 < --image-config-dir string (optional) path to image config files < --image-config-file string yaml file describing images to run 103d94 < --images string images to test 105,106d95 < --instance-name-prefix string prefix for instance names < --k8s-bin-dir string Directory containing k8s kubelet binaries. 120d108 < --mode string Mode to operate in. One of gce\|ssh. Defaults to gce (default "gce") 133d120 < --results-dir string Directory to scp test results to. (default "/tmp/") 142,145d128 < --ssh-env string Use predefined ssh options for environment. Options: gce < --ssh-key string Path to ssh private key. < --ssh-options string Commandline options passed to ssh. < --ssh-user string Use predefined user for ssh. 160,161d142 < --target-build-arch string Target architecture for the test artifacts for dockerized build (default "linux/amd64") < --test-timeout duration How long (in golang duration format) to wait for ginkgo tests to complete. (default 45m0s) 196d176 < --test_args string Space-separated list of arguments to pass to Ginkgo test runner. 198d177 < --use-dockerized-build Use dockerized build for test artifacts	2026-05-20 12:00:01 +02:00
hoteye	4d24257a5e	kubelet: thread logger through boot time lookup Pass a logger into GetBootTime so the Linux fallback path no longer creates a local context.TODO() only to derive a logger. This keeps boot time lookup behavior unchanged and updates the node startup latency tracker constructor to accept a logger instead of a context, matching contextual logging migration guidelines.	2026-05-20 15:22:00 +08:00
carlory	f4d97c13f5	kubelet: defer CRI fallback removal to 1.38	2026-05-18 09:52:38 +08:00
Kubernetes Prow Robot	97d2d4a29f	Merge pull request #139073 from sohankunkerkar/fix/memoryqos-rollback-startup-cleanup Use updateKubeletConfig helper in rollback tests	2026-05-15 23:24:36 +05:30
Kubernetes Prow Robot	908fa4852b	Merge pull request #139033 from saschagrunert/fix/container-metrics-direct-io Use direct I/O for ContainerMetrics cadvisor test	2026-05-15 23:24:29 +05:30
Sohan Kunkerkar	7bef6a3ab1	Use updateKubeletConfig helper in rollback tests Address review feedback to use the standard updateKubeletConfig helper instead of manual WriteKubeletConfigFile + restartKubelet + waitForKubeletToStart.	2026-05-14 16:33:41 -04:00
Kubernetes Prow Robot	4f39ba34ff	Merge pull request #138903 from sohankunkerkar/fix/memoryqos-rollback-startup-cleanup Clear stale MemoryQoS cgroup values at kubelet startup	2026-05-15 00:30:28 +05:30
Sascha Grunert	5843e8ce1a	Use direct I/O for ContainerMetrics cadvisor test Overlayfs does not support cgroupv2 writeback accounting, so buffered writes (even with conv=fsync) get attributed to the root cgroup instead of the container's cgroup. This causes cadvisor to see an empty io.stat for the container, making container_blkio_device_usage_total, container_fs_reads_bytes_total, and container_fs_writes_bytes_total permanently absent. Switch to oflag=direct for writes and add iflag=direct reads to bypass the page cache entirely. Direct I/O is always attributed to the issuing process's cgroup regardless of filesystem type. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2026-05-13 14:36:29 +02:00
Kubernetes Prow Robot	874a7b40b0	Merge pull request #138617 from esotsal/kubeletHealthCheckRefactor Move kubeletHealthCheck from e2enode to node as HealthCheck	2026-05-12 02:26:10 +05:30
Kubernetes Prow Robot	5cf56a97d5	Merge pull request #138851 from saschagrunert/fix/container-metrics-flake Fix ContainerMetrics cadvisor test flake for block I/O metrics	2026-05-10 18:37:47 +05:30
Sotiris Salloumis	20c57876a4	Increase bound CPU limit to 2e+10 to fix admission api flaky test. After replacing the command to increase UsageNanoCores, to fix a previous flaky test, in some test environments, UsageNanoCores exceeds the limit 2e+09, this commit attempts to fix this by ncreasing UsageNanoCores limit to 2e+10.	2026-05-09 09:46:23 +02:00
Sohan Kunkerkar	85d3992ac1	Clear stale MemoryQoS cgroup values at kubelet startup When MemoryQoS is disabled after being previously enabled, stale memory.min and memory.low values persist on QoS-class cgroups because systemd re-applies stored properties on every SetUnitProperties call. Fix this by including memory.min=0 and memory.low=0 in the existing startup dbus calls (enforceNodeAllocatableCgroups for the root cgroup, qosContainerManager.Start for the burstable cgroup). This overwrites systemd's stored stale values so subsequent realizations re-apply 0. Fixes https://github.com/kubernetes/kubernetes/issues/138436 KEP: https://github.com/kubernetes/enhancements/issues/2570 Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>	2026-05-08 13:14:59 -04:00
Kubernetes Prow Robot	4818833ecc	Merge pull request #138820 from esotsal/fix-sriov-cpumanager Fix podresources flaky test: wait for Pod Resources V1 serving in flaky test	2026-05-08 00:05:18 +05:30
Sascha Grunert	ee9f8c6bde	Fix ContainerMetrics cadvisor test flakes Replace the small echo write with a dd that uses conv=fsync to force data through the block layer. Without fsync, the 11-byte echo writes stay in page cache and never reach the block device within the 60-second test window. This leaves the cgroup io.stat empty, so cadvisor does not emit container_blkio_device_usage_total, container_fs_reads_bytes_total, or container_fs_writes_bytes_total for the container. The conv=fsync call guarantees block device I/O on every loop iteration. Once io.stat has an entry for a device, all fields (rbytes, wbytes, rios, wios) are present, even if zero, so all cadvisor metrics pass their boundedSample(0, ...) checks. Also increase the UsageCoreNanoSeconds upper bound from 1e11 to 1e12 for the container and pod-level CPU checks. The cumulative CPU time can exceed 100s on slower architectures like ppc64le where the dd CPU burner loop accumulates faster than expected. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2026-05-07 15:01:02 +02:00
Kubernetes Prow Robot	d92b8fe8f2	Merge pull request #138739 from zxqlxy/device-plugin-slow-register Add e2e test for device plugin slow register	2026-05-07 11:42:31 +05:30
Sotiris Salloumis	acabaa7d50	Fix podresources flaky test: wait for Pod Resources V1 serving in flaky test One podresources test, was not waiting for Pod Resources V1 to be serving. This can lead to flaky tests in a next step. This change attempts to fix this flaky test, by adding waitForPodResourcesV1Serving(ctx) as done on remaining tests. In addition ExpectNoError was added to all closing connection attempts, to improve troubleshooting.	2026-05-07 05:35:17 +02:00
Xinyun Liu	62e23b9857	Add E2E test for multiple device plugin and second one is struggle to register	2026-05-06 23:48:32 +00:00
Paco Xu	11d08fcb7f	Revert "remove flaky label in SRIOV related tests"	2026-05-06 17:11:33 +08:00

1 2 3 4 5 ...

3572 commits