Commit graph

7068 commits

Author SHA1 Message Date
Kubernetes Prow Robot
43cfcac7cc
Merge pull request #135434 from yliaog/quota_abuse
Fixes the loophole that allows users to workaround resource quota set by system admin
2025-12-17 22:35:28 -08:00
Filip Křepinský
7aa186fa0a
schedule pod availability checks at the correct time in StatefulSets (#135428)
* wire now (time) to the availability checks in the StatefulSet controller

- this helps to make the controller reconcilliation consistent

* schedule pod availability checks at the correct time in StatefulSets

* replace "k8s.io/klog/v2/ktesting" with "k8s.io/kubernetes/test/utils/ktesting"

for advanced features (e.g. Eventually)

* add StatefulSetAvailabilityCheck test
2025-12-17 22:35:21 -08:00
Kubernetes Prow Robot
7795655410
Merge pull request #135402 from xigang/pv_controller
PV controller: Add rate-limiting queues and improve error handling
2025-12-17 21:43:02 -08:00
yliao
3e34de29c4 fixed the loophole that allows user to get around resource quota set by system admin 2025-12-18 00:56:20 +00:00
xigang
8f1ff1d8ce Refactor PV controller to use rate-limiting queues and improve error handling
Signed-off-by: xigang <wangxigang2014@gmail.com>
2025-12-01 19:11:52 +08:00
VijetaPriya47
4795edadee Fix goroutine leak in TestNodeSyncResync 2025-11-11 21:19:28 +05:30
Heba
aceb89debc
KEP-5471: Extend tolerations operators (#134665)
* Add numeric operations to tolerations

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* add default feature gate

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add integration tests

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add toleration value validation

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add validate options for new operators

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove log

Signed-off-by: helayoty <heelayot@microsoft.com>

* Update feature gate check

Signed-off-by: helayoty <heelayot@microsoft.com>

* emove IsValidNumericString func

Signed-off-by: helayoty <heelayot@microsoft.com>

* Implement IsDecimalInteger

Signed-off-by: helayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add logs to v1/toleration

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>

* Update integration tests and address code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add feature gate to the scheduler framework

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove extra test

Signed-off-by: helayoty <heelayot@microsoft.com>

* Fix integration test

Signed-off-by: helayoty <heelayot@microsoft.com>

* pass feature gate via TolerationsTolerateTaint

Signed-off-by: helayoty <heelayot@microsoft.com>

---------

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>
2025-11-10 12:42:54 -08:00
Kubernetes Prow Robot
59d65dad34
Merge pull request #134945 from tchap/kcm-controllers-check-threads
pkg/controller: Improve goroutine management (part 2)
2025-11-06 00:43:01 -08:00
Kubernetes Prow Robot
50b4bcbab5
Merge pull request #134210 from yliaog/admit_quota
DRA extended resource quota
2025-11-06 00:42:53 -08:00
Kubernetes Prow Robot
6723beac00
Merge pull request #135154 from kubernetes/revert-134840-ahmet/mini-cleanup
Revert "controller: duplicate utility method cleanup"
2025-11-05 22:49:04 -08:00
Kubernetes Prow Robot
ca03752ee7
Merge pull request #135104 from mimowo/mutable-job-directives
Allow mutable job scheduling directives on suspended Jobs
2025-11-05 21:57:11 -08:00
Kubernetes Prow Robot
f025bcace9
Merge pull request #135068 from pohly/dra-device-taints-1.35-full
DRA device taint eviction: several improvements
2025-11-05 18:52:58 -08:00
yliao
870062df4f adjusts DRA extended resource quota to include devices usages from regular resource claims 2025-11-05 23:24:24 +00:00
Maciej Szulik
499bff4ca4
Revert "controller: duplicate utility method cleanup" 2025-11-05 21:06:09 +01:00
Michał Woźniak
5a7c90fb76 Allow mutable scheduling directives for suspended Jobs 2025-11-05 19:37:33 +00:00
Patrick Ohly
60744fc8b9 DRA device taint eviction: track evicting rules
This avoids having to call the rule lister (which theoretically, but not in
practice) fail and having to iterate over rules which can be ignored (might be
a small performance boost).
2025-11-05 20:03:17 +01:00
Patrick Ohly
9527987293 DRA device taint eviction: use NOP queue during simulation
It's slightly more efficient and a bit cleaner.
2025-11-05 20:03:17 +01:00
Patrick Ohly
eaee6b6bce DRA device taints: add separate feature gate for rules
Support for DeviceTaintRules depends on a significant amount of
additional code:
- ResourceSlice tracker is a NOP without it.
- Additional informers and corresponding permissions in scheduler and controller.
- Controller code for handling status.

Not all users necessarily need DeviceTaintRules, so adding a second feature
gate for that code makes it possible to limit the blast radius of bugs in that
code without having to turn off device taints and tolerations entirely.
2025-11-05 20:03:17 +01:00
Kubernetes Prow Robot
9ef1a14d68
Merge pull request #134840 from ahmetb/ahmet/mini-cleanup
controller: duplicate utility method cleanup
2025-11-05 08:06:58 -08:00
Kubernetes Prow Robot
9a192aa1c3
Merge pull request #134432 from Karthik-K-N/fix-sv-test
Fix storage version test flake
2025-11-05 06:56:52 -08:00
Ayato Tokubi
320987ead3 Addressed comments 2025-11-05 10:44:50 +00:00
Ayato Tokubi
5102591a6b Refactor resource claim metrics to use structured labels and add "source" dimension.
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-11-05 09:52:47 +00:00
Kubernetes Prow Robot
c1a6a3ca71
Merge pull request #134152 from pohly/dra-device-taints-1.35
DRA: device taints: new ResourceSlice API, new features
2025-11-04 15:32:07 -08:00
Ondra Kupka
024382658b controller/volume/vacprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
e08d03b1b5 controller/volume/selinuxwarning: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
1e6ad423bf controller/volume/pvprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
0caae6f704 controller/volume/pvcprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
ed74779a0f controller/volume/persistentvolume: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
8eab454e38 controller/volume/expand: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
27774052ab controller/volume/ephemeral: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
12205df76d controller/volume/attachdetach: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
9d4ff6ecf2 controller/tainteviction: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
d2a443db75 controller/serviceaccount: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
c641df792b controller/resourcequota: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
d908a470a5 controller/garbagecollector: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Kubernetes Prow Robot
97cb47a913
Merge pull request #135080 from dejanzele/feat/promote-job-managedby-to-ga
KEP-4368: Job Managed By; Promote to GA
2025-11-04 13:42:12 -08:00
Patrick Ohly
bbf8bc766e DRA device taints: DeviceTaintRule status
To update the right statuses, the controller must collect more information
about why a pod is being evicted. Updating the DeviceTaintRule statuses then is
handled by the same work queue as evicting pods.

Both operations already share the same client instance and thus QPS+server-side
throttling, so they might as well share the same work queue. Deleting pods is
not necessarily more important than informing users or vice-versa, so there is
no strong argument for having different queues.

While at it, switching the unit tests to usage of the same mock work queue as
in staging/src/k8s.io/dynamic-resource-allocation/internal/workqueue. Because
there is no time to add it properly to a staging repo, the implementation gets
copied.
2025-11-04 21:57:24 +01:00
Patrick Ohly
0689b628c7 generated files 2025-11-04 21:57:24 +01:00
Patrick Ohly
f4a453389d DRA device taint eviction: configurable number of workers
It might never be necessary to change the default, but it is hard to be sure.
It's better to have the option, just in case.
2025-11-04 21:57:24 +01:00
Kubernetes Prow Robot
a058cf788a
Merge pull request #134624 from yt2985/podcertificates-beta
Promote Pod Certificates feature to beta
2025-11-04 11:42:12 -08:00
Dejan Zele Pejchev
3dabd4417d
KEP-4368: Job Managed By; Promote to GA
Signed-off-by: Dejan Zele Pejchev <pejcev.dejan@gmail.com>
2025-11-04 10:59:45 +01:00
Kubernetes Prow Robot
d6aa2db57e
Merge pull request #135027 from omerap12/remove-reactor-hpa
Remove unused delete reactor
2025-11-04 01:30:10 -08:00
Kubernetes Prow Robot
48c56e04e0
Merge pull request #135017 from liggitt/stateful-set-noop-rollout
Fix spurious statefulset rollout from 1.33 → 1.34
2025-11-03 19:58:11 -08:00
Kubernetes Prow Robot
41673c7198
Merge pull request #134910 from tchap/kcm-controllers-thread-mgmt
pkg/controller: Improve goroutine management
2025-11-03 17:58:03 -08:00
Jordan Liggitt
979c442774
Fix spurious workload rollout due to null creationTimestamp in controller revisions 2025-11-03 17:11:06 -05:00
Jordan Liggitt
7d186d870f
Remove unused and fragile revision hash comparisons
This was broken since 666a41c2ea when the label value became non-integer encoded
The chance of one controller revision hash label being int-parsable: 7/27 ^ 8 = 0.00002041 = ~0
The chance of both being int-parsable: 0.00002041^2 = ~0

Hash comparison locks in differences in content failing EqualRevision
even when the semantic content is normalized to be equal.
2025-11-03 16:33:40 -05:00
Jordan Liggitt
94e085e15c
Add unit test detecting spurious statefulset rollout 2025-11-03 16:33:39 -05:00
Lukasz Szaszkiewicz
c832203707 pkg/controller/garbagecollector/garbagecollector_test: wrap kubeClient with a client that doesn't support WatchList semantics. 2025-11-03 10:41:49 +01:00
tinatingyu
59e075e8d3 Promote PodCertificateRequests to v1beta1 2025-11-02 05:33:44 +00:00
Omer Aplatony
264eab46db Remove unused delete reactor
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-11-01 06:13:40 +00:00