Commit graph

7103 commits

Author SHA1 Message Date
Patrick Ohly
6bfa727bee client-go testing: fix List+Watch support
5644850607 added support for List+Watch to a fake client-go instance.
However, that support was not quite working yet as seen when analyzing a test
flake:

- List returned early when there were no objects, without adding the
  ResourceVersion. The ResourceVersion should have been "0" instead.
- When encountering "" as ResourceVersion, Watch didn't deliver
  any objects. That was meant to preserve compatibility with clients
  which don't expect objects from a Watch, but the right semantic of
  "" is "Start at most recent", which includes delivering existing
  objects.

Tests which meddle with the List implementation via a reactor (like
clustertrustbundlepublisher) have to be aware that Watch now may
return objects when given an empty ResourceVersion.
2026-01-15 16:08:23 +01:00
Kubernetes Prow Robot
d8556481df
Merge pull request #135551 from Jefftree/deployment
Add pod indexer to deployment controller
2026-01-09 01:19:49 +05:30
Jefftree
5e753a131d Update benchmark for deployment 2026-01-08 15:10:32 +00:00
Anson Qian
a816a7b1d8
Make ConcurrentResourceClaimSyncs configurable (#134701)
* DRA resource claim controller: configurable number of workers

It might never be necessary to change the default, but it is hard to be sure.
It's better to have the option, just in case.

* generate files

* resourceclaimcontroller: normalize validation error message

* Update cmd/kube-controller-manager/app/options/resourceclaimcontroller.go

Co-authored-by: Jordan Liggitt <jordan@liggitt.net>

---------

Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
2026-01-08 19:31:39 +05:30
Jefftree
1250c7d56e Add pod indexer to deployment controller 2026-01-08 13:55:48 +00:00
Kubernetes Prow Robot
c9994c5f82
Merge pull request #135234 from atiratree/renameScaleReplicaSetAndRecordEvent
rename scaleReplicaSetAndRecordEvent to scaleReplicaSetWithLazyAnnotationUpdate
2026-01-08 18:39:40 +05:30
Kubernetes Prow Robot
5fbb132d69
Merge pull request #135625 from atiratree/quotamonitor-race
mark QuotaMonitor as not running and invalidate monitors list
2026-01-08 17:19:38 +05:30
Kubernetes Prow Robot
7e7267a6df
Merge pull request #136046 from Tanner-Gladson/issue-136027-error-verbosity
fix(controller/storageversionmigrator) Reduce log level
2026-01-08 04:03:50 +05:30
Kubernetes Prow Robot
df9a0bda18
Merge pull request #133797 from tico88612/cleanup/new-fake-with-options
Replace apimachinery/pkg/watch.NewFake with NewFakeWithOptions in pkg/controller
2026-01-08 03:01:38 +05:30
Kubernetes Prow Robot
18663b347e
Merge pull request #135983 from Goend/master
Fix the issue of slow creation of ResourceClaim in specific scenarios
2026-01-07 20:05:46 +05:30
Tanner
bb8b4b0d80 Correct the usage of vlog's .Error() or .V().Info() methods 2026-01-06 20:31:04 -05:00
Kubernetes Prow Robot
6af6361e3b
Merge pull request #134798 from aditigupta96/fix-runwithcontext-apimachinery
apimachinery: Use informer.RunWithContext in various components
2026-01-07 05:45:37 +05:30
Goend
13e46ffc45 Fix the issue of slow creation of ResourceClaim in specific scenarios 2026-01-06 19:18:58 +08:00
Karthik Bhat
cd7d35fa3d Fix flake TestDeviceTaintRule test by adjusting event hanlder status update logic
Co-authored-by: Pohly <patrick.ohly@intel.com>
2026-01-06 11:00:06 +05:30
Kubernetes Prow Robot
7d0b8f979c
Merge pull request #135629 from jsafrane/selinux-fix-completed-pods
selinux: Fix the controller to ignore finished pods
2025-12-19 11:52:33 -08:00
Jan Safranek
80d0b0f8cc Add unit test with CSIDriver.SELinuxMount=false
Add unit test with a volume plugin that does not support SELinux. That
simulates a CSi driver whose spec.SELinuxMount is empty or false.

This requires a little refactoring, each unit test now has a flag if it
runs with a volume plugin that supports SELinux.
2025-12-19 15:01:01 +01:00
Kubernetes Prow Robot
31fb6f64ef
Merge pull request #135821 from pohly/dra-device-taints-owner
DRA device taints controller: add pohly to OWNERS
2025-12-18 19:24:38 -08:00
Kubernetes Prow Robot
b9d491f56e
Merge pull request #134556 from carlory/fix-133160
lock the feature-gate VolumeAttributesClass to default (true)
2025-12-18 15:13:17 -08:00
Kubernetes Prow Robot
c34c5a5426
Merge pull request #135608 from pohly/dra-device-taints-unit-tests-helpers
DRA device taints: fix and simplify unit tests
2025-12-18 03:21:23 -08:00
Patrick Ohly
9194bfe75b DRA device taints controller: add pohly to OWNERS
While the code is nominally owned by SIG Scheduling, in practice I am the one
who knows it best, so I should be a reviewer and should be able to merge simple
changes without additional approvals (will use cautiously!).
2025-12-18 12:07:52 +01:00
carlory
f8e8e55f1d
locked the feature-gate VolumeAttributesClass to default (true) and switch storage version from v1beta1 to v1
Signed-off-by: carlory <baofa.fan@daocloud.io>
2025-12-18 15:59:33 +08:00
Kubernetes Prow Robot
01697eb712
Merge pull request #135647 from grandeit/fix-gc-data-race-135621
Fix data race in garbage collector on node.owners field
2025-12-17 23:28:58 -08:00
Kubernetes Prow Robot
43cfcac7cc
Merge pull request #135434 from yliaog/quota_abuse
Fixes the loophole that allows users to workaround resource quota set by system admin
2025-12-17 22:35:28 -08:00
Filip Křepinský
7aa186fa0a
schedule pod availability checks at the correct time in StatefulSets (#135428)
* wire now (time) to the availability checks in the StatefulSet controller

- this helps to make the controller reconcilliation consistent

* schedule pod availability checks at the correct time in StatefulSets

* replace "k8s.io/klog/v2/ktesting" with "k8s.io/kubernetes/test/utils/ktesting"

for advanced features (e.g. Eventually)

* add StatefulSetAvailabilityCheck test
2025-12-17 22:35:21 -08:00
Kubernetes Prow Robot
7795655410
Merge pull request #135402 from xigang/pv_controller
PV controller: Add rate-limiting queues and improve error handling
2025-12-17 21:43:02 -08:00
yliao
3e34de29c4 fixed the loophole that allows user to get around resource quota set by system admin 2025-12-18 00:56:20 +00:00
Manuel Grandeit
b18339638a Fix data race in garbage collector on node.owners field
Add ownersLock to protect concurrent access to node.owners between
GraphBuilder.processGraphChanges() (writer) and GC worker goroutines
reading in blockingDependents() and unblockOwnerReferences() methods.

Also fix concurrent reads in the HTTP debug handler (/graph endpoint)
for owners, dependents, beingDeleted, deletingDependents, and virtual
fields by using their respective thread-safe accessor methods.
2025-12-16 22:38:59 +01:00
Jan Safranek
e701a37a1e Use only enqueuePod to add pods to the controller queue
enqueuePod already creates the right key for a pod, it's better to reuse it
than copy the code around.
2025-12-12 11:19:13 +01:00
Jan Safranek
cfa65ceed2 Fix policy of Pods with unknown SELinux label
Reset SELinuxChangePolicy of Pods that have no SELinux label set to
Recursive. Kubelet cannot mount with `-o context=<label>`, if the label is
not known.

This fixes the e2e test error revealed by the previous commit - it changed the
e2e test to check for events when no events are expected and it found a
warning about a Pod with no label, but MountOption policy.
2025-12-12 11:17:54 +01:00
Jan Safranek
cbcf845810 Add new unit tests 2025-12-12 11:17:54 +01:00
Jan Safranek
7609325a9a Rework unit tests to builder pattern 2025-12-12 11:17:54 +01:00
Jan Safranek
fa1847ac40 selinux: Do not report conflits with finished pods
When a Pod reaches its final state (Succeeded or Failed), its volumes are
getting unmounted and therefore their SELinux mount option will not
conflict with any other pod.

Let the SELinux controller monitor "pod updated" events to see the pod is
finished
2025-12-12 11:17:51 +01:00
Jan Safranek
6666bd52b8 refactoring: use a common function to enqueue Pod
addPod and deletePod have the same implementation, merge them into
enqueuePod
2025-12-08 12:36:56 +01:00
Patrick Ohly
b2151b1f51 DRA device taints: fix and simplify unit tests
Using `t` instead of `tCtx` is subtly wrong: the failure is attributed to the
parent test, not the sub-test. Using a separate function with tCtx as
parameter ensures that t is not in scope of the code and thus this mistake
cannot happen. The number of lines is the same, it's just a bit more code.

For TestRetry another advantage is the reduced indention.

It's worth calling out that the same cannot be done for benchmarks:
- They need methods (Loop) or fields (N) which are not exposed by TContext.
- The `for b.Loop()` pattern only works if the for loop is written exactly
  like that.
2025-12-05 19:13:55 +01:00
Filip Křepinský
feffdbbcf2 mark QuotaMonitor as not running and invalidate monitors list
to prevent close of closed channel panic
2025-12-05 15:50:32 +01:00
Aditi Gupta
915866f0e4 apimachinery: Use informer.RunWithContext in various components 2025-12-02 15:02:04 -08:00
xigang
8f1ff1d8ce Refactor PV controller to use rate-limiting queues and improve error handling
Signed-off-by: xigang <wangxigang2014@gmail.com>
2025-12-01 19:11:52 +08:00
VijetaPriya47
4795edadee Fix goroutine leak in TestNodeSyncResync 2025-11-11 21:19:28 +05:30
Heba
aceb89debc
KEP-5471: Extend tolerations operators (#134665)
* Add numeric operations to tolerations

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* add default feature gate

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add integration tests

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add toleration value validation

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add validate options for new operators

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove log

Signed-off-by: helayoty <heelayot@microsoft.com>

* Update feature gate check

Signed-off-by: helayoty <heelayot@microsoft.com>

* emove IsValidNumericString func

Signed-off-by: helayoty <heelayot@microsoft.com>

* Implement IsDecimalInteger

Signed-off-by: helayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add logs to v1/toleration

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>

* Update integration tests and address code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add feature gate to the scheduler framework

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove extra test

Signed-off-by: helayoty <heelayot@microsoft.com>

* Fix integration test

Signed-off-by: helayoty <heelayot@microsoft.com>

* pass feature gate via TolerationsTolerateTaint

Signed-off-by: helayoty <heelayot@microsoft.com>

---------

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>
2025-11-10 12:42:54 -08:00
Filip Křepinský
e6641cd290 rename scaleReplicaSetAndRecordEvent to scaleReplicaSetWithLazyAnnotationUpdate 2025-11-08 10:18:50 -05:00
Kubernetes Prow Robot
59d65dad34
Merge pull request #134945 from tchap/kcm-controllers-check-threads
pkg/controller: Improve goroutine management (part 2)
2025-11-06 00:43:01 -08:00
Kubernetes Prow Robot
50b4bcbab5
Merge pull request #134210 from yliaog/admit_quota
DRA extended resource quota
2025-11-06 00:42:53 -08:00
Kubernetes Prow Robot
6723beac00
Merge pull request #135154 from kubernetes/revert-134840-ahmet/mini-cleanup
Revert "controller: duplicate utility method cleanup"
2025-11-05 22:49:04 -08:00
Kubernetes Prow Robot
ca03752ee7
Merge pull request #135104 from mimowo/mutable-job-directives
Allow mutable job scheduling directives on suspended Jobs
2025-11-05 21:57:11 -08:00
Kubernetes Prow Robot
f025bcace9
Merge pull request #135068 from pohly/dra-device-taints-1.35-full
DRA device taint eviction: several improvements
2025-11-05 18:52:58 -08:00
yliao
870062df4f adjusts DRA extended resource quota to include devices usages from regular resource claims 2025-11-05 23:24:24 +00:00
Maciej Szulik
499bff4ca4
Revert "controller: duplicate utility method cleanup" 2025-11-05 21:06:09 +01:00
Michał Woźniak
5a7c90fb76 Allow mutable scheduling directives for suspended Jobs 2025-11-05 19:37:33 +00:00
Patrick Ohly
60744fc8b9 DRA device taint eviction: track evicting rules
This avoids having to call the rule lister (which theoretically, but not in
practice) fail and having to iterate over rules which can be ignored (might be
a small performance boost).
2025-11-05 20:03:17 +01:00
Patrick Ohly
9527987293 DRA device taint eviction: use NOP queue during simulation
It's slightly more efficient and a bit cleaner.
2025-11-05 20:03:17 +01:00