Commit graph

7084 commits

Author SHA1 Message Date
Kubernetes Prow Robot
7d0b8f979c
Merge pull request #135629 from jsafrane/selinux-fix-completed-pods
selinux: Fix the controller to ignore finished pods
2025-12-19 11:52:33 -08:00
Jan Safranek
80d0b0f8cc Add unit test with CSIDriver.SELinuxMount=false
Add unit test with a volume plugin that does not support SELinux. That
simulates a CSi driver whose spec.SELinuxMount is empty or false.

This requires a little refactoring, each unit test now has a flag if it
runs with a volume plugin that supports SELinux.
2025-12-19 15:01:01 +01:00
Kubernetes Prow Robot
31fb6f64ef
Merge pull request #135821 from pohly/dra-device-taints-owner
DRA device taints controller: add pohly to OWNERS
2025-12-18 19:24:38 -08:00
Kubernetes Prow Robot
b9d491f56e
Merge pull request #134556 from carlory/fix-133160
lock the feature-gate VolumeAttributesClass to default (true)
2025-12-18 15:13:17 -08:00
Kubernetes Prow Robot
c34c5a5426
Merge pull request #135608 from pohly/dra-device-taints-unit-tests-helpers
DRA device taints: fix and simplify unit tests
2025-12-18 03:21:23 -08:00
Patrick Ohly
9194bfe75b DRA device taints controller: add pohly to OWNERS
While the code is nominally owned by SIG Scheduling, in practice I am the one
who knows it best, so I should be a reviewer and should be able to merge simple
changes without additional approvals (will use cautiously!).
2025-12-18 12:07:52 +01:00
carlory
f8e8e55f1d
locked the feature-gate VolumeAttributesClass to default (true) and switch storage version from v1beta1 to v1
Signed-off-by: carlory <baofa.fan@daocloud.io>
2025-12-18 15:59:33 +08:00
Kubernetes Prow Robot
01697eb712
Merge pull request #135647 from grandeit/fix-gc-data-race-135621
Fix data race in garbage collector on node.owners field
2025-12-17 23:28:58 -08:00
Kubernetes Prow Robot
43cfcac7cc
Merge pull request #135434 from yliaog/quota_abuse
Fixes the loophole that allows users to workaround resource quota set by system admin
2025-12-17 22:35:28 -08:00
Filip Křepinský
7aa186fa0a
schedule pod availability checks at the correct time in StatefulSets (#135428)
* wire now (time) to the availability checks in the StatefulSet controller

- this helps to make the controller reconcilliation consistent

* schedule pod availability checks at the correct time in StatefulSets

* replace "k8s.io/klog/v2/ktesting" with "k8s.io/kubernetes/test/utils/ktesting"

for advanced features (e.g. Eventually)

* add StatefulSetAvailabilityCheck test
2025-12-17 22:35:21 -08:00
Kubernetes Prow Robot
7795655410
Merge pull request #135402 from xigang/pv_controller
PV controller: Add rate-limiting queues and improve error handling
2025-12-17 21:43:02 -08:00
yliao
3e34de29c4 fixed the loophole that allows user to get around resource quota set by system admin 2025-12-18 00:56:20 +00:00
Manuel Grandeit
b18339638a Fix data race in garbage collector on node.owners field
Add ownersLock to protect concurrent access to node.owners between
GraphBuilder.processGraphChanges() (writer) and GC worker goroutines
reading in blockingDependents() and unblockOwnerReferences() methods.

Also fix concurrent reads in the HTTP debug handler (/graph endpoint)
for owners, dependents, beingDeleted, deletingDependents, and virtual
fields by using their respective thread-safe accessor methods.
2025-12-16 22:38:59 +01:00
Jan Safranek
e701a37a1e Use only enqueuePod to add pods to the controller queue
enqueuePod already creates the right key for a pod, it's better to reuse it
than copy the code around.
2025-12-12 11:19:13 +01:00
Jan Safranek
cfa65ceed2 Fix policy of Pods with unknown SELinux label
Reset SELinuxChangePolicy of Pods that have no SELinux label set to
Recursive. Kubelet cannot mount with `-o context=<label>`, if the label is
not known.

This fixes the e2e test error revealed by the previous commit - it changed the
e2e test to check for events when no events are expected and it found a
warning about a Pod with no label, but MountOption policy.
2025-12-12 11:17:54 +01:00
Jan Safranek
cbcf845810 Add new unit tests 2025-12-12 11:17:54 +01:00
Jan Safranek
7609325a9a Rework unit tests to builder pattern 2025-12-12 11:17:54 +01:00
Jan Safranek
fa1847ac40 selinux: Do not report conflits with finished pods
When a Pod reaches its final state (Succeeded or Failed), its volumes are
getting unmounted and therefore their SELinux mount option will not
conflict with any other pod.

Let the SELinux controller monitor "pod updated" events to see the pod is
finished
2025-12-12 11:17:51 +01:00
Jan Safranek
6666bd52b8 refactoring: use a common function to enqueue Pod
addPod and deletePod have the same implementation, merge them into
enqueuePod
2025-12-08 12:36:56 +01:00
Patrick Ohly
b2151b1f51 DRA device taints: fix and simplify unit tests
Using `t` instead of `tCtx` is subtly wrong: the failure is attributed to the
parent test, not the sub-test. Using a separate function with tCtx as
parameter ensures that t is not in scope of the code and thus this mistake
cannot happen. The number of lines is the same, it's just a bit more code.

For TestRetry another advantage is the reduced indention.

It's worth calling out that the same cannot be done for benchmarks:
- They need methods (Loop) or fields (N) which are not exposed by TContext.
- The `for b.Loop()` pattern only works if the for loop is written exactly
  like that.
2025-12-05 19:13:55 +01:00
xigang
8f1ff1d8ce Refactor PV controller to use rate-limiting queues and improve error handling
Signed-off-by: xigang <wangxigang2014@gmail.com>
2025-12-01 19:11:52 +08:00
VijetaPriya47
4795edadee Fix goroutine leak in TestNodeSyncResync 2025-11-11 21:19:28 +05:30
Heba
aceb89debc
KEP-5471: Extend tolerations operators (#134665)
* Add numeric operations to tolerations

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* add default feature gate

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add integration tests

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add toleration value validation

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add validate options for new operators

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove log

Signed-off-by: helayoty <heelayot@microsoft.com>

* Update feature gate check

Signed-off-by: helayoty <heelayot@microsoft.com>

* emove IsValidNumericString func

Signed-off-by: helayoty <heelayot@microsoft.com>

* Implement IsDecimalInteger

Signed-off-by: helayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add logs to v1/toleration

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>

* Update integration tests and address code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add feature gate to the scheduler framework

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove extra test

Signed-off-by: helayoty <heelayot@microsoft.com>

* Fix integration test

Signed-off-by: helayoty <heelayot@microsoft.com>

* pass feature gate via TolerationsTolerateTaint

Signed-off-by: helayoty <heelayot@microsoft.com>

---------

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>
2025-11-10 12:42:54 -08:00
Kubernetes Prow Robot
59d65dad34
Merge pull request #134945 from tchap/kcm-controllers-check-threads
pkg/controller: Improve goroutine management (part 2)
2025-11-06 00:43:01 -08:00
Kubernetes Prow Robot
50b4bcbab5
Merge pull request #134210 from yliaog/admit_quota
DRA extended resource quota
2025-11-06 00:42:53 -08:00
Kubernetes Prow Robot
6723beac00
Merge pull request #135154 from kubernetes/revert-134840-ahmet/mini-cleanup
Revert "controller: duplicate utility method cleanup"
2025-11-05 22:49:04 -08:00
Kubernetes Prow Robot
ca03752ee7
Merge pull request #135104 from mimowo/mutable-job-directives
Allow mutable job scheduling directives on suspended Jobs
2025-11-05 21:57:11 -08:00
Kubernetes Prow Robot
f025bcace9
Merge pull request #135068 from pohly/dra-device-taints-1.35-full
DRA device taint eviction: several improvements
2025-11-05 18:52:58 -08:00
yliao
870062df4f adjusts DRA extended resource quota to include devices usages from regular resource claims 2025-11-05 23:24:24 +00:00
Maciej Szulik
499bff4ca4
Revert "controller: duplicate utility method cleanup" 2025-11-05 21:06:09 +01:00
Michał Woźniak
5a7c90fb76 Allow mutable scheduling directives for suspended Jobs 2025-11-05 19:37:33 +00:00
Patrick Ohly
60744fc8b9 DRA device taint eviction: track evicting rules
This avoids having to call the rule lister (which theoretically, but not in
practice) fail and having to iterate over rules which can be ignored (might be
a small performance boost).
2025-11-05 20:03:17 +01:00
Patrick Ohly
9527987293 DRA device taint eviction: use NOP queue during simulation
It's slightly more efficient and a bit cleaner.
2025-11-05 20:03:17 +01:00
Patrick Ohly
eaee6b6bce DRA device taints: add separate feature gate for rules
Support for DeviceTaintRules depends on a significant amount of
additional code:
- ResourceSlice tracker is a NOP without it.
- Additional informers and corresponding permissions in scheduler and controller.
- Controller code for handling status.

Not all users necessarily need DeviceTaintRules, so adding a second feature
gate for that code makes it possible to limit the blast radius of bugs in that
code without having to turn off device taints and tolerations entirely.
2025-11-05 20:03:17 +01:00
Kubernetes Prow Robot
9ef1a14d68
Merge pull request #134840 from ahmetb/ahmet/mini-cleanup
controller: duplicate utility method cleanup
2025-11-05 08:06:58 -08:00
Kubernetes Prow Robot
9a192aa1c3
Merge pull request #134432 from Karthik-K-N/fix-sv-test
Fix storage version test flake
2025-11-05 06:56:52 -08:00
Ayato Tokubi
320987ead3 Addressed comments 2025-11-05 10:44:50 +00:00
Ayato Tokubi
5102591a6b Refactor resource claim metrics to use structured labels and add "source" dimension.
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-11-05 09:52:47 +00:00
Kubernetes Prow Robot
c1a6a3ca71
Merge pull request #134152 from pohly/dra-device-taints-1.35
DRA: device taints: new ResourceSlice API, new features
2025-11-04 15:32:07 -08:00
Ondra Kupka
024382658b controller/volume/vacprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
e08d03b1b5 controller/volume/selinuxwarning: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
1e6ad423bf controller/volume/pvprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
0caae6f704 controller/volume/pvcprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
ed74779a0f controller/volume/persistentvolume: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
8eab454e38 controller/volume/expand: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
27774052ab controller/volume/ephemeral: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
12205df76d controller/volume/attachdetach: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
9d4ff6ecf2 controller/tainteviction: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
d2a443db75 controller/serviceaccount: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
c641df792b controller/resourcequota: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00