Commit graph

186 commits

Author SHA1 Message Date
Antoni Zawodny
833b7205fc Run PreBind plugins in parallel if feasible 2026-01-11 14:19:18 +01:00
Patrick Ohly
dfa6aa22b2 DRA scheduler: fix unit test flakes
Test_isSchedulableAfterClaimChange was sensitive to system load because of the
arbitrary delay when waiting for the assume cache to catch up. Running inside
a synctest bubble avoids this. While at it, the unit tests get converted
to ktesting (nicer failure output, no extra indention needed for
tCtx.SyncTest).

TestPlugin/prebind-fail-with-binding-timeout relied on setting up a claim with
certain time stamps and then getting that test case tested within a certain
real-world time window. It's surprising that this didn't flake more often
because test execution order is random. Now the time stamp gets set right
before the test case is about to be tested. Conversion to a synctest would
be nicer, but synctests cannot have sub-tests, which are used here to track
where log output and failures come from within the larger test case.

Inside the plugin itself some log output gets added to explain why a claim is
unavailable on a node in case of a binding timeout or error during Filter.
2025-12-30 11:45:02 +01:00
Patrick Ohly
5d536bfb8e DRA: log more information
For debugging double allocation of the same
device (https://github.com/kubernetes/kubernetes/issues/133602) it is necessary
to have information about pools, devices and in-flight claims. Log calls get
extended and the config for DRA CI jobs updated to enable higher verbosity for
relevant source files.

Log output in such a cluster at verbosity 6 looks like this:

I1215 10:28:54.166872       1 allocator_incubating.go:130] "Gathered pool information" logger="FilterWithNominatedPods.Filter.DynamicResources" pod="dra-8841/tester-3" node="kind-worker2" pools={"count":1,"devices":["dra-8841.k8s.io/kind-worker2/device-00"],"meta":[{"InvalidReason":"","id":"dra-8841.k8s.io/kind-worker2","isIncomplete":false,"isInvalid":false}]}
I1215 10:28:54.166941       1 allocator_incubating.go:254] "Gathered information about devices" logger="FilterWithNominatedPods.Filter.DynamicResources" pod="dra-8841/tester-3" node="kind-worker2" allocatedDevices={"count":2,"devices":["dra-8841.k8s.io/kind-worker/device-00","dra-8841.k8s.io/kind-worker3/device-00"]} minDevicesToBeAllocated=1
2025-12-16 09:58:05 +01:00
bwsalmon
854e67bb51
KEP 5598: Opportunistic Batching (#135231)
* First version of batching w/out signatures.

* First version of pod signatures.

* Integrate batching with signatures.

* Fix merge conflicts.

* Fixes from self-review.

* Test fixes.

* Fix a bug that limited batches to size 2
Also add some new high-level logging and
simplify the pod affinity signature.

* Re-enable batching on perf tests for now.

* fwk.NewStatus(fwk.Success)

* Review feedback.

* Review feedback.

* Comment fix.

* Two plugin specific unit tests.:

* Add cycle state to the sign call, apply to topo spread.
Also add unit tests for several plugi signature
calls.

* Review feedback.

* Switch to distinct stats for hint and store calls.

* Switch signature from string to []byte

* Revert cyclestate in signs. Update node affinity.
Node affinity now sorts all of the various
nested arrays in the structure. CycleState no
longer in signature; revert to signing fewer
cases for pod spread.

* hack/update-vendor.sh

* Disable signatures when extenders are configured.

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update staging/src/k8s.io/kube-scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Disable node resource signatures when extended DRA enabled.

* Review feedback.

* Update pkg/scheduler/framework/plugins/imagelocality/image_locality.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/plugins/nodedeclaredfeatures/nodedeclaredfeatures.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Fixes for review suggestions.

* Add integration tests.

* Linter fixes, test fix.

* Whitespace fix.

* Remove broken test.

* Unschedulable test.

* Remove go.mod changes.

---------

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>
2025-11-12 21:51:37 -08:00
Kubernetes Prow Robot
0cfbf89e70
Merge pull request #134189 from mortent/NewUpdatePartitionableDevices
Updates to DRA Partitionable Devices feature
2025-11-06 16:10:53 -08:00
Kubernetes Prow Robot
6232175b94
Merge pull request #134935 from alaypatel07/refactor-dra-extended-resources
refactor dra extended resources implementation in scheduler plugin
2025-11-06 15:18:59 -08:00
Morten Torkildsen
38b5750e33 DRA: Update allocator for Partitionable Devices 2025-11-06 21:30:01 +00:00
Alay Patel
f8ccc4c4d7 dra scheduler plugin: refactor extendeddynamicresources.go for readibility
Signed-off-by: Alay Patel <alayp@nvidia.com>
2025-11-06 15:49:33 -05:00
Kubernetes Prow Robot
22962087ec
Merge pull request #135186 from pohly/dra-scheduler-unit-test-flake
DRA: fix for scheduler unit test flake + logging
2025-11-06 12:43:23 -08:00
Alay Patel
da9f1d8eed dra scheduler plugin: move extended resources functions into separate file
Signed-off-by: Alay Patel <alayp@nvidia.com>
2025-11-06 14:58:59 -05:00
Patrick Ohly
1c4cab9dda DRA scheduler unit test: fix race with ResourceSlice informer
The test started without waiting for the ResourceSlice informer to have
synced. As a result, the "CEL-runtime-error-for-one-of-three-nodes" test case
failed randomly with a very low flake rate (less than 1% in local runs) because
CEL expressions never got evaluated due to not having the slices (yet).

Other tests also were less reliable, but not known to fail.
2025-11-06 18:40:35 +01:00
Ed Bartosh
edbc32fa60 DRA: implement scoring for extended resources
Updated extended resource allocation scorer to calculate
allocatable and requested values for DRA-backed resources.
2025-11-06 10:40:52 +02:00
Kubernetes Prow Robot
7537d52c2e
Merge pull request #134882 from yliaog/initcon
Fix non-sidecar init container device requests
2025-11-05 21:57:04 -08:00
Kubernetes Prow Robot
f025bcace9
Merge pull request #135068 from pohly/dra-device-taints-1.35-full
DRA device taint eviction: several improvements
2025-11-05 18:52:58 -08:00
yliao
6676982316 fixed non-sidecar init container device requests and mappings 2025-11-05 22:48:50 +00:00
Kubernetes Prow Robot
cf37f0bf49
Merge pull request #135037 from yliaog/extendedresourcecache
pick one device class deterministically for extended resource
2025-11-05 14:16:58 -08:00
Patrick Ohly
eaee6b6bce DRA device taints: add separate feature gate for rules
Support for DeviceTaintRules depends on a significant amount of
additional code:
- ResourceSlice tracker is a NOP without it.
- Additional informers and corresponding permissions in scheduler and controller.
- Controller code for handling status.

Not all users necessarily need DeviceTaintRules, so adding a second feature
gate for that code makes it possible to limit the blast radius of bugs in that
code without having to turn off device taints and tolerations entirely.
2025-11-05 20:03:17 +01:00
Morten Torkildsen
fbfeb33231 DRA: Add scoring for Prioritized List feature 2025-11-05 17:18:38 +00:00
yliao
949be1d132 fixed comments due to switch from class name to class for GetDeviceClass 2025-11-05 15:08:38 +00:00
Ayato Tokubi
902c2e0c15 Fix lint errors in dynamicresources_test.go
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-11-05 10:44:50 +00:00
Ayato Tokubi
c5b1493925 Add test case for claim creation failure in DRAExtendedResources
Extend the `setup` function to support API reactors, allowing custom reactions in tests.

Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-11-05 09:55:28 +00:00
Ayato Tokubi
ea7561b243 Implement scheduler_resourceclaim_creates_total metrics for DRAExtendedResources 2025-11-05 09:53:33 +00:00
yliao
c67937dd35 switched from storing name to storing a pointer to the device class. 2025-11-04 17:51:12 +00:00
fj-naji
c438f8a983 scheduler: Add BindingTimeout args to DynamicResources plugin
Add a new `bindingTimeout` field to DynamicResources plugin args and wire it
into PreBind.

Changes:
- API: add `bindingTimeout` to DynamicResourcesArgs (staging + internal types).
- Defaults: default to 600 seconds when BOTH DRADeviceBindingConditions and
  DRAResourceClaimDeviceStatus are enabled.
- Validation: require >= 1s; forbid when either feature gate is disabled.
- Plugin: plumbs args into `pl.bindingTimeout` and uses it in
  `wait.PollUntilContextTimeout` for binding-condition wait logic.
- Plugin: remove legacy `BindingTimeoutDefaultSeconds`.

Tests:
- Add/adjust unit tests for validation and PreBind timeout path.
- Ensure <1s and negative values are rejected; forbids when gates disabled.
2025-11-04 17:15:19 +00:00
yliao
b83a6a83f0 pick the device class created latest, or with name alphabetically sorted earlier 2025-11-03 19:28:18 +00:00
yliao
3eab698884 fixed unit test and integration test failures
Fix minor nits

Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
2025-11-03 20:07:01 +05:30
Patrick Ohly
12a0c8ce17 DRA extended resource: chain event handlers
The cache and scheduler event handlers cannot be registered separately in the
informer, that leads to a race (scheduler might schedule based on event before
cache is updated). Chaining event handlers (cache first, then scheduler) avoids
this.

This also ensures that the cache is up-to-date before the scheduler
starts (HasSynced of the handler registration for the cache is checked).

Other changes:
- renamed package to avoid clash with other "cache" packages
- clarified nil handling
- feature gate check before instantiating the cache
- per-test logging
- utilruntime.HandleErrorWithLogger
- simpler cache.DeletedFinalStateUnknown

Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
2025-11-03 12:31:17 +05:30
Sai Ramesh Vanka
d8c66ffb63 Add a global cache to support DRA's extended resource to the device
class mapping

- Add a new interface "DeviceClassResolver" in the scheduler framework
- Add a global cache of mapping between the extended resource and the
  device class
- Cache can be leveraged by the k8s api-server, controller-manager along with the scheduler
- This change helps in delegating the requests to the dynamicresource
  plugin based on the mapping during the node update events and thus
avoiding an extra scheduling cycle

Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
2025-11-03 12:31:16 +05:30
Kubernetes Prow Robot
808d320de1
Merge pull request #134956 from yliaog/blockowner
removed BlockOwnerDeletion
2025-10-30 01:26:11 -07:00
yliao
4f647b3f3d removed BlockOwnerDeletion 2025-10-29 22:41:10 +00:00
Ed Bartosh
1cb45e2a27 DRA: fix scheduling of pods with extended resources
Previously, the scheduler assumed an extended resource was maintained
by a device plugin if its name was present in the node's Allocatable
map, even if its value was zero. This blocked scheduling when a device
plugin was disconnected or uninstalled, because Kubelet still reported
the resource with Allocatable=0.

This change adds a check for the actual allocatable value in addition
to a key presence check, allowing nodes with uninstalled device
plugins to be considered for scheduling.
2025-10-27 16:24:29 +02:00
Kubernetes Prow Robot
4a1558c545
Merge pull request #133967 from pohly/dra-allocator-selection
DRA: allocator selection
2025-09-30 08:24:18 -07:00
Patrick Ohly
60eeaa6ebd DRA scheduler: add unit test for allocator selection
This prevents the mistake from 1.34 where the default-on
DRAResourceClaimDeviceStatus feature caused the use of the experimental
allocator implementation. The test fails without a fix for that.
2025-09-30 16:53:38 +02:00
Patrick Ohly
7f57730ba4 DRA scheduler: fix selection of "incubating" allocator implementation
In 1.34, the default feature gate selection picked the "experimental" allocator
implementation when it should have used the "incubating" allocator. No harm
came from that because the experimental allocator has all the necessary if
checks to disable the extra code and no bugs were introduced when implementing
it, but it means that our safety net wasn't there when we expected it to be.

The reason is that the "DRAResourceClaimDeviceStatus" feature gate is on by
default and was only listed as supported by the experimental implementation.
This could be fixed by listing it as supported also by the other
implementation, but that would be a bit odd because there is nothing to support
for it (the reason why this was missed in 1.34!). Instead, the allocator
features are now only indirectly related to feature gates, with a single
boolean controlling the implementation of binding conditions.
2025-09-30 16:53:38 +02:00
Patrick Ohly
b5bcac998d DRA scheduler: clean up feature gate handling
Copying from feature.Features to new fields in the plugin got a bit silly with
the long list of features that we have now. Embedding feature.Features is
simpler.

Two fields in feature.Features weren't named according to the feature gate, now
they are named consistently and the fields are sorted.
2025-09-30 16:53:38 +02:00
hojinchoi
7028ba09db fix: duplicated 'the' in comment 2025-09-18 18:11:44 +09:00
yliao
74cf1db218 sort the device requests in the extended resource claim spec.
removed the sortClaim in the unit test.
2025-09-11 16:55:58 +00:00
yliao
79f8d1b1c5 fixed bug such that implicit extended resource name can always be used,
no matter the explicit extendedResourceName field in device class is set or not.
2025-09-10 14:10:40 +00:00
Ania Borowiec
fadb40199f
Move interfaces: Handle and Plugin and related types from kubernetes/kubernetes to staging repo kube-scheduler 2025-09-02 09:42:53 +00:00
yliao
bf13cd1b81 added resourceClaimModified to bindClaim to decide whether to update assume cache 2025-08-29 16:12:55 +00:00
Abu Kashem
747a295cac
fix flake in dra test 'TestPlugin'
TestPlugin/multi-claims-binding-conditions-all-success/PreEnqueue
flakes due to the assumed cache not been synced with the initial
store. The test waits until the registered handler used by the
assumed cache has synced before proceeding with the test
2025-08-18 15:54:03 -04:00
Abu Kashem
c8ab780edb
dra plugin: assume claim after api call in bindClaim 2025-08-13 16:35:35 -04:00
yliao
2a026f6d65 1/ added retries to AssumeClaimAfterAPICall for the object which is not present in the cache (dynamicresources.go)
2/ modified the assume cache verification to not error out as long as
the expected claim is in the cache, no matter its latest and api object
are different or not. (dynamicresources_test.go).
3/ fixed nil panic as seen from https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/133321/pull-kubernetes-integration/1952472629470302208
2025-08-06 07:08:58 +00:00
yliao
0a12f00e9d
fix nil panic in hasBindingConditions, it cannot assume claim has allocations 2025-07-30 14:44:41 +09:00
Sunyanan Choochotkaew
7f052afaef
KEP 5075: implement scheduler
Signed-off-by: Sunyanan Choochotkaew <sunyanan.choochotkaew1@ibm.com>
2025-07-30 09:52:49 +09:00
yliao
34a64db2c7 extended resource backed by DRA: implementation 2025-07-29 18:55:21 +00:00
Kobayashi,Daisuke
e8c3af1f5c KEP-5007 DRA Device Binding Conditions: Implement scheduler logic 2025-07-29 11:34:30 +00:00
Kubernetes Prow Robot
a11bc701e8
Merge pull request #132457 from ania-borowiec/depends_on_cluster_move_podinfo
Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler
2025-07-24 09:38:27 -07:00
Ania Borowiec
aecd37e6fb
Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler 2025-07-24 12:10:58 +00:00
Kubernetes Prow Robot
89a01ec72a
Merge pull request #133019 from pohly/dra-scheduler-plugin-owners
DRA scheduler plugin: add pohly as approver
2025-07-24 03:42:33 -07:00