Commit graph

207 commits

Author SHA1 Message Date
Patrick Ohly
85bca3b684 DRA device taints: fix beta-enabled, alpha-disable configurations
DeviceTaintRule is off by default because the corresponding v1beta2 API group
is off. When enabled, the potentially still disabled v1alpha3 API version was
used instead of the new v1beta2, causing the scheduler to fail while setting up
informers and then not scheduling pods.
2026-03-13 09:20:57 +01:00
Tsubasa Watanabe
30b811a99b DRA Device Binding Conditions: add metrics for prebind flow
This commit introduces metrics and improves log outputs for
DRA Device Binding Conditions (KEP-5007):

- scheduler_dra_bindingconditions_allocations_total

  Counts the number of per-device scheduling attempts
  during PreBind where BindingConditions are in use

- scheduler_dra_bindingconditions_wait_duration_seconds

  Observes the time spent waiting for BindingConditions
  to be satisfied during PreBind.

Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
2026-03-12 17:19:13 +09:00
Kubernetes Prow Robot
69144c9081
Merge pull request #137371 from pohly/dra-bind-claim-panic
DRA scheduler: fix potential panic when DRABindingConditions are enabled
2026-03-11 03:03:25 +05:30
Patrick Ohly
f33176fc00 DRA scheduler: add unit tests for AllocationTimestamp
The code paths for adding AllocationTimestamp were not tested well. None of
the test cases verified that an AllocationTimestamp gets added at all because
go-cmp was instructed to ignore the unpredictable field.

We can do better than that and at least check for existence by normalizing all
non-nil time stamps to the empty time. This affects all tests where the binding
conditions and thus AllocationTimestamp support is enabled.

The retry loop for status updates was untested. The fake client has to return a
conflict status error to trigger it. This enables writing a test case where a
concurrent deallocation would have caused the nil panic without the previous
fix.

For binding conditions, one test case gets added which runs through the full
flow of allocating a claim and trying to bind it. All other test cases seem to
have started with the claim already allocated.

Altogether this increases coverage from 82.4% to 83.7%.
2026-03-10 16:25:53 +01:00
Troy Chiu
1d2165b29c Fix scheduler flaky test: wait for DeviceClass cache sync in dynamicresources tests
When DRAExtendedResource is enabled, the dynamicresources test setup
registers an event handler for DeviceClasses but was not waiting for it
to sync. This can lead to flaky tests where the cache is not fully
populated when the test starts.

This change captures the event handler registration and includes its
DoneChecker in a WaitFor call.
2026-03-09 19:24:13 +00:00
Rita Zhang
c4f88de33e
Move DRAAdminAccess feature to GA (#137373)
* Move DRAAdminAccess feature to GA

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>

* address comments

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>

---------

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
2026-03-05 23:42:21 +05:30
Kubernetes Prow Robot
8bd1505fc0
Merge pull request #137108 from pohly/logtools-update
golangci-lint: bump to logtools v0.10.1
2026-03-05 10:14:16 +05:30
Kubernetes Prow Robot
8275484dcf
Merge pull request #137297 from atombrella/feature/pkg_forvar_modernize
Remove redundant variable re-assignment in for-loops under pkg
2026-03-05 00:28:20 +05:30
Patrick Ohly
b895ce734f golangci-lint: bump to logtools v0.10.1
This fixes a bug that caused log calls involving `klog.Logger` to not be
checked.

As a result we have to fix some code that is now considered faulty:

    ERROR: pkg/controller/serviceaccount/tokens_controller.go:382:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (e *TokensController) generateTokenIfNeeded(ctx context.Context, logger klog.Logger, serviceAccount *v1.ServiceAccount, cachedSecret *v1.Secret) ( /* retry */ bool, error) {
    ERROR: ^
    ERROR: pkg/controller/storageversionmigrator/storageversionmigrator.go:299:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (svmc *SVMController) runMigration(ctx context.Context, logger klog.Logger, gvr schema.GroupVersionResource, resourceMonitor *garbagecollector.Monitor, toBeProcessedSVM *svmv1beta1.StorageVersionMigration, listResourceVersion string) (err error, failed bool) {
    ERROR: ^
    ERROR: pkg/proxy/node.go:121:3: logging function "Error" should not use format specifier "%q" (logcheck)
    ERROR: 		klog.FromContext(ctx).Error(nil, "Timed out waiting for node %q to exist", nodeName)
    ERROR: 		^
    ERROR: pkg/proxy/node.go:123:3: logging function "Error" should not use format specifier "%q" (logcheck)
    ERROR: 		klog.FromContext(ctx).Error(nil, "Timed out waiting for node %q to be assigned IPs", nodeName)
    ERROR: 		^
    ERROR: pkg/scheduler/backend/queue/scheduling_queue.go:610:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (p *PriorityQueue) runPreEnqueuePlugin(ctx context.Context, logger klog.Logger, pl fwk.PreEnqueuePlugin, pInfo *framework.QueuedPodInfo, shouldRecordMetric bool) *fwk.Status {
    ERROR: ^
    ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:286:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (pl *DynamicResources) deleteClaim(ctx context.Context, claim *resourceapi.ResourceClaim, logger klog.Logger) error {
    ERROR: ^
    ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:499:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (pl *DynamicResources) waitForExtendedClaimInAssumeCache(
    ERROR: ^
    ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:528:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (pl *DynamicResources) createExtendedResourceClaimInAPI(
    ERROR: ^
    ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:592:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (pl *DynamicResources) unreserveExtendedResourceClaim(ctx context.Context, logger klog.Logger, pod *v1.Pod, state *stateData) {
    ERROR: ^
    ERROR: pkg/scheduler/framework/runtime/batch.go:171:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (b *OpportunisticBatch) batchStateCompatible(ctx context.Context, logger klog.Logger, pod *v1.Pod, signature fwk.PodSignature, cycleCount int64, state fwk.CycleState, nodeInfos fwk.NodeInfoLister) bool {
    ERROR: ^
    ERROR: staging/src/k8s.io/component-base/featuregate/feature_gate.go:890:4: Additional arguments to Info should always be Key Value pairs. Please check if there is any key or value missing. (logcheck)
    ERROR: 			logger.Info("Warning: SetEmulationVersionAndMinCompatibilityVersion will change already queried feature", "featureGate", feature, "oldValue", oldVal, newVal)
    ERROR: 			^
    ERROR: test/images/sample-device-plugin/sampledeviceplugin.go:108:2: logging function "Info" should not use format specifier "%s" (logcheck)
    ERROR: 	logger.Info("pluginSocksDir: %s", pluginSocksDir)
    ERROR: 	^
    ERROR: test/images/sample-device-plugin/sampledeviceplugin.go:123:2: logging function "Info" should not use format specifier "%s" (logcheck)
    ERROR: 	logger.Info("CDI_ENABLED: %s", cdiEnabled)
    ERROR: 	^

While waiting for this to merge, another call was added which also doesn't
follow conventions:

    ERROR: pkg/kubelet/kubelet.go:2454:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (kl *Kubelet) deletePod(ctx context.Context, logger klog.Logger, pod *v1.Pod) error {
    ERROR: ^

Contextual logging has been beta and enabled by default for several releases
now. It's mostly just a matter of wrapping up and declaring it GA. Therefore
the calls which directly call WithName or WithValues (always have an effect)
are left as-is instead of converting them to use the klog wrappers (support
disabling the effect). To allow that, the linter gets reconfigured to not
complain about this anymore, anywhere.

The calls which would have to be fixed otherwise are:

    ERROR: pkg/kubelet/cm/dra/claiminfo.go:170:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger = logger.WithName("dra-claiminfo")
    ERROR: 	         ^
    ERROR: pkg/kubelet/cm/dra/healthinfo.go:45:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger = logger.WithName("dra-healthinfo")
    ERROR: 	         ^
    ERROR: pkg/kubelet/cm/dra/healthinfo.go:89:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger = logger.WithName("dra-healthinfo")
    ERROR: 	         ^
    ERROR: pkg/kubelet/cm/dra/healthinfo.go:157:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger = logger.WithName("dra-healthinfo")
    ERROR: 	         ^
    ERROR: pkg/kubelet/cm/dra/manager.go:175:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager.go:239:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager.go:593:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager.go:781:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(context.Background()).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager.go:898:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager_test.go:1638:15: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 				logger := klog.FromContext(streamCtx).WithName(st.Name())
    ERROR: 				          ^
    ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:77:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-plugin")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:108:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-plugin")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:161:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-plugin")
    ERROR: 	          ^
    ERROR: staging/src/k8s.io/dynamic-resource-allocation/resourceslice/tracker/tracker.go:695:14: function "WithValues" should be called through klogr.LoggerWithValues (logcheck)
    ERROR: 			logger := logger.WithValues("device", deviceID)
    ERROR: 			          ^
    ERROR: test/integration/apiserver/watchcache_test.go:42:54: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	etcd0URL, stopEtcd0, err := framework.RunCustomEtcd(klog.FromContext(ctx).WithName("etcd0"), "etcd_watchcache0", etcdArgs)
    ERROR: 	                                                    ^
    ERROR: test/integration/apiserver/watchcache_test.go:47:54: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	etcd1URL, stopEtcd1, err := framework.RunCustomEtcd(klog.FromContext(ctx).WithName("etcd1"), "etcd_watchcache1", etcdArgs)
    ERROR: 	                                                    ^
    ERROR: test/integration/scheduler_perf/scheduler_perf.go:1149:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 		logger = logger.WithName(tCtx.Name())
    ERROR: 		         ^
2026-03-04 12:08:18 +01:00
Patrick Ohly
dd6f4d3a16 DRA scheduler: avoid panic during PreBind
It can happen that a claim gets deallocated in parallel to adding a new pod to
ReservedFor. Without binding conditions, that was caught by the apiserver
validation. With binding conditions, the code which checks and sets
AllocationTimestamp panics with a nil pointer access.

This has been observed in the TestDRA/all/ShareResourceClaimSequentially
integration test, but couldn't be reproduced locally:

    E0303 07:43:20.158261   39037 panic.go:262] "Observed a panic" panic="runtime error: invalid memory address or nil pointer dereference" panicGoValue="\"invalid memory address or nil pointer dereference\"" stacktrace=<
    	goroutine 554266 [running]:
    	k8s.io/apimachinery/pkg/util/runtime.logPanic({0x69ce9f0, 0xc017bc00f0}, {0x59381a0, 0x91c6570})
    		/home/prow/go/src/k8s.io/kubernetes/staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go:132 +0xbc
    	k8s.io/apimachinery/pkg/util/runtime.handleCrash({0x69ce8d8, 0x93d87a0}, {0x59381a0, 0x91c6570}, {0xc020506f00, 0x0, 0x200?})
    		/home/prow/go/src/k8s.io/kubernetes/staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go:107 +0x116
    	k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc00685ddc0?})
    		/home/prow/go/src/k8s.io/kubernetes/staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go:64 +0x17b
    	panic({0x59381a0?, 0x91c6570?})
    		/usr/local/go/src/runtime/panic.go:783 +0x132
    	k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources.(*DynamicResources).bindClaim.func2()
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources/dynamicresources.go:1247 +0x730
    	k8s.io/client-go/util/retry.OnError.func1()
    		/home/prow/go/src/k8s.io/kubernetes/staging/src/k8s.io/client-go/util/retry/util.go:51 +0x30
    	k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection(0x9ebcca?)
    		/home/prow/go/src/k8s.io/kubernetes/staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:150 +0x3e
    	k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff({0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x2, 0x0}, 0xc020507410)
    		/home/prow/go/src/k8s.io/kubernetes/staging/src/k8s.io/apimachinery/pkg/util/wait/backoff.go:477 +0x5a
    	k8s.io/client-go/util/retry.OnError({0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0}, 0xc02c2d5380?, 0x4?)
    		/home/prow/go/src/k8s.io/kubernetes/staging/src/k8s.io/client-go/util/retry/util.go:50 +0x96
    	k8s.io/client-go/util/retry.RetryOnConflict(...)
    		/home/prow/go/src/k8s.io/kubernetes/staging/src/k8s.io/client-go/util/retry/util.go:104
    	k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources.(*DynamicResources).bindClaim(0xc0024adb20, {0x69cea28, 0xc0061acbe0}, 0xc011876800, 0x0, 0xc025163408, {0xc021909d80, 0x8})
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources/dynamicresources.go:1207 +0x845
    	k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources.(*DynamicResources).PreBind-range1(...)
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources/dynamicresources.go:1073
    	k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources.(*DynamicResources).PreBind.(*claimStore).all.func2(...)
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources/claims.go:72
    	k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources.(*DynamicResources).PreBind(0xc0024adb20, {0x69cea28, 0xc0061acbe0}, {0x69fc840?, 0xc0286f2540?}, 0xc025163408, {0xc021909d80, 0x8})
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources/dynamicresources.go:1071 +0x246
    	k8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).runPreBindPlugin(0xc009628dc8, {0x69cea28, 0xc0061acbe0}, {0x7eabfddddb30, 0xc0024adb20}, {0x69fc840, 0xc0286f2540}, 0xc025163408, {0xc021909d80, 0x8})
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/framework/runtime/framework.go:1532 +0x2e2
    	k8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).RunPreBindPlugins.func2({0x7eabfddddb30, 0xc0024adb20})
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/framework/runtime/framework.go:1461 +0x1cf
    	k8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).RunPreBindPlugins(0xc009628dc8, {0x69cea28, 0xc0061ac690}, {0x69fc840, 0xc0286f2540}, 0xc025163408, {0xc021909d80, 0x8})
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/framework/runtime/framework.go:1484 +0x623
    	k8s.io/kubernetes/pkg/scheduler.(*Scheduler).bindingCycle(0xc02a6a7500, {0x69cea28, 0xc00f22a690}, {0x69fc840, 0xc0286f2540}, {0x6a32e60, 0xc009628dc8}, {{0xc021909d80, 0x8}, 0x8, ...}, ...)
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/schedule_one.go:457 +0x72a
    	k8s.io/kubernetes/pkg/scheduler.(*Scheduler).runBindingCycle(0xc02a6a7500, {0x69ce9f0?, 0xc0027e5470?}, {0x69fc840, 0xc0286f2540}, {0x6a32e60, 0xc009628dc8}, {{0xc021909d80, 0x8}, 0x8, ...}, ...)
    		/home/prow/go/src/k8s.io/kubernetes/pkg/scheduler/schedule_one.go:164 +0x1e8
2026-03-03 16:21:38 +01:00
Sunyanan Choochotkaew
e035c41256
DRA: Promote DRAConsumableCapacity to Beta
Signed-off-by: Sunyanan Choochotkaew <sunyanan.choochotkaew1@ibm.com>
2026-03-03 18:30:26 +09:00
Mads Jensen
f11bb48738 Remove redundant re-assignment in for-loops under pkg
This the forvar rule from modernize. The semantics of the for-loop
changed from Go 1.22 to make this pattern obsolete.
2026-03-02 08:47:43 +01:00
Patrick Ohly
2b18086a25 DRA scheduler: update logging
The assume cache logs adding assumed claims at V(4) but there wasn't anything
about in-flight claims in the log for a scheduling failure where the same
device was allocated twice (https://github.com/kubernetes/kubernetes/issues/133602).

Debugging that issue depends on seeing all changes related to assume cache
changes (not just the single "Assumed object") and in-flight claims. We could
make them all V(4) (= "debug level") but they seem more appropriate for V(5) (=
"trace level"), so the assume cache verbosity gets toned down to that.
2026-02-20 08:05:23 +01:00
Patrick Ohly
e6d9cbf729 DRA scheduler plugin: increase test coverage
Line coverage isn't much better (81.3% -> 81.8%) but it's not clear whether "in
flight claims" were considered by any test case.
2026-02-20 07:58:58 +01:00
Stephen Kitt
d42d1e3d1f
Deprecate obsolete slice utility functions
... and update users to use standard library functions.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2026-02-16 10:04:33 +01:00
Sunyanan Choochotkaew
e1a7952a6c
DRA: Rename GetBaseDeviceID to GetDeviceID for SharedDeviceID
Signed-off-by: Sunyanan Choochotkaew <sunyanan.choochotkaew1@ibm.com>
2026-02-12 13:47:11 +09:00
Kubernetes Prow Robot
1bb2e12490
Merge pull request #136734 from sunya-ch/sunya-ch/fix-gather-shared-id
Fix missing GetSharedDeviceIDs bug in GatherAllocatedState
2026-02-11 01:24:00 +05:30
Sunyanan Choochotkaew
92fc98de6f
DRA: Fix missing GetSharedDeviceIDs bug in GatherAllocatedState with consumable capacity
Signed-off-by: Sunyanan Choochotkaew <sunyanan.choochotkaew1@ibm.com>
2026-02-09 10:04:29 +09:00
Mads Jensen
95616cecda Use slices.Sort instead of sort.Slice.
There were only two instances of this in the entire code-base. Hence,
I have enabled the modernize rule/linter in golangci-lint.
2026-02-06 22:46:08 +01:00
Patrick Ohly
5c19239290 DRA allocator: promote experimental -> incubating -> stable
The previous incubating becomes stable, experimental the new incubating. Now
experimental and incubating are identical until we merge more experimental
changes again.

Specifically, these commands where used:

    rm -rf stable
    mv incubating stable
    mv stable/allocator_incubating.go stable/allocator_stable.go
    mv stable/pools_incubating.go stable/pools_stable.go
    sed -i -e 's/package incubating/package stable/' stable/*.go
    cp -a experimental incubating
    mv incubating/allocator_experimental.go incubating/allocator_incubating.go
    mv incubating/pools_experimental.go incubating/pools_incubating.go
    sed -i -e 's/package experimental/package incubating/' incubating/*.go

Some other packages then need to be adapted, in particular the
TestAllocatorSelection test.
2026-01-29 12:52:57 +01:00
Patrick Ohly
581ee0a2ec DRA scheduler: fix another root cause of double device allocation
GatherAllocatedState and ListAllAllocatedDevices need to collect information
from different sources (allocated devices, in-flight claims), potentially even
multiple times (GatherAllocatedState first gets allocated devices, then the
capacities).

The underlying assumption that nothing bad happens in parallel is not always
true. The following log snippet shows how an update of the assume
cache (feeding the allocated devices tracker) and in-flight claims lands such
that GatherAllocatedState doesn't see the device in that claim as allocated:

    dra_manager.go:263: I0115 15:11:04.407714      18778] scheduler: Starting GatherAllocatedState
    ...
    allocateddevices.go:189: I0115 15:11:04.407945      18066] scheduler: Observed device allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-094" claim="testdra-all-usesallresources-hvs5d/claim-0553"
    dynamicresources.go:1150: I0115 15:11:04.407981      89109] scheduler: Claim stored in assume cache pod="testdra-all-usesallresources-hvs5d/my-pod-0553" claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 resourceVersion="5680"
    dra_manager.go:201: I0115 15:11:04.408008      89109] scheduler: Removed in-flight claim claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 version="1211"
    dynamicresources.go:1157: I0115 15:11:04.408044      89109] scheduler: Removed claim from in-flight claims pod="testdra-all-usesallresources-hvs5d/my-pod-0553" claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 resourceVersion="5680" allocation=<
        	{
        	  "devices": {
        	    "results": [
        	      {
        	        "request": "req-1",
        	        "driver": "testdra-all-usesallresources-hvs5d.driver",
        	        "pool": "worker-5",
        	        "device": "worker-5-device-094"
        	      }
        	    ]
        	  },
        	  "nodeSelector": {
        	    "nodeSelectorTerms": [
        	      {
        	        "matchFields": [
        	          {
        	            "key": "metadata.name",
        	            "operator": "In",
        	            "values": [
        	              "worker-5"
        	            ]
        	          }
        	        ]
        	      }
        	    ]
        	  },
        	  "allocationTimestamp": "2026-01-15T14:11:04Z"
        	}
         >
    dra_manager.go:280: I0115 15:11:04.408085      18778] scheduler: Device is in flight for allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-095" claim="testdra-all-usesallresources-hvs5d/claim-0086"
    dra_manager.go:280: I0115 15:11:04.408137      18778] scheduler: Device is in flight for allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-096" claim="testdra-all-usesallresources-hvs5d/claim-0165"
    default_binder.go:69: I0115 15:11:04.408175      89109] scheduler: Attempting to bind pod to node pod="testdra-all-usesallresources-hvs5d/my-pod-0553" node="worker-5"
    dra_manager.go:265: I0115 15:11:04.408264      18778] scheduler: Finished GatherAllocatedState allocatedDevices=<map[string]interface {} | len:2>: {

Initial state: "worker-5-device-094" is in-flight, not in cache
- goroutine #1: starts GatherAllocatedState, copies cache
- goroutine #2: adds to assume cache, removes from in-flight
- goroutine #1: checks in-flight

=> device never seen as allocated

This is the second reason for double allocation of the same device in two
different claims. The other was timing in the assume cache. Both were
tracked down with an integration test (separate commit). It did not fail
all the time, but enough that regressions should show up as flakes.
2026-01-26 15:44:48 +01:00
Antoni Zawodny
833b7205fc Run PreBind plugins in parallel if feasible 2026-01-11 14:19:18 +01:00
Patrick Ohly
dfa6aa22b2 DRA scheduler: fix unit test flakes
Test_isSchedulableAfterClaimChange was sensitive to system load because of the
arbitrary delay when waiting for the assume cache to catch up. Running inside
a synctest bubble avoids this. While at it, the unit tests get converted
to ktesting (nicer failure output, no extra indention needed for
tCtx.SyncTest).

TestPlugin/prebind-fail-with-binding-timeout relied on setting up a claim with
certain time stamps and then getting that test case tested within a certain
real-world time window. It's surprising that this didn't flake more often
because test execution order is random. Now the time stamp gets set right
before the test case is about to be tested. Conversion to a synctest would
be nicer, but synctests cannot have sub-tests, which are used here to track
where log output and failures come from within the larger test case.

Inside the plugin itself some log output gets added to explain why a claim is
unavailable on a node in case of a binding timeout or error during Filter.
2025-12-30 11:45:02 +01:00
Patrick Ohly
5d536bfb8e DRA: log more information
For debugging double allocation of the same
device (https://github.com/kubernetes/kubernetes/issues/133602) it is necessary
to have information about pools, devices and in-flight claims. Log calls get
extended and the config for DRA CI jobs updated to enable higher verbosity for
relevant source files.

Log output in such a cluster at verbosity 6 looks like this:

I1215 10:28:54.166872       1 allocator_incubating.go:130] "Gathered pool information" logger="FilterWithNominatedPods.Filter.DynamicResources" pod="dra-8841/tester-3" node="kind-worker2" pools={"count":1,"devices":["dra-8841.k8s.io/kind-worker2/device-00"],"meta":[{"InvalidReason":"","id":"dra-8841.k8s.io/kind-worker2","isIncomplete":false,"isInvalid":false}]}
I1215 10:28:54.166941       1 allocator_incubating.go:254] "Gathered information about devices" logger="FilterWithNominatedPods.Filter.DynamicResources" pod="dra-8841/tester-3" node="kind-worker2" allocatedDevices={"count":2,"devices":["dra-8841.k8s.io/kind-worker/device-00","dra-8841.k8s.io/kind-worker3/device-00"]} minDevicesToBeAllocated=1
2025-12-16 09:58:05 +01:00
bwsalmon
854e67bb51
KEP 5598: Opportunistic Batching (#135231)
* First version of batching w/out signatures.

* First version of pod signatures.

* Integrate batching with signatures.

* Fix merge conflicts.

* Fixes from self-review.

* Test fixes.

* Fix a bug that limited batches to size 2
Also add some new high-level logging and
simplify the pod affinity signature.

* Re-enable batching on perf tests for now.

* fwk.NewStatus(fwk.Success)

* Review feedback.

* Review feedback.

* Comment fix.

* Two plugin specific unit tests.:

* Add cycle state to the sign call, apply to topo spread.
Also add unit tests for several plugi signature
calls.

* Review feedback.

* Switch to distinct stats for hint and store calls.

* Switch signature from string to []byte

* Revert cyclestate in signs. Update node affinity.
Node affinity now sorts all of the various
nested arrays in the structure. CycleState no
longer in signature; revert to signing fewer
cases for pod spread.

* hack/update-vendor.sh

* Disable signatures when extenders are configured.

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update staging/src/k8s.io/kube-scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Disable node resource signatures when extended DRA enabled.

* Review feedback.

* Update pkg/scheduler/framework/plugins/imagelocality/image_locality.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/plugins/nodedeclaredfeatures/nodedeclaredfeatures.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Fixes for review suggestions.

* Add integration tests.

* Linter fixes, test fix.

* Whitespace fix.

* Remove broken test.

* Unschedulable test.

* Remove go.mod changes.

---------

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>
2025-11-12 21:51:37 -08:00
Kubernetes Prow Robot
0cfbf89e70
Merge pull request #134189 from mortent/NewUpdatePartitionableDevices
Updates to DRA Partitionable Devices feature
2025-11-06 16:10:53 -08:00
Kubernetes Prow Robot
6232175b94
Merge pull request #134935 from alaypatel07/refactor-dra-extended-resources
refactor dra extended resources implementation in scheduler plugin
2025-11-06 15:18:59 -08:00
Morten Torkildsen
38b5750e33 DRA: Update allocator for Partitionable Devices 2025-11-06 21:30:01 +00:00
Alay Patel
f8ccc4c4d7 dra scheduler plugin: refactor extendeddynamicresources.go for readibility
Signed-off-by: Alay Patel <alayp@nvidia.com>
2025-11-06 15:49:33 -05:00
Kubernetes Prow Robot
22962087ec
Merge pull request #135186 from pohly/dra-scheduler-unit-test-flake
DRA: fix for scheduler unit test flake + logging
2025-11-06 12:43:23 -08:00
Alay Patel
da9f1d8eed dra scheduler plugin: move extended resources functions into separate file
Signed-off-by: Alay Patel <alayp@nvidia.com>
2025-11-06 14:58:59 -05:00
Patrick Ohly
1c4cab9dda DRA scheduler unit test: fix race with ResourceSlice informer
The test started without waiting for the ResourceSlice informer to have
synced. As a result, the "CEL-runtime-error-for-one-of-three-nodes" test case
failed randomly with a very low flake rate (less than 1% in local runs) because
CEL expressions never got evaluated due to not having the slices (yet).

Other tests also were less reliable, but not known to fail.
2025-11-06 18:40:35 +01:00
Ed Bartosh
edbc32fa60 DRA: implement scoring for extended resources
Updated extended resource allocation scorer to calculate
allocatable and requested values for DRA-backed resources.
2025-11-06 10:40:52 +02:00
Kubernetes Prow Robot
7537d52c2e
Merge pull request #134882 from yliaog/initcon
Fix non-sidecar init container device requests
2025-11-05 21:57:04 -08:00
Kubernetes Prow Robot
f025bcace9
Merge pull request #135068 from pohly/dra-device-taints-1.35-full
DRA device taint eviction: several improvements
2025-11-05 18:52:58 -08:00
yliao
6676982316 fixed non-sidecar init container device requests and mappings 2025-11-05 22:48:50 +00:00
Kubernetes Prow Robot
cf37f0bf49
Merge pull request #135037 from yliaog/extendedresourcecache
pick one device class deterministically for extended resource
2025-11-05 14:16:58 -08:00
Patrick Ohly
eaee6b6bce DRA device taints: add separate feature gate for rules
Support for DeviceTaintRules depends on a significant amount of
additional code:
- ResourceSlice tracker is a NOP without it.
- Additional informers and corresponding permissions in scheduler and controller.
- Controller code for handling status.

Not all users necessarily need DeviceTaintRules, so adding a second feature
gate for that code makes it possible to limit the blast radius of bugs in that
code without having to turn off device taints and tolerations entirely.
2025-11-05 20:03:17 +01:00
Morten Torkildsen
fbfeb33231 DRA: Add scoring for Prioritized List feature 2025-11-05 17:18:38 +00:00
yliao
949be1d132 fixed comments due to switch from class name to class for GetDeviceClass 2025-11-05 15:08:38 +00:00
Ayato Tokubi
902c2e0c15 Fix lint errors in dynamicresources_test.go
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-11-05 10:44:50 +00:00
Ayato Tokubi
c5b1493925 Add test case for claim creation failure in DRAExtendedResources
Extend the `setup` function to support API reactors, allowing custom reactions in tests.

Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-11-05 09:55:28 +00:00
Ayato Tokubi
ea7561b243 Implement scheduler_resourceclaim_creates_total metrics for DRAExtendedResources 2025-11-05 09:53:33 +00:00
yliao
c67937dd35 switched from storing name to storing a pointer to the device class. 2025-11-04 17:51:12 +00:00
fj-naji
c438f8a983 scheduler: Add BindingTimeout args to DynamicResources plugin
Add a new `bindingTimeout` field to DynamicResources plugin args and wire it
into PreBind.

Changes:
- API: add `bindingTimeout` to DynamicResourcesArgs (staging + internal types).
- Defaults: default to 600 seconds when BOTH DRADeviceBindingConditions and
  DRAResourceClaimDeviceStatus are enabled.
- Validation: require >= 1s; forbid when either feature gate is disabled.
- Plugin: plumbs args into `pl.bindingTimeout` and uses it in
  `wait.PollUntilContextTimeout` for binding-condition wait logic.
- Plugin: remove legacy `BindingTimeoutDefaultSeconds`.

Tests:
- Add/adjust unit tests for validation and PreBind timeout path.
- Ensure <1s and negative values are rejected; forbids when gates disabled.
2025-11-04 17:15:19 +00:00
yliao
b83a6a83f0 pick the device class created latest, or with name alphabetically sorted earlier 2025-11-03 19:28:18 +00:00
yliao
3eab698884 fixed unit test and integration test failures
Fix minor nits

Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
2025-11-03 20:07:01 +05:30
Patrick Ohly
12a0c8ce17 DRA extended resource: chain event handlers
The cache and scheduler event handlers cannot be registered separately in the
informer, that leads to a race (scheduler might schedule based on event before
cache is updated). Chaining event handlers (cache first, then scheduler) avoids
this.

This also ensures that the cache is up-to-date before the scheduler
starts (HasSynced of the handler registration for the cache is checked).

Other changes:
- renamed package to avoid clash with other "cache" packages
- clarified nil handling
- feature gate check before instantiating the cache
- per-test logging
- utilruntime.HandleErrorWithLogger
- simpler cache.DeletedFinalStateUnknown

Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
2025-11-03 12:31:17 +05:30
Sai Ramesh Vanka
d8c66ffb63 Add a global cache to support DRA's extended resource to the device
class mapping

- Add a new interface "DeviceClassResolver" in the scheduler framework
- Add a global cache of mapping between the extended resource and the
  device class
- Cache can be leveraged by the k8s api-server, controller-manager along with the scheduler
- This change helps in delegating the requests to the dynamicresource
  plugin based on the mapping during the node update events and thus
avoiding an extra scheduling cycle

Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
2025-11-03 12:31:16 +05:30
Kubernetes Prow Robot
808d320de1
Merge pull request #134956 from yliaog/blockowner
removed BlockOwnerDeletion
2025-10-30 01:26:11 -07:00