kubernetes

mirror of https://github.com/kubernetes/kubernetes.git synced 2026-04-22 22:58:08 -04:00

Author	SHA1	Message	Date
Kubernetes Prow Robot	4a1558c545	Merge pull request #133967 from pohly/dra-allocator-selection DRA: allocator selection	2025-09-30 08:24:18 -07:00
Patrick Ohly	7f57730ba4	DRA scheduler: fix selection of "incubating" allocator implementation In 1.34, the default feature gate selection picked the "experimental" allocator implementation when it should have used the "incubating" allocator. No harm came from that because the experimental allocator has all the necessary if checks to disable the extra code and no bugs were introduced when implementing it, but it means that our safety net wasn't there when we expected it to be. The reason is that the "DRAResourceClaimDeviceStatus" feature gate is on by default and was only listed as supported by the experimental implementation. This could be fixed by listing it as supported also by the other implementation, but that would be a bit odd because there is nothing to support for it (the reason why this was missed in 1.34!). Instead, the allocator features are now only indirectly related to feature gates, with a single boolean controlling the implementation of binding conditions.	2025-09-30 16:53:38 +02:00
Patrick Ohly	b5bcac998d	DRA scheduler: clean up feature gate handling Copying from feature.Features to new fields in the plugin got a bit silly with the long list of features that we have now. Embedding feature.Features is simpler. Two fields in feature.Features weren't named according to the feature gate, now they are named consistently and the fields are sorted.	2025-09-30 16:53:38 +02:00
hojinchoi	7028ba09db	fix: duplicated 'the' in comment	2025-09-18 18:11:44 +09:00
yliao	74cf1db218	sort the device requests in the extended resource claim spec. removed the sortClaim in the unit test.	2025-09-11 16:55:58 +00:00
yliao	79f8d1b1c5	fixed bug such that implicit extended resource name can always be used, no matter the explicit extendedResourceName field in device class is set or not.	2025-09-10 14:10:40 +00:00
Ania Borowiec	fadb40199f	Move interfaces: Handle and Plugin and related types from kubernetes/kubernetes to staging repo kube-scheduler	2025-09-02 09:42:53 +00:00
yliao	bf13cd1b81	added resourceClaimModified to bindClaim to decide whether to update assume cache	2025-08-29 16:12:55 +00:00
Abu Kashem	c8ab780edb	dra plugin: assume claim after api call in bindClaim	2025-08-13 16:35:35 -04:00
yliao	2a026f6d65	1/ added retries to AssumeClaimAfterAPICall for the object which is not present in the cache (dynamicresources.go) 2/ modified the assume cache verification to not error out as long as the expected claim is in the cache, no matter its latest and api object are different or not. (dynamicresources_test.go). 3/ fixed nil panic as seen from https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/133321/pull-kubernetes-integration/1952472629470302208	2025-08-06 07:08:58 +00:00
yliao	0a12f00e9d	fix nil panic in hasBindingConditions, it cannot assume claim has allocations	2025-07-30 14:44:41 +09:00
Sunyanan Choochotkaew	7f052afaef	KEP 5075: implement scheduler Signed-off-by: Sunyanan Choochotkaew <sunyanan.choochotkaew1@ibm.com>	2025-07-30 09:52:49 +09:00
yliao	34a64db2c7	extended resource backed by DRA: implementation	2025-07-29 18:55:21 +00:00
Kobayashi,Daisuke	e8c3af1f5c	KEP-5007 DRA Device Binding Conditions: Implement scheduler logic	2025-07-29 11:34:30 +00:00
Kubernetes Prow Robot	a11bc701e8	Merge pull request #132457 from ania-borowiec/depends_on_cluster_move_podinfo Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler	2025-07-24 09:38:27 -07:00
Ania Borowiec	aecd37e6fb	Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler	2025-07-24 12:10:58 +00:00
Patrick Ohly	5c4f81743c	DRA: use v1 API As before when adding v1beta2, DRA drivers built using the k8s.io/dynamic-resource-allocation helper packages remain compatible with all Kubernetes release >= 1.32. The helper code picks whatever API version is enabled from v1beta1/v1beta2/v1. However, the control plane now depends on v1, so a cluster configuration where only v1beta1 or v1beta2 are enabled without the v1 won't work.	2025-07-24 08:33:45 +02:00
Patrick Ohly	bc338e7505	DRA scheduler: implement filter timeout and cancellation The intent is to catch abnormal runtimes with the generously large default timeout of 10 seconds. We have to set up a context with the configured timeout (optional!), then ensure that both CEL evaluation and the allocation logic itself properly returns the context error. The scheduler plugin then can convert that into "unschedulable". The allocator and thus Filter now also check for context cancellation by the scheduler. This happens when enough nodes have been found.	2025-07-17 21:18:28 +02:00
Patrick Ohly	025c606e39	DRA scheduler: add plugin configuration The only option is the filter timeout. The implementation of it follows in a separate commit.	2025-07-17 16:47:47 +02:00
yliao	dd3691b169	refactor allocator, removed claimsToAllocate from NewAllocator(), instead, passed it through Allocate()	2025-07-16 15:11:11 +00:00
Kubernetes Prow Robot	ab685237f0	Merge pull request #132391 from sanposhiho/pre-bind-pre-flight feat: add PreBindPreFlight and implement in in-tree plugins	2025-07-15 04:06:23 -07:00
Patrick Ohly	5caf7bca15	DRA allocator: refactor code The goal is to maintain different version of the allocator logic. We already had one incidence where adding an alpha feature caused a regression also when it was disabled. Not everything can be implemented within obviously correct if branches. This also opens the door for implementing different alternatives. The code just gets moved around for now.	2025-07-10 17:34:21 +02:00
Kensei Nakada	ebae419337	feat: add PreBindPreFlight and implement in in-tree plugins	2025-07-05 17:14:21 -07:00
Ania Borowiec	ee8c265d35	Move Code and Status from pkg/scheduler/framework to k8s.io/kube-scheduler/framework	2025-06-30 10:06:22 +00:00
Ania Borowiec	00d3750503	Move ClusterEvent type to staging repo, leaving some functions (that contain logic internal to scheduler) in kubernetes/kubernetes (#132190 ) * Move ClusterEvent type to staging repo, leaving some functions (that contain logic internal to scheduler) in kubernetes/kubernetes apply review comment and fix linter warning * update-vendor.sh * update doc comments * run update-vendor.sh	2025-06-26 08:06:29 -07:00
Davanum Srinivas	03afe6471b	Add a replacement for cmp.Diff using json+go-difflib Co-authored-by: Jordan Liggitt <jordan@liggitt.net> Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2025-06-16 17:10:42 -04:00
Ania Borowiec	d75af825fb	Extract interface CycleState and move is to staging repo. CycleState implementation remains in k/k/pkg/scheduler/framework	2025-05-29 16:18:36 +00:00
Kubernetes Prow Robot	8a6b916765	Merge pull request #130720 from saintube/scheduler-expose-nodeinfo-in-prefilter Expose NodeInfo to PreFilter plugins	2025-04-23 13:31:29 -07:00
saintube	8dc6806d26	Expose NodeInfo to PreFilter plugins and Framework Co-authored-by: Zhan Sheng <49895476+AxeZhan@users.noreply.github.com> Co-authored-by: shenxin <rougang.hrg@alibaba-inc.com> Signed-off-by: saintube <saintube@foxmail.com>	2025-03-21 14:55:25 +08:00
Cici Huang	f04cfdf6e7	Update gofmt.	2025-03-19 23:21:30 +00:00
Cici Huang	6d7f11689d	Complete feature impl, fix issues, add perDeviceNodeSelection support, add tests, address comments, etc.	2025-03-19 22:10:48 +00:00
Morten Torkildsen	ecba6cde1d	Allocator updates	2025-03-19 22:10:48 +00:00
Jon Huhn	5760a4f282	DRA scheduler: device taints and tolerations Thanks to the tracker, the plugin sees all taints directly in the device definition and can compare it against the tolerations of a request while trying to find a device for the request. When the feature is turnedd off, taints are ignored during scheduling.	2025-03-19 09:18:38 +01:00
Patrick Ohly	dfb8ab6521	DRA scheduler: fail in PreFilter when DRAPrioritizedList is disabled and used This was previously caught during Filter by the allocator check. Doing it sooner avoids wasting resources on a pod which ultimately cannot get scheduled. While at it, be a bit more clear about which feature is disabled. The user might not know that.	2025-03-07 08:45:32 +01:00
Morten Torkildsen	2229a78dfe	DRA: Update allocator for Prioritized Alternatives in Device Requests	2025-02-28 19:30:10 +00:00
Kubernetes Prow Robot	fc268ecd09	Merge pull request #129823 from googs1025/chore/log_improve fix(dra plugin): when there is no resourceclaim, return directly	2025-02-02 16:28:56 -08:00
googs1025	ed826dddfe	fix(dra plugin): when there is no resourceclaim, return directly	2025-01-29 08:47:52 +08:00
Davanum Srinivas	4e05bc20db	Linter to ensure go-cmp/cmp is used ONLY in tests Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2025-01-24 20:49:14 -05:00
googs1025	77eae7c34f	feature(scheduler): remove dra plugin resourceslice QueueingHintFn	2025-01-08 16:24:28 +08:00
Patrick Ohly	33ea278c51	DRA: use v1beta1 API No code is left which depends on the v1alpha3, except of course the code implementing that version.	2024-11-06 13:03:19 +01:00
Kuba Tużnik	8d489425aa	scheduler/dynamicresources: extract obtaining and tracking in-memory modifications of DRA objects All logic related to obtaining DRA objects and tracking modifications to ResourceClaims in-memory is extracted to DefaultDRAManager, which implements framework.SharedDRAManager. This is intended to be a no-op in terms of the DRA plugin behavior.	2024-11-05 14:11:04 +01:00
Patrick Ohly	7863d9a381	DRA scheduler: refactor CEL compilation cache A better place is the cel package because a) the name can become shorter and b) it is tightly coupled with the compiler there. Moving the compilation into the cache simplifies the callers.	2024-11-05 08:34:42 +01:00
Patrick Ohly	6f07fa3a5e	DRA scheduler: update some stale comments	2024-11-01 13:23:42 +01:00
Patrick Ohly	ae6b5522ea	DRA scheduler: rename variable "Allocated devices" are the ones which can be observed from the informer. "All allocated devices" also includes those which are in flight and haven't been written back to the apiserver.	2024-11-01 13:23:42 +01:00
Patrick Ohly	0130ebba1d	DRA scheduler: refactor "allocated devices" lookup The logic for skipping "admin access" was repeated in three different places. A single foreachAllocatedDevices with a callback puts it into one function.	2024-11-01 13:23:28 +01:00
Patrick Ohly	bc55e82621	DRA scheduler: maintain a set of allocated device IDs Reacting to events from the informer cache (indirectly, through the assume cache) is more efficient than repeatedly listing it's content and then converting to IDs with unique strings. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 54.70 ± 6% 76.81 ± 6% +40.42% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 106.4 ± 4% 105.6 ± 2% ~ (p=0.413 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 120.0 ± 4% 118.9 ± 7% ~ (p=0.117 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 112.5 ± 4% 105.9 ± 4% -5.87% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 87.13 ± 4% 123.55 ± 4% +41.80% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 113.4 ± 2% 103.3 ± 2% -8.95% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 65.55 ± 3% 121.30 ± 3% +85.05% (p=0.002 n=6) geomean 90.81 106.8 +17.57%	2024-11-01 13:23:06 +01:00
Patrick Ohly	814c9428fd	DRA scheduler: cache compiled CEL expressions DeviceClasses and different requests are very likely to contain the same expression string. We don't need to compile that over and over again. To avoid hanging onto that cache longer than necessary, it's currently tied to each PreFilter/Filter combination. It might make sense to move this up into the scheduler plugin and thus reuse compiled expressions for different pods. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 33.95 ± 4% 36.65 ± 2% +7.95% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 105.8 ± 2% 106.7 ± 3% ~ (p=0.177 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 100.7 ± 1% 119.7 ± 3% +18.82% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 90.78 ± 1% 121.10 ± 4% +33.40% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 50.51 ± 7% 63.72 ± 3% +26.17% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 103.7 ± 5% 110.2 ± 2% +6.32% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 28.50 ± 2% 28.16 ± 5% ~ (p=0.102 n=6) geomean 64.99 73.15 +12.56%	2024-11-01 13:20:06 +01:00
Patrick Ohly	941d17b3b8	DRA scheduler: code cleanups Looking up the slice can be avoided by storing it when allocating a device. The AllocationResult struct is small enough that it can be copied by value. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 33.30 ± 2% 33.95 ± 4% ~ (p=0.288 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 105.3 ± 2% 105.8 ± 2% ~ (p=0.524 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 100.8 ± 1% 100.7 ± 1% ~ (p=0.738 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 90.96 ± 2% 90.78 ± 1% ~ (p=0.952 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 49.84 ± 4% 50.51 ± 7% ~ (p=0.485 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 103.8 ± 1% 103.7 ± 5% ~ (p=0.582 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 27.21 ± 7% 28.50 ± 2% ~ (p=0.065 n=6) geomean 64.26 64.99 +1.14%	2024-11-01 13:19:51 +01:00
Patrick Ohly	1246898315	DRA scheduler: ResourceSlice with unique strings Using unique strings instead of normal strings speeds up allocation with structured parameters because maps that use those strings as key no longer need to build hashes of the string content. However, care must be taken to call unique.Make as little as possible because it is costly. Pre-allocating the map of allocated devices reduces the need to grow the map when adding devices. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 18.06 ± 2% 33.30 ± 2% +84.31% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 104.7 ± 2% 105.3 ± 2% ~ (p=0.818 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 96.62 ± 1% 100.75 ± 1% +4.28% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 83.00 ± 2% 90.96 ± 2% +9.59% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 32.45 ± 7% 49.84 ± 4% +53.60% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 95.22 ± 7% 103.80 ± 1% +9.00% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 9.111 ± 10% 27.215 ± 7% +198.69% (p=0.002 n=6) geomean 45.86 64.26 +40.12%	2024-11-01 13:19:48 +01:00
Patrick Ohly	7de6d070f2	DRA scheduler: avoid listing claims during Filter The Allocate call used to call back into the claim lister for each node. This was significant work which showed up at the top of the CPU profile. It's okay to list only once during PreFilter because the Filter call does not change the claim status between Allocate calls. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 15.04 ± 0% 18.06 ± 2% +20.07% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 105.5 ± 1% 104.7 ± 2% ~ (p=0.485 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 95.83 ± 1% 96.62 ± 1% ~ (p=0.063 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 79.67 ± 3% 83.00 ± 2% +4.18% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 27.11 ± 5% 32.45 ± 7% +19.68% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 84.00 ± 3% 95.22 ± 7% +13.36% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 7.110 ± 6% 9.111 ± 10% +28.15% (p=0.002 n=6) geomean 41.05 45.86 +11.73%	2024-11-01 12:43:17 +01:00

1 2 3

124 commits