Commit graph

181 commits

Author SHA1 Message Date
Kubernetes Prow Robot
7f3a5ab96f
Merge pull request #136579 from romanbaron/reuse-scheduling-signature
Reuse pod scheduling signature for opportunistic batching
2026-03-13 20:15:39 +05:30
Roman Baron
c0e973dc70 scheduler: Replaced context.Context and testing.T parameters with ktesting.TContext in scheduling_queue_test.go 2026-03-12 17:31:11 +02:00
Bartosz
43c5d2a419
Add PlacementGeneratePlugin interface and runner 2026-03-12 09:33:05 +00:00
Roman Baron
de1385fe1b scheduler: Added ObserveFrameworkDurationAsync to metrics recorder 2026-03-12 10:31:38 +02:00
Roman Baron
7b00255135 scheduler: Removed plugin stats from pod signing process 2026-03-12 10:31:04 +02:00
Roman Baron
3a6c169034 scheduler: Reuse scheduling signature for opportunistic batching 2026-03-12 10:30:32 +02:00
Bartosz
f50ae7284a
Remove feature gate check for placement score plugin validation 2026-03-10 09:42:07 +00:00
Bartosz
335f043756
Add Min/MaxScore to replace Min/MaxNodeScore 2026-03-10 09:42:05 +00:00
Bartosz
db3c8f3a4b
Add PlacementScorePlugin interface and runner 2026-03-10 09:42:04 +00:00
Antoni Zawodny
3f094dc228
Create Workload API v1alpha2 (#136976)
* Drop WorkloadRef field and introduce SchedulingGroup field in Pod API

* Introduce v1alpha2 Workload and PodGroup APIs, drop v1alpha1 Workload API

Co-authored-by: yongruilin <yongrlin@outlook.com>

* Run hack/update-codegen.sh

* Adjust kube-scheduler code and integration tests to v1alpha2 API

* Drop v1alpha1 scheduling API group and run make update

---------

Co-authored-by: yongruilin <yongrlin@outlook.com>
2026-03-10 07:59:10 +05:30
Maciej Skoczeń
912bf9c4ef Use a single RunPermitPlugins function and call AddWaitingPod outside a framework 2026-03-02 15:14:12 +00:00
Kubernetes Prow Robot
8812ec563c
Merge pull request #134353 from skitt/drop-string-slice
Deprecate obsolete slice utility functions
2026-02-20 00:57:41 +05:30
Maciej Wyrzuc
4a326b0196 Preempt pods in prebind phase without delete calls.
This change allows the preemption to preempt a pod that is not yet
bound, but is already in prebind phase) without issuing a delete call to the
apiserver.

Pods are added to a special map of pods currently in prebind phaseand
preemption can cancel the context that is used for given pod prebind phase ,
allowing it to gracefully handle error in the same manner as errors
coming out from prebind plugins. This results in pods being pushed to
backoff queue, allowing them to be rescheduled in upcoming scheduling
cycles.
2026-02-18 09:00:23 +00:00
Kubernetes Prow Robot
f9c9f03b05
Merge pull request #136618 from macsko/workload_scheduling_cycle
KEP-4671: Introduce Workload Scheduling Cycle
2026-02-17 15:21:04 +05:30
Maciej Skoczeń
6233b25907 Introduce Workload Scheduling Cycle
Add integration tests for gang and basic policy workload scheduling

Add more tests for cluster snapshot

Proceed to binding cycle just after pod group cycle

Enforce one scheduler name per pod group, rename workload cycle to pod group cycle

Add unit tests for pod group scheduling cycle

Run ScheduleOne tests treating pod as part of a pod group

Rename NeedsPodGroupCycle to NeedsPodGroupScheduling

Observe correct per-pod and per-podgroup metrics during pod group cycle

Rename pod group algorithm status to waiting_on_preemption

Mention forgotAllAssumedPods is a safety check
2026-02-17 09:02:32 +00:00
Stephen Kitt
d42d1e3d1f
Deprecate obsolete slice utility functions
... and update users to use standard library functions.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2026-02-16 10:04:33 +01:00
Kubernetes Prow Robot
49fe2ecce1
Merge pull request #135719 from Argh4k/waiting-pod-integration-test
Put pods preempted in WaitOnPermit to backoff queue
2026-01-30 23:36:24 +05:30
Maciej Wyrzuc
f1f3a08ba7 Put pods preempted in WaitOnPermit in backoff queue 2026-01-30 09:38:17 +00:00
Antoni Zawodny
833b7205fc Run PreBind plugins in parallel if feasible 2026-01-11 14:19:18 +01:00
Antoni Zawodny
16b375e4ef Generalize ErrorChannel to other underlying types 2026-01-11 13:58:06 +01:00
Patrick Ohly
7a4d650125 DRA extended resources: fix flake in unit tests
The tests assumed that instantiating a DRAManager followed by
informerFactory.WaitForCacheSync would be enough to have the manager
up-to-date, but that's not correct: the test only waits for informer *caches*
to be synced, but syncing *event handlers* like the one in the manager may
still be going on. The flake rate is low, though:

    $ GOPATH/bin/stress -p 256 ./noderesources.test
    5s: 0 runs so far, 0 failures, 256 active
    10s: 256 runs so far, 0 failures, 256 active
    15s: 256 runs so far, 0 failures, 256 active
    20s: 512 runs so far, 0 failures, 256 active
    25s: 567 runs so far, 0 failures, 256 active
    30s: 771 runs so far, 0 failures, 256 active

    /tmp/go-stress-20251226T181044-974980161
    --- FAIL: TestCalculateResourceAllocatableRequest (0.81s)
        --- FAIL: TestCalculateResourceAllocatableRequest/DRA-backed-resource-with-shared-device-allocation (0.00s)
            extendedresourcecache.go:197: I1226 18:11:14.431337] Updated extended resource cache for explicit mapping extendedResource="extended.resource.dra.io/something" deviceClass="device-class-name"
            extendedresourcecache.go:204: I1226 18:11:14.431380] Updated extended resource cache for default mapping extendedResource="deviceclass.resource.kubernetes.io/device-class-name" deviceClass="device-class-name"
            extendedresourcecache.go:220: I1226 18:11:14.431394] Updated device class mapping deviceClass="device-class-name" extendedResource="extended.resource.dra.io/something"
            resource_allocation_test.go:595: Expected requested=2, but got requested=1
    FAIL

It becomes higher when changing WaitForCacheSync such that it doesn't poll and
therefore returns more promptly, which is where this flake was first observed.

The fix is to run the test in a syntest bubble where Wait can be used to wait
for all background activity, including event handling, to be finished before
proceeding with the test.

synctest is less forgiving about lingering goroutines. A synctest bubble must
wait for gouroutines to stop, which in this case means that there has to be
a way to wait for the metric recorder shutdown. Event handlers have to be
removed.

This could be done with plain Go, but here test/utils/ktesting is used instead
because it offers some advantages:
- less boilerplate code
- automatic cancellation of the context (i.e. less manual context.WithCancel)
- tCtx.SyncTest is a direct substitute for t.Run, which avoids re-indenting
  sub-tests. synctest itself needs another anonymous function, which makes
  the line too long and forced re-indention:
     t.Run(... func(...) {
         synctest.Test(... func() {
         })
     })

For the sake of consistency all tests get updated.

While at it, some code gets improved:

- t.Fatal(err) is not a good way to report an error because
  there is no additional markup in the test output that indicates
  that there was an unexpected error. It just logs err.Error(),
  which might not be very informative and/or obvious.
- newTestDRAManager aborts in case of a failure instead of
  returning an error.
2025-12-27 09:47:56 +01:00
bwsalmon
854e67bb51
KEP 5598: Opportunistic Batching (#135231)
* First version of batching w/out signatures.

* First version of pod signatures.

* Integrate batching with signatures.

* Fix merge conflicts.

* Fixes from self-review.

* Test fixes.

* Fix a bug that limited batches to size 2
Also add some new high-level logging and
simplify the pod affinity signature.

* Re-enable batching on perf tests for now.

* fwk.NewStatus(fwk.Success)

* Review feedback.

* Review feedback.

* Comment fix.

* Two plugin specific unit tests.:

* Add cycle state to the sign call, apply to topo spread.
Also add unit tests for several plugi signature
calls.

* Review feedback.

* Switch to distinct stats for hint and store calls.

* Switch signature from string to []byte

* Revert cyclestate in signs. Update node affinity.
Node affinity now sorts all of the various
nested arrays in the structure. CycleState no
longer in signature; revert to signing fewer
cases for pod spread.

* hack/update-vendor.sh

* Disable signatures when extenders are configured.

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update staging/src/k8s.io/kube-scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Disable node resource signatures when extended DRA enabled.

* Review feedback.

* Update pkg/scheduler/framework/plugins/imagelocality/image_locality.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/plugins/nodedeclaredfeatures/nodedeclaredfeatures.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Fixes for review suggestions.

* Add integration tests.

* Linter fixes, test fix.

* Whitespace fix.

* Remove broken test.

* Unschedulable test.

* Remove go.mod changes.

---------

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>
2025-11-12 21:51:37 -08:00
Maciej Skoczeń
8d67173de0 Implement Gang scheduling in kube-scheduler 2025-11-06 10:47:29 +00:00
Hemant Kumar
002774c315 Address review comments 2025-11-04 11:16:43 -05:00
Hemant Kumar
fe3722dfa9 Address review comments
Change type name and stuff
2025-11-03 16:27:06 -05:00
Hemant Kumar
c71e45c735 Implement a csimanager for managing storage related assets 2025-10-31 11:06:58 -04:00
Hemant Kumar
7bbec73192 Add a interface for sharing CSINode objects between scheduler and CAS 2025-10-30 13:53:10 -04:00
Ania Borowiec
fadb40199f
Move interfaces: Handle and Plugin and related types from kubernetes/kubernetes to staging repo kube-scheduler 2025-09-02 09:42:53 +00:00
Kensei Nakada
ac9fad6030 feat: trigger PreFilterPreBind in the binding cycle 2025-07-29 19:01:02 +09:00
Maciej Skoczeń
17d733e243 KEP-5229: Send API calls through dispatcher and cache 2025-07-25 15:35:36 +00:00
Ania Borowiec
aecd37e6fb
Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler 2025-07-24 12:10:58 +00:00
Ania Borowiec
ee8c265d35
Move Code and Status from pkg/scheduler/framework to k8s.io/kube-scheduler/framework 2025-06-30 10:06:22 +00:00
Ania Borowiec
00d3750503
Move ClusterEvent type to staging repo, leaving some functions (that contain logic internal to scheduler) in kubernetes/kubernetes (#132190)
* Move ClusterEvent type to staging repo, leaving some functions (that contain logic internal to scheduler) in kubernetes/kubernetes

apply review comment and fix linter warning

* update-vendor.sh

* update doc comments

* run update-vendor.sh
2025-06-26 08:06:29 -07:00
Ania Borowiec
d75af825fb
Extract interface CycleState and move is to staging repo. CycleState implementation remains in k/k/pkg/scheduler/framework 2025-05-29 16:18:36 +00:00
Kubernetes Prow Robot
8a6b916765
Merge pull request #130720 from saintube/scheduler-expose-nodeinfo-in-prefilter
Expose NodeInfo to PreFilter plugins
2025-04-23 13:31:29 -07:00
saintube
8dc6806d26 Expose NodeInfo to PreFilter plugins and Framework
Co-authored-by: Zhan Sheng <49895476+AxeZhan@users.noreply.github.com>
Co-authored-by: shenxin <rougang.hrg@alibaba-inc.com>
Signed-off-by: saintube <saintube@foxmail.com>
2025-03-21 14:55:25 +08:00
dom4ha
4deb4f2b5f Trigger rescheduling on delete event also when unscheduled pod is removed 2025-03-10 15:03:50 +00:00
saintube
afb4e96510 Expose NodeInfo to Score plugins
Co-authored-by: shenxin <rougang.hrg@alibaba-inc.com>
Signed-off-by: saintube <saintube@foxmail.com>
2025-03-04 17:57:14 +08:00
Kensei Nakada
c322294883 implement PodActivator to activate when preemption fails 2024-11-07 14:09:35 +09:00
Kuba Tużnik
87cd496a29 scheduler/framework: introduce pluggable SharedDRAManager
SharedDRAManager will be used by the DRA plugin to obtain DRA
objects, and to track modifications to them in-memory. The current
DRA plugin behavior will be the default implementation of
SharedDRAManager.

Plugging a different implementation will allow Cluster Autoscaler
to provide a simulated state of DRA objects to the DRA plugin when
making scheduling simulations, as well as obtain the modifications
to DRA objects from the plugin.
2024-11-05 13:52:57 +01:00
Kubernetes Prow Robot
ea1143efc7
Merge pull request #126022 from macsko/new_node_to_status_map_structure
Change structure of NodeToStatus map in scheduler
2024-08-13 21:02:55 -07:00
Maciej Skoczeń
98be7dfc5d Change structure of NodeToStatus map in scheduler 2024-07-25 07:48:35 +00:00
googs1025
a3978e8315 scheduler: Add ctx param and error return to EnqueueExtensions.EventsToRegister() 2024-07-18 12:22:17 +08:00
Kubernetes Prow Robot
b6899c5e08
Merge pull request #122251 from olderTaoist/unschedulable-plugin
register unschedulable plugin  for those plugins that PreFilter's PreFilterResult filter out some nodes
2024-07-05 05:44:26 -07:00
olderTaoist
b478621596 register unscheduable plugin when prefileter with NodeNames 2024-07-02 13:02:45 +08:00
Kubernetes Prow Robot
8c478a06d8
Merge pull request #124595 from pohly/dra-scheduler-assume-cache-eventhandlers
DRA: scheduler event handlers via assume cache
2024-06-25 11:56:28 -07:00
Patrick Ohly
9a6f3b9388 scheduler: central ResourceClaim assume cache
This enables connecting the event handler for ResourceClaim to the assume
cache, which addresses a theoretic race condition.

It may also be useful for implementing the autoscaler support, because now
the autoscaler can modify the content of the cache.
2024-06-25 14:00:25 +02:00
NoicFank
31a4b13238 enhancement(scheduler): share waitingPods among profiles 2024-05-17 17:07:27 +08:00
kerthcet
84750fe52e Revert "enhancement(scheduler): share waitingPods among profiles"
This reverts commit 227c1915db.
2024-03-19 22:52:59 +01:00
NoicFank
227c1915db enhancement(scheduler): share waitingPods among profiles 2024-02-01 10:06:23 +08:00