Commit graph

196 commits

Author SHA1 Message Date
weizhoublue
4e64e19fb7
scheduler: log error instead of failing when GangScheduling plugin is missing
Signed-off-by: weizhoublue <weizhou.lan@daocloud.io>
2026-06-18 16:26:58 +08:00
Maciej Skoczeń
54ca619d4b Merge GangScheduling and WorkloadAwarePreemption feature gates into GenericWorkload 2026-06-15 11:42:10 +00:00
Bartosz
499e68fd23
Return error PlacementFeasible is not implemented in GangScheduling 2026-06-02 10:30:49 +00:00
Kubernetes Prow Robot
ae84ac1a16
Merge pull request #138274 from wtravO/wtravo/placement-cycle-state
Add PlacementCycleState to WAS scheduler framework
2026-05-28 16:28:48 +05:30
wtravO
b968273f03 Pass PlacementCycleState to PlacementFeasible plugins 2026-05-27 10:44:44 -04:00
wtravO
bb6664b1e2 Address review feedback 2026-05-27 10:25:23 -04:00
wtravO
101af7309c Keep PlacementCycleState out of PodGroupAssignments 2026-05-27 10:25:23 -04:00
wtravO
bd97e3f190 expose PodGroupCycleState via PlacementCycleState 2026-05-27 10:25:23 -04:00
Maciej Skoczeń
8eb66b73ef Add support for PodGroups in scheduling queue 2026-05-27 13:06:13 +00:00
Bartosz
1e1bad1dde
Add PlacementFeasible plugin to support early gang termination 2026-05-25 10:36:03 +00:00
Maciej Skoczeń
0d9aca8306 Handle opportunistic batching correctly for PodGroup scheduling cycle 2026-05-04 15:28:10 +00:00
Maciej Wyrzuc
1f15743e49 Add pod group preemption 2026-03-23 16:22:32 +00:00
Maciej Wyrzuc
1382c96217 Allow cycle state to skip all post filter plugins 2026-03-23 12:30:23 +00:00
Nour
aa5e5ea9d6
scheduler: use contextual logging for event emission
Signed-off-by: Nour <nurmn3m@gmail.com>
2026-03-19 14:33:09 +02:00
Omar Sayed
e1b18e34ff snapshot pod group state before scheduling cycle and embed pod group manager into cache 2026-03-13 21:44:17 +00:00
Kubernetes Prow Robot
7f3a5ab96f
Merge pull request #136579 from romanbaron/reuse-scheduling-signature
Reuse pod scheduling signature for opportunistic batching
2026-03-13 20:15:39 +05:30
Roman Baron
c0e973dc70 scheduler: Replaced context.Context and testing.T parameters with ktesting.TContext in scheduling_queue_test.go 2026-03-12 17:31:11 +02:00
Bartosz
43c5d2a419
Add PlacementGeneratePlugin interface and runner 2026-03-12 09:33:05 +00:00
Roman Baron
de1385fe1b scheduler: Added ObserveFrameworkDurationAsync to metrics recorder 2026-03-12 10:31:38 +02:00
Roman Baron
7b00255135 scheduler: Removed plugin stats from pod signing process 2026-03-12 10:31:04 +02:00
Roman Baron
3a6c169034 scheduler: Reuse scheduling signature for opportunistic batching 2026-03-12 10:30:32 +02:00
Bartosz
f50ae7284a
Remove feature gate check for placement score plugin validation 2026-03-10 09:42:07 +00:00
Bartosz
335f043756
Add Min/MaxScore to replace Min/MaxNodeScore 2026-03-10 09:42:05 +00:00
Bartosz
db3c8f3a4b
Add PlacementScorePlugin interface and runner 2026-03-10 09:42:04 +00:00
Antoni Zawodny
3f094dc228
Create Workload API v1alpha2 (#136976)
* Drop WorkloadRef field and introduce SchedulingGroup field in Pod API

* Introduce v1alpha2 Workload and PodGroup APIs, drop v1alpha1 Workload API

Co-authored-by: yongruilin <yongrlin@outlook.com>

* Run hack/update-codegen.sh

* Adjust kube-scheduler code and integration tests to v1alpha2 API

* Drop v1alpha1 scheduling API group and run make update

---------

Co-authored-by: yongruilin <yongrlin@outlook.com>
2026-03-10 07:59:10 +05:30
Maciej Skoczeń
912bf9c4ef Use a single RunPermitPlugins function and call AddWaitingPod outside a framework 2026-03-02 15:14:12 +00:00
Kubernetes Prow Robot
8812ec563c
Merge pull request #134353 from skitt/drop-string-slice
Deprecate obsolete slice utility functions
2026-02-20 00:57:41 +05:30
Maciej Wyrzuc
4a326b0196 Preempt pods in prebind phase without delete calls.
This change allows the preemption to preempt a pod that is not yet
bound, but is already in prebind phase) without issuing a delete call to the
apiserver.

Pods are added to a special map of pods currently in prebind phaseand
preemption can cancel the context that is used for given pod prebind phase ,
allowing it to gracefully handle error in the same manner as errors
coming out from prebind plugins. This results in pods being pushed to
backoff queue, allowing them to be rescheduled in upcoming scheduling
cycles.
2026-02-18 09:00:23 +00:00
Kubernetes Prow Robot
f9c9f03b05
Merge pull request #136618 from macsko/workload_scheduling_cycle
KEP-4671: Introduce Workload Scheduling Cycle
2026-02-17 15:21:04 +05:30
Maciej Skoczeń
6233b25907 Introduce Workload Scheduling Cycle
Add integration tests for gang and basic policy workload scheduling

Add more tests for cluster snapshot

Proceed to binding cycle just after pod group cycle

Enforce one scheduler name per pod group, rename workload cycle to pod group cycle

Add unit tests for pod group scheduling cycle

Run ScheduleOne tests treating pod as part of a pod group

Rename NeedsPodGroupCycle to NeedsPodGroupScheduling

Observe correct per-pod and per-podgroup metrics during pod group cycle

Rename pod group algorithm status to waiting_on_preemption

Mention forgotAllAssumedPods is a safety check
2026-02-17 09:02:32 +00:00
Stephen Kitt
d42d1e3d1f
Deprecate obsolete slice utility functions
... and update users to use standard library functions.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2026-02-16 10:04:33 +01:00
Kubernetes Prow Robot
49fe2ecce1
Merge pull request #135719 from Argh4k/waiting-pod-integration-test
Put pods preempted in WaitOnPermit to backoff queue
2026-01-30 23:36:24 +05:30
Maciej Wyrzuc
f1f3a08ba7 Put pods preempted in WaitOnPermit in backoff queue 2026-01-30 09:38:17 +00:00
Antoni Zawodny
833b7205fc Run PreBind plugins in parallel if feasible 2026-01-11 14:19:18 +01:00
Antoni Zawodny
16b375e4ef Generalize ErrorChannel to other underlying types 2026-01-11 13:58:06 +01:00
Patrick Ohly
7a4d650125 DRA extended resources: fix flake in unit tests
The tests assumed that instantiating a DRAManager followed by
informerFactory.WaitForCacheSync would be enough to have the manager
up-to-date, but that's not correct: the test only waits for informer *caches*
to be synced, but syncing *event handlers* like the one in the manager may
still be going on. The flake rate is low, though:

    $ GOPATH/bin/stress -p 256 ./noderesources.test
    5s: 0 runs so far, 0 failures, 256 active
    10s: 256 runs so far, 0 failures, 256 active
    15s: 256 runs so far, 0 failures, 256 active
    20s: 512 runs so far, 0 failures, 256 active
    25s: 567 runs so far, 0 failures, 256 active
    30s: 771 runs so far, 0 failures, 256 active

    /tmp/go-stress-20251226T181044-974980161
    --- FAIL: TestCalculateResourceAllocatableRequest (0.81s)
        --- FAIL: TestCalculateResourceAllocatableRequest/DRA-backed-resource-with-shared-device-allocation (0.00s)
            extendedresourcecache.go:197: I1226 18:11:14.431337] Updated extended resource cache for explicit mapping extendedResource="extended.resource.dra.io/something" deviceClass="device-class-name"
            extendedresourcecache.go:204: I1226 18:11:14.431380] Updated extended resource cache for default mapping extendedResource="deviceclass.resource.kubernetes.io/device-class-name" deviceClass="device-class-name"
            extendedresourcecache.go:220: I1226 18:11:14.431394] Updated device class mapping deviceClass="device-class-name" extendedResource="extended.resource.dra.io/something"
            resource_allocation_test.go:595: Expected requested=2, but got requested=1
    FAIL

It becomes higher when changing WaitForCacheSync such that it doesn't poll and
therefore returns more promptly, which is where this flake was first observed.

The fix is to run the test in a syntest bubble where Wait can be used to wait
for all background activity, including event handling, to be finished before
proceeding with the test.

synctest is less forgiving about lingering goroutines. A synctest bubble must
wait for gouroutines to stop, which in this case means that there has to be
a way to wait for the metric recorder shutdown. Event handlers have to be
removed.

This could be done with plain Go, but here test/utils/ktesting is used instead
because it offers some advantages:
- less boilerplate code
- automatic cancellation of the context (i.e. less manual context.WithCancel)
- tCtx.SyncTest is a direct substitute for t.Run, which avoids re-indenting
  sub-tests. synctest itself needs another anonymous function, which makes
  the line too long and forced re-indention:
     t.Run(... func(...) {
         synctest.Test(... func() {
         })
     })

For the sake of consistency all tests get updated.

While at it, some code gets improved:

- t.Fatal(err) is not a good way to report an error because
  there is no additional markup in the test output that indicates
  that there was an unexpected error. It just logs err.Error(),
  which might not be very informative and/or obvious.
- newTestDRAManager aborts in case of a failure instead of
  returning an error.
2025-12-27 09:47:56 +01:00
bwsalmon
854e67bb51
KEP 5598: Opportunistic Batching (#135231)
* First version of batching w/out signatures.

* First version of pod signatures.

* Integrate batching with signatures.

* Fix merge conflicts.

* Fixes from self-review.

* Test fixes.

* Fix a bug that limited batches to size 2
Also add some new high-level logging and
simplify the pod affinity signature.

* Re-enable batching on perf tests for now.

* fwk.NewStatus(fwk.Success)

* Review feedback.

* Review feedback.

* Comment fix.

* Two plugin specific unit tests.:

* Add cycle state to the sign call, apply to topo spread.
Also add unit tests for several plugi signature
calls.

* Review feedback.

* Switch to distinct stats for hint and store calls.

* Switch signature from string to []byte

* Revert cyclestate in signs. Update node affinity.
Node affinity now sorts all of the various
nested arrays in the structure. CycleState no
longer in signature; revert to signing fewer
cases for pod spread.

* hack/update-vendor.sh

* Disable signatures when extenders are configured.

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update staging/src/k8s.io/kube-scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Disable node resource signatures when extended DRA enabled.

* Review feedback.

* Update pkg/scheduler/framework/plugins/imagelocality/image_locality.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/interface.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/plugins/nodedeclaredfeatures/nodedeclaredfeatures.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Update pkg/scheduler/framework/runtime/batch.go

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>

* Review feedback.

* Fixes for review suggestions.

* Add integration tests.

* Linter fixes, test fix.

* Whitespace fix.

* Remove broken test.

* Unschedulable test.

* Remove go.mod changes.

---------

Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>
2025-11-12 21:51:37 -08:00
Maciej Skoczeń
8d67173de0 Implement Gang scheduling in kube-scheduler 2025-11-06 10:47:29 +00:00
Hemant Kumar
002774c315 Address review comments 2025-11-04 11:16:43 -05:00
Hemant Kumar
fe3722dfa9 Address review comments
Change type name and stuff
2025-11-03 16:27:06 -05:00
Hemant Kumar
c71e45c735 Implement a csimanager for managing storage related assets 2025-10-31 11:06:58 -04:00
Hemant Kumar
7bbec73192 Add a interface for sharing CSINode objects between scheduler and CAS 2025-10-30 13:53:10 -04:00
Ania Borowiec
fadb40199f
Move interfaces: Handle and Plugin and related types from kubernetes/kubernetes to staging repo kube-scheduler 2025-09-02 09:42:53 +00:00
Kensei Nakada
ac9fad6030 feat: trigger PreFilterPreBind in the binding cycle 2025-07-29 19:01:02 +09:00
Maciej Skoczeń
17d733e243 KEP-5229: Send API calls through dispatcher and cache 2025-07-25 15:35:36 +00:00
Ania Borowiec
aecd37e6fb
Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler 2025-07-24 12:10:58 +00:00
Ania Borowiec
ee8c265d35
Move Code and Status from pkg/scheduler/framework to k8s.io/kube-scheduler/framework 2025-06-30 10:06:22 +00:00
Ania Borowiec
00d3750503
Move ClusterEvent type to staging repo, leaving some functions (that contain logic internal to scheduler) in kubernetes/kubernetes (#132190)
* Move ClusterEvent type to staging repo, leaving some functions (that contain logic internal to scheduler) in kubernetes/kubernetes

apply review comment and fix linter warning

* update-vendor.sh

* update doc comments

* run update-vendor.sh
2025-06-26 08:06:29 -07:00
Ania Borowiec
d75af825fb
Extract interface CycleState and move is to staging repo. CycleState implementation remains in k/k/pkg/scheduler/framework 2025-05-29 16:18:36 +00:00
Kubernetes Prow Robot
8a6b916765
Merge pull request #130720 from saintube/scheduler-expose-nodeinfo-in-prefilter
Expose NodeInfo to PreFilter plugins
2025-04-23 13:31:29 -07:00