vshkrabkov
b78cdbfdf4
Adds test cases for multiple preEnqueue plugins
2026-01-09 15:35:48 +00:00
vshkrabkov
779ff43005
Add unschedulabe pods metric drop for pod deletion
2026-01-07 15:17:27 +00:00
Manthan Parmar
41cde37f00
Update pkg/scheduler/backend/queue/scheduling_queue.go
...
Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>
2025-12-30 15:05:51 +00:00
Manuel Grandeit
66d4bd3206
Fix data race in PriorityQueue.UnschedulablePods()
...
The UnschedulablePods() function iterates over the unschedulablePods.podInfoMap
without holding any lock, while other goroutines may concurrently modify the map
via addOrUpdate(), delete(), or clear().
Other functions like PendingPods() and GetPod() correctly acquire p.lock.RLock()
before accessing unschedulablePods.podInfoMap, but UnschedulablePods() was
missing this.
Fix by adding p.lock.RLock()/RUnlock() to UnschedulablePods(), matching the
pattern used by PendingPods().
2025-12-20 13:46:58 +01:00
Kubernetes Prow Robot
1757c6358b
Merge pull request #135368 from vshkrabkov/fix/scheduler-queue-metric-sync
...
Scheduler: Fix GatedPods metric desync in unschedulable queue
2025-12-17 21:42:00 -08:00
Vlad Shkrabkov
5be527b78e
Scheduler: Fix GatedPods metric desync in unschedulable queue
...
Previously, when a Pod residing in the 'unschedulablePods' queue was updated and subsequently rejected by PreEnqueue plugins (returning 'Wait'), the logic in 'moveToActiveQ' would return early because the Pod was already present in the queue.
This caused the 'scheduler_gated_pods_total' metric to fail to increment, leading to metric inconsistencies (and potentially negative values upon Pod deletion).
This change adds a check to detect the transition from Ungated to Gated. If detected, the Pod is removed and re-added to the queue to ensure metrics are correctly swapped (Unschedulable-- and Gated++).
Added regression test 'TestSchedulingQueueMetrics_UngatedToGated' to verify the fix.
Signed-off-by: Vlad Shkrabkov <vshkrabkov@google.com>
2025-12-15 11:47:22 +00:00
Mohammad Varmazyar
4c2fff1934
Address comments, log level, test assersion consistency and remove unnecessary locks in TestFlushUnschedulablePodsLeftoverSetsFlag
2025-11-26 14:08:05 +01:00
Mohammad Varmazyar
4f455c9c0d
Refactor plugin clearing to use ClearRejectorPlugins method
2025-11-26 09:54:32 +01:00
Mohammad Varmazyar
d64e09c697
Clear plugins at handleSchedulingFailure and preserve both at Pop
2025-11-24 20:32:41 +01:00
Mohammad Varmazyar
ec05bcf186
test: simplify TestFlushUnschedulablePodsLeftoverSetsFlag
...
scheduler: add logging for pods scheduled after flush and preserve UnschedulablePlugins
2025-11-24 09:55:52 +01:00
Mohammad Varmazyar
e5e8ef993c
Add unit test for WasFlushedFromUnschedulable flag
2025-11-24 09:38:41 +01:00
Mohammad Varmazyar
6a1a71ddc5
Removing the reduntant WasFlushedFromUnschedulable
2025-11-24 09:38:41 +01:00
Mohammad Varmazyar
bc632c72d0
scheduler: add metric for pods scheduled after flush
...
Add counter metric to track pods that schedule immediately after
being flushed from unschedulablePods due to timeout. Uses a boolean
flag that is cleared when pods return to queue or move via events.
2025-11-24 09:38:41 +01:00
Mohammad Varmazyar
b2a399cf30
scheduler: add metric for pods scheduled after flush
...
This metric tracks pods that successfully schedule after being
flushed from unschedulablePods due to timeout. High values may
indicate missing queue hint optimizations or event handling issues.
2025-11-24 09:38:40 +01:00
Kubernetes Prow Robot
597a684bb0
Merge pull request #133172 from ania-borowiec/move_handle_and_plugin
...
Move interfaces: Handle and Plugin and related types from kubernetes/kubernetes to staging repo kube-scheduler
2025-09-08 06:05:31 -07:00
Maciej Skoczeń
4babdf8026
Fix race in movePodsToActiveOrBackoffQueue
2025-09-02 11:57:18 +00:00
Ania Borowiec
fadb40199f
Move interfaces: Handle and Plugin and related types from kubernetes/kubernetes to staging repo kube-scheduler
2025-09-02 09:42:53 +00:00
Kubernetes Prow Robot
5fb3296920
Merge pull request #132451 from macsko/fix_race_in_scheduler_integration_tests
...
Fix race in scheduler integration tests
2025-08-31 05:03:09 -07:00
Maciej Skoczeń
46e10103ff
Take activeQ lock for part of the Update method
2025-08-25 12:30:43 +00:00
Maciej Skoczeń
8b0b0df431
Don't run PreEnqueue when pod is activated from backoffQ
2025-08-22 12:40:41 +00:00
Maciej Skoczeń
aa59f930b3
Add lock to TestAsyncPreemption to prevent races
2025-08-05 09:43:12 +00:00
Maciej Skoczeń
c5ef720837
Fix race in scheduler integration tests
2025-08-05 09:42:52 +00:00
yliao
34a64db2c7
extended resource backed by DRA: implementation
2025-07-29 18:55:21 +00:00
Maciej Skoczeń
17d733e243
KEP-5229: Send API calls through dispatcher and cache
2025-07-25 15:35:36 +00:00
Ania Borowiec
aecd37e6fb
Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler
2025-07-24 12:10:58 +00:00
Omar Nasser
45c355ca58
Move unschedulablePods struct to a separate file
2025-07-11 19:48:11 +03:00
Junhao Zou
1b730abf8d
cleanup: use HandleErrorWithXXX instead of logger.Error where errors are intentionally ignored
2025-07-08 09:34:49 +08:00
Ania Borowiec
ee8c265d35
Move Code and Status from pkg/scheduler/framework to k8s.io/kube-scheduler/framework
2025-06-30 10:06:22 +00:00
Ania Borowiec
00d3750503
Move ClusterEvent type to staging repo, leaving some functions (that contain logic internal to scheduler) in kubernetes/kubernetes ( #132190 )
...
* Move ClusterEvent type to staging repo, leaving some functions (that contain logic internal to scheduler) in kubernetes/kubernetes
apply review comment and fix linter warning
* update-vendor.sh
* update doc comments
* run update-vendor.sh
2025-06-26 08:06:29 -07:00
Kensei Nakada
f694c58c6c
feat: graduate QueueingHint to GA
2025-05-26 21:23:46 +02:00
Maciej Skoczeń
157903b09b
Skip backoff when PodMaxBackoffDuration is set to zero
2025-05-26 09:35:53 +00:00
Kensei Nakada
adc4916dfe
feat: introduce pInfo.UnschedulableCount to make the backoff calculation more appropriate
2025-05-17 12:39:58 +02:00
Kubernetes Prow Robot
0113538e59
Merge pull request #127180 from sanposhiho/general-gate
...
feat: introduce pInfo.GatingPlugin to filter out events more generally
2025-05-14 05:13:18 -07:00
Kensei Nakada
5140786829
feat: improve the backoff calculation to o(1)
2025-05-12 01:26:47 +02:00
Kensei Nakada
d28c8cd488
fix: not removing the plugin from the unsched plugins after PreEnqueue
2025-05-07 14:12:23 +02:00
Kensei Nakada
47d296d62d
feat: introduce pInfo.GatingPlugin to filter out events more generally
2025-05-07 13:54:47 +02:00
Ania Borowiec
17acc4a5ee
Move queue.Done() before Prebind, add tests
2025-03-20 22:14:36 +00:00
Maciej Skoczeń
c7919f5e22
Pop from the backoffQ when the activeQ is empty
2025-03-20 16:07:13 +00:00
Kubernetes Prow Robot
65d9066665
Merge pull request #130680 from macsko/update_backoffq_less_function_to_order_by_priority_in_windows
...
Update backoffQ's less function to order pods by priority in windows
2025-03-20 01:36:31 -07:00
Maciej Skoczeń
e367dca6c5
Change backoffQ less function to order pods by priority in windows
2025-03-19 13:04:15 +00:00
Maciej Skoczeń
1be3f8961b
Fix a race when closing activeQ
2025-03-18 10:25:56 +00:00
Maciej Skoczeń
9df0f6b604
Call PreEnqueue plugins before adding pod to backoffQ
2025-03-14 08:47:40 +00:00
carlory
aab7a079fa
make each scheduler test independent
...
Signed-off-by: carlory <baofa.fan@daocloud.io>
2025-03-13 14:39:50 +08:00
Maciej Skoczeń
2fc3cd90b1
Store pod backoff expiration time in QueuedPodInfo
2025-03-06 10:45:38 +00:00
Maciej Skoczeń
6975572a80
Add missing increments of queue_incoming_pods_total metric in scheduling queue
2025-03-04 12:37:22 +00:00
Maciej Skoczeń
0f24b9ff45
Split backoffQ into backoffQ and errorBackoffQ in scheduler
2025-02-24 14:11:26 +00:00
Maciej Skoczeń
0452ae402a
Use cached calculateResource result when removing pod from NodeInfo in preemption
2025-01-21 10:02:57 +00:00
Kubernetes Prow Robot
fb033826a8
Merge pull request #128170 from sanposhiho/async-preemption
...
feature(KEP-4832): asynchronous preemption
2024-11-07 19:44:54 +00:00
Kensei Nakada
b96eee847e
feat: graduate SchedulerQueueingHints to beta
2024-11-07 21:45:18 +09:00
Kensei Nakada
105d489aa4
chore: wording
2024-11-07 14:09:35 +09:00