Commit graph

6958 commits

Author SHA1 Message Date
Kubernetes Prow Robot
db63a581ca
Merge pull request #134366 from tallclair/feature-gates-test
Set multiple feature gates simultaneously in test
2025-10-13 13:11:33 -07:00
PersistentJZH
b738e8c3ca fix panic in cron.ParseStandard
Signed-off-by: PersistentJZH <zhihao.kan17@gmail.com>

fix

optimize logic

fix unit test
2025-10-10 23:51:05 +08:00
Kubernetes Prow Robot
ee1ff4866e
Merge pull request #134479 from pohly/dra-device-taint-no-execute-toleration-fix
DRA device taints: fix toleration of NoExecute
2025-10-10 00:47:00 -07:00
Patrick Ohly
6f51446802 DRA device taints: fix toleration of NoExecute
As usual, consumers of an allocated claim react to the information stored in
the status. In this case, the scheduler did not copy the tolerations into the
status and as a result a pod with a toleration for NoExecute got scheduled and
then immediately evicted.

Some additional logging gets added to make the handling easier to track in the
eviction controller. Example YAMLs allow reproducing the use case manually.
2025-10-08 13:13:47 +02:00
Adrian Moisey
ae25979790
Use a dedicated delete channel in HPA test
This is an attempt to fix a flake
2025-10-02 21:22:10 +02:00
Kubernetes Prow Robot
16eebeb5ee
Merge pull request #134379 from liggitt/gc-race
Lock all mutable fields when printing gc node
2025-10-02 11:40:56 -07:00
Jordan Liggitt
6d3d7553fb
Lock all mutable fields when printing gc node 2025-10-02 08:50:17 -04:00
Tim Allclair
4986abe0b8 Automated refactoring to use SetFeatureGatesDuringTest 2025-10-01 21:10:53 -07:00
Kubernetes Prow Robot
ef95e1fd7e
Merge pull request #134318 from xigang/disruption
disruption: remove unused pdb parameter from getExpectedScale method
2025-10-01 07:38:27 -07:00
Kubernetes Prow Robot
6bb0bd55a3
Merge pull request #134295 from omerap12/hpa-desiredReplicasCount-metric
Add desired_replicas gauge metric to HPA controller
2025-10-01 05:16:17 -07:00
Kubernetes Prow Robot
7353b6a93d
Merge pull request #134312 from alaypatel07/gc-resourceclaim-extendedresource
fix resource claims deallocation for extended resource when pod is completed
2025-09-30 01:38:20 -07:00
Alay Patel
8a03067211 fix resource claims deallocation for extended resource when pod is completed
Signed-off-by: Alay Patel <alayp@nvidia.com>
2025-09-29 15:15:40 -04:00
xigang
574ac5b497 disruption: remove unused pdb parameter from getExpectedScale method
Signed-off-by: xigang <wangxigang2014@gmail.com>
2025-09-28 17:41:36 +08:00
xigang
574b09b7de nodelifecycle: fix ComputeZoneState method comment
Signed-off-by: xigang <wangxigang2014@gmail.com>
2025-09-28 10:56:56 +08:00
Omer Aplatony
7af3377900 Add desired_replicas histogram metric to HPA controller
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-09-27 09:30:03 +00:00
Kubernetes Prow Robot
1eb2c4182d
Merge pull request #134102 from mayank-agrwl/namespace-nodelifecycle-contextual
Replace HandleError with HandleErrorWithContext
2025-09-23 18:50:17 -07:00
Omer Aplatony
a8a21aaf85
Add hpa object count metric (#134140)
* Add hpa object count metrics

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

* change name to num_horizontal_pod_autoscalers

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

* remove log line

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

---------

Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-09-22 06:10:19 -07:00
Kubernetes Prow Robot
b4943a9dc9
Merge pull request #134104 from aditigupta96/refactor-cloud-node-controllers
refactor(controller): Use WithContext variants in cloud node controllers
2025-09-18 10:32:22 -07:00
Kubernetes Prow Robot
d7bd2b0343
Merge pull request #134030 from richabanker/update-metrics-docs
Update metrics docs list for v1.34
2025-09-18 08:04:15 -07:00
Edwin Hernandez
fa9071302f
Adding metrics for Maxunavailable feature in StatefulSet (#130951)
* adding maxunavailable_violation metric

added metric to list of stable metrics

changed when metric gets incremented

addressed comments

fixed stable metrics list

* Update pkg/controller/statefulset/metrics/metrics.go

Co-authored-by: Filip Křepinský <fkrepins@redhat.com>

* Update the metric and log verbosity level

* Address false positives metric count

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Implement maxUnavailable and UnavailableReplicas metrics

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* fix lint fmt

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* update tests

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* se metrics to 1 as a default

* log for true validation only and update func sig.

* Move maxUnavailable metric to the updateStatefulSetStatus

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* change metrics stability level to Alpha

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* fix unit test

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* fix linting issue

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Address code review feedback

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

---------

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Co-authored-by: Filip Křepinský <fkrepins@redhat.com>
Co-authored-by: Heba Elayoty <heelayot@microsoft.com>
2025-09-17 05:34:14 -07:00
Aditi Gupta
f58d1e101f refactor(controller): Use WithContext variants in cloud node controllers
This change refactors the cloud-specific versions of the node lifecycle
and node IPAM controllers to use a context.Context for cancellation and
contextual logging, replacing the legacy stopCh pattern.

This is a follow-up to PR #133985, where these controllers were
separated out due to their use in the legacy Cloud Controller Manager
(CCM).

It is a known issue that the CCM's startup logic does not pass the
controller name via the context. This change proceeds with the
refactoring to unify the cancellation logic across controllers, while
acknowledging that contextual logs will be less detailed when these
controllers are run in the CCM.

Signed-off-by: Aditi Gupta <aditigpta@google.com>
2025-09-17 00:17:38 -07:00
Mayank Agrawal
d12eeb98d0 Replace HandleError with HandleErrorWithContext 2025-09-16 23:47:23 -07:00
Kubernetes Prow Robot
69e92c6827
Merge pull request #134022 from aditigupta96/cleanup-waitfornamedcachesync
refactor(controller): Use context-aware WaitForNamedCacheSync in resourcequota and HPA tests
2025-09-16 17:18:16 -07:00
Kubernetes Prow Robot
d03d25f47c
Merge pull request #133985 from aditigupta96/api-waitfornamedcachesync-with-context
Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext in pkg/controller/
2025-09-16 17:18:09 -07:00
Aditi Gupta
af231d2153 Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext in pkg/controller/ 2025-09-16 14:51:34 -07:00
Kubernetes Prow Robot
12ddfaa5c7
Merge pull request #133984 from aditigupta96/add-context-to-waitfornamedcachesync
Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext in pkg/controller/garbagecollector
2025-09-16 13:48:10 -07:00
Richa Banker
c51a8734b1 Update documented metrics list 2025-09-16 11:52:14 -07:00
Aditi Gupta
1ce12710ec refactor(controller): Use context-aware WaitForNamedCacheSync in resourcequota and HPA tests 2025-09-12 12:37:54 -07:00
Kubernetes Prow Robot
118e833a0d
Merge pull request #133687 from soltysh/drop_PodIndexLabel
Drop PodIndexLabel after the feature GA-ed in 1.32
2025-09-12 07:30:11 -07:00
Kubernetes Prow Robot
44544abdc7
Merge pull request #133612 from michaelasp/discoveryCheck
feat: Add discovery check to SVM to ensure migration doesn't get stuck
2025-09-11 18:32:07 -07:00
Maciej Szulik
46cc610e6f
Drop PodIndexLabel after the feature GA-ed in 1.32
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2025-09-11 19:32:48 +02:00
Kubernetes Prow Robot
26b246ae66
Merge pull request #133191 from Jefftree/rev
Add jefftree to OWNERS
2025-09-11 07:06:11 -07:00
Kubernetes Prow Robot
bb12fee4c1
Merge pull request #133904 from aditigupta96/feat-auth-trust-wait-context
Change WaitForNamedCacheSync to WaitForNamedCacheSyncWithContext
2025-09-10 18:48:02 -07:00
Aditi Gupta
dfcadb4f89 Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext in pkg/controller/garbagecollector 2025-09-10 13:08:27 -07:00
Kubernetes Prow Robot
a8905a154b
Merge pull request #133179 from nmn3m/fix-strings-title
Replace deprecated strings.Title with cases.Title
2025-09-09 05:53:30 -07:00
Huan Yan
7aa6cabd63 fix typo for forceDetachTimeoutExpired 2025-09-07 16:37:34 +08:00
Michael Aspinwall
1a0813598b Update SVM Discovery checks in response to jpbetz and stlaz 2025-09-05 20:33:05 +00:00
Aditi Gupta
7d14367f57 Change WaitForNamedCacheSync to WaitForNamedCacheSyncWithContext.
This is part of the ongoing effort to adopt contextual logging
and utilities throughout the codebase.

Contributes to  #126379

Signed-off-by: Aditi Gupta <aditigpta@google.com>
2025-09-05 18:49:31 +00:00
Michael Aspinwall
21359d7b1f Switch to resourceVersion controller 2025-09-04 18:17:00 +00:00
Omer Aplatony
fbd33bd6b3 hpa: prevent integer overflow in external metrics sum
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-09-04 08:36:53 +00:00
Kubernetes Prow Robot
1bcfd5cee7
Merge pull request #133741 from kincoy/hpa-cleanup-redundant-casts
cleanup: remove redundant type conversions in podautoscaler
2025-09-01 04:35:20 -07:00
Kubernetes Prow Robot
5c107f08e9
Merge pull request #133708 from ingvagabund/podautoscaler-dont-print-panic
fix(controller/podautoscaler): do not print panic when .status.lastScaleTime is not set
2025-09-01 04:35:13 -07:00
Nour
72847ee1f7
Replace deprecated strings.Title with cases.Title 2025-08-30 18:16:59 +03:00
Michael Aspinwall
e1218922db Add unit tests to isResourceUpdatable 2025-08-28 21:04:59 +00:00
Kubernetes Prow Robot
6b33567f9b
Merge pull request #133684 from soltysh/drop_StatefulSetAutoDeletePVC
Drop StatefulSetAutoDeletePVC after the feature GA-ed in 1.32
2025-08-28 10:49:15 -07:00
Maciej Szulik
09e357d31f
Drop StatefulSetAutoDeletePVC after the feature GA-ed in 1.32
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2025-08-28 13:35:16 +02:00
kincoy
12a784b46b cleanup: remove redundant type conversions in podautoscaler
Signed-off-by: kincoy <kincoyao@gmail.com>
2025-08-28 14:28:57 +08:00
Kubernetes Prow Robot
e8fb05e8a0
Merge pull request #133686 from soltysh/drop_CronJobsScheduledAnnotation
Drop CronJobsScheduledAnnotation after the feature GA-ed in 1.32
2025-08-27 20:24:28 -07:00
Kubernetes Prow Robot
b8f5561ab7
Merge pull request #133425 from jsafrane/selinux-e2e-driver
Fix SELinux label comparison
2025-08-27 17:18:56 -07:00
Kubernetes Prow Robot
5742171781
Merge pull request #133415 from AadiDev005/optimize-calculate-pod-requests
HPA: optimize calculatePodRequests for specific container lookups
2025-08-27 17:18:34 -07:00