Commit graph

6107 commits

Author SHA1 Message Date
Miciah Masters
fc18ffe58d TopologyAwareHints: Take lock in HasPopulatedHints
Prevent potential concurrent map access by taking a lock before reading the
topology cache's hintsPopulatedByService map.

* staging/src/k8s.io/endpointslice/topologycache/topologycache.go
(setHintsLocked, hasPopulatedHintsLocked): New helper functions.  These are
the same as the existing SetHints and HasPopulatedHints methods except that
these helpers assume that a lock is already held.
(SetHints): Use setHintsLocked.
(HasPopulatedHints): Take a lock and use hasPopulatedHintsLocked.
(AddHints): Take a lock and use setHintsLocked and hasPopulatedHintsLocked.
* staging/src/k8s.io/endpointslice/topologycache/topologycache_test.go
(TestTopologyCacheRace): Add a goroutine that calls HasPopulatedHints.
2023-08-31 16:38:51 -04:00
Kubernetes Prow Robot
de56018f04
Merge pull request #117269 from tnqn/automated-cherry-pick-of-#117245-#117249-upstream-release-1.27
Automated cherry pick of #117245: Fix TopologyAwareHint not working when zone label is added
#117249: Fix a data race in TopologyCache
2023-08-04 13:26:31 -07:00
Michal Wozniak
ed0cdc9e0b Include ignored pods when computing backoff delay for Job pod failures
# Conflicts:
#	pkg/controller/job/job_controller.go
2023-07-21 09:31:49 +02:00
Michal Wozniak
ae24a5cf74 Remarks 2023-07-21 09:29:47 +02:00
Michal Wozniak
9e1050b4d9 Adjust the algorithm for computing the pod finish time
Change-Id: Ic282a57169cab8dc498574f08b081914218a1039
2023-07-20 16:29:26 +02:00
Kubernetes Prow Robot
5ee5d7346e
Merge pull request #119096 from aleksandra-malinowska/automated-cherry-pick-of-#117865-upstream-release-1.27
Automated cherry pick of #117865: Parallel StatefulSet pod create & delete
2023-07-12 16:31:33 -07:00
Aleksandra Malinowska
28c79be674 Add unit tests for parallel StatefulSet create & delete 2023-07-10 12:31:07 +02:00
Aleksandra Malinowska
66f980be12 Parallel StatefulSet pod create & delete 2023-07-10 12:31:07 +02:00
Aleksandra Malinowska
288504fbf8 Refactor StatefulSet controller update logic 2023-07-10 12:31:07 +02:00
Aldo Culquicondor
92a0f58e2b
Only declare job as finished after removing all finalizers
Change-Id: Id4b01b0e6fabe24134e57e687356e0fc613cead4
2023-07-07 14:31:02 -04:00
Aldo Culquicondor
c655001fa4
Automated cherry pick of #118716 upstream release 1.27 (#118911)
* Skip terminal Pods with a deletion timestamp from the Daemonset sync

Change-Id: I64a347a87c02ee2bd48be10e6fff380c8c81f742

* Review comments and fix integration test

Change-Id: I3eb5ec62bce8b4b150726a1e9b2b517c4e993713

* Include deleted terminal pods in history

Change-Id: I8b921157e6be1c809dd59f8035ec259ea4d96301

* Exclude terminal pods from Daemonset e2e tests

Change-Id: Ic29ca1739ebdc54822d1751fcd56a99c628021c4
2023-07-06 18:57:02 -07:00
Maciej Szulik
b383755e46 Hide numberOfMissedSchedules as an algorithm internal number 2023-07-06 10:21:55 -07:00
Maciej Szulik
26db84e04c Update schedule logic to properly calculate missed schedules
Before this change we've assumed a constant time between schedule runs,
which is not true for cases like "30 6-16/4 * * 1-5".
The fix is to calculate the potential next run using the fixed schedule
as the baseline, and then go back one schedule back and allow the cron
library to calculate the correct time.

This approach saves us from iterating multiple times between last
schedule time and now, if the cronjob for any reason wasn't running for
significant amount of time.
2023-07-06 10:21:43 -07:00
Paco Xu
9ef90afb4f verifyVolumeNoStatusUpdateNeeded may cause flake and so only keep the last ones 2023-04-18 11:30:37 +02:00
Paco Xu
b598ea5c39 deflake: Add retry with timeout to wait for final conditions 2023-04-18 11:30:37 +02:00
Quan Tian
6f8ce72c0c Fix a data race in TopologyCache
The member variable `cpuRatiosByZone` should be accessed with the lock
acquired as it could be be updated by `SetNodes` concurrently.

Signed-off-by: Quan Tian <qtian@vmware.com>
Co-authored-by: Antonio Ojea <aojea@google.com>
2023-04-13 11:13:02 +08:00
Quan Tian
668778d1bd Fix TopologyAwareHint not working when zone label is added after Node creation
The topology.kubernetes.io/zone label may be added by could provider
asynchronously after the Node is created. The previous code didn't
update the topology cache after receiving the Node update event, causing
TopologyAwareHint to not work until kube-controller-manager restarts or
other Node events trigger the update.

Signed-off-by: Quan Tian <qtian@vmware.com>
2023-04-13 11:13:00 +08:00
Harshal Patil
1972dd1005 Do not log entire pod struct while attaching the volume
Signed-off-by: Harshal Patil <harpatil@redhat.com>
2023-04-05 20:24:12 -04:00
Michal Wozniak
b5dd5f1f3a Investigate and fix the handling of Succeeded pods in DaemonSet 2023-04-04 19:21:15 +02:00
mantuliu
0567c93b2a Improve the performance of map usage
Signed-off-by: mantuliu <240951888@qq.com>
2023-03-21 20:37:53 +08:00
Sathyanarayanan Saravanamuthu
c84c8add70
Decouple batch/job back-off logic from workqueues (#114768)
* batch/job: decouple backoff from workqueue

Signed-off-by: Sathyanarayanan Saravanamuthu <sathyanarays@vmware.com>

* Resolving review comments

* Resolving more review comments

* Resolving review comments

Signed-off-by: Sathyanarayanan Saravanamuthu <sathyanarays@vmware.com>

* Computing finish time to now when FinishedAt is unix epoch

* Addressing review comments

Signed-off-by: Sathyanarayanan Saravanamuthu <sathyanarays@vmware.com>

---------

Signed-off-by: Sathyanarayanan Saravanamuthu <sathyanarays@vmware.com>
2023-03-16 10:15:21 -07:00
Kensei Nakada
543f15d10c HPA: expose the metrics "metric_computation_duration_seconds" and "metric_computation_total" from HPA controller 2023-03-14 22:47:24 +00:00
Kubernetes Prow Robot
27e23bad7d
Merge pull request #116529 from pohly/controllers-with-name
kube-controller-manager: convert to structured logging
2023-03-14 14:12:55 -07:00
Kubernetes Prow Robot
c0ef73222f
Merge pull request #116522 from robscott/topology-1-27-updates
Introducing Topology Mode Annotation, Deprecating Topology Hints Annotation
2023-03-14 14:12:48 -07:00
Ziqi Zhao
d1aa73312c
pkg/controller/util support contextual logging (#115049)
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2023-03-14 12:38:14 -07:00
Patrick Ohly
99151c39b7 kube-controller-manager: convert to structured logging
Most of the individual controllers were already converted earlier. Some log
calls were missed or added and then not updated during a rebase. Some of those
get updated here to fill those gaps.

Adding of the name to the logger used by each controller gets
consolidated in this commit. By using the name under which the
controller is registered we ensure that the names in the log
are consistent.
2023-03-14 19:16:32 +01:00
Kubernetes Prow Robot
6a111bebe2
Merge pull request #116377 from kinvolk/rata/userns
KEP-127: user namespace support for stateless pods
2023-03-14 10:40:43 -07:00
Kubernetes Prow Robot
49649c89ea
Merge pull request #113584 from yangjunmyfm192085/volume-contextual-logging
volume: use contextual logging
2023-03-14 10:40:16 -07:00
Kensei Nakada
b49b34c03a
HPA: expose the metrics "reconciliations_total" and "reconciliation_duration_seconds" from HPA controller (#116010) 2023-03-14 09:39:42 -07:00
Kubernetes Prow Robot
f769c66aa8
Merge pull request #113622 from 249043822/br-context-logging-daemon
daemonset: use contextual logging
2023-03-14 09:38:28 -07:00
Patrick Ohly
29941b8d3e api: resource.k8s.io v1alpha1 -> v1alpha2
For Kubernetes 1.27, we intend to make some breaking API changes:
- rename PodScheduling -> PodSchedulingHints (https://github.com/kubernetes/kubernetes/issues/114283)
- extend ResourceClaimStatus (https://github.com/kubernetes/enhancements/pull/3802)

We need to switch from v1alpha1 to v1alpha2 for that.
2023-03-14 07:52:03 +01:00
Rob Scott
e23af041f5
Introducing Topology Mode Annotation, Deprecating Topology Hints
Annotation

As part of this change, kube-proxy accepts any value for either
annotation that is not "disabled".

Change-Id: Idfc26eb4cc97ff062649dc52ed29823a64fc59a4
2023-03-14 02:23:11 +00:00
ZhangKe10140699
7198bcffcd daemonset: use contextual logging 2023-03-14 08:50:27 +08:00
杨军10092085
361e4ff0fa volume: use contextual logging 2023-03-14 08:37:30 +08:00
Damien Grisonnet
ac394c5c19 Cleanup deprecated metrics
Remove the following deprecated metrics:
- node_collector_evictions_number
- scheduler_e2e_scheduling_duration_seconds

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2023-03-13 22:55:34 +01:00
Rodrigo Campos
8af3cce7fe
kubelet: remove GetHostIDsForPod()
Now KEP-127 relies on idmap mounts to do the ID translation and we won't
do any chowns in the kubelet.

This patch just removes the usage of GetHostIDsForPod() in
operationexecutor to do the chown, and also removes the
GetHostIDsForPod() method from the kubelet volume interface.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-03-13 22:28:03 +01:00
Kubernetes Prow Robot
c237ddb226
Merge pull request #116045 from sanposhiho/sanposhiho/message
fix(HPA): make a difference in SuccessfulRescale  events between the resource metric and the container resource metric
2023-03-13 13:24:47 -07:00
Kubernetes Prow Robot
02a654a635
Merge pull request #116043 from sanposhiho/featuregate-check
fix(HPA): ignore the container resource metrics in HPA controller when the feature gate is disabled
2023-03-13 12:14:50 -07:00
Kubernetes Prow Robot
a0b1bee7c5
Merge pull request #115840 from atosatto/remove-taint-manager-cli
Remove enable-taint-manager and pod-eviction-timeout CLI flags
2023-03-13 08:13:10 -07:00
Kubernetes Prow Robot
492a08c916
Merge pull request #113525 from 249043822/br-context-logging-deployment
deployment controller: use contextual logging
2023-03-13 08:13:02 -07:00
Kubernetes Prow Robot
185cd95b9c
Merge pull request #113443 from yangjunmyfm192085/namespace-contextual-logging
namespace controller: use contextual logging
2023-03-13 04:34:44 -07:00
ZhangKe10140699
66bda6c092 deployment controller: use contextual logging 2023-03-13 19:00:44 +08:00
JunYang
f5bd8c86d4 namespace controller: use contextual logging 2023-03-13 14:59:17 +08:00
Kubernetes Prow Robot
16bc942a6b
Merge pull request #113464 from mengjiao-liu/contextual-logging-controller-bootstrap
Migrate `pkg/controller/bootstrap` to contextual logging
2023-03-12 20:12:42 -07:00
Mengjiao Liu
e56f3e0781 Migrate pkg/controller/bootstrap to contextual logging 2023-03-13 10:18:40 +08:00
Kensei Nakada
fafbed3b1d
fix the error message 2023-03-12 14:48:48 +09:00
Kubernetes Prow Robot
7529178924
Merge pull request #111372 from HeavenTonight/master
code cleanup
2023-03-10 11:44:40 -08:00
Kubernetes Prow Robot
c88b61f553
Merge pull request #113910 from mengjiao-liu/contextual-logging-pkg-controller-certificates
clusterroleaggregation: use contextual logging
2023-03-10 04:34:50 -08:00
Kubernetes Prow Robot
cb00077cd3
Merge pull request #113471 from ncdc/gc-contextual-logging
garbagecollector: use contextual logging
2023-03-10 04:34:39 -08:00
Kubernetes Prow Robot
ccba890df9
Merge pull request #114420 from bzsuni/bz/optimization
Cleanup: fix variable names in comments
2023-03-09 21:33:37 -08:00