The fields become beta, enabled by default. DeviceTaintRule gets
added to the v1beta2 API, but support for it must remain off by default
because that API group is also off by default.
The v1beta1 API is left unchanged. No-one should be using it
anymore (deprecated in 1.33, could be removed now if it wasn't for
reading old objects and version emulation).
To achieve consistent validation, declarative validation must be enabled also
for v1alpha3 (was already enabled for other versions). Otherwise,
TestVersionedValidationByFuzzing fails:
--- FAIL: TestVersionedValidationByFuzzing (0.09s)
--- FAIL: TestVersionedValidationByFuzzing/resource.k8s.io/v1beta2,_Kind=DeviceTaintRule (0.00s)
validation_test.go:109: different error count (0 vs. 1)
resource.k8s.io/v1alpha3: <no errors>
resource.k8s.io/v1beta2: "spec.taint.effect: Unsupported value: \"幤HxÒQP¹¬永唂ȳ垞ş]嘨鶊\": supported values: \"NoExecute\", \"NoSchedule\", \"None\""
...
When deleteObject returns a NotFound error (the object was externally deleted
between the GET and the DELETE), attemptToDeleteItem should enqueue a virtual
delete event and return enqueuedVirtualDeleteEventErr.
Cover both code paths:
- default (background propagation): item with dangling owner
- waitingForDependentsDeletion: item whose owner is foreground-deleting
This fixes a bug that caused log calls involving `klog.Logger` to not be
checked.
As a result we have to fix some code that is now considered faulty:
ERROR: pkg/controller/serviceaccount/tokens_controller.go:382:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (e *TokensController) generateTokenIfNeeded(ctx context.Context, logger klog.Logger, serviceAccount *v1.ServiceAccount, cachedSecret *v1.Secret) ( /* retry */ bool, error) {
ERROR: ^
ERROR: pkg/controller/storageversionmigrator/storageversionmigrator.go:299:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (svmc *SVMController) runMigration(ctx context.Context, logger klog.Logger, gvr schema.GroupVersionResource, resourceMonitor *garbagecollector.Monitor, toBeProcessedSVM *svmv1beta1.StorageVersionMigration, listResourceVersion string) (err error, failed bool) {
ERROR: ^
ERROR: pkg/proxy/node.go:121:3: logging function "Error" should not use format specifier "%q" (logcheck)
ERROR: klog.FromContext(ctx).Error(nil, "Timed out waiting for node %q to exist", nodeName)
ERROR: ^
ERROR: pkg/proxy/node.go:123:3: logging function "Error" should not use format specifier "%q" (logcheck)
ERROR: klog.FromContext(ctx).Error(nil, "Timed out waiting for node %q to be assigned IPs", nodeName)
ERROR: ^
ERROR: pkg/scheduler/backend/queue/scheduling_queue.go:610:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (p *PriorityQueue) runPreEnqueuePlugin(ctx context.Context, logger klog.Logger, pl fwk.PreEnqueuePlugin, pInfo *framework.QueuedPodInfo, shouldRecordMetric bool) *fwk.Status {
ERROR: ^
ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:286:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (pl *DynamicResources) deleteClaim(ctx context.Context, claim *resourceapi.ResourceClaim, logger klog.Logger) error {
ERROR: ^
ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:499:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (pl *DynamicResources) waitForExtendedClaimInAssumeCache(
ERROR: ^
ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:528:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (pl *DynamicResources) createExtendedResourceClaimInAPI(
ERROR: ^
ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:592:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (pl *DynamicResources) unreserveExtendedResourceClaim(ctx context.Context, logger klog.Logger, pod *v1.Pod, state *stateData) {
ERROR: ^
ERROR: pkg/scheduler/framework/runtime/batch.go:171:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (b *OpportunisticBatch) batchStateCompatible(ctx context.Context, logger klog.Logger, pod *v1.Pod, signature fwk.PodSignature, cycleCount int64, state fwk.CycleState, nodeInfos fwk.NodeInfoLister) bool {
ERROR: ^
ERROR: staging/src/k8s.io/component-base/featuregate/feature_gate.go:890:4: Additional arguments to Info should always be Key Value pairs. Please check if there is any key or value missing. (logcheck)
ERROR: logger.Info("Warning: SetEmulationVersionAndMinCompatibilityVersion will change already queried feature", "featureGate", feature, "oldValue", oldVal, newVal)
ERROR: ^
ERROR: test/images/sample-device-plugin/sampledeviceplugin.go:108:2: logging function "Info" should not use format specifier "%s" (logcheck)
ERROR: logger.Info("pluginSocksDir: %s", pluginSocksDir)
ERROR: ^
ERROR: test/images/sample-device-plugin/sampledeviceplugin.go:123:2: logging function "Info" should not use format specifier "%s" (logcheck)
ERROR: logger.Info("CDI_ENABLED: %s", cdiEnabled)
ERROR: ^
While waiting for this to merge, another call was added which also doesn't
follow conventions:
ERROR: pkg/kubelet/kubelet.go:2454:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
ERROR: func (kl *Kubelet) deletePod(ctx context.Context, logger klog.Logger, pod *v1.Pod) error {
ERROR: ^
Contextual logging has been beta and enabled by default for several releases
now. It's mostly just a matter of wrapping up and declaring it GA. Therefore
the calls which directly call WithName or WithValues (always have an effect)
are left as-is instead of converting them to use the klog wrappers (support
disabling the effect). To allow that, the linter gets reconfigured to not
complain about this anymore, anywhere.
The calls which would have to be fixed otherwise are:
ERROR: pkg/kubelet/cm/dra/claiminfo.go:170:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger = logger.WithName("dra-claiminfo")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/healthinfo.go:45:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger = logger.WithName("dra-healthinfo")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/healthinfo.go:89:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger = logger.WithName("dra-healthinfo")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/healthinfo.go:157:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger = logger.WithName("dra-healthinfo")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/manager.go:175:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(ctx).WithName("dra-manager")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/manager.go:239:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(ctx).WithName("dra-manager")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/manager.go:593:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(ctx).WithName("dra-manager")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/manager.go:781:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(context.Background()).WithName("dra-manager")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/manager.go:898:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(ctx).WithName("dra-manager")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/manager_test.go:1638:15: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(streamCtx).WithName(st.Name())
ERROR: ^
ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:77:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(ctx).WithName("dra-plugin")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:108:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(ctx).WithName("dra-plugin")
ERROR: ^
ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:161:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger := klog.FromContext(ctx).WithName("dra-plugin")
ERROR: ^
ERROR: staging/src/k8s.io/dynamic-resource-allocation/resourceslice/tracker/tracker.go:695:14: function "WithValues" should be called through klogr.LoggerWithValues (logcheck)
ERROR: logger := logger.WithValues("device", deviceID)
ERROR: ^
ERROR: test/integration/apiserver/watchcache_test.go:42:54: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: etcd0URL, stopEtcd0, err := framework.RunCustomEtcd(klog.FromContext(ctx).WithName("etcd0"), "etcd_watchcache0", etcdArgs)
ERROR: ^
ERROR: test/integration/apiserver/watchcache_test.go:47:54: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: etcd1URL, stopEtcd1, err := framework.RunCustomEtcd(klog.FromContext(ctx).WithName("etcd1"), "etcd_watchcache1", etcdArgs)
ERROR: ^
ERROR: test/integration/scheduler_perf/scheduler_perf.go:1149:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
ERROR: logger = logger.WithName(tCtx.Name())
ERROR: ^
Raname this because facing lint error, counter metrics should have
"_total" suffix. Add the test `volume_operation_errors_total`
Marked `volume_operation_total_errors` as deprecated
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
When rapidly processing informer events it can happen that a pod gets scheduled
twice (seen only in the TestEviction/update unit test):
- Claim update observed, pod from informer cache with NodeName from update -> queue pod for eviction.
- Pod update observed, claim from informer cache -> queue pod again.
The effect is one additional Get call to the apiserver. We can avoid it by
maintaining an LRU cache with the UIDs of the pods which we have evicted and
thus don't need to do anything for.
In these cases it's certain that no time needs to pass, so Wait can
replace polling with Eventually. This also means that locking is
not necessary to prevent data races.
In particular with the builtin tCtx.Assert/Expect the assertions are also short
when using gomega and often more readable (no more confusion in Equal which one
is the expected and which the actual value).
We can observe the delay in the metric histogram. Because we run in a synctest
bubble, the delay is 100% predictable.
Unfortunately we cannot use the reactor mechanism of the fake client: that
delays while holding the fake's mutex. When some other goroutine (in this case,
the event recorder) calls the client, it gets blocked without being considered
durably blocked by synctest, so time does not advance and the test gets stuck.
Thanks for waiting for cache sync via channels the random delays caused by
polling are gone, making the initial setup including cache sync happen
"immediately" when a test starts (= same virtual time). This makes the tests
more predictable and simplifies making further assertions about when something
happens or how long it takes.
While at it, restore previous performance by setting feature gates once and
running tests in parallel again.
Added podToVolumes reverse index to optimize DeletePod.
Currently we simply iterate through all the volumes and remove the pod
being deleted from there. This is inefficient and takes longer the
longer the volume list becomes.
Keeping a map pod -> volumes makes removing a pod fast. We can just jump
to the relevant volumes directly and remove the pod from there.
When calling ControllerSELinuxTranslator.Conflicts(), the SELinux label
is repeatedly split into []string to detect conflicts. This causes a huge
number of allocations when there are many comparisons.
This is now made more efficient by pre-parsing the SELinux label and
storing it in podInfo as [4]string for fast comparison when needed.
Currently, we get each released PV every 15s, and in parallel. If there are a lot of released PV and we cannot finish all the get in 15s, it will starve other request by making the queue waiting for client-side throttling very long.
Even in a normal cluster, these requests are taking majority of all get requests from KCM (58% or 470 qps) in our stress test.
When objects are deleted externally (e.g., pods deleted directly after
job deletion), the garbage collector should not log errors. This change
adds explicit NotFound error handling in attemptToDeleteItem to enqueue
virtual delete events when deleteObject returns NotFound, treating
external deletion as a successful outcome.
This follows the same pattern already used when getObject returns
NotFound, ensuring consistency across the function.