In the DRA scheduler plugin, `SignalClaimPendingAllocation` and
`MaybeRemoveClaimPendingAllocation` coordinate in-flight allocations for
claims shared between Pods when scheduling a PodGroup. `Signal` is
called in Reserve to store an in-flight allocation and mark a claim as
being shared. `MaybeRemove` is called in Unreserve to release a Pod's
share of the claim and remove the in-flight allocation if no more Pods
are sharing the claim. `MaybeRemove` is also called at the end of
PreBind so successful binding cycles remove the in-flight allocation in
favor of the allocation stored in the API.
When a Pod transiently fails in PreBind, `MaybeRemove` is invoked twice:
first in PreBind, and again in Unreserve. Since shares are tracked with
a counter, these double-counted releases can cause the in-flight claim
to be removed from the cache prematurely and poison the AssumeCache with
a claim that has both an updated resourceVersion and no allocation,
halting progress.
This change refactors the `MaybeRemove` call in PreBind to only fire
when PreBind has finished successfully. When Pods sharing a claim are
binding concurrently, each one will either:
- Update the claim's allocation successfully and force-remove the
in-flight allocation, or
- Fail to bind and wait until Unreserve to call `MaybeRemove`
The DRAListTypeAttributes feature gate was enabled but not passed
through AllocatorFeatures(), so the scheduler always selected the
incubating allocator which doesn't support list-type attribute
intersection matching (KEP-5491). The experimental allocator has
the intersection logic but was never selected.
As StorageCapacityScoring graduates to beta, its default shape is
space-spreading (prefer nodes with more available storage capacity).
However, the test code was still treating space-packing (prefer nodes
with less available storage capacity) as the default — a remnant from
the VolumeCapacityPriority era, which was absorbed into
StorageCapacityScoring.
This commit fixes that by aligning the default shape in the tests with
the actual default of StorageCapacityScoring.
Include image volumes in the image source count used by calculatePriority, so
the raw image score and max threshold are based on the same image sources.
Update ImageLocality tests to cover ImageVolume scoring against equivalent
regular container images.
Signed-off-by: Joshua Su <i@joshua.su>
Both kube-controller-manager and kube-scheduler create ResourceClaims. Using
the same metric (sub-system: "dynamic_resource_allocation", name:
"resourceclaim_creates_total") in both components simplifies aggregation across
the entire cluster.