It turned out that ginkgo.GinkgoT() wasn't as cheap as it should have been (fix
coming in Ginkgo 2.27.5). When instantiated once for each framework.Framework
instance during init by all workers at the same time, the resulting spike in
overall memory usage within the container caused OOM killing of workers in Prow
jobs like ci-kubernetes-e2e-gci-gce with very tight memory limits.
Even with the upcoming fix in Ginkgo it makes sense to set the TB field only
while it really is needed, i.e. while a test runs. This is conceptually similar
to setting and unsetting the test namespace. It may help to flush out incorrect
usage of TB outside of tests.
This makes it possible to call helper packages which expect a TContext from E2E
tests.
The implementation uses GinkgoT as TB and supports registering cleanup
callbacks which expect a context. These callbacks then run with a context that
comes from ginkgo.DeferCleanup, just as if they had called that directly.
(cherry picked from commit 47b613eded)
Refactoring the DRA upgrade/downgrade testing such that it runs as Go test
depended on supporting ktesting in the E2E framework. That change worked during
presubmit testing, but broke some periodic jobs. Therefore the relevant commits
from https://github.com/kubernetes/kubernetes/pull/135664/commits get reverted:
c47ad64820 DRA e2e+integration: test ResourceSlice controller
047682908d ktesting: replace Begin/End with TContext.Step
de47714879 DRA upgrade/downgrade: rewrite as Go unit test
7c7b1e1018 DRA e2e: make driver deployment possible in Go unit tests
65ef31973c DRA upgrade/downgrade: split out individual test steps
47b613eded e2e framework: support creating TContext
The last one is what must have caused the problem, but the other commits depend
on it.
This makes it possible to call helper packages which expect a TContext from E2E
tests.
The implementation uses GinkgoT as TB and supports registering cleanup
callbacks which expect a context. These callbacks then run with a context that
comes from ginkgo.DeferCleanup, just as if they had called that directly.
A special test (for example, one which manages its own cluster) could almost
construct a Framework instance because most fields are exported. The
`clientConfig` field isn't because REST configs often need to be deep-copied to
avoid accidentally updating some shared copy, so for this special case a
SetClientConfig is needed.
This consolidates timeout handling. In the future, configuration of all
timeouts via a configuration file might get added. For now, the same three
legacy command line flags for the timeouts that get moved continue to be
supported.
If we were to add new fields in TimeoutContext, the current users of
NewFrameworkWithCustomTimeouts might run into failures unless they get modified
to also set those new fields. This is error-prone.
A better approach is to let users of NewFrameworkWithCustomTimeouts override
fields by setting just those and use the normal defaults for the others.
All code must use the context from Ginkgo when doing API calls or polling for a
change, otherwise the code would not return immediately when the test gets
aborted.
It is set in all of the test/e2e* suites, but not in the ginkgo output
tests. This check is needed before adding a test case there which would trigger
this nil pointer access.
This addresses a problem caused by
https://github.com/kubernetes/kubernetes/pull/112043: because the AfterEach
which invokes AllNodesReady always runs, including tests that skipped early,
those tests ran into a nil pointer access. This increased the size of log
files. The tests still worked.
This helps getting rid of the ssh dependency. The same init package as for
dumping namespaces takes care of adding the functionality back to framework
instances.
This helps getting rid of the ssh dependency. The same init package as for
dumping namespaces takes care of adding the functionality back to framework
instances.
This reduces the size of the test/e2e/framework itself. Because it does not
gather metrics data anymore by default, E2E test suites must set their
callbacks function or set the original one by importing
"k8s.io/kubernetes/test/e2e/framework/todo/metrics/init".
This reduces the size of the test/e2e/framework itself. Because it does not
check nodes anymore by default, E2E test suites must set their own check
function or set the original one by importing
"k8s.io/kubernetes/test/e2e/framework/todo/node/init".
This reduces the size of the test/e2e/framework itself. Because it does not
dump anything anymore by default, E2E test suites must set their own dump
function or set the original one by importing
"k8s.io/kubernetes/test/e2e/framework/debug/init".
This will be used as mechanism for invoking some of the code which is currently
hard-coded in the framework once that code is placed in optional packages.
When Ginkgo shows a BeforeEach/AfterEach/DeferCleanup, then it can only show
the source code where the callback was registered because there is no
description parameter. This can be improved by passing a custom CodeLocation.
Because a description like "set up framework" might not be enough, the source
code is still shown, too.
The framework.AddCleanupAction API was a workaround for Ginkgo v1 not invoking
AfterEach callbacks after a test failure. Ginkgo v2 not only fixed that, but
also added a DeferCleanup API which can be used to run some code if (and only
if!) the corresponding setup code ran. In several cases that makes the test
cleanup simpler.
For cleanup purposes the ginkgo.DeferCleanup is a better replacement for
f.AddAfterEach:
- the cleanup only gets executed when the corresponding setup code ran
and can use the same local variables
- the callback runs after the test and before the framework
deletes namespaces (as before)
- if one callback fails, the others still get executed
For the original purpose (https://github.com/kubernetes/kubernetes/pull/86177 "This is
very useful for custom gathering scripts.") it is now possible to use
ginkgo.AfterEach because it will always get executed. Just beware that its
callbacks run in first-in-first-out order.
In contrast to ginkgo.AfterEach, ginkgo.DeferCleanup runs the callback in
first-in-last-out order. Using it makes the following test code work as
expected:
f := framework.NewDefaultFramework("some test")
ginkgo.AfterEach(func() {
// do something with f.ClientSet
})
Previously, f.ClientSet was already set to nil by the framework's cleanup code.
Besides, the using of method might lead to a `concurrent map writes`
issue per the discussion here: https://github.com/onsi/ginkgo/issues/970
Signed-off-by: Dave Chen <dave.chen@arm.com>
`FullStackTrace` is not available in v2 if no exception found
with test execution.
The change is needed for conformance test's spec validation.
pls see: https://github.com/onsi/ginkgo/issues/960 for details.
Signed-off-by: Dave Chen <dave.chen@arm.com>
- update all the import statements
- run hack/pin-dependency.sh to change pinned dependency versions
- run hack/update-vendor.sh to update go.mod files and the vendor directory
- update the method signatures for custom reporters
Signed-off-by: Dave Chen <dave.chen@arm.com>
The change from service account secrets to projected tokens and
the new dependency on kube-root-ca.crt to start pods with those
projected tokens means that e2e tests can start before
kube-root-ca.crt is created in a namespace. Wait for the default
service account AND the kube-root-ca.crt configmap in normal
e2e tests.
The previous approach with grabbing via a nginx proxy had some
drawbacks:
- it did not work when the pods only listened on localhost (as
configured by kubeadm) and the proxy got deployed on a different
node
- starting the proxy raced with starting the pods, causing
sporadic test failures because the proxy was not set up
properly unless it saw all pods when starting the e2e.test
- the proxy was always started, whether it is needed or not
- the proxy was left running after a test and then the next
test run triggered potentially confusing messages when
it failed to create objects for the proxy
The new approach is similar to "kubectl port-forward" + "kubectl get
--raw". It uses the port forwarding feature to establish a TCP
connection via a custom dialer, then lets client-go handle TLS and
credentials.
Somehow verifying the server certificate did not work. As this
shouldn't be a big concern for E2E testing, certificate checking gets
disabled on the client side instead of investigating this further.