-
Bug
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
The control plane deployment is stuck, only the following pods get deployed:
1. capi-provider
2. cluster-api
3. control-plane-operator
The control-plane-operator log is full with this error:
{"level":"error","ts":"2022-08-22T05:35:25Z","msg":"Reconciler error","controller":"hostedcontrolplane","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedControlPlane","hostedControlPlane":{"name":"test-infra-cluster-e1635fda","namespace":"clusters-test-infra-cluster-e1635fda"},"namespace":"clusters-test-infra-cluster-e1635fda","name":"test-infra-cluster-e1635fda","reconcileID":"26fbb362-978f-4112-906c-f959732f5d11","error":"failed to update control plane: failed to reconcile ignition server: failed to reconcile ignition deployment: Deployment.apps \"ignition-server\" is invalid: spec.template.spec.containers[0].image: Required value","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}
It seems that this line returns an empty string:
https://github.com/openshift/hypershift/blob/7e9f12fdd326cd581b1f657c645e53b0586f6a3a/control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go#L625
This is the link to the job artifacts:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-cluster-api-provider-agent-master-e2e-metal-assisted-capi-periodic/1561580772995895296/artifacts/e2e-metal-assisted-capi-periodic/assisted-baremetal-gather/artifacts/
Version-Release number of selected component (if applicable):
ocp-4.11
This is the job configuration:
https://github.com/openshift/release/blob/d82af179367ed18da189ddece26bc465c08d98c3/ci-operator/config/openshift/cluster-api-provider-agent/openshift-cluster-api-provider-agent-master.yaml#L117
So
registry.build02.ci.openshift.org/ci-op-fxj5lnvy/release@sha256:1946d15fec9b196f46326ea7c271190a33e20e6b25793af091be26dd2f1c5356
should match
https://github.com/openshift/release/blob/d82af179367ed18da189ddece26bc465c08d98c3/ci-operator/config/openshift/cluster-api-provider-agent/openshift-cluster-api-provider-agent-master.yaml#L59
This is the create cluster command:
./bin/hypershift create cluster agent --pull-secret /tmp/tmpdmpfpx3s --name test-infra-cluster-e1635fda --agent-namespace assisted-spoke-cluster-ad00f487 --base-domain redhat.com --annotations hypershift.openshift.io/capi-provider-agent-image=registry.build02.ci.openshift.org/ci-op-fxj5lnvy/pipeline@sha256:dcb630bc03f5fe72a4a9c70421731cfa77a38a3b6130b0c65395a6438d0ef655 --release-image=registry.build02.ci.openshift.org/ci-op-fxj5lnvy/release@sha256:1946d15fec9b196f46326ea7c271190a33e20e6b25793af091be26dd2f1c5356 --ssh-key /tmp/tmp9bfk_295
How reproducible:
70%
See the job history here:
https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-openshift-cluster-api-provider-agent-master-e2e-metal-assisted-capi-periodic
Steps to Reproduce:
1. Setup an ACM HUB cluster with infrastructure operator enabled
2. Create an InfraEnv and spin up an agent
3. Create an hypershift hostedCluster using the command above
You can follow these instructions for setting up the env manually:
https://hypershift-docs.netlify.app/how-to/agent/create-agent-cluster/
Actual results:
The hosted-control plane deployment hangs and never reach ready status