-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.18.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Ignition server returning 511 error when scaling up NodePool Ignition server logs: {"level":"error","ts":"2025-03-10T01:52:11Z","logger":"get-payload","msg":"mcs returned unexpected response code","code":500,"stacktrace":"github.com/openshift/hypershift/ignition-server/controllers.(*LocalIgnitionProvider).GetPayload.func11.3\n\t/hypershift/ignition-server/controllers/local_ignitionprovider.go:676\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/hypershift/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:154\nk8s.io/apimachinery/pkg/util/wait.waitForWithContext\n\t/hypershift/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:207\nk8s.io/apimachinery/pkg/util/wait.poll\n\t/hypershift/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:260\nk8s.io/apimachinery/pkg/util/wait.PollUntilWithContext\n\t/hypershift/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:111\ngithub.com/openshift/hypershift/ignition-server/controllers.(*LocalIgnitionProvider).GetPayload.func11\n\t/hypershift/ignition-server/controllers/local_ignitionprovider.go:660\ngithub.com/openshift/hypershift/ignition-server/controllers.(*LocalIgnitionProvider).GetPayload\n\t/hypershift/ignition-server/controllers/local_ignitionprovider.go:695\ngithub.com/openshift/hypershift/ignition-server/controllers.(*TokenSecretReconciler).Reconcile.func1\n\t/hypershift/ignition-server/controllers/tokensecret_controller.go:273\ngithub.com/openshift/hypershift/ignition-server/controllers.(*TokenSecretReconciler).Reconcile\n\t/hypershift/ignition-server/controllers/tokensecret_controller.go:281\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222"}
Version-Release number of selected component (if applicable):
Advanced Cluster Management for Kubernetes 2.13.0-80 provided by Red Hat multicluster engine for Kubernetes 2.8.0-201 provided by Red Hat
How reproducible:
1. use bare mental servers and ABI to create ocp cluster 3node, then control-plane node can have worker role. 2. use UPI to add 3 worker nodes to OCP cluster 3. remove schedulable to OCP to remove worker role from control-plane node. 4. get latest acm mce from v4.18 catalog and install them. 5. continue to create hub cluster on 3 worker nodes. 6. add two new nodes to host inventory. 7. create hcp cluster. 7. then we try to add worker nodes to the hcp cluster and fail here with Insufficient state.
Steps to Reproduce:
1. use bare mental and ABI to create ocp cluster 3node, then control-plane node can have worker role. 2. use UPI to add 3 worker nodes to OCP cluster 3. remove schedulable to OCP to remove worker role from control-plane node. 4. get latest acm mce from v4.18 catalog and install them. 5. continue to create hub cluster on 3 worker nodes. 6. add two new nodes to host inventory. 7. create hcp cluster. 8. then we try to add worker nodes to the hcp cluster and fail here with Insufficient state.
Actual results:
fail oc -n c1-hosted-01 get agent -o jsonpath='{range .items[*]}BMH: {@.metadata.labels.agent-install\.openshift\.io/bmh} Agent: {@.metadata.name} State: {@.status.debugInfo.state}{"\n"}{end}' BMH: Agent: 65fdfa2d-37a4-0080-aae4-d1b2cc58b1a2 State: insufficient BMH: Agent: bfe9f471-34ac-3e3a-e5d6-928ca2c1bbe0 State: insufficient and agent status info shows that: state: insufficient stateInfo: |- Host does not meet the minimum hardware requirements: This host has failed to download the ignition file from https://api.c1-hosted-01.p82.local:31615/ignition with the following error: ignition file download failed: bad status code: 511. server response: Token not found
Since token always there. we suspect some cache missing for agent hypershift consume. Please help review it and provide root cause and workaround for it.
Ignition-server logs
{"level":"error","ts":"2025-03-10T01:52:11Z","logger":"get-payload","msg":"mcs returned unexpected response code","code":500,"stacktrace":"github.com/openshift/hypershift/ignition-server/controllers.(*LocalIgnitionProvider).GetPayload.func11.3\n\t/hypershift/ignition-server/controllers/local_ignitionprovider.go:676\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/hypershift/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:154\nk8s.io/apimachinery/pkg/util/wait.waitForWithContext\n\t/hypershift/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:207\nk8s.io/apimachinery/pkg/util/wait.poll\n\t/hypershift/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:260\nk8s.io/apimachinery/pkg/util/wait.PollUntilWithContext\n\t/hypershift/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:111\ngithub.com/openshift/hypershift/ignition-server/controllers.(*LocalIgnitionProvider).GetPayload.func11\n\t/hypershift/ignition-server/controllers/local_ignitionprovider.go:660\ngithub.com/openshift/hypershift/ignition-server/controllers.(*LocalIgnitionProvider).GetPayload\n\t/hypershift/ignition-server/controllers/local_ignitionprovider.go:695\ngithub.com/openshift/hypershift/ignition-server/controllers.(*TokenSecretReconciler).Reconcile.func1\n\t/hypershift/ignition-server/controllers/tokensecret_controller.go:273\ngithub.com/openshift/hypershift/ignition-server/controllers.(*TokenSecretReconciler).Reconcile\n\t/hypershift/ignition-server/controllers/tokensecret_controller.go:281\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222"}
Expected results:
worker nodes can be added to hcp cluster.
Additional info:
we also create a support case for it. https://access.redhat.com/support/cases/#/case/04078889
- clones
-
OCPBUGS-52564 Ignition server error when scaling up NodePool
-
- Closed
-