-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.20
-
None
-
Quality / Stability / Reliability
-
False
-
-
0
-
Critical
-
None
-
None
-
None
-
None
-
OCPEDGE Sprint 280, OCPEDGE Sprint 281
-
2
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Version-Release number of selected component (if applicable):
The installation of a TNF cluster using agent-based installation gets stuck due to failure in pacemaker to synchronize entities.
The journalctl shows failures related to missing revision.json file.
How reproducible:
Steps to Reproduce:
1.Install cluster acting as Hub cluster 2.Install MCE 3.Provision the infrastruture for a new spoke cluster 4.Apply the manifests that deploy a TNF cluster javier-1_manifests.tgz 5.After the nodes are installed, the ACI status is "finalizing", but the status of the pacemaker running on the hosts is showing
Actual results:
pcs status:
Full List of Resources:
* Clone Set: kubelet-clone [kubelet]:
* Started: [ javier-master-1-0 javier-master-1-1 ]
* javier-master-1-0_redfish (stonith:fence_redfish): Started javier-master-1-0
* javier-master-1-1_redfish (stonith:fence_redfish): Started javier-master-1-1
* Clone Set: etcd-clone [etcd]:
* Stopped: [ javier-master-1-0 javier-master-1-1 ]Failed Resource Actions:
* etcd start on javier-master-1-1 returned 'error' (podman failed to launch container (error code: 1)) at Thu Nov 6 15:48:25 2025 after 2m6.080s
* etcd start on javier-master-1-0 could not be executed (Timed Out: Resource agent did not complete within 10m) at Thu Nov 6 15:48:25 2025 after 10m2ms
Expected results:
pcs status withour failed resources and TNF cluster deployed successfully
Additional info:
Manual workaround can be used "sudo pcs resource cleanup" to continue the installation.