-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
4.13.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
No
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
installing with enabling realtime kernel failed
Version-Release number of selected component (if applicable):
4.13.0-0.nightly-2023-03-11-033820
How reproducible:
Always
1. "create install-config", then insert "credentialsMode: Manual" 2. "create manifests", then create the manifest files to enable RT kernel 3. create the required credentials manually 4. "create cluster"
Actual results:
The installation failed, with co "machine-config" degraded.
Expected results:
The installation should succeed.
Additional info:
FYI the QE flexy-install job: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/185177/
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version False False 22m Error while reconciling 4.13.0-0.nightly-2023-03-11-033820: the cluster operator machine-config is degraded
$ oc get nodes
NAME STATUS ROLES AGE VERSION
jiwei-0313a-zcmqc-master-0 Ready control-plane,master 63m v1.26.2+bc894ae
jiwei-0313a-zcmqc-master-1 Ready control-plane,master 63m v1.26.2+bc894ae
jiwei-0313a-zcmqc-master-2 Ready control-plane,master 63m v1.26.2+bc894ae
jiwei-0313a-zcmqc-worker-us-east-1a-95gvm Ready worker 37m v1.26.2+bc894ae
jiwei-0313a-zcmqc-worker-us-east-1a-hsb9s Ready worker 36m v1.26.2+bc894ae
jiwei-0313a-zcmqc-worker-us-east-1b-tkgc2 Ready worker 35m v1.26.2+bc894ae
$ oc get co machine-config
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
machine-config 4.13.0-0.nightly-2023-03-11-033820 True False True 50m Failed to resync 4.13.0-0.nightly-2023-03-11-033820 because: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 0)]
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-376fbf7e7ee4581bf68bc0a2686538ed False True True 3 0 0 1 59m
worker rendered-worker-67c5d5a1689043d5419056c2ec3a83b3 False True True 3 0 0 1 59m
$ oc describe co machine-config
Name: machine-config
Namespace:
Labels: <none>
Annotations: exclude.release.openshift.io/internal-openshift-hosted: true
include.release.openshift.io/self-managed-high-availability: true
include.release.openshift.io/single-node-developer: true
API Version: config.openshift.io/v1
Kind: ClusterOperator
...output omitted...
Status:
Conditions:
Last Transition Time: 2023-03-13T09:10:36Z
Message: Cluster version is 4.13.0-0.nightly-2023-03-11-033820
Status: False
Type: Progressing
Last Transition Time: 2023-03-13T09:27:33Z
Message: Failed to resync 4.13.0-0.nightly-2023-03-11-033820 because: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 0)]
Reason: RequiredPoolsFailed
Status: True
Type: Degraded
Last Transition Time: 2023-03-13T09:10:35Z
Message: Cluster has deployed [{operator 4.13.0-0.nightly-2023-03-11-033820}]
Reason: AsExpected
Status: True
Type: Available
Last Transition Time: 2023-03-13T09:17:38Z
Message: One or more machine config pools are degraded, please see `oc get mcp` for further details and resolve before upgrading
Reason: DegradedPool
Status: False
Type: Upgradeable
Extension:
Master: pool is degraded because nodes fail with "1 nodes are reporting degraded status on sync": "Node jiwei-0313a-zcmqc-master-0 is reporting: \"error running rpm-ostree override remove kernel kernel-core kernel-modules kernel-modules-extra --install kernel-rt-core --install kernel-rt-modules --install kernel-rt-modules-extra --install kernel-rt-kvm: \\x1b[0m\\x1b[31merror: \\x1b[0mPackage/capability 'kernel-rt-core' is already requested\\n: exit status 1\""
Worker: pool is degraded because nodes fail with "1 nodes are reporting degraded status on sync": "Node jiwei-0313a-zcmqc-worker-us-east-1a-95gvm is reporting: \"error running rpm-ostree override remove kernel kernel-core kernel-modules kernel-modules-extra --install kernel-rt-core --install kernel-rt-modules --install kernel-rt-modules-extra --install kernel-rt-kvm: \\x1b[0m\\x1b[31merror: \\x1b[0mPackage/capability 'kernel-rt-core' is already requested\\n: exit status 1\""
...output omitted...
$