-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
4.13.0
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
None
-
Proposed
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
with Confidential Computing enabled, IPI installation failed with "monitoring" degraded due to NoPodReady
Version-Release number of selected component (if applicable):
4.13.0-0.nightly-2023-03-07-131556
How reproducible:
Always
Steps to Reproduce:
1. "create install-config" 2. edit "install-config.yaml" to insert Confidential Computing settings, for example $ yq-3.3.0 r test3/install-config.yaml platform gcp: projectID: openshift-qe region: us-central1 defaultMachinePlatform: confidentialCompute: Enabled onHostMaintenance: Terminate type: n2d-standard-4 $ 3. "create cluster"
Actual results:
The installation failed, with the cluster operator "monitoring" degraded.
Expected results:
The installation should succeed.
Additional info:
The Prow CI job: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/37012/rehearse-37012-periodic-ci-openshift-verification-tests-master-installer-rehearse-4.13-installer-rehearse-gcp/1633286920605798400
$ ./oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version False True 119m Unable to apply 4.13.0-0.nightly-2023-03-07-131556: the cluster operator monitoring is not available
$ ./oc get nodes
NAME STATUS ROLES AGE VERSION
ci-op-knzzvf4r-47ef3-r9xz4-master-0.c.openshift-qe.internal Ready control-plane,master 115m v1.26.2+bc894ae
ci-op-knzzvf4r-47ef3-r9xz4-master-1.c.openshift-qe.internal Ready control-plane,master 115m v1.26.2+bc894ae
ci-op-knzzvf4r-47ef3-r9xz4-master-2.c.openshift-qe.internal Ready control-plane,master 115m v1.26.2+bc894ae
ci-op-knzzvf4r-47ef3-r9xz4-worker-a-jtv4j Ready worker 101m v1.26.2+bc894ae
ci-op-knzzvf4r-47ef3-r9xz4-worker-b-6k98z Ready worker 101m v1.26.2+bc894ae
ci-op-knzzvf4r-47ef3-r9xz4-worker-c-h76sp Ready worker 101m v1.26.2+bc894ae
$ ./oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.13.0-0.nightly-2023-03-07-131556 True False False 93m
baremetal 4.13.0-0.nightly-2023-03-07-131556 True False False 110m
cloud-controller-manager 4.13.0-0.nightly-2023-03-07-131556 True False False 111m
cloud-credential 4.13.0-0.nightly-2023-03-07-131556 True False False 118m
cluster-autoscaler 4.13.0-0.nightly-2023-03-07-131556 True False False 109m
config-operator 4.13.0-0.nightly-2023-03-07-131556 True False False 111m
console 4.13.0-0.nightly-2023-03-07-131556 True False False 97m
control-plane-machine-set 4.13.0-0.nightly-2023-03-07-131556 True False False 108m
csi-snapshot-controller 4.13.0-0.nightly-2023-03-07-131556 True False False 110m
dns 4.13.0-0.nightly-2023-03-07-131556 True False False 109m
etcd 4.13.0-0.nightly-2023-03-07-131556 True False False 109m
image-registry 4.13.0-0.nightly-2023-03-07-131556 True False False 100m
ingress 4.13.0-0.nightly-2023-03-07-131556 True False False 98m
insights 4.13.0-0.nightly-2023-03-07-131556 True False False 104m
kube-apiserver 4.13.0-0.nightly-2023-03-07-131556 True False False 99m
kube-controller-manager 4.13.0-0.nightly-2023-03-07-131556 True False False 107m
kube-scheduler 4.13.0-0.nightly-2023-03-07-131556 True False False 106m
kube-storage-version-migrator 4.13.0-0.nightly-2023-03-07-131556 True False False 111m
machine-api 4.13.0-0.nightly-2023-03-07-131556 True False False 101m
machine-approver 4.13.0-0.nightly-2023-03-07-131556 True False False 110m
machine-config 4.13.0-0.nightly-2023-03-07-131556 True False False 92m
marketplace 4.13.0-0.nightly-2023-03-07-131556 True False False 109m
monitoring False True True 93m NoPodReady: shard 0: pod prometheus-k8s-0: containers with incomplete status: [init-config-reloader]...
network 4.13.0-0.nightly-2023-03-07-131556 True False False 113m
node-tuning 4.13.0-0.nightly-2023-03-07-131556 True False False 109m
openshift-apiserver 4.13.0-0.nightly-2023-03-07-131556 True False False 102m
openshift-controller-manager 4.13.0-0.nightly-2023-03-07-131556 True False False 106m
openshift-samples 4.13.0-0.nightly-2023-03-07-131556 True False False 103m
operator-lifecycle-manager 4.13.0-0.nightly-2023-03-07-131556 True False False 110m
operator-lifecycle-manager-catalog 4.13.0-0.nightly-2023-03-07-131556 True False False 110m
operator-lifecycle-manager-packageserver 4.13.0-0.nightly-2023-03-07-131556 True False False 104m
service-ca 4.13.0-0.nightly-2023-03-07-131556 True False False 111m
storage 4.13.0-0.nightly-2023-03-07-131556 True False False 110m
$ ./oc describe co monitoring
Name: monitoring
Namespace:
Labels: <none>
Annotations: include.release.openshift.io/ibm-cloud-managed: true
include.release.openshift.io/self-managed-high-availability: true
include.release.openshift.io/single-node-developer: true
API Version: config.openshift.io/v1
Kind: ClusterOperator
Metadata:
Creation Timestamp: 2023-03-08T02:13:25Z
Generation: 1
Managed Fields:
API Version: config.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:include.release.openshift.io/ibm-cloud-managed:
f:include.release.openshift.io/self-managed-high-availability:
f:include.release.openshift.io/single-node-developer:
f:ownerReferences:
.:
k:{"uid":"cbb4bb0c-d5ec-4d29-ab4b-292d512073c1"}:
f:spec:
Manager: cluster-version-operator
Operation: Update
Time: 2023-03-08T02:13:25Z
API Version: config.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:extension:
f:relatedObjects:
Manager: cluster-version-operator
Operation: Update
Subresource: status
Time: 2023-03-08T02:13:26Z
API Version: config.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
f:conditions:
Manager: operator
Operation: Update
Subresource: status
Time: 2023-03-08T02:39:15Z
Owner References:
API Version: config.openshift.io/v1
Controller: true
Kind: ClusterVersion
Name: version
UID: cbb4bb0c-d5ec-4d29-ab4b-292d512073c1
Resource Version: 31168
UID: 0c45f5ca-5672-420f-8402-e8914ae08f7a
Spec:
Status:
Conditions:
Last Transition Time: 2023-03-08T02:39:15Z
Message: NoPodReady: shard 0: pod prometheus-k8s-0: containers with incomplete status: [init-config-reloader]
shard 0: pod prometheus-k8s-1: containers with incomplete status: [init-config-reloader]
Reason: UpdatingPrometheusK8SFailed
Status: False
Type: Available
Last Transition Time: 2023-03-08T02:39:15Z
Message: NoPodReady: shard 0: pod prometheus-k8s-0: containers with incomplete status: [init-config-reloader]
shard 0: pod prometheus-k8s-1: containers with incomplete status: [init-config-reloader]
Reason: UpdatingPrometheusK8SFailed
Status: True
Type: Degraded
Last Transition Time: 2023-03-08T02:22:47Z
Message: Rolling out the stack.
Reason: RollOutInProgress
Status: True
Type: Progressing
Last Transition Time: 2023-03-08T02:22:47Z
Status: Unknown
Type: Upgradeable
Extension: <nil>
Related Objects:
Group:
Name: openshift-monitoring
Resource: namespaces
Group:
Name: openshift-user-workload-monitoring
Resource: namespaces
Group: monitoring.coreos.com
Name:
Resource: servicemonitors
Group: monitoring.coreos.com
Name:
Resource: podmonitors
Group: monitoring.coreos.com
Name:
Resource: prometheusrules
Group: monitoring.coreos.com
Name:
Resource: alertmanagers
Group: monitoring.coreos.com
Name:
Resource: prometheuses
Group: monitoring.coreos.com
Name:
Resource: thanosrulers
Group: monitoring.coreos.com
Name:
Resource: alertmanagerconfigs
Events: <none>
$ ./oc get pods -n openshift-monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 6/6 Running 1 (96m ago) 96m
alertmanager-main-1 6/6 Running 1 (97m ago) 97m
cluster-monitoring-operator-6fc7495c9f-v26jg 1/1 Running 0 117m
kube-state-metrics-5844868575-88hr4 3/3 Running 0 103m
node-exporter-52zf5 2/2 Running 0 103m
node-exporter-7rfb6 2/2 Running 0 103m
node-exporter-8xt24 2/2 Running 0 103m
node-exporter-9w4sl 2/2 Running 0 103m
node-exporter-bwvj8 2/2 Running 0 103m
node-exporter-k9tpb 2/2 Running 0 103m
openshift-state-metrics-8666f44bb-vxhwn 3/3 Running 0 103m
prometheus-adapter-5c57687ddb-fr5rk 1/1 Running 0 102m
prometheus-adapter-5c57687ddb-kkwjq 1/1 Running 0 102m
prometheus-k8s-0 0/6 Init:0/1 0 102m
prometheus-k8s-1 0/6 Init:0/1 0 96m
prometheus-operator-58b5f659fd-hptbn 2/2 Running 0 104m
prometheus-operator-admission-webhook-6c7b57b57-v4cxm 1/1 Running 0 114m
prometheus-operator-admission-webhook-6c7b57b57-wf6rs 1/1 Running 0 114m
telemeter-client-6bc4594c4-6g6l4 3/3 Running 0 102m
thanos-querier-6d47f89d57-69925 6/6 Running 0 102m
thanos-querier-6d47f89d57-dcbgp 6/6 Running 0 102m
$ ./oc logs prometheus-k8s-0 -n openshift-monitoring
Error from server (BadRequest): container "prometheus" in pod "prometheus-k8s-0" is waiting to start: PodInitializing
$ ./oc logs prometheus-k8s-1 -n openshift-monitoring
Error from server (BadRequest): container "prometheus" in pod "prometheus-k8s-1" is waiting to start: PodInitializing
$
- duplicates
-
OCPBUGS-7582 RHCOS misses udev rules for GCE PD NVMe disks
-
- Closed
-
- is related to
-
CORS-2553 CI Integration
-
- Closed
-