-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
7.6.12
-
None
-
False
-
-
False
-
-
After updating to ''rhsso-operator.7.6.12-opr-001'' the following alert is reported:
- alerts: - activeAt: "2025-08-21T14:07:13.890274457Z" annotations: description: 100% of the rhsso-operator-metrics/rhsso-operator-metrics targets in nn-rhsso namespace have been unreachable for more than 15 minutes. This may be a symptom of network connectivity issues, down nodes, or failures within these components. Assess the health of the infrastructure and nodes running these targets and then contact support. runbook_url: https://github.com/openshift/runbooks/blob/master/alerts/cluster-monitoring-operator/TargetDown.md summary: Some targets were not reachable from the monitoring server for an extended period of time. labels: alertname: TargetDown job: rhsso-operator-metrics namespace: nn-rhsso service: rhsso-operator-metrics severity: warning state: firing value: "1e+02" annotations: description: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service }} targets in {{ $labels.namespace }} namespace have been unreachable for more than 15 minutes. This may be a symptom of network connectivity issues, down nodes, or failures within these components. Assess the health of the infrastructure and nodes running these targets and then contact support.' runbook_url: https://github.com/openshift/runbooks/blob/master/alerts/cluster-monitoring-operator/TargetDown.md summary: Some targets were not reachable from the monitoring server for an extended period of time. duration: 900 evaluationTime: 0.014825242 health: ok keepFiringFor: 0 labels: severity: warning lastEvaluation: "2025-08-26T12:19:43.89164352Z" name: TargetDown query: 100 * ((1 - sum by (job, namespace, service) (up and on (namespace, pod) kube_pod_info) / count by (job, namespace, service) (up and on (namespace, pod) kube_pod_info)) or (count by (job, namespace, service) (up == 0) / count by (job, namespace, service) (up))) > 10 state: firing type: alerting
The problem is that the service and the ServiceMonitor have tcp/8383 and tcp/86686 ports configured but the rhsso-operator only listen on tcp/8181:
- service
apiVersion: v1 kind: Service metadata: creationTimestamp: "2025-09-03T15:38:19Z" labels: monitoring-key: middleware name: rhsso-operator name: rhsso-operator-metrics namespace: <namespace> ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: Deployment name: rhsso-operator uid: 9b0716b2-5477-4527-8b7b-3a1abfbd306c resourceVersion: "1154230" uid: cd447a64-7e8a-4dcd-ae42-091d9ed254d6 spec: clusterIP: 172.30.73.224 clusterIPs: - 172.30.73.224 internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: http-metrics port: 8383 protocol: TCP targetPort: 8383 - name: cr-metrics port: 8686 protocol: TCP targetPort: 8686 selector: name: rhsso-operator sessionAffinity: None type: ClusterIP
- operator listening ports:
oc debug node/<node> sh-5.1# chroot /host sh-5.1# NS=<namespace> sh-5.1# POD=<rhsso-operator-pod-name> sh-5.1# POD_ID=$( crictl pods --namespace=$NS --name=$POD -o json | jq -r '.items [].id' ) sh-5.1# crictl inspectp $POD_ID | jq -r '.info.runtimeSpec.linux.namespaces[] | select( .type=="network" ) | .path' /var/run/netns/d900d288-f7c9-4ab0-b584-4e92928a3c43 sh-5.1# nsenter --net=/var/run/netns/d900d288-f7c9-4ab0-b584-4e92928a3c43 ss -tulpn Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process tcp LISTEN 0 4096 :8081 *: users("manager",pid=3503301,fd=3))
- on previous versions, for example 'rhsso-operator.7.6.11-opr-006' we see that rhsso-operator listens on tcp/8383 and tcp/8686 ports, the ports defined in the service and ServiceMonitor:
NS=<namespace> POD=<rhsso-operator-pod-name> sh-5.1# POD_ID=$( crictl pods --namespace=$NS --name=$POD -o json | jq -r '.items[].id' ) sh-5.1# crictl inspectp $POD_ID | jq -r '.info.runtimeSpec.linux.namespaces[] | select( .type=="network" ) | .path' /var/run/netns/c486e14c-3c50-4a93-b541-6679a12e92c1 sh-5.1# nsenter --net=/var/run/netns/c486e14c-3c50-4a93-b541-6679a12e92c1 ss -tulpn Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process tcp LISTEN 0 4096 *:8383 *:* users:(("keycloak-operat",pid=3482046,fd=5)) tcp LISTEN 0 4096 *:8686 *:* users:(("keycloak-operat",pid=3482046,fd=6))
Thank you!