-
Bug
-
Resolution: Won't Do
-
Major
-
None
-
7.6.12
-
None
-
False
-
-
False
-
-
After updating to ''rhsso-operator.7.6.12-opr-001'' the following alert is reported:
- alerts: - activeAt: "2025-08-21T14:07:13.890274457Z" annotations: description: 100% of the rhsso-operator-metrics/rhsso-operator-metrics targets in nn-rhsso namespace have been unreachable for more than 15 minutes. This may be a symptom of network connectivity issues, down nodes, or failures within these components. Assess the health of the infrastructure and nodes running these targets and then contact support. runbook_url: https://github.com/openshift/runbooks/blob/master/alerts/cluster-monitoring-operator/TargetDown.md summary: Some targets were not reachable from the monitoring server for an extended period of time. labels: alertname: TargetDown job: rhsso-operator-metrics namespace: nn-rhsso service: rhsso-operator-metrics severity: warning state: firing value: "1e+02" annotations: description: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service }} targets in {{ $labels.namespace }} namespace have been unreachable for more than 15 minutes. This may be a symptom of network connectivity issues, down nodes, or failures within these components. Assess the health of the infrastructure and nodes running these targets and then contact support.' runbook_url: https://github.com/openshift/runbooks/blob/master/alerts/cluster-monitoring-operator/TargetDown.md summary: Some targets were not reachable from the monitoring server for an extended period of time. duration: 900 evaluationTime: 0.014825242 health: ok keepFiringFor: 0 labels: severity: warning lastEvaluation: "2025-08-26T12:19:43.89164352Z" name: TargetDown query: 100 * ((1 - sum by (job, namespace, service) (up and on (namespace, pod) kube_pod_info) / count by (job, namespace, service) (up and on (namespace, pod) kube_pod_info)) or (count by (job, namespace, service) (up == 0) / count by (job, namespace, service) (up))) > 10 state: firing type: alerting
The problem is that the service and the ServiceMonitor have tcp/8383 and tcp/86686 ports configured but the rhsso-operator only listen on tcp/8181:
- service
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2025-09-03T15:38:19Z"
labels:
monitoring-key: middleware
name: rhsso-operator
name: rhsso-operator-metrics
namespace: <namespace>
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: Deployment
name: rhsso-operator
uid: 9b0716b2-5477-4527-8b7b-3a1abfbd306c
resourceVersion: "1154230"
uid: cd447a64-7e8a-4dcd-ae42-091d9ed254d6
spec:
clusterIP: 172.30.73.224
clusterIPs:
- 172.30.73.224
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http-metrics
port: 8383
protocol: TCP
targetPort: 8383
- name: cr-metrics
port: 8686
protocol: TCP
targetPort: 8686
selector:
name: rhsso-operator
sessionAffinity: None
type: ClusterIP
- operator listening ports:
oc debug node/<node> sh-5.1# chroot /host sh-5.1# NS=<namespace> sh-5.1# POD=<rhsso-operator-pod-name> sh-5.1# POD_ID=$( crictl pods --namespace=$NS --name=$POD -o json | jq -r '.items [].id' ) sh-5.1# crictl inspectp $POD_ID | jq -r '.info.runtimeSpec.linux.namespaces[] | select( .type=="network" ) | .path' /var/run/netns/d900d288-f7c9-4ab0-b584-4e92928a3c43 sh-5.1# nsenter --net=/var/run/netns/d900d288-f7c9-4ab0-b584-4e92928a3c43 ss -tulpn Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process tcp LISTEN 0 4096 :8081 *: users("manager",pid=3503301,fd=3))
- on previous versions, for example 'rhsso-operator.7.6.11-opr-006' we see that rhsso-operator listens on tcp/8383 and tcp/8686 ports, the ports defined in the service and ServiceMonitor:
NS=<namespace> POD=<rhsso-operator-pod-name> sh-5.1# POD_ID=$( crictl pods --namespace=$NS --name=$POD -o json | jq -r '.items[].id' ) sh-5.1# crictl inspectp $POD_ID | jq -r '.info.runtimeSpec.linux.namespaces[] | select( .type=="network" ) | .path' /var/run/netns/c486e14c-3c50-4a93-b541-6679a12e92c1 sh-5.1# nsenter --net=/var/run/netns/c486e14c-3c50-4a93-b541-6679a12e92c1 ss -tulpn Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process tcp LISTEN 0 4096 *:8383 *:* users:(("keycloak-operat",pid=3482046,fd=5)) tcp LISTEN 0 4096 *:8686 *:* users:(("keycloak-operat",pid=3482046,fd=6))
Thank you!