Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-10038

throw events, add status to OpenStackControlPlane when something's wrong with DNS

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • infra-operator
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • ?
    • ?
    • None
    • Moderate

      I created a similar ticket already for Telemetry, it's actually the same request for all components, but just to illustrate my point:

      [root@jumphost 05_control_plane]# oc get openstackcontrolplane
      NAME                      STATUS   MESSAGE
      openstack-control-plane   False    OpenStackControlPlane DNS in progress
      [root@jumphost 05_control_plane]# oc get events
      LAST SEEN   TYPE     REASON    OBJECT             MESSAGE
      6m22s       Normal   Killing   pod/ceilometer-0   Stopping container ceilometer-central-agent
      6m22s       Normal   Killing   pod/ceilometer-0   Stopping container sg-core
      6m22s       Normal   Killing   pod/ceilometer-0   Stopping container ceilometer-notification-agent
      6m22s       Normal   Killing   pod/ceilometer-0   Stopping container proxy-httpd
      [root@jumphost 05_control_plane]# 
      
      # oc describe openstackcontrolplane | less
      (...)
        Dns:
          Enabled:  true
          Template:
            Container Image:                registry.redhat.io/rhoso/openstack-neutron-server-rhel9@sha256:3c49822b33a4d9b05ee9946ca92923d6697c7c66787a02c69a8420ba2de94778
            Dns Data Label Selector Value:  dnsdata
            Options:
              Key:  server
              Values:
                8.8.8.8
              Key:  server
              Values:
                8.8.4.4
            Override:
              Service:
                Metadata:
                  Annotations:
                    metallb.universe.tf/address-pool:     ctlplane
                    metallb.universe.tf/allow-shared-ip:  ctlplane
                    metallb.universe.tf/loadBalancerIPs:  172.20.1.80
                Spec:
                  Type:  LoadBalancer
            Replicas:    2
      (...)
          Message:               OpenStackControlPlane DNS in progress
          Reason:                Requested
          Severity:              Info
          Status:                False
          Type:                  Ready
          Last Transition Time:  2024-09-08T16:34:34Z
      (...)
      

      I mentioned in the other ticket that it's an anti-pattern in OpenShift to log issues into pod errors only, and here's why:
      I as a user of the OpenStack operator have no idea which one of the many operator-pods is actually responsible for driving forward the DNS progress. So I have to do something as absurd as:

      # oc get pods -n openstack-operators -o name | while read p; do echo "=== $p ==="; oc logs -n openstack-operators $p | grep -i dns; done | less
      

      .. in order to figure out that the the infra-operator-controller-manager is responsible for this. Please make sure to update the OpenStackControlPlane CR with the actual status and failures, and please emit events as well

      Thanks!

              Unassigned Unassigned
              akaris@redhat.com Andreas Karis
              rhos-dfg-ospk8s
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: