Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18769

Documentation for Autoscaling an Ingress Controller is wrong

XMLWordPrintable

    • Important
    • No
    • False
    • Hide

      None

      Show
      None
    • N/A
    • Release Note Not Required

      Description of problem:

      The documentation in https://docs.openshift.com/container-platform/4.13/networking/ingress-operator.html#nw-autoscaling-ingress-controller_configuring-ingress is misleading and also partially wrong.
      
      In Prerequisites, "You have the Custom Metrics Autoscaler Operator installed" should have a link pointing to https://docs.openshift.com/container-platform/4.13/nodes/cma/nodes-cma-autoscaling-custom-install.html or similar to help customers understand how Custom Metrics Autoscaler can and should be installed and configured before starting with this activity.
      
      Step 1, "Create a project in the openshift-ingress-operator namespace by running the following command:" this is wrong as it's not about creating that project (it's there by default) but rather using that project. So while the command itself is correct the description of the step is wrong.
      
      Step 2, "Enable OpenShift monitoring for user-defined projects by creating and applying a config map" is not needed as the procedure also works without the user specific project monitoring being enabled.
      
      > $ oc get clusterversion
      > NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      > version   4.13.11   True        False         69m     Cluster version is 4.13.11
      
      > $ oc describe cm cluster-monitoring-config -n openshift-monitoring
      > Name:         cluster-monitoring-config
      > Namespace:    openshift-monitoring
      > Labels:       <none>
      > Annotations:  <none>
      > 
      > Data
      > ====
      > config.yaml:
      > ----
      > prometheusK8s:
      > 
      >   volumeClaimTemplate:
      >     metadata:
      >       name: prometheus-data
      >       annotations:
      >         openshift.io/cluster-monitoring-drop-pvc: "yes"
      >     spec:
      >       resources:
      >         requests:
      >           storage: 20Gi
      > 
      > BinaryData
      > ====
      > 
      > Events:  <none>
      
      > $ oc get kedacontroller -n openshift-keda keda -o yaml
      > apiVersion: keda.sh/v1alpha1
      > kind: KedaController
      > metadata:
      >   creationTimestamp: "2023-09-07T13:49:38Z"
      >   finalizers:
      >   - finalizer.kedacontroller.keda.sh
      >   generation: 1
      >   name: keda
      >   namespace: openshift-keda
      >   resourceVersion: "32960"
      >   uid: 61d0709b-b8ee-417d-bba0-0944cc863064
      > spec:
      >   admissionWebhooks:
      >     logEncoder: console
      >     logLevel: info
      >   metricsServer:
      >     logLevel: "0"
      >   operator:
      >     logEncoder: console
      >     logLevel: info
      >   watchNamespace: ""
      > status:
      >   phase: Installation Succeeded
      >   reason: KEDA v2.10.1 is installed in namespace 'openshift-keda'
      >   version: 2.10.1
      
      > $ oc get scaledobject -o yaml
      > apiVersion: v1
      > items:
      > - apiVersion: keda.sh/v1alpha1
      >   kind: ScaledObject
      >   metadata:
      [...]
      >     labels:
      >       scaledobject.keda.sh/name: ingress-scaler
      >     name: ingress-scaler
      >     namespace: openshift-ingress-operator
      >     resourceVersion: "61620"
      >     uid: dcc58b7f-cba6-44e0-8046-49406a009fc0
      >   spec:
      >     cooldownPeriod: 1
      >     maxReplicaCount: 20
      >     minReplicaCount: 1
      >     pollingInterval: 1
      >     scaleTargetRef:
      >       apiVersion: operator.openshift.io/v1
      >       envSourceContainerName: ingress-operator
      >       kind: IngressController
      >       name: default
      >     triggers:
      >     - authenticationRef:
      >         name: keda-trigger-auth-prometheus
      >       metadata:
      >         authModes: bearer
      >         metricName: kube-node-role
      >         namespace: openshift-ingress-operator
      >         query: sum(kube_node_role{role="worker",service="kube-state-metrics"})
      >         serverAddress: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091
      >         threshold: "1"
      >       metricType: AverageValue
      >       type: prometheus
      >   status:
      >     conditions:
      >     - message: ScaledObject is defined correctly and is ready for scaling
      >       reason: ScaledObjectReady
      >       status: "True"
      >       type: Ready
      >     - message: Scaling is performed because triggers are active
      >       reason: ScalerActive
      >       status: "True"
      >       type: Active
      >     - message: No fallbacks are active on this scaled object
      >       reason: NoFallbackFound
      >       status: "False"
      >       type: Fallback
      >     externalMetricNames:
      >     - s0-prometheus-kube-node-role
      >     health:
      >       s0-prometheus-kube-node-role:
      >         numberOfFailures: 0
      >         status: Happy
      >     hpaName: keda-hpa-ingress-scaler
      >     lastActiveTime: "2023-09-07T14:51:42Z"
      >     originalReplicaCount: 2
      >     scaleTargetGVKR:
      >       group: operator.openshift.io
      >       kind: IngressController
      >       resource: ingresscontrollers
      >       version: v1
      >     scaleTargetKind: operator.openshift.io/v1.IngressController
      > kind: List
      > metadata:
      >   resourceVersion: ""
      
      > $ oc get pod -n openshift-ingress
      > NAME                              READY   STATUS    RESTARTS   AGE
      > router-default-69dc7ff9f9-mltsc   1/1     Running   0          82m
      > router-default-69dc7ff9f9-mqdzg   1/1     Running   0          82m
      > router-default-69dc7ff9f9-qt7ph   1/1     Running   0          51m
      
      > $ oc get nodes
      > NAME                                         STATUS   ROLES                  AGE   VERSION
      > ip-10-0-133-134.us-west-2.compute.internal   Ready    control-plane,master   88m   v1.26.7+0ef5eae
      > ip-10-0-165-185.us-west-2.compute.internal   Ready    worker                 76m   v1.26.7+0ef5eae
      > ip-10-0-170-80.us-west-2.compute.internal    Ready    control-plane,master   88m   v1.26.7+0ef5eae
      > ip-10-0-186-178.us-west-2.compute.internal   Ready    worker                 81m   v1.26.7+0ef5eae
      > ip-10-0-196-253.us-west-2.compute.internal   Ready    worker                 75s   v1.26.7+0ef5eae
      > ip-10-0-221-152.us-west-2.compute.internal   Ready    worker                 85s   v1.26.7+0ef5eae
      > ip-10-0-226-64.us-west-2.compute.internal    Ready    worker                 81m   v1.26.7+0ef5eae
      > ip-10-0-249-220.us-west-2.compute.internal   Ready    control-plane,master   88m   v1.26.7+0ef5eae
      
      > $ oc get pod -n openshift-ingress
      > NAME                              READY   STATUS    RESTARTS   AGE
      > router-default-69dc7ff9f9-mltsc   1/1     Running   0          85m
      > router-default-69dc7ff9f9-mmm7l   1/1     Running   0          88s
      > router-default-69dc7ff9f9-mqdzg   1/1     Running   0          85m
      > router-default-69dc7ff9f9-nf269   1/1     Running   0          58s
      > router-default-69dc7ff9f9-qt7ph   1/1     Running   0          53m
      
      Step 6, "$ oc adm policy add-role-to-user thanos-metrics-reader -z thanos --role=namespace=openshift-ingress-operator" is wrong and won't work. This should be "$ oc adm policy add-role-to-user thanos-metrics-reader -z thanos --role-namespace=openshift-ingress-operator"
      
      Step 7, "Create a new ScaledObject YAML file, ingress-autoscaler.yaml, that targets the default Ingress Controller deployment". The "serverAddress" is not the cluster address and the port but actually the "thanos-querier" service endpoint in the openshift-monitoring namespace. So basically "https://thanos-querier.openshift-monitoring.svc.cluster.local:9091" as shown above and therefore something that can be mentioned in the example as it's the same in all OpenShift Container Platform 4 - Cluster.
      
      

      Version-Release number of selected component (if applicable):

       - OpenShift Container Platform 4.13 (but all version applicable)
      
      

      How reproducible:

       - Always
      
      

      Steps to Reproduce:

      1. See above
      
      

      Actual results:

      Following the current documentation is misleading and actually renders the configuration put in place not usable as specifically "serverAddress" won't work. Also does it enable user workload monitoring, which is not necessarily required.
      
      

      Expected results:

      Fix documentation to make sure the implementation is working and also does not enable non required components.
      
      

      Additional info:

       - N/A
      

              rhn-support-jdohmann Jesse Dohmann
              rhn-support-sreber Simon Reber
              Melvin Joseph Melvin Joseph
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: