Uploaded image for project: 'OpenShift Service Mesh'
  1. OpenShift Service Mesh
  2. OSSM-4973

Detect oauth-proxy image after operator bootstrap

XMLWordPrintable

    • Icon: Ticket Ticket
    • Resolution: Done
    • Icon: Major Major
    • None
    • OSSM 2.4.3
    • Maistra
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Description of problem:

      In disconnected environments certain conditions cause the updateOauthProxyConfig [1] function to fail at startup, e.g.:

      • shortly after SNO node reboot the Image API may not be available yet, preventing the function from reading the dockerImageReference from the ImageStream.
      • oauth-proxy ImageStream's dockerReferenceImage empty due to (temporary) pull failure during operator startup.

      Once the operator has started [2], the detection never runs again. As a result the managed prometheus pod remains in ImagePullBackOff since it cannot pull the oauth-proxy image.

      [1] https://github.com/maistra/istio-operator/blob/eb3f6c8b4ccdfedc12f16a284477e971f5872530/pkg/controller/versions/template.go#L179C13-L179C13

       [2] https://github.com/maistra/istio-operator/blob/eb3f6c8b4ccdfedc12f16a284477e971f5872530/pkg/controller/servicemesh/controlplane/reconciler.go#L150

      Version-Release number of selected component (if applicable):

      • OCP 4.12.32
      • OSSM 2.4.3-0

       

      How reproducible:

      100% on disconnected environment

       

      Steps to Reproduce:

      $ curl -L https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.12.34/oc-mirror.tar.gz|tar -xzf - 
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100 52.9M  100 52.9M    0     0  19.0M      0  0:00:02  0:00:02 --:--:-- 19.0M
      $ chmod +x ./oc-mirror 
      $ ./oc-mirror version
      Client Version: version.Info{Major:"", Minor:"", GitVersion:"4.12.0-202308291001.p0.g3ac49d9.assembly.stream-3ac49d9", GitCommit:"3ac49d9bd7c2193ede794e328dfa1142d7735f2e", GitTreeState:"clean", BuildDate:"2023-08-29T13:29:58Z", GoVersion:"go1.19.10 X:strictfipsruntime", Compiler:"gc", Platform:"linux/amd64"}
      $ cat > ossm-is.yaml <<EOF
      kind: ImageSetConfiguration
      apiVersion: mirror.openshift.io/v1alpha2
      archiveSize: 4                                                      
      storageConfig:
        registry:
          imageURL: mirror.local/mirror/oc-mirror-metadata
          skipTLS: true
      mirror:
        platform:
          channels:
          - name: stable-4.12
            minVersion: 4.12.30
            type: ocp
          graph: true
        operators:
        - catalog: registry.redhat.io/redhat/redhat-operator-index:v4.12
          packages:
          - name: servicemeshoperator
            channels:
            - name: stable
          - name: jaeger-product
            channels:
            - name: stable
          - name: kiali-ossm
            channels:
            - name: stable
          - name: elasticsearch-operator
            channels:
            - name: stable
      EOF
      $ ./oc-mirror --dest-skip-tls --config=./ossm-is.yaml docker://mirror.local/ossm-ocp-412
      <...>
      info: Mirroring completed in 7m0.03s (149.2MB/s)
      Rendering catalog image "mirror.local/ossm-ocp-412/redhat/redhat-operator-index:v4.12" with file-based catalog 
      Writing image mapping to oc-mirror-workspace/results-1695653324/mapping.txt
      Writing UpdateService manifests to oc-mirror-workspace/results-1695653324
      Writing CatalogSource manifests to oc-mirror-workspace/results-1695653324
      Writing ICSP manifests to oc-mirror-workspace/results-1695653324
      Installed a cluster with the imagecontentsources:
      $ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.12.32   True        False         6m43s   Cluster version is 4.12.32
      $ oc apply -f oc-mirror-workspace/results-1695653324/catalogSource-redhat-operator-index.yaml
      catalogsource.operators.coreos.com/redhat-operator-index created                                                       
      $ oc patch OperatorHub cluster --type json -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
      operatorhub.config.openshift.io/cluster patched                                                                        
      # installing all 4 operators through console
      $ oc get csv -n openshift-operators
      NAME                            DISPLAY                                          VERSION    REPLACES                                   PHASE
      elasticsearch-operator.v5.7.6   OpenShift Elasticsearch Operator                 5.7.6      elasticsearch-operator.v5.7.5              Succeeded
      jaeger-operator.v1.47.0-2       Red Hat OpenShift distributed tracing platform   1.47.0-2   jaeger-operator.v1.42.0-5-0.1687199951.p   Succeeded
      kiali-operator.v1.65.8          Kiali Operator                                   1.65.8     kiali-operator.v1.65.7                     Succeeded
      servicemeshoperator.v2.4.3      Red Hat OpenShift Service Mesh                   2.4.3-0    servicemeshoperator.v2.4.2                 Succeeded
      # create ServiceMeshControlPlanes through console
      $ oc  new-project test-ossm
      $ oc apply -f - <<EOF
      apiVersion: maistra.io/v1
      kind: ServiceMeshMemberRoll
      metadata:
        name: default
        namespace: istio-system
      spec:
        members:
          # a list of projects joined into the service mesh
          - test-ossm
      EOF
      servicemeshmemberroll.maistra.io/default created
      $ oc get pods -n istio-system 
      NAME                            READY   STATUS             RESTARTS   AGE
      istiod-basic-74468b6865-bcqnt   1/1     Running            0          5m38s
      prometheus-54bddd4b76-pstrl     2/3     ImagePullBackOff   0          5m26s
      $ oc get deployment -n istio-system prometheus -o go-template='{{range .spec.template.spec.containers}}{{.name}}{{": "}}{{.image}}{{"\n"}}{{end}}'
      prometheus-proxy: registry.redhat.io/openshift4/ose-oauth-proxy:v4.9
      prometheus: registry.redhat.io/openshift4/ose-prometheus@sha256:203dd4282f288c5781ed20cb455e37ac82389bf8dc882d3858c9f609b7e06073
      config-reloader: registry.redhat.io/openshift4/ose-prometheus-config-reloader@sha256:f0bcbfb672d79ef087b785c26f0ea2976d690455d2ab1383cd40dcfc8fb7ea2a 

       Actual results:

      $ oc get events -n istio-system |grep oauth-proxy
      3h7m        Normal    BackOff            pod/prometheus-54bddd4b76-pstrl    Back-off pulling image "registry.redhat.io/openshift4/ose-oauth-proxy:v4.9"3h7m        Normal    Pulling            pod/prometheus-54bddd4b76-r7pmh    Pulling image "registry.redhat.io/openshift4/ose-oauth-proxy:v4.9"

       

      Expected results:

      prometheus pod should run successfully.

       

      Additional info:
       

      An earlier bug reported [3] for this situation concluded the ImageStream's presence as a requirement to install the operator and a restart of the operator to work around this, but that's not a viable option for managed/automated clusters.

      Ideally the updateOauthProxyConfig should run on every reconcile loop to ensure the correct (mirrored) oauth-proxy image is eventually detected.

       

      [3] https://issues.redhat.com/browse/OSSM-247

              rhn-support-pyadav Prachi Yadav
              rhn-support-bverschu Bram Verschueren
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: