Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-2183

CCO fails to create credentials in a mitm proxy enabled gcp cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Minor Minor
    • None
    • 4.7.0
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Low
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      bugzilla:https://bugzilla.redhat.com/show_bug.cgi?id=1931032

      In a mitm proxy enabled gcp cluster, cloud-credential is degraded and is not able to create credentials. From the co description, we can see 0 of 5 credentials requests provisioned, 5 reporting errors.

      1. oc describe co/cloud-credential
        Name: cloud-credential
        Namespace:
        Labels: <none>
        Annotations: exclude.release.openshift.io/internal-openshift-hosted: true
        include.release.openshift.io/self-managed-high-availability: true
        API Version: config.openshift.io/v1
        Kind: ClusterOperator
        Metadata:
        Creation Timestamp: 2021-02-18T08:35:12Z
        Generation: 1
        Managed Fields:
        API Version: config.openshift.io/v1
        Fields Type: FieldsV1
        fieldsV1:
        f:metadata:
        f:annotations:
        .:
        f:exclude.release.openshift.io/internal-openshift-hosted:
        f:include.release.openshift.io/self-managed-high-availability:
        f:spec:
        f:status:
        .:
        f:extension:
        Manager: cluster-version-operator
        Operation: Update
        Time: 2021-02-18T08:35:12Z
        API Version: config.openshift.io/v1
        Fields Type: FieldsV1
        fieldsV1:
        f:status:
        f:conditions:
        f:relatedObjects:
        f:versions:
        Manager: cloud-credential-operator
        Operation: Update
        Time: 2021-02-18T08:37:06Z
        Resource Version: 800647
        Self Link: /apis/config.openshift.io/v1/clusteroperators/cloud-credential
        UID: 12e7e082-401c-4899-b936-5f6c9ec0c7a4
        Spec:
        Status:
        Conditions:
        Last Transition Time: 2021-02-18T08:37:06Z
        Status: True
        Type: Available
        Last Transition Time: 2021-02-18T09:10:18Z
        Message: 5 of 5 credentials requests are failing to sync.
        Reason: CredentialsFailing
        Status: True
        Type: Degraded
        Last Transition Time: 2021-02-18T09:10:18Z
        Message: 0 of 5 credentials requests provisioned, 5 reporting errors.
        Reason: Reconciling
        Status: True
        Type: Progressing
        Last Transition Time: 2021-02-18T08:37:06Z
        Status: True
        Type: Upgradeable
        Extension: <nil>
        Related Objects:
        Group: operator.openshift.io
        Name: cluster
        Resource: cloudcredentials
        Group:
        Name: openshift-cloud-credential-operator
        Resource: namespaces
        Group: cloudcredential.openshift.io
        Name: cloud-credential-operator-iam-ro
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-cluster-csi-drivers
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-image-registry-openstack
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-ingress
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-machine-api-kubevirt
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-machine-api-openstack
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-vsphere-problem-detector
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: aws-ebs-csi-driver-operator
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-image-registry-azure
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-machine-api-azure
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-machine-api-ovirt
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-machine-api-vsphere
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-image-registry
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-ingress-gcp
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: ovirt-csi-driver-operator
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-network
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: cloud-credential-operator-gcp-ro-creds
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: manila-csi-driver-operator
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-gcp-pd-csi-driver-operator
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-image-registry-gcs
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-ingress-azure
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-machine-api-aws
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Group: cloudcredential.openshift.io
        Name: openshift-machine-api-gcp
        Namespace: openshift-cloud-credential-operator
        Resource: credentialsrequests
        Versions:
        Name: operator
        Version: 4.7.0-0.nightly-2021-02-17-130606
        Events: <none>

      From the pod log we can see below kind of messages:

      time="2021-02-18T08:58:13Z" level=warning msg="read-only creds not found, using root creds client" actuator=gcp cr=openshift-cloud-credential-operator/openshift-machine-api-gcp secret=openshift-cloud-credential-operator/cloud-credential-operator-gcp-ro-creds
      time="2021-02-18T08:58:59Z" level=info msg="validating cloud cred secret" controller=secretannotator
      time="2021-02-18T08:58:59Z" level=info msg="reconciling clusteroperator status"
      time="2021-02-18T08:58:59Z" level=info msg="requeueing all CredentialsRequests"
      time="2021-02-18T08:58:59Z" level=info msg="requeueing all CredentialsRequests"
      time="2021-02-18T08:58:59Z" level=info msg="clusteroperator status updated" controller=status
      time="2021-02-18T08:58:59Z" level=info msg="Verified cloud creds can be used for minting new creds" controller=secretannotator
      time="2021-02-18T09:00:10Z" level=info msg="calculating metrics for all CredentialsRequests" controller=metrics
      time="2021-02-18T09:00:10Z" level=info msg="reconcile complete" controller=metrics elapsed=4.450811ms
      time="2021-02-18T09:00:13Z" level=error msg="error determining whether a credentials update is needed" actuator=gcp cr=openshift-cloud-credential-operator/openshift-machine-api-gcp error="error gathering permissions for each role: error getting role details: context deadline exceeded"
      time="2021-02-18T09:00:13Z" level=error msg="error syncing credentials: error determining whether a credentials update is needed" controller=credreq cr=openshift-cloud-credential-operator/openshift-machine-api-gcp secret=openshift-machine-api/gcp-cloud-credentials
      time="2021-02-18T09:00:13Z" level=error msg="errored with condition: CredentialsProvisionFailure" controller=credreq cr=openshift-cloud-credential-operator/openshift-machine-api-gcp secret=openshift-machine-api/gcp-cloud-credentials

      Version-Release number of selected component (if applicable):
      4.7.0-0.nightly-2021-02-17-130606

      How reproducible:
      Always

      Steps to Reproduce:
      1. Install a gcp cluster in a restricted network with mitm proxy enabled
      2.
      3.

      Actual results:

      Installation failed and many operators are in bad state

      1. oc get co
        NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
        authentication False True True 45h
        baremetal 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        cloud-credential 4.7.0-0.nightly-2021-02-17-130606 True True True 45h
        cluster-autoscaler 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        config-operator 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        console 4.7.0-0.nightly-2021-02-17-130606 True False True 45h
        csi-snapshot-controller 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        dns 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        etcd 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        image-registry False True True 45h
        ingress 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        insights 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        kube-apiserver 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        kube-controller-manager 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        kube-scheduler 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        kube-storage-version-migrator 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        machine-api 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        machine-approver 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        machine-config 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        marketplace 4.7.0-0.nightly-2021-02-17-130606 True False False 4h20m
        monitoring 4.7.0-0.nightly-2021-02-17-130606 True False False 4h20m
        network 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        node-tuning 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        openshift-apiserver 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        openshift-controller-manager 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        openshift-samples 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        operator-lifecycle-manager 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        operator-lifecycle-manager-catalog 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        operator-lifecycle-manager-packageserver 4.7.0-0.nightly-2021-02-17-130606 True False False 4h21m
        service-ca 4.7.0-0.nightly-2021-02-17-130606 True False False 45h
        storage 4.7.0-0.nightly-2021-02-17-130606 False True False 45h

      Expected results:
      Installation passed.

      Additional info:

      CCO pod config is as below:

      1. oc describe pod/cloud-credential-operator-f595bf46f-r89k9
        Name: cloud-credential-operator-f595bf46f-r89k9
        Namespace: openshift-cloud-credential-operator
        Priority: 2000000000
        Priority Class Name: system-cluster-critical
        Node: yangyang-mitm-kdpg4-m-2.c.openshift-qe.internal/10.0.0.6
        Start Time: Thu, 18 Feb 2021 03:42:04 -0500
        Labels: app=cloud-credential-operator
        control-plane=controller-manager
        controller-tools.k8s.io=1.0
        pod-template-hash=f595bf46f
        Annotations: k8s.v1.cni.cncf.io/network-status:
        [{
        "name": "",
        "interface": "eth0",
        "ips": [
        "10.129.0.29"
        ],
        "default": true,
        "dns": {}
        }]
        k8s.v1.cni.cncf.io/networks-status:
        [{
        "name": "",
        "interface": "eth0",
        "ips": [
        "10.129.0.29"
        ],
        "default": true,
        "dns": {}
        }]
        openshift.io/scc: restricted
        Status: Running
        IP: 10.129.0.29
        IPs:
        IP: 10.129.0.29
        Controlled By: ReplicaSet/cloud-credential-operator-f595bf46f
        Containers:
        kube-rbac-proxy:
        Container ID: cri-o://32a6be5815484409583d1600b9eb059a1b614ffd85fd0693aced7f38f91b3bb8
        Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:286594a73fcc8d6be9933e64b24455e4d0ac1dd90e485ec8ea1927dc77f38f5c
        Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:286594a73fcc8d6be9933e64b24455e4d0ac1dd90e485ec8ea1927dc77f38f5c
        Port: 8443/TCP
        Host Port: 0/TCP
        Args:
        --secure-listen-address=0.0.0.0:8443
        --upstream=
        http://127.0.0.1:2112/
        --tls-cert-file=/etc/tls/private/tls.crt
        --tls-private-key-file=/etc/tls/private/tls.key
        --logtostderr=true
        State: Running
        Started: Thu, 18 Feb 2021 03:43:12 -0500
        Ready: True
        Restart Count: 0
        Requests:
        cpu: 10m
        memory: 20Mi
        Environment: <none>
        Mounts:
        /etc/tls/private from cloud-credential-operator-serving-cert (rw)
        /var/run/secrets/kubernetes.io/serviceaccount from cloud-credential-operator-token-9tcgd (ro)
        cloud-credential-operator:
        Container ID: cri-o://5883e239afce9b1b11fe19e08ec7b8ef832dc9897659bf4fe7aab80f9f0541ea
        Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:49a43623fc2674e37ff4b4705421e6ec114780e8a84446c65eec6cb34a4b7c57
        Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:49a43623fc2674e37ff4b4705421e6ec114780e8a84446c65eec6cb34a4b7c57
        Port: <none>
        Host Port: <none>
        Command:
        /bin/bash
        -ec
        Args:
        if [ -s /var/run/configmaps/trusted-ca-bundle/tls-ca-bundle.pem ]; then
        echo "Copying system trust bundle"
        cp -f /var/run/configmaps/trusted-ca-bundle/tls-ca-bundle.pem /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
        fi
        exec /usr/bin/cloud-credential-operator operator

      State: Running
      Started: Thu, 18 Feb 2021 03:43:37 -0500
      Ready: True
      Restart Count: 0
      Requests:
      cpu: 10m
      memory: 150Mi
      Environment:
      RELEASE_VERSION: 4.7.0-0.nightly-2021-02-17-130606
      AWS_POD_IDENTITY_WEBHOOK_IMAGE: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:30d25b2bd9a7bd9ff59ed4fe15fe3b2628f09993004fd4bf318a80ab0715809c
      HTTP_PROXY:
      http://user:password@10.0.0.2:3129
      HTTPS_PROXY:
      http://user:password@10.0.0.2:3129
      NO_PROXY: .cluster.local,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.yangyang-mitm.qe.gcp.devcluster.openshift.com,localhost,metadata,metadata.google.internal,metadata.google.internal.,test.no-proxy.com
      Mounts:
      /var/run/configmaps/trusted-ca-bundle from cco-trusted-ca (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cloud-credential-operator-token-9tcgd (ro)
      Conditions:
      Type Status
      Initialized True
      Ready True
      ContainersReady True
      PodScheduled True
      Volumes:
      cco-trusted-ca:
      Type: ConfigMap (a volume populated by a ConfigMap)
      Name: cco-trusted-ca
      Optional: true
      cloud-credential-operator-serving-cert:
      Type: Secret (a volume populated by a Secret)
      SecretName: cloud-credential-operator-serving-cert
      Optional: false
      cloud-credential-operator-token-9tcgd:
      Type: Secret (a volume populated by a Secret)
      SecretName: cloud-credential-operator-token-9tcgd
      Optional: false
      QoS Class: Burstable
      Node-Selectors: node-role.kubernetes.io/master=
      Tolerations: node-role.kubernetes.io/master:NoSchedule op=Exists
      node.kubernetes.io/memory-pressure:NoSchedule op=Exists
      node.kubernetes.io/not-ready:NoExecute op=Exists for 120s
      node.kubernetes.io/unreachable:NoExecute op=Exists for 120s
      Events: <none>

      The proxy configuration is as below:

      1. cat /srv/squid/etc/squid.conf
        #
      2. Recommended minimum configuration:
        #
      1. Example rule allowing access from your local networks.
      2. Adapt to list your (internal) IP networks from where browsing
      3. should be allowed

      auth_param basic program /usr/lib64/squid/basic_ncsa_auth /etc/squid/passwords
      auth_param basic realm proxy
      acl authenticated proxy_auth REQUIRED
      http_access allow authenticated

      sslcrtd_program /usr/lib64/squid/security_file_certgen -s /var/lib/ssl_db -M 4MB

      acl localnet src 0.0.0.1-0.255.255.255 # RFC 1122 "this" network (LAN)
      acl localnet src 10.0.0.0/8 # RFC 1918 local private network (LAN)
      acl localnet src 100.64.0.0/10 # RFC 6598 shared address space (CGN)
      acl localnet src 169.254.0.0/16 # RFC 3927 link-local (directly plugged) machines
      acl localnet src 172.16.0.0/12 # RFC 1918 local private network (LAN)
      acl localnet src 192.168.0.0/16 # RFC 1918 local private network (LAN)
      acl localnet src fc00::/7 # RFC 4193 local private network range
      acl localnet src fe80::/10 # RFC 4291 link-local (directly plugged) machines

      acl SSL_ports port 443
      acl Safe_ports port 80 # http
      acl Safe_ports port 21 # ftp
      acl Safe_ports port 443 # https
      acl Safe_ports port 70 # gopher
      acl Safe_ports port 210 # wais
      acl Safe_ports port 1025-65535 # unregistered ports
      acl Safe_ports port 280 # http-mgmt
      acl Safe_ports port 488 # gss-http
      acl Safe_ports port 591 # filemaker
      acl Safe_ports port 777 # multiling http
      acl CONNECT method CONNECT

      #

      1. Recommended minimum Access Permission configuration:
        #
      2. Deny requests to certain unsafe ports
        http_access deny !Safe_ports
      1. Deny CONNECT to other than secure SSL ports
        http_access deny CONNECT !SSL_ports
      1. Only allow cachemgr access from localhost
        http_access allow localhost manager
        http_access deny manager
      1. We strongly recommend the following be uncommented to protect innocent
      2. web applications running on the proxy server who think the only
      3. one who can access services on "localhost" is a local user
        #http_access deny to_localhost

      #

      1. INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
        #
      1. Example rule allowing access from your local networks.
      2. Adapt localnet in the ACL section to list your (internal) IP networks
      3. from where browsing should be allowed
        #http_access allow localnet
        #http_access allow localhost
      1. And finally deny all other access to this proxy
        http_access deny all
      1. Squid normally listens to port 3128
        http_port 3128

      http_port 3129 ssl-bump generate-host-certificates=on dynamic_cert_mem_cache_size=4MB cert=/etc/squid/certs/intermediate.cert.pem key=/etc/squid/certs/intermediate.key.pem cafile=/etc/squid/certs/ca-chain.cert.pem

      #https_port 3130 cert=/etc/squid/certs/squid-server.cert.pem key=/etc/squid/certs/squid-server.key.pem cafile=/etc/squid/certs/ca-chain.cert.pem

      #https_port 3131 intercept ssl-bump generate-host-certificates=on dynamic_cert_mem_cache_size=4MB cert=/etc/squid/certs/squid-server.cert.pem key=/etc/squid/certs/squid-server.key.pem cafile=/etc/squid/certs/ca-chain.cert.pem

      acl step1 at_step SslBump1

      ssl_bump peek step1
      ssl_bump bump all

      1. Uncomment and adjust the following to add a disk cache directory.
        #cache_dir ufs /var/spool/squid 100 16 256
      1. Leave coredumps in the first cache dir
        coredump_dir /var/spool/squid

      #

      1. Add any of your own refresh_pattern entries above these.
        #
        refresh_pattern ^ftp: 1440 20% 10080
        refresh_pattern ^gopher: 1440 0% 1440
        refresh_pattern -i (/cgi-bin/|?) 0 0% 0
        refresh_pattern . 0 20% 4320

              akhilrane Akhil Rane (Inactive)
              yanyang@redhat.com Yang Yang
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: