Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8779

Pods fail to pull images when cluster proxy enabled and using self-signed cert w/MITM proxy

XMLWordPrintable

    • Low
    • None
    • x86_64
    • If docs needed, set a value

      Description of problem:

      When configuring the cluster proxy resource (OCP 4.4) using a self signed cert for the trustedCA and connecting through a MITM (man in the middle) proxy, pods are no longer able to pull images, displaying ErrImagePull and ' x509: certificate signed by unknown', although other resources work through the proxy, such as the CVO operator. Possibly pods are not referencing the trustedCA configmap cert while still routing through proxy.

      Version-Release number of selected component (if applicable):
      quay.io/openshift-release-dev/ocp-release:4.4.3-x86_64

      How reproducible:
      100%

      Steps to Reproduce:
      [Setup MITM squid proxy]
      1. Install squid proxy on a separate server and generate a self signed certificate for it
      openssl req -new -newkey rsa:2048 -sha256 -days 365 -nodes -x509 -extensions v3_ca -keyout squid-ca-key.pem -out squid-ca-cert.pem
      cat squid-ca-cert.pem squid-ca-key.pem >> squid-ca-cert-key.pem
      sudo mkdir /etc/squid/certs
      sudo cp squid-ca-cert-key.pem /etc/squid/certs/.
      sudo chown squid:squid -R /etc/squid/certs

      2. Add the following to /etc/squid/squid.conf

      1. uncomment existing http_port 3128 line
      2. http_port 3128
        http_port 3128 ssl-bump \
        cert=/etc/squid/certs/squid-ca-cert-key.pem \
        generate-host-certificates=on dynamic_cert_mem_cache_size=16MB
        https_port 3129 intercept ssl-bump \
        cert=/etc/squid/certs/squid-ca-cert-key.pem \
        generate-host-certificates=on dynamic_cert_mem_cache_size=16MB \
        capath=/etc/squid/additional-certs
        acl step1 at_step SslBump1
        ssl_bump peek step1
        ssl_bump bump all
        ssl_bump splice all

      3. Confirm the configuration file is correct
      sudo squid -k parse

      4. Create the SSL database and make sure the squid user can access it

      sudo /usr/lib64/squid/security_file_certgen -c -s /var/spool/squid/ssl_db -M 4MB
      sudo chown -R squid:squid /var/spool/squid/ssl_db

      5. On the squid server, download ca-bundle.crt from default-ingress-cert ConfigMap in openshift-config-managed namespace and place the file in /etc/pki/ca-trust/source/anchors/

      6. Also copy the squid self-signed cert to the same location.
      sudo cp /etc/squid/certs/squid-ca-cert-key.pem /etc/pki/ca-trust/source/anchors/

      7. Run update-ca-trust
      sudo update-ca-trust extract

      8. If you are using a libvirt managed cluster, turn off firewalld on libvirtd host or figure out how to update the firewall rules to allow libvirt nodes to communicate with squid

      9. Start squid

      [Setup cluster proxy and trusted CA]

      10. Create a ConfigMap containing squid's cert and name it ca-bundle.crt. ca-bundle.crt contains only squid's own self-signed cert to enable TLS

      oc create configmap -n openshift-config user-ca-bundle --from-file=ca-bundle.crt

      11. Update the Proxy named cluster with following settings

      $ cat cluster-proxy.yaml
      apiVersion: config.openshift.io/v1
      kind: Proxy
      metadata:
      name: cluster
      spec:
      httpProxy: http://<squid>:3128
      httpsProxy: http://<squid>:3128
      noProxy: example.com
      trustedCA:
      name: user-ca-bundle

      oc apply -f cluster-proxy.yaml

      12. Wait for mcp / nodes to complete applying configuration (rebooting)

      13. Check CVO and notice that it is able to connect to the RH Cincinnati endpoint without x509 untrusted errors

      Actual results:

      Run: 'oc get pods -A | grep -v -e Comp -e Run' and notice many pods get in 'ImagePullBackOff' state,
      NAMESPACE NAME READY STATUS RESTARTS AGE
      multicluster-endpoint endpoint-appmgr-575b75c4b-k9px7 0/1 ImagePullBackOff 0 20m
      multicluster-endpoint endpoint-certpolicyctrl-5fbf848789-qtr2w 0/1 ImagePullBackOff 0 20m
      multicluster-endpoint endpoint-component-operator-7749c65cd5-x45tk 0/1 ImagePullBackOff 0 20m
      multicluster-endpoint endpoint-connmgr-7f964ff9b9-dh57v 0/1 ImagePullBackOff 0 21m

      Checking into events/pod status notice the following types of errors:
      message: 'rpc error: code = Unknown desc = error pinging docker registry registry.redhat.io:
      Get https://registry.redhat.io/v2/: x509: certificate signed by unknown

      Expected results:

      Pods adhere to the proxy and use the ca-bundle.crt configmap trustedCA to properly connect to the MITM proxy and pull images via proxy successfully.

      Additional info:

      We are following the OCP 4.4 cluster wide proxy documentation [1]. If this isn't supported the documentation should be updated to reflect this.

      Side note: I understand that using a MITM proxy with self signed cert is probably not a recommended production use case, but I can see it used for internal testing (Which is why we have run into it).

      [1] https://docs.openshift.com/container-platform/4.4/networking/enable-cluster-wide-proxy.html

              jboxman@redhat.com Jason Boxman
              chadcrum Chad Crum
              Red Hat Employee
              Jason Boxman
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: