Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-6800

AWS Local Zones must set the correct MTU when installing a cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • 4.12
    • Documentation
    • None
    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      When installing a cluster in existing VPC extending worker nodes in AWS Local Zones subnets, the local registry is not working correctly, falling into errors when pulling internal images.

      Version-Release number of selected component (if applicable):

      4.12.z

      How reproducible:

      Always

      Steps to Reproduce:

      1. Install a cluster following the documentation
      2. Login to the node running in Local Zone
      3. Try to pull any image from internal registry
      

      Actual results:

      $ NODE_NAME=$(oc get nodes -l node-role.kubernetes.io/edge='' -o jsonpath={.items[0].metadata.name})
      $ KPASS=$(cat auth/kubeadmin-password)
      $ API_INT=$(oc get infrastructures cluster -o jsonpath={.status.apiServerInternalURI})
      
      $ oc debug node/${NODE_NAME} --  chroot /host /bin/bash -c "\
      $ oc login --insecure-skip-tls-verify -u kubeadmin -p ${KPASS} ${API_INT}; \
      podman login -u kubeadmin -p \$(oc whoami -t) image-registry.openshift-image-registry.svc:5000; \
      podman pull image-registry.openshift-image-registry.svc:5000/openshift/tests"
      
      (...)
      Error: authenticating creds for "image-registry.openshift-image-registry.svc:5000": pinging container registry image-registry.openshift-image-registry.svc:5000: Get "https://image-registry.openshift-image-registry.svc:5000/v2/": net/http: TLS handshake timeout
      Trying to pull image-registry.openshift-image-registry.svc:5000/openshift/tests:latest...
      time="2023-01-30T16:10:29Z" level=warning msg="Failed, retrying in 1s ... (1/3). Error: initializing source docker://image-registry.openshift-image-registry.svc:5000/openshift/tests:latest: pinging container registry image-registry.openshift-image-registry.svc:5000: Get \"https://image-registry.openshift-image-registry.svc:5000/v2/\": net/http: TLS handshake timeout"
      

      Expected results:

      Image pulled

      Additional info:

      Steps to fix: Add a step on the procedure withing the section "Creating the Kubernetes manifest files" to add the CNO manifest to change it on Install time (documnetation reference[3]). Steps:
      
      . Set the MTU size to 1200
      +
      [source,terminal]
      ----
      $ cat <<EOF > manifests/cluster-network-03-config.yml
      apiVersion: operator.openshift.io/v1
      kind: Network
      metadata:
        name: cluster
      spec:
        defaultNetwork:
          ovnKubernetesConfig:
            mtu: 1200
      EOF
      ----

      [1] https://docs.aws.amazon.com/local-zones/latest/ug/how-local-zones-work.html
      [2] https://docs.openshift.com/container-platform/4.12/installing/installing_aws/installing-aws-localzone.html#installation-localzone-generate-k8s-manifestinstalling-aws-localzone
      [3] https://docs.openshift.com/container-platform/4.12/networking/changing-cluster-network-mtu.html#mtu-value-selection_changing-cluster-network-mtu

              rhn-support-mrbraga Marco Braga
              rhn-support-mrbraga Marco Braga
              Yunfei Jiang Yunfei Jiang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: