Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-52363

[release-4.19] RHCOS worker image fails to find repo mirror

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Major Major
    • None
    • 4.19.0
    • RHCOS
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Proposed
    • None
    • In Progress
    • Release Note Not Required
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Trying to deploy an updated ipsec os extension (https://github.com/openshift/os/pull/1718) through machine config on a Prow cluster.
      
      Machine config to be deployed:
      
      kind: MachineConfig
      metadata:
        labels:
          machineconfiguration.openshift.io/role: worker
        name: 80-ipsec-worker-extensions
      spec:
        config:
          ignition:
            version: 3.2.0
          systemd:
            units:
            - name: ipsecenabler.service
              enabled: true
              contents: |
               [Unit]
               Description=Enable ipsec service after os extension installation
               Before=kubelet.service         [Service]
               Type=oneshot
               ExecStartPre=rm -f /etc/ipsec.d/cno.conf
               ExecStart=systemctl enable --now ipsec.service         [Install]
               WantedBy=multi-user.target
        extensions:
          - ipsec
      
      
      But machine config pool goes into degraded state with following error.
      
      conditions:
        - lastTransitionTime: "2025-03-05T10:59:18Z"
          message: ""
          reason: ""
          status: "False"
          type: RenderDegraded
        - lastTransitionTime: "2025-03-05T11:28:03Z"
          message: ""
          reason: ""
          status: "False"
          type: Updated
        - lastTransitionTime: "2025-03-05T11:28:03Z"
          message: All nodes are updating to MachineConfig rendered-worker-b7f59013a3075262a236d4c058174c7a
          reason: ""
          status: "True"
          type: Updating
        - lastTransitionTime: "2025-03-05T11:31:40Z"
          message: 'Node ip-10-0-39-84.ec2.internal is reporting: "error running rpm-ostree
            update --install NetworkManager-libreswan --install libreswan --install openvswitch3.5-ipsec:
            error: Updating rpm-md repo ''rhel-9.6-baseos'': cannot update repo ''rhel-9.6-baseos'':
            Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors
            were tried; Last error: Curl error (6): Couldn''t resolve host name for http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-baseos/repodata/repomd.xml
            [Could not resolve host: base-4-19-rhel96.ocp.svc.cluster.local]\n: exit status
            1"'
          reason: 1 nodes are reporting degraded status on sync
          status: "True"
          type: NodeDegraded
        - lastTransitionTime: "2025-03-05T11:31:40Z"
          message: 'Node ip-10-0-39-84.ec2.internal is reporting: "error running rpm-ostree
            update --install NetworkManager-libreswan --install libreswan --install openvswitch3.5-ipsec:
            error: Updating rpm-md repo ''rhel-9.6-baseos'': cannot update repo ''rhel-9.6-baseos'':
            Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors
            were tried; Last error: Curl error (6): Couldn''t resolve host name for http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-baseos/repodata/repomd.xml
            [Could not resolve host: base-4-19-rhel96.ocp.svc.cluster.local]\n: exit status
            1"'
          reason: ""
          status: "True"
          type: Degraded
      
      
      machine config daemon container logs:
      
      I0305 11:30:52.634215    2580 image_manager_helper.go:92] Running captured: podman create --net=none --annotation=org.openshift.machineconfigoperator.pivot=true --name ostree-container-pivot-a8165000-2e81-4b29-a5f1-ff46b51e9e7a registry.build06.ci.openshift.org/ci-ln-s0lbbib/stable@sha256:5fc84b3d068f6bd8a219f2da667a83de8d544895e1f01173e17aee2c871493fa
      I0305 11:30:52.687303    2580 run.go:19] Running: nice -- ionice -c 3 podman cp e94be9d866e522cf0e83aa8db091297a65674935c6e071a3c26087790e13b58a:/ /run/mco-extensions/os-extensions-content-1869168287
      I0305 11:31:01.355684    2580 update.go:2737] Running: chcon -R -t var_run_t /run/mco-extensions/os-extensions-content-1869168287
      I0305 11:31:01.618636    2580 update.go:2737] Running: rpm-ostree cleanup -p
      Deployments unchanged.
      I0305 11:31:01.771836    2580 update.go:1836] Applying extensions : ["update" "--install" "NetworkManager-libreswan" "--install" "libreswan" "--install" "openvswitch3.5-ipsec"]
      I0305 11:31:01.771857    2580 update.go:2737] Running: rpm-ostree update --install NetworkManager-libreswan --install libreswan --install openvswitch3.5-ipsec
      Pulling manifest: ostree-unverified-registry:registry.build06.ci.openshift.org/ci-ln-s0lbbib/stable@sha256:cbf7220ecb4531d2bcba90b8f0af549b37c95a3e61b0efb6e65bab8a827bd811
      Checking out tree 819387a...done
      Enabled rpm-md repositories: rhel-9.6-baseos rhel-9.6-appstream rhel-9.6-fast-datapath rhel-9.6-nfv rhel-9.6-highavailability rhel-9.6-server-ose-4.19 rhel-9.6-early-kernel rhel-9.4-baseos rhel-9.4-appstream rhel-9.4-fast-datapath rhel-9.4-nfv rhel-9.4-server-ose-4.19 coreos-extensions
      Updating metadata for 'rhel-9.6-baseos'...done
      I0305 11:31:05.750825    2580 update.go:2909] Rolling back applied changes to OS due to error: error running rpm-ostree update --install NetworkManager-libreswan --install libreswan --install openvswitch3.5-ipsec: error: Updating rpm-md repo 'rhel-9.6-baseos': cannot update repo 'rhel-9.6-baseos': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried; Last error: Curl error (6): Couldn't resolve host name for http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-baseos/repodata/repomd.xml [Could not resolve host: base-4-19-rhel96.ocp.svc.cluster.local]
      : exit status 1
      I0305 11:31:05.750846    2580 update.go:2737] Running: rpm-ostree cleanup -p
      Deployments unchanged.
      ...
      I0305 11:31:12.374944    2580 update.go:2272] Could not reset unit preset for zincati.service, skipping. (Error msg: error running preset on unit: Failed to preset unit: Unit file zincati.service does not exist.
      )
      I0305 11:31:12.374964    2580 file_writers.go:294] Writing systemd unit "kubelet-cleanup.service"
      I0305 11:31:12.382898    2580 file_writers.go:307] Disabling systemd unit kubelet-cleanup.service before re-writing it
      I0305 11:31:13.100774    2580 update.go:2213] Enabled systemd units: [NetworkManager-clean-initrd-state.service aws-kubelet-nodename.service aws-kubelet-providerid.service firstboot-osupdate.target kubelet-auto-node-size.service kubelet.service machine-config-daemon-firstboot.service machine-config-daemon-pull.service nmstate-configuration.service node-valid-hostname.service openvswitch.service ovs-configuration.service ovsdb-server.service wait-for-primary-ip.service kubelet-cleanup.service]
      I0305 11:31:13.450782    2580 update.go:2224] Disabled systemd units [kubens.service nodeip-configuration.service]
      I0305 11:31:13.450814    2580 update.go:1980] Deleting stale data
      I0305 11:31:13.818779    2580 update.go:2235] Preset systemd unit "ipsecenabler.service"
      I0305 11:31:13.818934    2580 update.go:2143] Removed stale systemd unit "/etc/systemd/system/ipsecenabler.service"
      I0305 11:31:13.819001    2580 update.go:2813] Removing SIGTERM protection
      E0305 11:31:13.819068    2580 writer.go:226] Marking Degraded due to: "error running rpm-ostree update --install NetworkManager-libreswan --install libreswan --install openvswitch3.5-ipsec: error: Updating rpm-md repo 'rhel-9.6-baseos': cannot update repo 'rhel-9.6-baseos': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried; Last error: Curl error (6): Couldn't resolve host name for http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-baseos/repodata/repomd.xml [Could not resolve host: base-4-19-rhel96.ocp.svc.cluster.local]\n: exit status 1"
      
      The worker node's ocp.repo yum repo file contains following mirror entries:
      
      sh-5.1# cat /etc/yum.repos.d/*
      
      [rhel-9.6-baseos]
      id = rhel-9.6-baseos
      name = rhel-9.6-baseos
      baseurl = http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-baseos
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.6-appstream]
      id = rhel-9.6-appstream
      name = rhel-9.6-appstream
      baseurl = http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-appstream
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.6-fast-datapath]
      id = rhel-9.6-fast-datapath
      name = rhel-9.6-fast-datapath
      baseurl = http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-fast-datapath
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.6-nfv]
      id = rhel-9.6-nfv
      name = rhel-9.6-nfv
      baseurl = http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-nfv
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.6-highavailability]
      id = rhel-9.6-highavailability
      name = rhel-9.6-highavailability
      baseurl = http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-highavailability
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.6-server-ose-4.19]
      id = rhel-9.6-server-ose-4.19
      name = rhel-9.6-server-ose-4.19
      baseurl = http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-server-ose-4.19
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.6-early-kernel]
      id = rhel-9.6-early-kernel
      name = rhel-9.6-early-kernel
      baseurl = http://base-4-19-rhel96.ocp.svc.cluster.local/rhel-9.6-early-kernel
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.4-baseos]
      id = rhel-9.4-baseos
      name = rhel-9.4-baseos
      baseurl = http://base-4-19-rhel94.ocp.svc.cluster.local/rhel-9.4-baseos
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.4-appstream]
      id = rhel-9.4-appstream
      name = rhel-9.4-appstream
      baseurl = http://base-4-19-rhel94.ocp.svc.cluster.local/rhel-9.4-appstream
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.4-fast-datapath]
      id = rhel-9.4-fast-datapath
      name = rhel-9.4-fast-datapath
      baseurl = http://base-4-19-rhel94.ocp.svc.cluster.local/rhel-9.4-fast-datapath
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.4-nfv]
      id = rhel-9.4-nfv
      name = rhel-9.4-nfv
      baseurl = http://base-4-19-rhel94.ocp.svc.cluster.local/rhel-9.4-nfv
      enabled = 1
      gpgcheck = 0
      
      [rhel-9.4-server-ose-4.19]
      id = rhel-9.4-server-ose-4.19
      name = rhel-9.4-server-ose-4.19
      baseurl = http://base-4-19-rhel94.ocp.svc.cluster.local/rhel-9.4-server-ose-4.19
      enabled = 1
      gpgcheck = 0   

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

          

              psundara@redhat.com Prashanth Sundararaman
              pepalani@redhat.com Periyasamy Palanisamy
              None
              None
              Michael Nguyen Michael Nguyen
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: