Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9971

[AWS] Machine does not boot if IMDSv2 enabled via machineset and bootimage belong to Openshift's version lower than 4.7

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • 4.11
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • No
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      In case if machineset has IMDSv2 enabled (https://docs.openshift.com/container-platform/4.11/machine_management/creating_machinesets/creating-machineset-aws.html#machineset-creating-imds-options_creating-machineset-aws)

      and AMI belongs to Openshift version lower than 4.7 machine wont boot due to CoreOS does not support IMDSv2, see OCPBUGSM-20654 for additional details.

      Version-Release number of selected component (if applicable):

      OCP 4.11+

      How reproducible:

      Always

      Steps to Reproduce:

      On Openshift 4.11+ create and scale up machineset with using old AMI (from OCP 4.2 for example), to update existing machineset script below might be used

      oc -n openshift-machine-api get -o json secret worker-user-data-managed | jq '.data.userData | @base64d | fromjson' | jq '.networkd = {} | .passwd = {} | .storage = {} | .systemd = {} | .ignition.version = "2.2.0" | .ignition.security.tls.certificateAuthorities[0].verification = {} | .ignition.config.append = .ignition.config.merge | del(.ignition.config.merge) | .ignition.config.append[0].verification = {}' >v2.2.json
      oc -n openshift-machine-api create secret generic worker-user-data-22 --from-file=userData=v2.2.json --from-literal=disableTemplating=true
      
      
      REGION="$(oc get -o json infrastructure cluster | jq -r .status.platformStatus.aws.region)"
      AMI="$(curl -s https://raw.githubusercontent.com/openshift/installer/release-4.2/data/data/rhcos.json | jq -r ".amis[\"${REGION}\"].hvm")"
      
      echo "Region: $REGION"
      echo "AMI: $AMI"
      
      echo "patching machineset: $1"
      
      oc -n openshift-machine-api patch machineset "$1" --type json -p "[\{\"op\": \"add\", \"path\": \"/spec/template/spec/providerSpec/value/ami\", \"value\": {\"id\": \"${AMI}\"}}]"
      oc -n openshift-machine-api patch machineset "$1" --type json -p "[\{\"op\": \"add\", \"path\": \"/spec/template/spec/providerSpec/value/userDataSecret\", \"value\": {\"name\": \"worker-user-data-22\"}}]"
      oc -n openshift-machine-api patch machineset "$1" --type json -p "[\{\"op\": \"add\", \"path\": \"/spec/template/spec/providerSpec/value/metadataServiceOptions\", \"value\": {\"authentication\": \"Required\"}}]"
      

      Actual results:

      `ignition-disks.service` fails to succeed during machine boot with message like

      Ignition has failed. Please ensure your config is valid. Note that only Ignition spec
      v2.x.x configs are accepted.
      A CLI validation tool to check this called ignition-validate can be downloaded from GitHub:
          https://github.com/coreos/ignition/releases
      Note that the v0.x Ignition releases have the correct validator for config spec v2.x.x.
      Here are the Ignition logs:
      Ignition 0.33.0
      reading system config file "/usr/lib/ignition/base.ign"
      parsing config with SHA512: f5b0d067579d19bcee06ea95bcc9ed79a838db83f9dc9788af6e01229519ebf50ee96d330203701c481b1e11471704783f9996597eb2084d6294c7ba7f4db58e
      parsed url from cmdline: ""
      no config URL provided
      reading system config file "/usr/lib/ignition/user.ign"
      no config at "/usr/lib/ignition/user.ign"
      GET http://169.254.169.254/2009-04-04/user-data: attempt #1
      GET error: Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: connect: network is unreachable
      GET http://169.254.169.254/2009-04-04/user-data: attempt #2
      GET error: Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: connect: network is unreachable
      GET http://169.254.169.254/2009-04-04/user-data: attempt #3
      GET error: Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: connect: network is unreachable
      GET http://169.254.169.254/2009-04-04/user-data: attempt #4
      GET error: Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: connect: network is unreachable
      GET http://169.254.169.254/2009-04-04/user-data: attempt #5
      GET result: Unauthorized
      failed to fetch config: failed to fetch resource
      failed to acquire config: failed to fetch resource
      Ignition failed: failed to fetch resource
              2023-03-10T14:19:30+00:00
      
      

      Expected results:

      Machine boots and joins the cluster

      Additional info:

      As for OCP 4.13 (March 2023) we do not support bootimage upgrade, so clusters which was born as OCP 4.6 and bellow are affected and can not use IMDSv2. 

      Related bugs:

              mimccune@redhat.com Michael McCune
              dmoiseev Denis Moiseev (Inactive)
              None
              None
              Zhaohua Sun Zhaohua Sun
              None
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: