Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-45264

rendered machine config fails to apply when performance profile contains very big list of cpus

XMLWordPrintable

    • None
    • x86_64
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          When Applying profile with isolated field containing huge cpu  list, profile doesn't apply and no errors is reported 

      Version-Release number of selected component (if applicable):

          4.18.0-0.nightly-2024-11-26-075648

      How reproducible:

          Everytime.

      Steps to Reproduce:

          1. Create a profile as specified below:
      apiVersion: performance.openshift.io/v2
      kind: PerformanceProfile
      metadata:
        annotations:
          kubeletconfig.experimental: '{"topologyManagerPolicy":"restricted"}'
        creationTimestamp: "2024-11-27T10:25:13Z"
        finalizers:
        - foreground-deletion
        generation: 61
        name: performance
        resourceVersion: "3001998"
        uid: 8534b3bf-7bf7-48e1-8413-6e728e89e745
      spec:
        cpu:
          isolated: 25,281,117,373,127,383,3,259,30,286,77,333,23,279,21,277,66,322,12,268,15,271,61,317,120,376,35,291,62,318,93,349,126,382,19,275,52,308,110,366,50,306,92,348,124,380,119,375,2,258,29,285,60,316,115,371,118,374,104,360,108,364,70,326,72,328,76,332,96,352,99,355,64,320,80,336,97,353,8,264,11,267,38,294,53,309,57,313,103,359,14,270,87,343,7,263,40,296,51,307,94,350,116,372,39,295,46,302,90,346,101,357,107,363,26,282,67,323,98,354,106,362,113,369,6,262,10,266,20,276,33,289,112,368,85,341,121,377,68,324,71,327,79,335,81,337,83,339,88,344,9,265,89,345,91,347,100,356,54,310,31,287,58,314,59,315,22,278,47,303,105,361,17,273,114,370,111,367,28,284,49,305,55,311,84,340,27,283,95,351,5,261,36,292,41,297,43,299,45,301,75,331,102,358,109,365,37,293,56,312,63,319,65,321,74,330,125,381,13,269,42,298,44,300,78,334,122,378,4,260,16,272,34,290,123,379,18,274,48,304,69,325,82,338,24,280,32,288,73,329,86,342,220,476,251,507,206,462,226,482,229,485,221,477,253,509,255,511,135,391,144,400,183,439,233,489,137,393,186,442,198,454,190,446,234,490,147,403,163,419,172,428,148,404,149,405,250,506,151,407,217,473,238,494,131,387,230,486,235,491,246,502,145,401,194,450,199,455,143,399,169,425,231,487,245,501,129,385,142,398,179,435,225,481,236,492,152,408,203,459,214,470,166,422,207,463,212,468,130,386,155,411,215,471,188,444,201,457,210,466,193,449,200,456,248,504,141,397,167,423,191,447,181,437,222,478,252,508,128,384,139,395,174,430,164,420,168,424,187,443,232,488,133,389,157,413,208,464,140,396,185,441,241,497,219,475,175,431,184,440,213,469,154,410,197,453,249,505,209,465,218,474,227,483,244,500,134,390,153,409,178,434,160,416,195,451,196,452,211,467,132,388,136,392,146,402,138,394,150,406,239,495,173,429,192,448,202,458,205,461,216,472,158,414,159,415,176,432,189,445,237,493,242,498,177,433,182,438,204,460,240,496,254,510,162,418,171,427,180,436,243,499,156,412,165,421,170,426,228,484,247,503,161,417,223,479,224,480
          reserved: 0,256,1,257
        hugepages:
          defaultHugepagesSize: 1G
          pages:
          - count: 20
            size: 2M
        machineConfigPoolSelector:
          machineconfiguration.openshift.io/role: worker-cnf
        net:
          userLevelNetworking: true
        nodeSelector:
          node-role.kubernetes.io/worker-cnf: ""
        numa:
          topologyPolicy: restricted
        realTimeKernel:
          enabled: false
        workloadHints:
          highPowerConsumption: true
          perPodPowerManagement: false
          realTime: true
      
          2. The worker-cnf node doesn't contain any kernel args associated with the above profile.
          3.
          

      Actual results:

          System doesn't boot with kernel args associated with above profile

      Expected results:

          System should boot with Kernel args presented from Performance Profile.

      Additional info:

      We can see MCO gets the details and creates the mc:
      
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com root[18779]: machine-config-daemon[9550]: "Running rpm-ostree [kargs --delete=systemd.unified_cgroup_hierarchy=1 --delete=cgroup_no_v1=\"all\" --delete=psi=0 --delete=skew_tick=1 --delete=tsc=reliable --delete=rcupda>
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com root[18779]: cbs=25,281,117,373,127,383,3,259,30,286,77,333,23,279,21,277,66,322,12,268,15,271,61,317,120,376,35,291,62,318,93,349,126,382,19,275,52,308,110,366,50,306,92,348,124,380,119,375,2,258,29,285,60,316,115,3>
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com root[18779]: 4,190,446,234,490,147,403,163,419,172,428,148,404,149,405,250,506,151,407,217,473,238,494,131,387,230,486,235,491,246,502,145,401,194,450,199,455,143,399,169,425,231,487,245,501,129,385,142,398,179,435,2>
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com root[18779]: systemd.cpu_affinity=0,1,256,257 --append=iommu=pt --append=amd_pstate=guided --append=tsc=reliable --append=nmi_watchdog=0 --append=mce=off --append=processor.max_cstate=1 --append=idle=poll --append=is>
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com root[18779]: ,78,334,122,378,4,260,16,272,34,290,123,379,18,274,48,304,69,325,82,338,24,280,32,288,73,329,86,342,220,476,251,507,206,462,226,482,229,485,221,477,253,509,255,511,135,391,144,400,183,439,233,489,137,393>
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com root[18779]: 510,162,418,171,427,180,436,243,499,156,412,165,421,170,426,228,484,247,503,161,417,223,479,224,480 --append=nohz_full=25,281,117,373,127,383,3,259,30,286,77,333,23,279,21,277,66,322,12,268,15,271,61,317>
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com root[18779]: ,476,251,507,206,462,226,482,229,485,221,477,253,509,255,511,135,391,144,400,183,439,233,489,137,393,186,442,198,454,190,446,234,490,147,403,163,419,172,428,148,404,149,405,250,506,151,407,217,473,238,49>
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com root[18779]: ppend=nosoftlockup --append=skew_tick=1 --append=rcutree.kthread_prio=11 --append=default_hugepagesz=1G --append=hugepagesz=2M --append=hugepages=20]"
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com rpm-ostree[18750]: client(id:machine-config-operator dbus:1.336 unit:crio-36c845a9c9a58a79a0e09dab668f8b21b5e46e5734a527c269c6a5067faa423b.scope uid:0) added; new total=1
      Dec 02 08:59:43 cnfdd11.t5g-dev.eng.rdu2.dc.redhat.com rpm-ostree[18750]: Loaded sysroot
      
      Actual Kernel args:
      BOOT_IMAGE=(hd1,gpt3)/boot/ostree/rhcos-854dd632224b34d5f4df1884c4ba8c2f9527422b37744b83e7b1b98172586ff4/vmlinuz-5.14.0-427.44.1.el9_4.x86_64 rw ostree=/ostree/boot.0/rhcos/854dd632224b34d5f4df1884c4ba8c2f9527422b37744b83e7b1b98172586ff4/0 ignition.platform.id=metal ip=dhcp root=UUID=0068e804-432c-409d-aabc-260aa71e3669 rw rootflags=prjquota boot=UUID=7797d927-876e-426b-9a30-d1e600c1a382 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=0 skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 nohz=on
      
          

              msivak@redhat.com Martin Sivak
              mniranja Mallapadi Niranjan
              Mallapadi Niranjan Mallapadi Niranjan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: