Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-56199

TNF: tnf-after-setup- pod timeout

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • 4.20.0
    • 4.19.0
    • Two Node Fencing
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 3
    • None
    • None
    • None
    • None
    • In Progress
    • Release Note Not Required
    • None
    • None
    • None
    • None
    • None

      Description of problem

      Deployment of TNF cluster with dev-scripts using the following config:

      [frmoreno@titan60 dev-scripts]$ grep -e "^export" config_frmoreno.sh
      export CI_TOKEN='<deleted>'
      export OPENSHIFT_RELEASE_STREAM=4.19
      export OPENSHIFT_RELEASE_TYPE=nightly
      export CLUSTER_NAME="tnf"
      export BASE_DOMAIN="titan60.local"
      export IP_STACK=v4
      export FEATURE_SET="DevPreviewNoUpgrade"
      export NUM_MASTERS=2
      export MASTER_MEMORY=32768
      export MASTER_DISK=100
      export NUM_WORKERS=0

      finishes correctly with the cluster active, but the following pods are in error status:

       

       

      [frmoreno@titan60 dev-scripts]$ oc get pods --all-namespaces --field-selector=status.phase!=Running,status.phase!=Succeeded
      NAMESPACE                           NAME                              READY   STATUS    RESTARTS   AGE
      openshift-etcd                      tnf-after-setup-master-0-24sz5    0/1     Error     0          39m
      openshift-etcd                      tnf-after-setup-master-1-rxzjj    0/1     Error     0          39m
      openshift-kube-controller-manager   installer-3-master-0              0/1     Error     0          38m
      openshift-kube-controller-manager   installer-3-retry-1-master-0      0/1     Error     0          37m
      openshift-kube-controller-manager   installer-3-retry-2-master-0      0/1     Error     0          36m
      openshift-monitoring                metrics-server-6c8849d45b-jwvtw   0/1     Pending   0          2m36s

       

      Version-Release number of selected component (if applicable):

      [frmoreno@titan60 dev-scripts]$ git log -1
      commit 333707ec15638941f39c7be3c70f72d4705d4286 (HEAD -> master, origin/master, origin/HEAD)
      Author: Daniel Erez <danielerez@gmail.com>
      Date:   Tue May 13 16:32:20 2025 +0300    appliance: remove cache/temp dirs (#1763)
          
          Delete the unused cache/temp directories to avoid
          storage overconsumption on the CI machine.
      

      How reproducible:

      on my setup (titan60.lab.eng.tlv2.redhat.com) 10/10

      Steps to Reproduce:

      setup config_<user>.sh file with the settings above and run "make"      

      Actual results:

      openshift cluster up with some pod in error state    

      Expected results:

      No pods in error state after finish cluster installation    

      Additional info:

      [frmoreno@titan60 dev-scripts]$ oc logs tnf-after-setup-master-0-24sz5 -n openshift-etcd
      I0514 14:36:12.456170    5688 runner.go:21] Setting up clients etc. for TNF after setup job
      I0514 14:36:12.459144    5688 runner.go:46] Running TNF after setup
      I0514 14:36:12.459206    5688 runner.go:48] Waiting for completed setup job
      W0514 14:36:12.469463    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:36:27.492949    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:36:42.468379    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:36:57.473053    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:37:12.483202    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:37:27.467775    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:37:42.465685    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:37:57.470420    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:38:12.467511    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:38:27.466861    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:38:42.465858    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:38:57.465596    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:39:12.464010    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:39:27.465894    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:39:42.466055    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:39:57.466981    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:40:12.465135    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:40:27.465901    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:40:42.464409    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:40:57.466121    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:41:12.465634    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:41:27.466970    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:41:42.465342    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:41:57.466299    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:42:12.465636    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:42:27.466331    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:43:42.466263    5688 runner.go:54] Failed to list jobs: the server was unable to return a response in the time allotted, but may still be processing the request (get jobs.batch)
      W0514 14:43:47.484449    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:43:57.464631    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:44:12.466499    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:44:27.471363    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:44:42.470107    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:44:57.465304    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:45:12.474078    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:45:27.467673    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:45:42.465881    5688 runner.go:62] Job tnf-setup not complete
      W0514 14:46:05.492669    5688 runner.go:62] Job tnf-setup not complete
      E0514 14:46:12.459471    5688 runner.go:70] Timed out waiting for setup job to complete: context deadline exceeded
          

              slintes Marc Sluiter
              frmoreno Francisco Javier Moreno Moreno
              None
              None
              Douglas Hensel Douglas Hensel
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: