• Icon: Sub-task Sub-task
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • operator
    • 1
    • False
    • Hide

      None

      Show
      None
    • False
    • Testing sub-task - see parent for rel note status
    • Done

      Verify story COO-498 add coo gather scripts

       

            [COO-518] Testing of COO resources must gather

            Hongyan Li added a comment -

            updated bug COO-584

            Hongyan Li added a comment - updated bug COO-584

            Hongyan Li added a comment -

            closed COO-574 and filed COO-584

            Hongyan Li added a comment - closed COO-574 and filed  COO-584

            Hongyan Li added a comment -

            Hongyan Li added a comment - Filed bug https://issues.redhat.com/browse/COO-574

            Hongyan Li added a comment - - edited

            Verify with FBC image quay.io/redhat-user-workloads/cluster-observabilit-tenant/coo-fbc-v4-17@sha256:bea28f342b28e5065ded9510ac96dcb3644807348dffee362057f0389e54f35a, still failed to run 

            % % oc adm must-gather --image=quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability-operator-container@sha256:1f9fd6dfed54e704dbd97a121736f8cde74ddf72d09006a52f7a62997f6f08b3 -- /usr/bin/gather
            [must-gather      ] OUT 2024-11-28T11:47:46.270936Z Using must-gather plug-in image: quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability-operator-container@sha256:1f9fd6dfed54e704dbd97a121736f8cde74ddf72d09006a52f7a62997f6f08b3
            When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
            ClusterID: 9005cb41-2154-4c6f-a0ed-eafac28e39a7
            ClientVersion: 4.17.3
            ClusterVersion: Stable at "4.17.6"
            ClusterOperators:
            	All healthy and stable
            
            
            
            
            [must-gather      ] OUT 2024-11-28T11:47:48.26632Z namespace/openshift-must-gather-5jk2r created
            [must-gather      ] OUT 2024-11-28T11:47:48.504043Z clusterrolebinding.rbac.authorization.k8s.io/must-gather-j2mgf created
            [must-gather      ] OUT 2024-11-28T11:47:49.117795Z pod for plug-in image quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability-operator-container@sha256:1f9fd6dfed54e704dbd97a121736f8cde74ddf72d09006a52f7a62997f6f08b3 created
            [must-gather-npwtn] POD 2024-11-28T11:47:50.548256903Z volume percentage checker started.....
            [must-gather-npwtn] POD 2024-11-28T11:47:50.549274294Z /bin/bash: line 14: /usr/bin/gather: Permission denied
            [must-gather-npwtn] POD 2024-11-28T11:47:50.559870985Z volume usage percentage 0
            [must-gather-npwtn] OUT 2024-11-28T11:48:00.100712Z waiting for gather to complete
            [must-gather-npwtn] OUT 2024-11-28T11:48:00.347023Z downloading gather output
            WARNING: cannot use rsync: rsync not available in container
            WARNING: cannot use tar: tar not available in container
            WARNING: cannot use rsync: rsync not available in container
            WARNING: cannot use tar: tar not available in container
            [must-gather-npwtn] OUT 2024-11-28T11:48:19.688838Z gather output not downloaded: No available strategies to copy.
            [must-gather-npwtn] OUT 2024-11-28T11:48:19.688985Z 
            [must-gather      ] OUT 2024-11-28T11:48:19.939127Z namespace/openshift-must-gather-5jk2r deleted
            
            
            
            
            Reprinting Cluster State:
            When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
            ClusterID: 9005cb41-2154-4c6f-a0ed-eafac28e39a7
            ClientVersion: 4.17.3
            ClusterVersion: Stable at "4.17.6"
            ClusterOperators:
            	All healthy and stable
            
            
            
            
            error: unable to download output from pod must-gather-npwtn: No available strategies to copy.

            Hongyan Li added a comment - - edited Verify with FBC image quay.io/redhat-user-workloads/cluster-observabilit-tenant/coo-fbc-v4-17@sha256:bea28f342b28e5065ded9510ac96dcb3644807348dffee362057f0389e54f35a , still failed to run  % % oc adm must-gather --image=quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability- operator -container@sha256:1f9fd6dfed54e704dbd97a121736f8cde74ddf72d09006a52f7a62997f6f08b3 -- /usr/bin/gather [must-gather      ] OUT 2024-11-28T11:47:46.270936Z Using must-gather plug-in image: quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability- operator -container@sha256:1f9fd6dfed54e704dbd97a121736f8cde74ddf72d09006a52f7a62997f6f08b3 When opening a support case , bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: 9005cb41-2154-4c6f-a0ed-eafac28e39a7 ClientVersion: 4.17.3 ClusterVersion: Stable at "4.17.6" ClusterOperators: All healthy and stable [must-gather      ] OUT 2024-11-28T11:47:48.26632Z namespace/openshift-must-gather-5jk2r created [must-gather      ] OUT 2024-11-28T11:47:48.504043Z clusterrolebinding.rbac.authorization.k8s.io/must-gather-j2mgf created [must-gather      ] OUT 2024-11-28T11:47:49.117795Z pod for plug-in image quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability- operator -container@sha256:1f9fd6dfed54e704dbd97a121736f8cde74ddf72d09006a52f7a62997f6f08b3 created [must-gather-npwtn] POD 2024-11-28T11:47:50.548256903Z volume percentage checker started..... [must-gather-npwtn] POD 2024-11-28T11:47:50.549274294Z /bin/bash: line 14: /usr/bin/gather: Permission denied [must-gather-npwtn] POD 2024-11-28T11:47:50.559870985Z volume usage percentage 0 [must-gather-npwtn] OUT 2024-11-28T11:48:00.100712Z waiting for gather to complete [must-gather-npwtn] OUT 2024-11-28T11:48:00.347023Z downloading gather output WARNING: cannot use rsync: rsync not available in container WARNING: cannot use tar: tar not available in container WARNING: cannot use rsync: rsync not available in container WARNING: cannot use tar: tar not available in container [must-gather-npwtn] OUT 2024-11-28T11:48:19.688838Z gather output not downloaded: No available strategies to copy. [must-gather-npwtn] OUT 2024-11-28T11:48:19.688985Z [must-gather      ] OUT 2024-11-28T11:48:19.939127Z namespace/openshift-must-gather-5jk2r deleted Reprinting Cluster State: When opening a support case , bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: 9005cb41-2154-4c6f-a0ed-eafac28e39a7 ClientVersion: 4.17.3 ClusterVersion: Stable at "4.17.6" ClusterOperators: All healthy and stable error: unable to download output from pod must-gather-npwtn: No available strategies to copy.

            Hongyan Li added a comment -

            Upstream image face the following issue

            % oc adm must-gather --image=quay.io/rhobs/observability-operator:0.4.3-241120152757 -- /usr/bin/gather
            ......
            [must-gather      ] OUT 2024-11-25T03:02:34.540285Z namespace/openshift-must-gather-fg95n deleted
            
            
            Reprinting Cluster State:
            When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
            ClusterID: 021969ed-0e19-4939-aa8c-94ef5f6a4f03
            ClientVersion: 4.17.3
            ClusterVersion: Stable at "4.16.20"
            ClusterOperators:
            	All healthy and stable
            
            
            
            
            error: unable to download output from pod must-gather-49l8m: No available strategies to copy. 

             

             

            Hongyan Li added a comment - Upstream image face the following issue % oc adm must-gather --image=quay.io/rhobs/observability- operator :0.4.3-241120152757 -- /usr/bin/gather ...... [must-gather      ] OUT 2024-11-25T03:02:34.540285Z namespace/openshift-must-gather-fg95n deleted Reprinting Cluster State: When opening a support case , bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: 021969ed-0e19-4939-aa8c-94ef5f6a4f03 ClientVersion: 4.17.3 ClusterVersion: Stable at "4.16.20" ClusterOperators: All healthy and stable error: unable to download output from pod must-gather-49l8m: No available strategies to copy.    

            Jan Fajerski added a comment - Thanks, Fix is in https://github.com/rhobs/konflux-coo/pull/184

            Hongyan Li added a comment - - edited

            Test with latest downstream image, "error: gather not start " is fixed, face new issue

            % oc adm must-gather --image=quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability-operator-container@sha256:3bc7c7d06c15d6af604bf36510c0025acc42f94e83b5d23bb23dacb94c143fdb -- /usr/bin/gather
            [must-gather      ] OUT 2024-11-19T14:14:05.38695Z Using must-gather plug-in image: quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability-operator-container@sha256:3bc7c7d06c15d6af604bf36510c0025acc42f94e83b5d23bb23dacb94c143fdb
            When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
            ClusterID: 02919076-cd48-44ce-b45e-618b63007972
            ClientVersion: 4.17.3
            ClusterVersion: Stable at "4.16.20"
            ClusterOperators:
            	All healthy and stable
            
            
            
            
            [must-gather      ] OUT 2024-11-19T14:14:08.218731Z namespace/openshift-must-gather-k2jvv created
            [must-gather      ] OUT 2024-11-19T14:14:08.4627Z clusterrolebinding.rbac.authorization.k8s.io/must-gather-6hdrm created
            [must-gather      ] OUT 2024-11-19T14:14:09.901493Z pod for plug-in image quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability-operator-container@sha256:3bc7c7d06c15d6af604bf36510c0025acc42f94e83b5d23bb23dacb94c143fdb created
            [must-gather-vnd8q] POD 2024-11-19T14:14:10.559894058Z volume percentage checker started.....
            [must-gather-vnd8q] POD 2024-11-19T14:14:10.560986327Z /bin/bash: line 14: /usr/bin/gather: No such file or directory
            [must-gather-vnd8q] POD 2024-11-19T14:14:10.568118593Z volume usage percentage 0
            [must-gather-vnd8q] OUT 2024-11-19T14:14:21.182818Z waiting for gather to complete
            [must-gather-vnd8q] OUT 2024-11-19T14:14:21.425203Z downloading gather output
            WARNING: cannot use rsync: rsync not available in container
            WARNING: cannot use tar: tar not available in container
            WARNING: cannot use rsync: rsync not available in container
            WARNING: cannot use tar: tar not available in container
            [must-gather-vnd8q] OUT 2024-11-19T14:14:55.195915Z gather output not downloaded: No available strategies to copy.
            [must-gather-vnd8q] OUT 2024-11-19T14:14:55.196036Z 
            [must-gather      ] OUT 2024-11-19T14:14:55.447612Z namespace/openshift-must-gather-k2jvv deleted
            
            
            
            
            Reprinting Cluster State:
            When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
            ClusterID: 02919076-cd48-44ce-b45e-618b63007972
            ClientVersion: 4.17.3
            ClusterVersion: Stable at "4.16.20"
            ClusterOperators:
            	All healthy and stable
            
            
            
            
            error: unable to download output from pod must-gather-vnd8q: No available strategies to copy. 

            Hongyan Li added a comment - - edited Test with latest downstream image, "error: gather not start " is fixed, face new issue % oc adm must-gather --image=quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability- operator -container@sha256:3bc7c7d06c15d6af604bf36510c0025acc42f94e83b5d23bb23dacb94c143fdb -- /usr/bin/gather [must-gather      ] OUT 2024-11-19T14:14:05.38695Z Using must-gather plug-in image: quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability- operator -container@sha256:3bc7c7d06c15d6af604bf36510c0025acc42f94e83b5d23bb23dacb94c143fdb When opening a support case , bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: 02919076-cd48-44ce-b45e-618b63007972 ClientVersion: 4.17.3 ClusterVersion: Stable at "4.16.20" ClusterOperators: All healthy and stable [must-gather      ] OUT 2024-11-19T14:14:08.218731Z namespace/openshift-must-gather-k2jvv created [must-gather      ] OUT 2024-11-19T14:14:08.4627Z clusterrolebinding.rbac.authorization.k8s.io/must-gather-6hdrm created [must-gather      ] OUT 2024-11-19T14:14:09.901493Z pod for plug-in image quay.io/redhat-user-workloads/cluster-observabilit-tenant/cluster-observability- operator -container@sha256:3bc7c7d06c15d6af604bf36510c0025acc42f94e83b5d23bb23dacb94c143fdb created [must-gather-vnd8q] POD 2024-11-19T14:14:10.559894058Z volume percentage checker started..... [must-gather-vnd8q] POD 2024-11-19T14:14:10.560986327Z /bin/bash: line 14: /usr/bin/gather: No such file or directory [must-gather-vnd8q] POD 2024-11-19T14:14:10.568118593Z volume usage percentage 0 [must-gather-vnd8q] OUT 2024-11-19T14:14:21.182818Z waiting for gather to complete [must-gather-vnd8q] OUT 2024-11-19T14:14:21.425203Z downloading gather output WARNING: cannot use rsync: rsync not available in container WARNING: cannot use tar: tar not available in container WARNING: cannot use rsync: rsync not available in container WARNING: cannot use tar: tar not available in container [must-gather-vnd8q] OUT 2024-11-19T14:14:55.195915Z gather output not downloaded: No available strategies to copy. [must-gather-vnd8q] OUT 2024-11-19T14:14:55.196036Z [must-gather      ] OUT 2024-11-19T14:14:55.447612Z namespace/openshift-must-gather-k2jvv deleted Reprinting Cluster State: When opening a support case , bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: 02919076-cd48-44ce-b45e-618b63007972 ClientVersion: 4.17.3 ClusterVersion: Stable at "4.16.20" ClusterOperators: All healthy and stable error: unable to download output from pod must-gather-vnd8q: No available strategies to copy.

            Actually the bash error only happens with out upsream docker file since it uses a distroless container. Will change it to busybox.

            Jan Fajerski added a comment - Actually the bash error only happens with out upsream docker file since it uses a distroless container. Will change it to busybox.

            Jan Fajerski added a comment - - edited

            Here is the real error

            pod-must-gather-zx46g.yaml

            container create failed: time="2024-11-13T09:33:26Z" level=error msg="runc create failed: unable to start container process: exec: \"/bin/bash\": stat /bin/bash: no such file or directory"

             
             

            Jan Fajerski added a comment - - edited Here is the real error pod-must-gather-zx46g.yaml container create failed: time= "2024-11-13T09:33:26Z" level=error msg= "runc create failed: unable to start container process: exec: \" /bin/bash\ ": stat /bin/bash: no such file or directory"    

            I can reproduce this in a cluster-bot cluster.

            QoS Class:                   BestEffort
            Node-Selectors:              kubernetes.io/os=linux
                                         node-role.kubernetes.io/master=
            Tolerations:                 op=Exists
            Events:
              Type     Reason            Age    From               Message
              ----     ------            ----   ----               -------
              Warning  FailedScheduling  6m19s  default-scheduler  0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
              Warning  FailedScheduling  54s    default-scheduler  0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.

            Not sure right now if that is something we need to fix in our container or if must-gather needs to be run differently.

            Jan Fajerski added a comment - I can reproduce this in a cluster-bot cluster. QoS Class :                   BestEffort Node-Selectors:              kubernetes.io/os=linux                              node-role.kubernetes.io/master= Tolerations:                 op=Exists Events:   Type     Reason            Age    From               Message   ----     ------            ----   ----               -------   Warning  FailedScheduling  6m19s   default -scheduler  0/1 nodes are available: 1 node(s) didn 't match Pod' s node affinity/selector. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling. Warning  FailedScheduling  54s     default -scheduler  0/1 nodes are available: 1 node(s) didn 't match Pod' s node affinity/selector. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling. Not sure right now if that is something we need to fix in our container or if must-gather needs to be run differently.

              hongyli@redhat.com Hongyan Li
              hongyli@redhat.com Hongyan Li
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: