Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-4314

fix OCP-48942 failed in 4.20

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • None
    • Quality / Stability / Reliability
    • 1
    • False
    • Hide

      None

      Show
      None
    • False
    • NEW
    • NEW
    • MON Sprint 274, MON Sprint 275

      https://qe-component-readiness.dptools.openshift.org/sippy-ng/component_readiness/test_details?Architecture=amd64&FeatureSet=default&Installer=ipi&Network=ovn&Platform=aws&Suite=unknown&Topology=ha&Upgrade=none&baseEndTime=2025-06-17%2023%3A59%3A59&baseRelease=4.19&baseStartTime=2025-05-18%2000%3A00%3A00&capability=Other&columnGroupBy=Architecture%2CNetwork%2CPlatform&component=Monitoring&confidence=95&dbGroupBy=Platform%2CArchitecture%2CNetwork%2CTopology%2CFeatureSet%2CUpgrade%2CSuite%2CInstaller&environment=Architecture%3Aamd64%20FeatureSet%3Adefault%20Installer%3Aipi%20Network%3Aovn%20Platform%3Aaws%20Suite%3Aunknown%20Topology%3Aha%20Upgrade%3Anone&flakeAsFailure=false&ignoreDisruption=true&ignoreMissing=false&includeMultiReleaseAnalysis=false&includeVariant=Architecture%3Aamd64&includeVariant=FeatureSet%3Adefault&includeVariant=Installer%3Aipi&includeVariant=Installer%3Aupi&includeVariant=Network%3Aovn&includeVariant=Owner%3Aqe&includeVariant=Owner%3Aservice-delivery&includeVariant=Platform%3Aaws&includeVariant=Platform%3Aazure&includeVariant=Platform%3Agcp&includeVariant=Platform%3Arosa&includeVariant=Platform%3Avsphere&includeVariant=Topology%3Aha&minFail=3&passRateAllTests=0&passRateNewTests=0&pity=5&sampleEndTime=2025-08-06%2023%3A59%3A59&sampleRelease=4.20&sampleStartTime=2025-07-30%2000%3A00%3A00&testBasisRelease=4.19&testId=Cluster_Observability%3A2b696481e75709238bcaf5c11e36def4&testName=OCP-48942%3Atagao%3ACluster_Observability%3A%5Bsig-monitoring%5D%20Cluster_Observability%20parallel%20monitoring%20validation%20for%20scrapeTimeout%20and%20relabel%20configs

      and

      https://qe-component-readiness.dptools.openshift.org/sippy-ng/component_readiness/test_details?Architecture=amd64&FeatureSet=default&Installer=ipi&Network=ovn&Platform=aws&Suite=unknown&Topology=ha&Upgrade=none&baseEndTime=2025-06-17%2023%3A59%3A59&baseRelease=4.19&baseStartTime=2025-05-18%2000%3A00%3A00&capability=Other&columnGroupBy=Architecture%2CNetwork%2CPlatform&component=Monitoring&confidence=95&dbGroupBy=Platform%2CArchitecture%2CNetwork%2CTopology%2CFeatureSet%2CUpgrade%2CSuite%2CInstaller&environment=Architecture%3Aamd64%20FeatureSet%3Adefault%20Installer%3Aipi%20Network%3Aovn%20Platform%3Aaws%20Suite%3Aunknown%20Topology%3Aha%20Upgrade%3Anone&flakeAsFailure=false&ignoreDisruption=true&ignoreMissing=false&includeMultiReleaseAnalysis=false&includeVariant=Architecture%3Aamd64&includeVariant=FeatureSet%3Adefault&includeVariant=Installer%3Aipi&includeVariant=Installer%3Aupi&includeVariant=Network%3Aovn&includeVariant=Owner%3Aqe&includeVariant=Owner%3Aservice-delivery&includeVariant=Platform%3Aaws&includeVariant=Platform%3Aazure&includeVariant=Platform%3Agcp&includeVariant=Platform%3Arosa&includeVariant=Platform%3Avsphere&includeVariant=Topology%3Aha&minFail=3&passRateAllTests=0&passRateNewTests=0&pity=5&sampleEndTime=2025-08-06%2023%3A59%3A59&sampleRelease=4.20&sampleStartTime=2025-07-30%2000%3A00%3A00&testBasisRelease=4.19&testId=Cluster_Observability%3A2b696481e75709238bcaf5c11e36def4&testName=OCP-48942%3Atagao%3ACluster_Observability%3A%5Bsig-monitoring%5D%20Cluster_Observability%20parallel%20monitoring%20validation%20for%20scrapeTimeout%20and%20relabel%20configs

      OCP-48942 validation for scrapeTimeout and relabel configs

      0% pass on 4.20, checked the failed run, error is

      OCP-48942:tagao:Cluster_Observability:[sig-monitoring] 
      Cluster_Observability parallel monitoring validation for scrapeTimeout 
      and relabel configs expand_less
                    4m6s
                  
                  
                    
                      {  fail [github.com/openshift/openshift-tests-private/test/extended/util/assert.go:30]: Unexpected error:
          <*errors.errorString | 0xc001ba2050>: 
          case: [sig-monitoring] Cluster_Observability parallel monitoring Author:tagao-Medium-48942-validation for scrapeTimeout and relabel configs
          error: failed to find "error="scrapeTimeout \"120s\" greater than scrapeInterval \"30s\""" in the pod logs
          {
              s: "case: [sig-monitoring] Cluster_Observability parallel monitoring Author:tagao-Medium-48942-validation for scrapeTimeout and relabel configs\nerror: failed to find \"error=\"scrapeTimeout \\\"120s\\\" greater than scrapeInterval \\\"30s\\\"\"\" in the pod logs",
          }
      occurred} 

      checked in 4.20.0-0.nightly-2025-07-31-063120, error is more specific for 4.20

      error="endpoints[0]: scrapeTimeout \"120s\" greater than scrapeInterval \"30s\""

      see:

      $ oc -n openshift-monitoring logs -c prometheus-operator $(oc -n openshift-monitoring get pod -l app.kubernetes.io/name=prometheus-operator --no-headers | awk '{print $1}') | grep scrapeTimeout
      ts=2025-08-06T14:29:01.719461089Z level=warn caller=/go/src/github.com/coreos/prometheus-operator/pkg/prometheus/resource_selector.go:186 msg="skipping object" component=prometheus-controller key=openshift-monitoring/k8s kind=ServiceMonitor error="endpoints[0]: scrapeTimeout \"120s\" greater than scrapeInterval \"30s\"" object=openshift-console/console-test
      ts=2025-08-06T14:29:01.719598951Z level=info caller=/go/src/github.com/coreos/prometheus-operator/vendor/k8s.io/client-go/tools/record/event.go:389 msg="Event occurred" object.name=console-test object.namespace=openshift-console fieldPath="" kind=ServiceMonitor apiVersion=monitoring.coreos.com/v1 type=Warning reason=InvalidConfiguration message="\"openshift-console/console-test\" was rejected due to invalid configuration: endpoints[0]: scrapeTimeout \"120s\" greater than scrapeInterval \"30s\"" 

      https://github.com/openshift/openshift-tests-private/blob/release-4.20/test/extended/monitoring/monitoring.go#L429

      change to below would pass

      checkLogWithLabel(oc, "openshift-monitoring", "app.kubernetes.io/name=prometheus-operator", "prometheus-operator", `scrapeTimeout \"120s\" greater than scrapeInterval \"30s\""`, true)
      

      please fix

       

              tagao@redhat.com Tai Gao
              juzhao@redhat.com Junqi Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: