Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-53432

CI fails on TestIngressControllerCustomEndpoints

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Low
    • None
    • None
    • Rejected
    • NI&D Sprint 275, NI&D Sprint 276
    • 2
    • In Progress
    • Release Note Not Required
    • N/A
    • None
    • None
    • None
    • None

      Description of problem

      CI is flaky because of test failures such as the following:

      === RUN   TestAll/serial/TestIngressControllerCustomEndpoints
          operator_test.go:2644: Expected conditions: map[Admitted:True Available:True DNSManaged:True DNSReady:True LoadBalancerManaged:True LoadBalancerReady:True]
               Current conditions: map[Admitted:True Available:False DNSManaged:True DNSReady:False Degraded:True DeploymentAvailable:True DeploymentReplicasAllAvailable:True DeploymentReplicasMinAvailable:True DeploymentRollingOut:False EvaluationConditionsDetected:False LoadBalancerManaged:True LoadBalancerProgressing:False LoadBalancerReady:False Progressing:False Upgradeable:True]
          operator_test.go:2644: Ingress Controller openshift-ingress-operator/test-custom-endpoints status: {
                "availableReplicas": 1,
                "selector": "ingresscontroller.operator.openshift.io/deployment-ingresscontroller=test-custom-endpoints",
                "domain": "test-custom-endpoints.ci-op-y9qzniri-9e7c5.origin-ci-int-aws.dev.rhcloud.com",
                "endpointPublishingStrategy": {
                  "type": "LoadBalancerService",
                  "loadBalancer": {
                    "scope": "External",
                    "providerParameters": {
                      "type": "AWS",
                      "aws": {
                        "type": "Classic",
                        "classicLoadBalancer": {
                          "connectionIdleTimeout": "0s"
                        }
                      }
                    },
                    "dnsManagementPolicy": "Managed"
                  }
                },
                "conditions": [
                  {
                    "type": "Admitted",
                    "status": "True",
                    "lastTransitionTime": "2025-03-20T18:14:42Z",
                    "reason": "Valid"
                  },
                  {
                    "type": "DeploymentAvailable",
                    "status": "True",
                    "lastTransitionTime": "2025-03-20T18:15:46Z",
                    "reason": "DeploymentAvailable",
                    "message": "The deployment has Available status condition set to True"
                  },
                  {
                    "type": "DeploymentReplicasMinAvailable",
                    "status": "True",
                    "lastTransitionTime": "2025-03-20T18:15:46Z",
                    "reason": "DeploymentMinimumReplicasMet",
                    "message": "Minimum replicas requirement is met"
                  },
                  {
                    "type": "DeploymentReplicasAllAvailable",
                    "status": "True",
                    "lastTransitionTime": "2025-03-20T18:15:46Z",
                    "reason": "DeploymentReplicasAvailable",
                    "message": "All replicas are available"
                  },
                  {
                    "type": "DeploymentRollingOut",
                    "status": "False",
                    "lastTransitionTime": "2025-03-20T18:15:46Z",
                    "reason": "DeploymentNotRollingOut",
                    "message": "Deployment is not actively rolling out"
                  },
                  {
                    "type": "LoadBalancerManaged",
                    "status": "True",
                    "lastTransitionTime": "2025-03-20T18:14:43Z",
                    "reason": "WantedByEndpointPublishingStrategy",
                    "message": "The endpoint publishing strategy supports a managed load balancer"
                  },
                  {
                    "type": "LoadBalancerReady",
                    "status": "False",
                    "lastTransitionTime": "2025-03-20T18:14:43Z",
                    "reason": "LoadBalancerPending",
                    "message": "The LoadBalancer service is pending"
                  },
                  {
                    "type": "LoadBalancerProgressing",
                    "status": "False",
                    "lastTransitionTime": "2025-03-20T18:14:43Z",
                    "reason": "LoadBalancerNotProgressing",
                    "message": "LoadBalancer is not progressing"
                  },
                  {
                    "type": "DNSManaged",
                    "status": "True",
                    "lastTransitionTime": "2025-03-20T18:14:43Z",
                    "reason": "Normal",
                    "message": "DNS management is supported and zones are specified in the cluster DNS config."
                  },
                  {
                    "type": "DNSReady",
                    "status": "False",
                    "lastTransitionTime": "2025-03-20T18:14:43Z",
                    "reason": "RecordNotFound",
                    "message": "The wildcard record resource was not found."
                  },
                  {
                    "type": "Available",
                    "status": "False",
                    "lastTransitionTime": "2025-03-20T18:14:43Z",
                    "reason": "IngressControllerUnavailable",
                    "message": "One or more status conditions indicate unavailable: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)"
                  },
                  {
                    "type": "Progressing",
                    "status": "False",
                    "lastTransitionTime": "2025-03-20T18:15:46Z"
                  },
                  {
                    "type": "Degraded",
                    "status": "True",
                    "lastTransitionTime": "2025-03-20T18:16:13Z",
                    "reason": "DegradedConditions",
                    "message": "One or more other status conditions indicate a degraded state: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)"
                  },
                  {
                    "type": "Upgradeable",
                    "status": "True",
                    "lastTransitionTime": "2025-03-20T18:14:43Z",
                    "reason": "Upgradeable",
                    "message": "IngressController is upgradeable."
                  },
                  {
                    "type": "EvaluationConditionsDetected",
                    "status": "False",
                    "lastTransitionTime": "2025-03-20T18:14:43Z",
                    "reason": "NoEvaluationCondition",
                    "message": "No evaluation condition is detected."
                  }
                ],
                "tlsProfile": {
                  "ciphers": [
                    "ECDHE-ECDSA-AES128-GCM-SHA256",
                    "ECDHE-RSA-AES128-GCM-SHA256",
                    "ECDHE-ECDSA-AES256-GCM-SHA384",
                    "ECDHE-RSA-AES256-GCM-SHA384",
                    "ECDHE-ECDSA-CHACHA20-POLY1305",
                    "ECDHE-RSA-CHACHA20-POLY1305",
                    "DHE-RSA-AES128-GCM-SHA256",
                    "DHE-RSA-AES256-GCM-SHA384",
                    "TLS_AES_128_GCM_SHA256",
                    "TLS_AES_256_GCM_SHA384",
                    "TLS_CHACHA20_POLY1305_SHA256"
                  ],
                  "minTLSVersion": "VersionTLS12"
                },
                "observedGeneration": 1
              }
          operator_test.go:2645: failed to observe expected conditions: timed out waiting for the condition
          operator_test.go:2647: deleted ingresscontroller test-custom-endpoints
      

      This particular failure comes from https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_cluster-ingress-operator/1152/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator-techpreview/1902755473363308544. Search.ci has other similar failures.

      Version-Release number of selected component (if applicable)

      I have seen this in 4.19 CI jobs.

      How reproducible

      Presently, search.ci shows the following stats for the past 14 days:

      pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator-techpreview (all) - 53 runs, 64% failed, 6% of failures match = 4% impact
      pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator (all) - 52 runs, 46% failed, 8% of failures match = 4% impact
      

      Steps to Reproduce

      1. Post a PR and have bad luck.
      2. Check search.ci.

      Actual results

      CI fails.

      Expected results

      CI passes, or fails on some other test failure.

      Additional info

      In the search.ci results, the failures all are in e2e-aws-operator jobs.

      The test output isn't very helpful in diagnosing the failures. The output shows DNSReady:False with the status condition message "The wildcard record resource was not found." Unfortunately, the must-gather archives did not capture any relevant ingress-operator logs or DNSRecord CRs. It would be useful if the test output included the DNSRecord CR manifest.

              rh-ee-rpchevuz Ricardo Pchevuzinske Katz
              mmasters1@redhat.com Miciah Masters
              None
              None
              Ishmam Amin Ishmam Amin
              None
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: