-
Story
-
Resolution: Done
-
Undefined
-
None
-
None
-
Quality / Stability / Reliability
-
False
-
-
False
-
2
-
1
-
None
-
None
-
OpenShift SPLAT - Sprint 268
User Story:
As an OpenShift Engineer I want
to review CI failures in platform-external clusters with prefix "[sig-network][Feature:Router] The HAProxy router should expose"[1]
so
that we can get confidence to the platform type External installations.
Description:
< Record any background information >
Jobs impacted for the same reason:
[sig-network][Feature:Router] The HAProxy router should expose a health check on the metrics port [Skipped:Disconnected] [Suite:openshift/conformance/parallel]
[sig-network][Feature:Router] The HAProxy router should expose prometheus metrics for a route [apigroup:route.openshift.io] [Skipped:Disconnected] [Suite:openshift/conformance/parallel]
[sig-network][Feature:Router] The HAProxy router should expose the profiling endpoints [Skipped:Disconnected] [Suite:openshift/conformance/parallel]
Failure cause:
[FAILED] Unexpected error: <*errors.errorString | 0xc0017b80a0>: host command failed: error running /usr/bin/kubectl --server=https://kubernetes.default.svc:443 --kubeconfig=/tmp/shared/kubeconfig --namespace=e2e-test-router-metrics-725qj exec execpod -- /bin/sh -x -c curl -s -o /dev/null -w '%{http_code}' "http://10.0.50.70:1936/healthz": Command stdout: 000 stderr: + curl -s -o /dev/null -w '%{http_code}' http://10.0.50.70:1936/healthz command terminated with exit code 28 error: exit status 28 000 {
When opening the TCP port 1936 to allow ingress traffic between worker node's Security Groups, the issue is cleaned, running openshift-tests directly:
$ cat test_haproxy.txt "[sig-network][Feature:Router] The HAProxy router should expose the profiling endpoints [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" "[sig-network][Feature:Router] The HAProxy router should expose prometheus metrics for a route [apigroup:route.openshift.io] [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" "[sig-network][Feature:Router] The HAProxy router should expose a health check on the metrics port [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" ./openshift-tests run -f test_haproxy.txt --monitor="etcd-log-analyzer" --junit-dir=./test-haproxy-sg1936 | tee -a test-haproxy-sg1936.txt passed: (18s) 2025-03-15T06:31:02 "[sig-network][Feature:Router] The HAProxy router should expose a health check on the metrics port [Skipped:Disconnected] [Suite:openshift/conformance/parallel]"passed: (22.2s) 2025-03-15T06:31:06 "[sig-network][Feature:Router] The HAProxy router should expose the profiling endpoints [Skipped:Disconnected] [Suite:openshift/conformance/parallel]"passed: (1m4s) 2025-03-15T06:31:47 "[sig-network][Feature:Router] The HAProxy router should expose prometheus metrics for a route [apigroup:route.openshift.io] [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" I0315 03:31:47.557251 1703452 reflector.go:311] Stopping reflector *v1.Pod (0s) from k8s.io/client-go@v0.31.1/tools/cache/reflector.go:243 3 pass, 0 skip (4m5s)
Acceptance Criteria:
- PR adding the port 1936 to the CI automation (cloudformation templates): https://github.com/openshift-eng/installer-labs/blob/main/installer-upi/aws/cloudformation/templates/03_cluster_security.yaml
- PR removing skips in the e2e workflow: https://github.com/openshift/release/blob/172a19c099e2c413c11f702a0c533054a074cc57/ci-operator/step-registry/openshift/e2e/external/aws/openshift-e2e-external-aws-workflow.yaml#L21
- Ensure platform type External e2e and opct workflows on AWS are passing
- Document it somewhere as inbound 1936/TCP port isn't expected to be opened in the docs[1], but there are some references in the OpenStack and metal[2]
Other Information:
< Record anything else that may be helpful to someone else picking up the card >
issue created by splat-bot
- duplicates
-
SPLAT-1854 [platform-external] investigate permanent failures in CI jobs caused by 'HA Proxy' tests on AWS provider
-
- Closed
-
- is related to
-
SPLAT-1854 [platform-external] investigate permanent failures in CI jobs caused by 'HA Proxy' tests on AWS provider
-
- Closed
-
-
SPLAT-2093 [platform-external] investigate e2e failure: [sig-network][Feature:Router] The HAProxy router should expose
-
- Closed
-
- relates to
-
SPLAT-2055 [platfotm-external][CI] Improve test scope of platform type External
-
- New
-
-
OCPBUGS-31407 [platform-external] e2e permanent test failures in OPCT e2e conformance workflow (openshift/conformance suite) in cluster installed w/ plaftorm type External cluster on provider AWS
-
- Closed
-
- links to