-
Bug
-
Resolution: Done
-
Critical
-
None
-
None
-
None
job link
another one
The aws-ovn-serial job is a job we want to move to become a payload blocker, so it's stability is critical.
It was stable when first created and run dozens of times with cluster-bot. even with the beginning of it's
history in production it was stable, but recently seems to be close to perma-fail. sippy link to the nightly
version.
essentially, a test case that looks for the occurance of too many events/Errors is turning up this:
{ 2 events happened too frequently event happened 25 times, something is wrong: ns/openshift-ovn-kubernetes service/ovnkube-db - reason/FailedToUpdateEndpointSlices Error updating Endpoint Slices for Service openshift-ovn-kubernetes/ovnkube-db: node "ip-10-0-177-156.ec2.internal" not found event happened 25 times, something is wrong: ns/openshift-ovn-kubernetes service/ovn-kubernetes-master - reason/FailedToUpdateEndpointSlices Error updating Endpoint Slices for Service openshift-ovn-kubernetes/ovn-kubernetes-master: node "ip-10-0-177-156.ec2.internal" not found}
link to this job's testgrid for reference.
- blocks
-
SDN-3115 increase aws-ovn-serial trt alert threshold
-
- Closed
-
- links to