Details
-
Story
-
Resolution: Done
-
Minor
-
None
-
None
-
None
Description
As an OpenShift engineer I would like a tool that can take a packet capture and look for known "red flags" so that I can quickly rule in, or rule out classes of errors.
For instance, we could show statistics on:
- Retries
- Excessive SYNs
- Bad packets
- Unexpected ICMP types
We should also provide some kind of script to run tcpdump, but that asks questions about what the context is (what is the node it is on, what ip addresses are of concern, what time was the error seen, etc.). And we should provide some sample queries (and some pre-filtering command lines) that help reduce the search space in tcpdump, along with tips about how to use it (e.g. change the timestamp format so you can correlate two traces).
This should be done somewhere alongside a release so you can run the latest tooling against an older release.
Doing: SDN-1838 we could have tcpdump included in the network-tools image and have either a dedicated CRD for launching tcpdump at a given moment (or even a label on a node) that a dedicated controller in the network-tools that would watch for either the CRD or the label.