Loading...

XML

Word

Printable

Details

Type: Story
Resolution: Done
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- qe-ack

Story Points:
5
Blocked:
False
Ready:
False
Epic Link:
Core Networking Debuggability
Acceptance Criteria:

Hide
Doing `oc adm must-gather --image network-tools -- tcpdump -i any -vvnne -w /root/must-gather/my_tcpdump.pcap` should start a packet trace on all cluster nodes which will run for X amount of time (to be defined) and then download all pcaps locally. You should also be able to specify a dedicated node to target as done with all must-gathers.

Show
Doing `oc adm must-gather --image network-tools -- tcpdump -i any -vvnne -w /root/must-gather/my_tcpdump.pcap` should start a packet trace on all cluster nodes which will run for X amount of time (to be defined) and then download all pcaps locally. You should also be able to specify a dedicated node to target as done with all must-gathers.
Feature Link:
OCPPLAN-6007 - OpenShift Core Networking Improvements
Release Note Text:
Undefined

Cost of Delay:
0
WSJF:
0.0

SFDC Cases Links:
SFDC Cases Counter:

Description

As an OpenShift engineer I would like a tool that can take a packet capture and look for known "red flags" so that I can quickly rule in, or rule out classes of errors.

For instance, we could show statistics on:

Retries
Excessive SYNs
Bad packets
Unexpected ICMP types

We should also provide some kind of script to run tcpdump, but that asks questions about what the context is (what is the node it is on, what ip addresses are of concern, what time was the error seen, etc.). And we should provide some sample queries (and some pre-filtering command lines) that help reduce the search space in tcpdump, along with tips about how to use it (e.g. change the timestamp format so you can correlate two traces).

This should be done somewhere alongside a release so you can run the latest tooling against an older release.

Doing: ~~SDN-1838~~ we could have tcpdump included in the network-tools image and have either a dedicated CRD for launching tcpdump at a given moment (or even a label on a node) that a dedicated controller in the network-tools that would watch for either the CRD or the label.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

SDN-1760.txt
9 kB
2021/11/30 9:33 PM

Issue Links

is blocked by

SDN-1838 Split out network-diagnostics from the CNO

Closed

is cloned by

SDN-3229 [Tools] backport options for multi-node, host-network must-gather

Closed

Activity

People

Assignee:: Alexander Constantinescu Birhala (Inactive)

Reporter:: Ben Bennett

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2021/04/08 11:59 AM

Updated:: 2023/01/03 5:13 AM

Resolved:: 2021/10/11 7:11 AM