-
Task
-
Resolution: Done
-
Major
-
None
-
None
-
8
-
False
-
-
False
-
-
None
-
rhel-net-ovn
-
-
-
ssg_networking
-
OVN FDP Sprint 9
-
1
OVN observability is a first step to support network observability for RHOSO. It mainly consists of three parts:
- Monitoring data collection from OVS/OVN
- Network policy correlation
- Topology context for OVN, VM endpoints, service/application
This ticket is focusing on the data collection part and aims to find the gaps between the required and the available metrics already exposed by OVN.
Below is the list of the metrics expected to support network observability for RHOSO.
1. ovn-db-raft metrics
Name | Definition |
---|---|
build_info | A metric with a constant '1' value labeled by ovsdb-server version and NB and SB schema version |
db_size | The size of the database file associated with the OVN DB component. |
cluster_election_timer | A metric that returns the current election timer value labeled by database name, cluster uuid, and server uuid |
cluster_id | A metric with a constant '1' value labeled by database name and cluster uuid |
cluster_server_id | A metric with a constant '1' value labeled by database name, cluster uuid and server uuid |
cluster_server_role | A metric with a constant '1' value labeled by database name, cluster uuid, server uuid and server role |
cluster_server_status | A metric with a constant '1' value labeled by database name, cluster uuid, server uuid server status |
cluster_server_vote | A metric with a constant '1' value labeled by database name, cluster uuid, server uuid and server vote |
cluster_term | A metric that returns the current election term value labeled by database name, cluster uuid, and server uuid |
cluster_leader | Identifies whether this pod is a leader for given database |
cluster_inbound_connections_error_total | A metric that returns the total number of failed inbound connections to the server labeled by database name, cluster uuid, and server uuid |
cluster_inbound_connections_total | A metric that returns the total number of inbound connections to the server labeled by database name, cluster uuid, and server uuid |
cluster_log_index_next | A metric that returns the log entry index next value labeled by database name, cluster uuid, and server uuid |
cluster_log_index_start | A metric that returns the log entry index start value labeled by database name, cluster uuid, and server uuid |
cluster_log_not_applied | A metric that returns the number of log entries not applied labeled by database name, cluster uuid, and server uuid |
cluster_log_not_committed | A metric that returns the number of log entries not committed labeled by database name, cluster uuid, and server uuid |
cluster_outbound_connections_error_total | A metric that returns the total number of failed outbound connections from the server labeled by database name, cluster uuid, and server uuid |
cluster_outbound_connections_total | A metric that returns the total number of outbound connections from the server labeled by database name, cluster uuid, and server uuid |
jsonrpc_server_sessions | Active number of JSON RPC Server sessions to the DB |
log_entry_index | The index of log entry currently exposed to clients. This value on all the instances of db should be close to each other otherwise they are said to lagging with eaxch other. |
ovsdb_monitors | Number of OVSDB Monitors on the server |
2. ovn-controller metrics
Name | Definition |
---|---|
integration_bridge_patch_ports_total | Captures the number of patch ports that connect br-int OVS bridge to physical OVS bridge and br-local OVS bridge |
integration_bridge_openflow_total | The total number of OpenFlow flows in the integration bridge |
integration_bridge_geneve_ports_total | Total number of OVN geneve ports on the node |
lflow_run | Number of times ovn-controller has translated the Logical_Flow table in the OVN SB database into OpenFlow flows |
remote_probe_intervala | The maximum number of milliseconds of idle time on connection to the OVN SB DB before sending an inactivity probe message. |
openflow_probe_intervala | The maximum number of milliseconds of idle time on OpenFlow connection to the OVS bridge before sending an inactivity probe message. |
monitor_alla | Specifies if ovn-controller should monitor all records of tables in OVN SB DB. The value of 0 means it will conditionally monitor the records that is needed in the current chassis. |
encap_ipa | A metric with a constant '1' value labeled by ipadress that specifies the encapsulation ip address configured on that node |
sb_connection_methoda | A metric with a constant '1' value labeled by sb_connectio_method that specifies the ovn-remote value configured on that node |
encap_typea | A metric with a constant '1' value labeled by type that specifies the encapsulation type that a chassis should use to connect to this node. |
bridge_mappings | A metric with a constant '1' value labeled by mapping that specifies a list of key-value pairs that map a physical network name to a local ovs bridge that provides connectivity to that network. |
packet_in | Specifies the number of times ovn-controller has handled the packet-ins from ovs-vswitchd. |
packet_in_drop | Specifies the number of times the ovn-controller has dropped the packet-ins from ovs-vswitchd due to resource constraints |
rconn_sent | Specifies the number of messages that have been sent to the underlying virtual connection (unix, tcp, or ssl) to OpenFlow devices |
rconn_queued | Specifies the number of messages that have been queued because it couldn't be sent using the underlying virtual connection to OpenFlow devices |
rconn_discarded | Specifies the number of messages that have been dropped because the send queue had to be flushed because of reconnection. |
rconn_overflow | Specifies the number of messages that have been dropped because of the queue overflow |
vconn_open | Specifies the number of attempts to connect to an OpenFlow Device |
vconn_sent | Specifies the number of messages sent to the OpenFlow Device |
vconn_received | Specifies the number of messages received from the OpenFlow Device |
stream_open | Specifies the number of attempts to connect to a remote peer (active connection) |
txn_success | Specifies the number of times the OVSDB transaction has successfully completed |
txn_error | Specifies the number of times the OVSDB transaction has errored out |
txn_uncommitted | Specifies the number of times the OVSDB transaction were uncommitted |
txn_unchanged | Specifies the number of times the OVSDB transaction resulted in no change to the database |
txn_incomplete | Specifies the number of times the OVSDB transaction did not complete and the client had to re-try |
txn_aborted | Specifies the number of times the OVSDB transaction has been aborted |
txn_try_again | Specifies the number of times the OVSDB transaction failed and the client had to re-try |
netlink_sent | Number of netlink message sent to the kernel |
netlink_recv | Number of netlink messages received by the kernel |
netlink_recv_jumbo | Number of netlink messages that were received from the kernel were more than the allocated buffer |
netlink_overflow | Netlink messages dropped by the daemon due to buffer overflow |
3. ovn-northd metrics
Name | Definition |
---|---|
status | Specifies whether this instance of ovn-northd is standby(0) or active(1) or paused(2) |
probe_interval | The maximum number of milliseconds of idle time on connection to the OVN SB and NB DB before sending an inactivity probe message. |
pstream_open | Specifies the number of time passive connections were opened for the remote peer to connect |
stream_open | Specifies the number of attempts to connect to a remote peer |
txn_success | Specifies the number of times the OVSDB transaction has successfully completed |
txn_error | Specifies the number of times the OVSDB transaction has errored out |
txn_uncommitted | Specifies the number of times the OVSDB transaction were uncommitted |
txn_unchanged | Specifies the number of times the OVSDB transaction resulted in no change to the database |
txn_incomplete | Specifies the number of times the OVSDB transaction did not complete and the client had to re-try |
txn_aborted | Specifies the number of times the OVSDB transaction has been aborted |
txn_try_again | Specifies the number of times the OVSDB transaction failed and the client had to re-try |
- is depended on by
-
OSPRH-9185 Support Prometheus exporter for OVS/OVN metrics (ovs-vswitchd, ovsdb-server)
-
- Closed
-