Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-7888

collect additional metrics from the DU SNO

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.18
    • None
    • None
    • Product / Portfolio Work
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Verizon Far Edge Operations Performance and Engineering team have requested the collection of additional metrics from the Far Edge SNOs to improve their monitoring and proactive maintenance.  These metrics are not included in the default allow-list.  

       

      Current list of Additional Metrics to be collected:

      metrics_list.yaml:


      names:
        - node_network_up
        - kube_pod_container_status_waiting_reason
        - kube_pod_status_phase
        - acm_managed_cluster_info
        - ALERTS_FOR_STATE
        - acm_remote_write_requests_total
        - container_runtime_crio_containers_oom_total
        - prometheus_operator_reconcile_operations_total
        - kube_pod_container_status_last_terminated_reason
        - node_vmstat_oom_kill
        - container_start_time_seconds
        - coredns_dns_requests_total
        - coredns_dns_request_duration_seconds_sum
        - coredns_dns_request_duration_seconds_count
        - coredns_dns_responses_total
        - coredns_forward_responses_total
        - coredns_forward_requests_total
        - service_assisted_installer_monitored_clusters
        - service_assisted_installer_monitored_hosts
        - service_assisted_installer_operation_duration_miliseconds_sum
        - container_runtime_crio_containers_oom_total
        - container_runtime_crio_containers_oom
        - container_network_receive_bytes_total
        - container_network_transmit_bytes_total      
        - node_network_carrier_changes_total
        - pod:container_cpu_usage:sum
        - container_memory_usage_bytes
        - container_memory_working_set_bytes
        - container_cpu_cfs_throttled_seconds_total
        - kubelet_container_log_filesystem_used_bytes
        - openshift_ptp_delay_ns
        - openshift_ptp_frequency_adjustment_ns
        - openshift_ptp_interface_role
        - openshift_ptp_offset_ns
        - openshift_ptp_process_restart_count
        - openshift_ptp_process_status
        - openshift_ptp_clock_class
        - openshift_ptp_clock_state
      collect_rules:
        - group: -SNOResourceUsage

       

      New list to of additional metrics to be collected (including the above):
      1. physical network interface ens{1|2|3}f{0|1|2|3}
              received packets or bytes
              transmitted packets or bytes
              received error
              transmitted error
              received drop
              transmitted drop
      instance_device:node_network_receive_bytes_phy:rate1m
      instance_device:node_network_transmit_bytes_phy:rate1m
      instance_device:node_network_receive_errs_phy:rate1m
      instance_device:node_network_transmit_errs_phy:rate1m
      instance_device:node_network_receive_drop_phy:rate1m
      instance_device:node_network_transmit_drop_phy:rate1m
       
      2. container network interface
              received error
              transmitted error
              received drop
              transmitted drop
      container_network_receive_errors_total
      container_network_transmit_errors_total
      container_network_receive_packets_dropped_total
      container_network_transmit_packets_dropped_total
       
       

       

              rolove Robert Love
              rhn-gps-ncocker Nabeel Cocker
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                None
                None