Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-692

During RHOSP upgrade fresh version of OVN controller failed to establish relationships with old control plane

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Major Major
    • None
    • None
    • ovn-2021, ovn23.09
    • False
    • Hide

      None

      Show
      None
    • False
    • Important
    • -

      VMs in customer's RHOSP deployment lost network connectivity during FFU 16.2 (ovn-2021) -> 17.1 (ovn23.09) procedure: after ovn-controller container was updated and restarted it failed to establish relationships with control plane. On all nodes running ovn-controller process same sequence of log messages is observed:

      2024-06-10T18:20:22.655Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovn-controller.log
      2024-06-10T18:20:22.655Z|00002|reconnect|INFO|unix:/run/openvswitch/db.sock: connecting...
      2024-06-10T18:20:22.655Z|00003|reconnect|INFO|unix:/run/openvswitch/db.sock: connected
      2024-06-10T18:20:22.662Z|00004|main|INFO|OVN internal version is : [22.12.2-20.27.0-70.6]
      2024-06-10T18:20:22.662Z|00005|main|INFO|OVS IDL reconnected, force recompute.
      2024-06-10T18:20:22.775Z|00006|reconnect|INFO|ssl:IP_ADDR:6642: connecting...
      2024-06-10T18:20:22.775Z|00007|main|INFO|OVNSB IDL reconnected, force recompute.
      2024-06-10T18:20:23.776Z|00008|reconnect|INFO|ssl:IP_ADDR:6642: connection attempt timed out
      2024-06-10T18:20:24.777Z|00009|reconnect|INFO|ssl:IP_ADDR:6642: connecting...
      2024-06-10T18:20:25.779Z|00010|reconnect|INFO|ssl:IP_ADDR:6642: connection attempt timed out
      2024-06-10T18:20:25.779Z|00011|reconnect|INFO|ssl:IP_ADDR:6642: waiting 2 seconds before reconnect
      2024-06-10T18:20:27.782Z|00012|reconnect|INFO|ssl:IP_ADDR:6642: connecting...
      2024-06-10T18:20:29.785Z|00013|reconnect|INFO|ssl:IP_ADDR:6642: connection attempt timed out
      2024-06-10T18:20:29.785Z|00014|reconnect|INFO|ssl:IP_ADDR:6642: waiting 4 seconds before reconnect
      2024-06-10T18:20:33.789Z|00015|memory|INFO|11080 kB peak resident set size after 11.1 seconds
      2024-06-10T18:20:33.789Z|00016|memory|INFO|idl-cells-Open_vSwitch:8014
      2024-06-10T18:20:33.790Z|00017|reconnect|INFO|ssl:IP_ADDR:6642: connecting...
      2024-06-10T18:20:37.795Z|00018|reconnect|INFO|ssl:IP_ADDR:6642: connection attempt timed out
      2024-06-10T18:20:37.795Z|00019|reconnect|INFO|ssl:IP_ADDR:6642: continuing to reconnect in the background but suppressing further logging
      2024-06-10T18:21:06.243Z|00020|reconnect|INFO|ssl:IP_ADDR:6642: connected
      2024-06-10T18:21:06.251Z|00021|ovsdb_idl|WARN|OVN_Southbound database lacks Chassis_Template_Var table (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00022|ovsdb_idl|WARN|Load_Balancer table in OVN_Southbound database lacks datapath_group column (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00023|ovsdb_idl|WARN|MAC_Binding table in OVN_Southbound database lacks timestamp column (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00024|ovsdb_idl|WARN|OVN_Southbound database lacks Mirror table (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00025|ovsdb_idl|WARN|Port_Binding table in OVN_Southbound database lacks additional_chassis column (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00026|ovsdb_idl|WARN|Port_Binding table in OVN_Southbound database lacks additional_encap column (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00027|ovsdb_idl|WARN|Port_Binding table in OVN_Southbound database lacks mirror_rules column (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00028|ovsdb_idl|WARN|Port_Binding table in OVN_Southbound database lacks port_security column (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00029|ovsdb_idl|WARN|Port_Binding table in OVN_Southbound database lacks requested_additional_chassis column (database needs upgrade?)
      2024-06-10T18:21:06.251Z|00030|ovsdb_idl|WARN|OVN_Southbound database lacks Static_MAC_Binding table (database needs upgrade?)
      2024-06-10T18:21:12.362Z|00031|memory|INFO|peak resident set size grew 217% in last 38.6 seconds, from 11080 kB to 35104 kB
      2024-06-10T18:21:12.362Z|00032|memory|INFO|idl-cells-Open_vSwitch:8014
      2024-06-10T18:21:20.899Z|00033|features|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting to switch
      2024-06-10T18:21:20.900Z|00034|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
      2024-06-10T18:21:20.908Z|00035|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
      2024-06-10T18:21:20.911Z|00036|features|INFO|OVS Feature: ct_zero_snat, state: supported
      2024-06-10T18:21:20.912Z|00037|main|INFO|OVS feature set changed, force recompute.
      2024-06-10T18:21:20.913Z|00038|ofctrl|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting to switch
      2024-06-10T18:21:20.915Z|00039|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
      2024-06-10T18:21:20.917Z|00040|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
      2024-06-10T18:21:20.918Z|00041|ofctrl|INFO|ofctrl-wait-before-clear is now 8000 ms (was 0 ms)
      2024-06-10T18:21:20.921Z|00042|main|INFO|OVS OpenFlow connection reconnected,force recompute.
      2024-06-10T18:21:20.965Z|00001|pinctrl(ovn_pinctrl0)|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting to switch
      2024-06-10T18:21:20.965Z|00002|rconn(ovn_pinctrl0)|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
      2024-06-10T18:21:20.965Z|00003|rconn(ovn_pinctrl0)|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
      2024-06-10T18:21:20.989Z|00043|timeval|WARN|Unreasonably long 1128ms poll interval (923ms user, 198ms system)
      2024-06-10T18:21:20.989Z|00044|timeval|WARN|faults: 88544 minor, 0 major
      2024-06-10T18:21:20.990Z|00045|timeval|WARN|disk: 0 reads, 8 writes
      2024-06-10T18:21:20.991Z|00046|timeval|WARN|context switches: 0 voluntary, 10 involuntary
      2024-06-10T18:21:20.994Z|00047|coverage|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour,  hash=97cf558d:
      2024-06-10T18:21:20.994Z|00048|coverage|INFO|cmap_expand                0.0/sec     0.000/sec        0.0000/sec   total: 3
      2024-06-10T18:21:20.994Z|00049|coverage|INFO|netlink_sent               0.0/sec     0.000/sec        0.0000/sec   total: 12
      2024-06-10T18:21:20.994Z|00050|coverage|INFO|netlink_recv_jumbo         0.0/sec     0.000/sec        0.0000/sec   total: 1
      2024-06-10T18:21:20.994Z|00051|coverage|INFO|netlink_received           0.0/sec     0.000/sec        0.0000/sec   total: 12
      2024-06-10T18:21:20.994Z|00052|coverage|INFO|vconn_sent                 0.0/sec     0.000/sec        0.0000/sec   total: 5
      2024-06-10T18:21:20.994Z|00053|coverage|INFO|vconn_received             0.0/sec     0.000/sec        0.0000/sec   total: 4
      2024-06-10T18:21:20.994Z|00054|coverage|INFO|vconn_open                 0.0/sec     0.000/sec        0.0000/sec   total: 3
      2024-06-10T18:21:20.994Z|00055|coverage|INFO|util_xalloc              396182.8/sec 53981.183/sec      899.6864/sec   total: 5873685
      2024-06-10T18:21:20.994Z|00056|coverage|INFO|stream_open                0.0/sec     0.117/sec        0.0019/sec   total: 10
      2024-06-10T18:21:20.994Z|00057|coverage|INFO|pstream_open               0.0/sec     0.017/sec        0.0003/sec   total: 1
      2024-06-10T18:21:20.994Z|00058|coverage|INFO|seq_change                22.8/sec     3.400/sec        0.0567/sec   total: 243
      2024-06-10T18:21:20.994Z|00059|coverage|INFO|rconn_sent                 0.0/sec     0.000/sec        0.0000/sec   total: 2
      2024-06-10T18:21:20.994Z|00060|coverage|INFO|rconn_queued               0.0/sec     0.000/sec        0.0000/sec   total: 2
      2024-06-10T18:21:20.994Z|00061|coverage|INFO|poll_zero_timeout         22.0/sec     2.833/sec        0.0472/sec   total: 196
      2024-06-10T18:21:20.994Z|00062|coverage|INFO|poll_create_node          69.2/sec    10.300/sec        0.1717/sec   total: 719
      2024-06-10T18:21:20.994Z|00063|coverage|INFO|txn_incomplete             0.0/sec     0.000/sec        0.0000/sec   total: 2
      2024-06-10T18:21:20.994Z|00064|coverage|INFO|txn_unchanged             22.8/sec     3.267/sec        0.0544/sec   total: 223
      2024-06-10T18:21:20.994Z|00065|coverage|INFO|hmap_reserve               0.0/sec     0.000/sec        0.0000/sec   total: 76
      2024-06-10T18:21:20.994Z|00066|coverage|INFO|hmap_expand              8996.4/sec  1176.017/sec       19.6003/sec   total: 91957
      2024-06-10T18:21:20.994Z|00067|coverage|INFO|hmap_pathological          0.4/sec     0.817/sec        0.0136/sec   total: 257
      2024-06-10T18:21:20.994Z|00068|coverage|INFO|miniflow_malloc            0.0/sec     0.000/sec        0.0000/sec   total: 272
      2024-06-10T18:21:20.994Z|00069|coverage|INFO|physical_run               0.0/sec     0.000/sec        0.0000/sec   total: 1
      2024-06-10T18:21:20.994Z|00070|coverage|INFO|consider_logical_flow      0.0/sec     0.000/sec        0.0000/sec   total: 70000
      2024-06-10T18:21:20.994Z|00071|coverage|INFO|lflow_run                  0.0/sec     0.000/sec        0.0000/sec   total: 1
      2024-06-10T18:21:20.994Z|00072|coverage|INFO|116 events never hit
      2024-06-10T18:21:25.915Z|00073|memory|INFO|peak resident set size grew 653% in last 13.6 seconds, from 35104 kB to 264164 kB
      2024-06-10T18:21:25.915Z|00074|memory|INFO|idl-cells-OVN_Southbound:862796 idl-cells-Open_vSwitch:8014 ofctrl_desired_flow_usage-KB:65 ofctrl_sb_flow_ref_usage-KB:20
      2024-06-10T18:21:52.209Z|00004|fatal_signal(ovn_pinctrl0)|WARN|terminating with signal 15 (Terminated)

       

              ovnteam@redhat.com OVN Team
              rhn-support-astupnik Aleksei Stupnikov
              Jianlin Shi Jianlin Shi
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: