Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-610

ovn-controller is crashing some times after debug/pause and resume

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • FDP 24.E
    • -

      ovn-controller is crashing sometimes when it is paused and resumed and when load balancers are created.

       

      Steps to reproduce :  I can reproduce this most of the times in ovn sandbox (but could not reproduce with a normal deployment).

      1. Start a sandbox - make sandbox
      2. Run the attached script.  It creates few logical switches, lbs and creates fake vms.
      3. Pause ovn-controller - ovn-appctl  -t  ovn-controller  debug/pause
      4. Delete a load balancer - ovn-nbctl lb-del lb1
      5. Resume ovn-controller ovn-appctl  -t  ovn-controller  debug/resume
      6. Run -  ovn-appctl  -t  ovn-controller  version a few times to see if ovn-controller has crashed or not.

       

      Below is the sanitizer trace

       

      =================================================================
      ==398765==ERROR: AddressSanitizer: heap-use-after-free on address 0x6130000b7fd0 at pc 0x000000533fc5 bp 0x7ffc0bf6c3e0 sp 0x7ffc0bf6c3d8
      READ of size 4 at 0x6130000b7fd0 thread T0
          #0 0x533fc4 in uuid_hash /home/nusiddiq/workspace_cpp/ovn-org/ovn/ovs/lib/uuid.h:51
          #1 0x535567 in objdep_mgr_find_resources ../lib/objdep.c:183
          #2 0x5352a6 in objdep_mgr_remove_obj ../lib/objdep.c:134
          #3 0x4ac637 in lb_data_local_lb_remove ../controller/ovn-controller.c:2979
          #4 0x4acba6 in en_lb_data_run ../controller/ovn-controller.c:3062
          #5 0x53ff4d in engine_recompute ../lib/inc-proc-eng.c:415
          #6 0x5406c4 in engine_run_node ../lib/inc-proc-eng.c:477
          #7 0x540924 in engine_run ../lib/inc-proc-eng.c:528
          #8 0x4be074 in main ../controller/ovn-controller.c:5803
          #9 0x7f83f54e0149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756)
          #10 0x7f83f54e020a in __libc_start_main_impl (/lib64/libc.so.6+0x2820a) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756)
          #11 0x408704 in _start (/home/nusiddiq/workspace_cpp/ovn-org/ovn/_gcc/controller/ovn-controller+0x408704) (BuildId: b90f1ae40d039fda24c9d4773f37c8e63a6fe44e)

      0x6130000b7fd0 is located 16 bytes inside of 376-byte region [0x6130000b7fc0,0x6130000b8138)
      freed by thread T0 here:
          #0 0x7f83f5ed7fb8 in __interceptor_free.part.0 (/lib64/libasan.so.8+0xd7fb8) (BuildId: 7fcb7759bc17ef47f9682414b6d99732d6a6ab0c)
          #1 0x7300f4 in ovsdb_idl_track_clear__ ../lib/ovsdb-idl.c:1404
          #2 0x72b5f5 in ovsdb_idl_clear ../lib/ovsdb-idl.c:433
          #3 0x7308f7 in ovsdb_idl_parse_update ../lib/ovsdb-idl.c:1526
          #4 0x72b8b2 in ovsdb_idl_run ../lib/ovsdb-idl.c:470
          #5 0x73f3c5 in ovsdb_idl_loop_run ../lib/ovsdb-idl.c:4373
          #6 0x4bd598 in main ../controller/ovn-controller.c:5659
          #7 0x7f83f54e0149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756)
          #8 0x7f83f54e020a in __libc_start_main_impl (/lib64/libc.so.6+0x2820a) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756)
          #9 0x408704 in _start (/home/nusiddiq/workspace_cpp/ovn-org/ovn/_gcc/controller/ovn-controller+0x408704) (BuildId: b90f1ae40d039fda24c9d4773f37c8e63a6fe44e)

      previously allocated by thread T0 here:
          #0 0x7f83f5ed8cc7 in calloc (/lib64/libasan.so.8+0xd8cc7) (BuildId: 7fcb7759bc17ef47f9682414b6d99732d6a6ab0c)
          #1 0x7871e4 in xcalloc__ ../lib/util.c:124
          #2 0x787228 in xzalloc__ ../lib/util.c:134
          #3 0x787308 in xzalloc ../lib/util.c:168
          #4 0x7351ff in ovsdb_idl_row_create__ ../lib/ovsdb-idl.c:2303
          #5 0x7352d1 in ovsdb_idl_row_create ../lib/ovsdb-idl.c:2316
          #6 0x730eff in ovsdb_idl_process_update ../lib/ovsdb-idl.c:1633
          #7 0x73060b in ovsdb_idl_parse_update__ ../lib/ovsdb-idl.c:1489
          #8 0x73092d in ovsdb_idl_parse_update ../lib/ovsdb-idl.c:1528
          #9 0x72b8b2 in ovsdb_idl_run ../lib/ovsdb-idl.c:470
          #10 0x73f3c5 in ovsdb_idl_loop_run ../lib/ovsdb-idl.c:4373
          #11 0x4bd598 in main ../controller/ovn-controller.c:5659
          #12 0x7f83f54e0149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756)
          #13 0x7f83f54e020a in __libc_start_main_impl (/lib64/libc.so.6+0x2820a) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756)
          #14 0x408704 in _start (/home/nusiddiq/workspace_cpp/ovn-org/ovn/_gcc/controller/ovn-controller+0x408704) (BuildId: b90f1ae40d039fda24c9d4773f37c8e63a6fe44e)

      SUMMARY: AddressSanitizer: heap-use-after-free /home/nusiddiq/workspace_cpp/ovn-org/ovn/ovs/lib/uuid.h:51 in uuid_hash
      Shadow bytes around the buggy address:
        0x6130000b7d00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x6130000b7d80: fd fd fd fd fd fa fa fa fa fa fa fa fa fa fa fa
        0x6130000b7e00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x6130000b7e80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x6130000b7f00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
      =>0x6130000b7f80: fa fa fa fa fa fa fa fa fd fd[fd]fd fd fd fd fd
        0x6130000b8000: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x6130000b8080: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x6130000b8100: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
        0x6130000b8180: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x6130000b8200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
      Shadow byte legend (one shadow byte represents 8 application bytes):
        Addressable:           00

      -------

       

            amusil@redhat.com Ales Musil
            nusiddiq@redhat.com Siddique Numan
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: