Loading...

Type: Bug
Resolution: Done-Errata
Priority: Major
Fix Version/s: 2.6.1
Affects Version/s: 2.5.6
Component/s: Controller
Labels:
- no-qe
- vmware

Activity Type:
Incidents & Support
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Git Pull Request:
https://github.com/kubev2v/forklift/pull/848, https://github.com/kubev2v/forklift/pull/879
Intelligence Requested:
Market:

Severity:
Important

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

The controller crashes in a loop after being connected to a vsphere provider

[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x193da35]
goroutine 725 [running]:
github.com/konveyor/forklift-controller/pkg/controller/provider/container/vsphere.(*VmAdapter).updateDisks(0xc00227a200, 0xc0006b5360?)
        /remote-source/app/pkg/controller/provider/container/vsphere/model.go:705 +0x355
github.com/konveyor/forklift-controller/pkg/controller/provider/container/vsphere.(*VmAdapter).Apply(0xc00227a200, {{}, {0xc002aefae0, 0x5}, {{0xc002aefb10, 0xe}, {0xc002aefb30, 0x7}}, {0xc002afc600, 0x1b, ...}, ...})
        /remote-source/app/pkg/controller/provider/container/vsphere/model.go:669 +0x20b5
github.com/konveyor/forklift-controller/pkg/controller/provider/container/vsphere.Collector.applyEnter({{0xc00086cac8, 0x16}, 0xc0005fe400, 0xc00217c140, {0x312a368, 0xc0021469a0}, {0x3120fb8, 0xc00283a120}, 0xc0002448e0, 0xc0028367f0, ...}, ...)
        /remote-source/app/pkg/controller/provider/container/vsphere/collector.go:851 +0x9e
github.com/konveyor/forklift-controller/pkg/controller/provider/container/vsphere.(*Collector).apply(0xc0028295c0, {0x0?, 0x0?}, 0x0?, {0xc002eae000?, 0x64, 0xa?})
        /remote-source/app/pkg/controller/provider/container/vsphere/collector.go:734 +0x15b
github.com/konveyor/forklift-controller/pkg/controller/provider/container/vsphere.(*Collector).getUpdates(0xc0028295c0, {0x311de68, 0xc00280fe50})
        /remote-source/app/pkg/controller/provider/container/vsphere/collector.go:396 +0xab4
github.com/konveyor/forklift-controller/pkg/controller/provider/container/vsphere.(*Collector).Start.func1()
        /remote-source/app/pkg/controller/provider/container/vsphere/collector.go:301 +0xfd
created by github.com/konveyor/forklift-controller/pkg/controller/provider/container/vsphere.(*Collector).Start
        /remote-source/app/pkg/controller/provider/container/vsphere/collector.go:316 +0xb9

The segfault is here:

 676  // Update virtual disk devices.
   677  func (v *VmAdapter) updateDisks(devArray *types.ArrayOfVirtualDevice) {
   678          disks := []model.Disk{}
   679          for _, dev := range devArray.VirtualDevice {
   680                  switch dev.(type) {
   681                  case *types.VirtualDisk:
   682                          disk := dev.(*types.VirtualDisk)
   683                          switch disk.Backing.(type) {
   684                          case *types.VirtualDiskFlatVer1BackingInfo:
   685                                  backing := disk.Backing.(*types.VirtualDiskFlatVer1BackingInfo)
   686                                  md := model.Disk{
   687                                          Key:      disk.Key,
   688                                          File:     backing.FileName,
   689                                          Capacity: disk.CapacityInBytes,
   690                                          Datastore: model.Ref{
   691                                                  Kind: model.DsKind,
   692                                                  ID:   backing.Datastore.Value,
   693                                          },
   694                                  }
   695                                  disks = append(disks, md)
   696                          case *types.VirtualDiskFlatVer2BackingInfo:
   697                                  backing := disk.Backing.(*types.VirtualDiskFlatVer2BackingInfo)
   698                                  md := model.Disk{
   699                                          Key:      disk.Key,
   700                                          File:     backing.FileName,
   701                                          Capacity: disk.CapacityInBytes,
   702                                          Shared:   backing.Sharing != "sharingNone",
   703                                          Datastore: model.Ref{
   704                                                  Kind: model.DsKind,
   705                                                  ID:   backing.Datastore.Value,        <--------
   706                                          },
   707                                  }
   708                                  disks = append(disks, md)

Because of this malformed/broken disk returned by vSphere

{
  "level": "info",
  "ts": "2024-04-03 08:13:39.876",
  "logger": "debug",
  "msg": "backing-debug",
  "disk": {
    "Key": 2001,
    "DeviceInfo": {
      "Label": "Hard disk 2",
      "Summary": "0 KB"
    },
    "Backing": {
      "FileName": "[] ...vmdk",
      "Datastore": null,                <---------- here is our SIGSEGV when trying to dereference backing.Datastore.Value
      "BackingObjectId": "",
      "DiskMode": "persistent",
      "Split": false,
      ....

Upon further investigation, the VM does not have a "Hard disk 2", the volume 5b9e1ae2-851ecad0-faa4-6805ca242b07 doesn't even exist, nor does that vmdk. Note the disk size is also zero.

This appears to be some bug or a problematic VM on the VMware side, but our controller should not crash this way, as it prevents migrating any VMs to OCP. Possibly, this particular VM won't migrate well, but it should not crash like this and make the product totally unusable.

Please investigate how to make the controller more resilient, printing an error for such disk/VM and allowing the customer to migrate the other VMs which are ok.

links to

[KCS] MTV forklift-controller pod crashes in a loop after configuring vSphere provider

RHBA-2024:130587 MTV 2.6.1 Images

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

Hide