Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-43997

Registry operator Panic when set up Azure private account Internal and External forth and back in Azure Private cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.17.z, 4.18
    • Image Registry
    • Low
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          Registry operator goes to Panic when set up Azure private account Internal and External forth and back in azure private cluster.

      Version-Release number of selected component (if applicable):

          4.18.0-0.nightly-2024-10-29-112337

      How reproducible:

          always

      Steps to Reproduce:

          1. Set networkAccess to Internal, and set networkResourceGroupName, vnet will be discovered by tag in wxj-1030-rg rg.
       $oc patch config.image/cluster -p '{"spec":{"storage":{"azure":{"networkAccess":{"type":"Internal","internal":{"networkResourceGroupName": "wxj-1030-rg"}}}}}}  --type=merge
          2.When image registry has been set Internal network access successfully, set the network access to External back, then set to Internal forth and back.
          3.
          

      Actual results:

       Image registry opertor goes to Panic finally.    
      $oc get pods
      NAME                                              READY   STATUS             RESTARTS      AGE
      cluster-image-registry-operator-67759775f-p6hq6   0/1     CrashLoopBackOff   5 (65s ago)   92m
      image-registry-b9c8d88c6-9wxt6                    1/1     Running            1 (38m ago)   59m
      image-registry-b9c8d88c6-zl9bp                    1/1     Running            1 (38m ago)   59m
      node-ca-dv7qs                                     1/1     Running            0             76m
      node-ca-hdrfx                                     1/1     Running            0             76m
      node-ca-kl9hf                                     1/1     Running            0             76m
      node-ca-rkfkv                                     1/1     Running            0             76m
      node-ca-wxzc7                                     1/1     Running            0             76m
      node-ca-xp8t8                                     1/1     Running            0             76m 
      
       oc logs -f cluster-image-registry-operator-67759775f-p6hq6
      Overwriting root TLS certificate authority trust storeI1030 09:38:20.307177       1 leaderelection.go:121] The leader election gives 4 retries and allows for 30s of clock skew. The kube-apiserver downtime tolerance is 78s. Worst non-graceful lease acquisition is 2m43s. Worst graceful lease acquisition is {26s}.I1030 09:38:20.308614       1 observer_polling.go:159] Starting file observerI1030 09:38:20.338367       1 leaderelection.go:250] attempting to acquire leader lease openshift-image-registry/openshift-master-controllers...I1030 09:40:37.702156       1 leaderelection.go:260] successfully acquired lease openshift-image-registry/openshift-master-controllersI1030 09:40:37.702277       1 event.go:377] Event(v1.ObjectReference{Kind:"Lease", Namespace:"openshift-image-registry", Name:"openshift-master-controllers", UID:"53c56afc-2f7c-4cdc-b91e-dbd58d497e32", APIVersion:"coordination.k8s.io/v1", ResourceVersion:"176338", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' cluster-image-registry-operator-67759775f-p6hq6_5e7a68a7-ed84-4e5e-a6e4-f2cecd9a3b01 became leaderI1030 09:40:37.702377       1 main.go:33] Cluster Image Registry Operator Version: v4.18.0-202410251041.p0.g4c7e4ef.assembly.stream.el9-dirtyI1030 09:40:37.702413       1 main.go:34] Go Version: go1.22.7 (Red Hat 1.22.7-1.el9_5) X:strictfipsruntimeI1030 09:40:37.702419       1 main.go:35] Go OS/Arch: linux/amd64I1030 09:40:37.702424       1 main.go:66] Watching files [/var/run/configmaps/trusted-ca/tls-ca-bundle.pem /etc/secrets/tls.crt /etc/secrets/tls.key]...I1030 09:40:37.713656       1 simple_featuregate_reader.go:171] Starting feature-gate-detectorI1030 09:40:37.716335       1 starter.go:88] FeatureGates initialized: knownFeatureGates=[AWSEFSDriverVolumeMetrics AdminNetworkPolicy AlibabaPlatform AzureWorkloadIdentity BareMetalLoadBalancer BuildCSIVolumes ChunkSizeMiB CloudDualStackNodeIPs DisableKubeletCloudCredentialProviders GCPLabelsTags HardwareSpeed IngressControllerLBSubnetsAWS KMSv1 ManagedBootImages MultiArchInstallAWS MultiArchInstallGCP NetworkDiagnosticsConfig NetworkLiveMigration NodeDisruptionPolicy OpenShiftPodSecurityAdmission PrivateHostedZoneAWS SetEIPForNLBIngressController VSphereControlPlaneMachineSet VSphereDriverConfiguration VSphereStaticIPs ValidatingAdmissionPolicy AWSClusterHostedDNS AdditionalRoutingCapabilities AutomatedEtcdBackup BootcNodeManagement CSIDriverSharedResource ClusterAPIInstall ClusterAPIInstallIBMCloud ClusterMonitoringConfig DNSNameResolver DynamicResourceAllocation EtcdBackendQuota EventedPLEG Example ExternalOIDC GCPClusterHostedDNS GatewayAPI ImageStreamImportMode IngressControllerDynamicConfigurationManager InsightsConfig InsightsConfigAPI InsightsOnDemandDataGather InsightsRuntimeExtractor MachineAPIMigration MachineAPIOperatorDisableMachineHealthCheckController MachineAPIProviderOpenStack MachineConfigNodes ManagedBootImagesAWS MaxUnavailableStatefulSet MetricsCollectionProfiles MixedCPUsAllocation MultiArchInstallAzure NetworkSegmentation NewOLM NodeSwap OVNObservability OnClusterBuild PersistentIPsForVirtualization PinnedImages PlatformOperators ProcMountType RouteAdvertisements RouteExternalCertificate ServiceAccountTokenNodeBinding SignatureStores SigstoreImageVerification TranslateStreamCloseWebsocketRequests UpgradeStatus UserNamespacesPodSecurityStandards UserNamespacesSupport VSphereMultiNetworks VSphereMultiVCenters VolumeGroupSnapshot]I1030 09:40:37.716393       1 event.go:377] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-image-registry", Name:"cluster-image-registry-operator", UID:"a536c929-e5ca-44fa-b15f-63a1cbfdfe6a", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'FeatureGatesInitialized' FeatureGates updated to featuregates.Features{Enabled:[]v1.FeatureGateName{"AWSEFSDriverVolumeMetrics", "AdminNetworkPolicy", "AlibabaPlatform", "AzureWorkloadIdentity", "BareMetalLoadBalancer", "BuildCSIVolumes", "ChunkSizeMiB", "CloudDualStackNodeIPs", "DisableKubeletCloudCredentialProviders", "GCPLabelsTags", "HardwareSpeed", "IngressControllerLBSubnetsAWS", "KMSv1", "ManagedBootImages", "MultiArchInstallAWS", "MultiArchInstallGCP", "NetworkDiagnosticsConfig", "NetworkLiveMigration", "NodeDisruptionPolicy", "OpenShiftPodSecurityAdmission", "PrivateHostedZoneAWS", "SetEIPForNLBIngressController", "VSphereControlPlaneMachineSet", "VSphereDriverConfiguration", "VSphereStaticIPs", "ValidatingAdmissionPolicy"}, Disabled:[]v1.FeatureGateName{"AWSClusterHostedDNS", "AdditionalRoutingCapabilities", "AutomatedEtcdBackup", "BootcNodeManagement", "CSIDriverSharedResource", "ClusterAPIInstall", "ClusterAPIInstallIBMCloud", "ClusterMonitoringConfig", "DNSNameResolver", "DynamicResourceAllocation", "EtcdBackendQuota", "EventedPLEG", "Example", "ExternalOIDC", "GCPClusterHostedDNS", "GatewayAPI", "ImageStreamImportMode", "IngressControllerDynamicConfigurationManager", "InsightsConfig", "InsightsConfigAPI", "InsightsOnDemandDataGather", "InsightsRuntimeExtractor", "MachineAPIMigration", "MachineAPIOperatorDisableMachineHealthCheckController", "MachineAPIProviderOpenStack", "MachineConfigNodes", "ManagedBootImagesAWS", "MaxUnavailableStatefulSet", "MetricsCollectionProfiles", "MixedCPUsAllocation", "MultiArchInstallAzure", "NetworkSegmentation", "NewOLM", "NodeSwap", "OVNObservability", "OnClusterBuild", "PersistentIPsForVirtualization", "PinnedImages", "PlatformOperators", "ProcMountType", "RouteAdvertisements", "RouteExternalCertificate", "ServiceAccountTokenNodeBinding", "SignatureStores", "SigstoreImageVerification", "TranslateStreamCloseWebsocketRequests", "UpgradeStatus", "UserNamespacesPodSecurityStandards", "UserNamespacesSupport", "VSphereMultiNetworks", "VSphereMultiVCenters", "VolumeGroupSnapshot"}}I1030 09:40:37.717385       1 metrics.go:88] Starting MetricsControllerI1030 09:40:37.717415       1 nodecadaemon.go:204] Starting NodeCADaemonControllerI1030 09:40:37.717418       1 clusteroperator.go:143] Starting ClusterOperatorStatusControllerI1030 09:40:37.717432       1 imageregistrycertificates.go:211] Starting ImageRegistryCertificatesControllerI1030 09:40:37.717448       1 imageconfig.go:100] Starting ImageConfigControllerI1030 09:40:37.717488       1 base_controller.go:67] Waiting for caches to sync for LoggingSyncerI1030 09:40:37.717511       1 azurestackcloud.go:174] Starting AzureStackCloudControllerI1030 09:40:37.717528       1 azurepathfixcontroller.go:261] Starting AzurePathFixControllerI1030 09:40:37.717548       1 awstagcontroller.go:160] Starting AWS Tag ControllerI1030 09:40:37.817617       1 clusteroperator.go:150] Started ClusterOperatorStatusControllerI1030 09:40:37.817648       1 awstagcontroller.go:167] Started AWS Tag ControllerI1030 09:40:37.817669       1 controllerimagepruner.go:386] Starting ImagePrunerControllerI1030 09:40:37.817723       1 azurepathfixcontroller.go:268] Started AzurePathFixControllerI1030 09:40:37.817729       1 imageregistrycertificates.go:218] Started ImageRegistryCertificatesControllerI1030 09:40:37.817759       1 azurestackcloud.go:181] Started AzureStackCloudControllerI1030 09:40:37.817797       1 nodecadaemon.go:211] Started NodeCADaemonControllerI1030 09:40:37.817910       1 metrics.go:94] Started MetricsControllerI1030 09:40:37.817955       1 imageconfig.go:107] Started ImageConfigControllerI1030 09:40:37.817705       1 base_controller.go:73] Caches are synced for LoggingSyncerI1030 09:40:37.817990       1 base_controller.go:110] Starting #1 worker of LoggingSyncer controller ...I1030 09:40:37.818704       1 controller.go:454] Starting ControllerI1030 09:40:37.818805       1 azure.go:1030] setting azure storage account tagsE1030 09:40:38.833340       1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)goroutine 628 [running]:k8s.io/apimachinery/pkg/util/runtime.logPanic({0x2f27da0, 0x5cd6310})	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:75 +0x85k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x3fcdf40?})	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:49 +0x6bpanic({0x2f27da0?, 0x5cd6310?})	/usr/lib/golang/src/runtime/panic.go:770 +0x132github.com/openshift/cluster-image-registry-operator/pkg/storage/azure.(*driver).CreateStorage(0xc0009b4ee0, 0xc0018cc408)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/storage/azure/azure.go:1096 +0x8f9github.com/openshift/cluster-image-registry-operator/pkg/resource.(*Generator).syncStorage(0xc000993680, 0xc0018cc408)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/resource/generator.go:157 +0x284github.com/openshift/cluster-image-registry-operator/pkg/resource.(*Generator).Apply(0xc000993680, 0xc0018cc408)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/resource/generator.go:227 +0x25github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).createOrUpdateResources(0xc000fe8780, 0xc0018cc408)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:211 +0x247github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).sync(0xc000fe8780)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:265 +0x285github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).eventProcessor.func1(0xc000fe8780, {0x2d13b20, 0x3fcdf40})	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:377 +0xb6github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).eventProcessor(0xc000fe8780)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:384 +0x2dk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:226 +0x33k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0018badb0, {0x3fd4980, 0xc001399980}, 0x1, 0xc0011703c0)	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:227 +0xafk8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0018badb0, 0x3b9aca00, 0x0, 0x1, 0xc0011703c0)	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:204 +0x7fk8s.io/apimachinery/pkg/util/wait.Until(...)	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:161created by github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).Run in goroutine 274	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:455 +0x18dpanic: runtime error: invalid memory address or nil pointer dereference [recovered]	panic: runtime error: invalid memory address or nil pointer dereference[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x26c3259]
      goroutine 628 [running]:k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x3fcdf40?})	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:56 +0xcdpanic({0x2f27da0?, 0x5cd6310?})	/usr/lib/golang/src/runtime/panic.go:770 +0x132github.com/openshift/cluster-image-registry-operator/pkg/storage/azure.(*driver).CreateStorage(0xc0009b4ee0, 0xc0018cc408)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/storage/azure/azure.go:1096 +0x8f9github.com/openshift/cluster-image-registry-operator/pkg/resource.(*Generator).syncStorage(0xc000993680, 0xc0018cc408)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/resource/generator.go:157 +0x284github.com/openshift/cluster-image-registry-operator/pkg/resource.(*Generator).Apply(0xc000993680, 0xc0018cc408)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/resource/generator.go:227 +0x25github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).createOrUpdateResources(0xc000fe8780, 0xc0018cc408)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:211 +0x247github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).sync(0xc000fe8780)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:265 +0x285github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).eventProcessor.func1(0xc000fe8780, {0x2d13b20, 0x3fcdf40})	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:377 +0xb6github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).eventProcessor(0xc000fe8780)	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:384 +0x2dk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:226 +0x33k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0018badb0, {0x3fd4980, 0xc001399980}, 0x1, 0xc0011703c0)	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:227 +0xafk8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0018badb0, 0x3b9aca00, 0x0, 0x1, 0xc0011703c0)	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:204 +0x7fk8s.io/apimachinery/pkg/util/wait.Until(...)	/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:161created by github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).Run in goroutine 274	/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:455 +0x18d

      Expected results:

      Operator shouldn't go to panic.  
      Better to prompt warning when set network access back to External, since we don't support undone to change network access.
      
      

      Additional info:

          Can't reproduce it on 4.16, 4.15
      The must gather log https://drive.google.com/file/d/11ApMD65BSb5But194MKM3Vd1MKZliGtp/view?usp=sharing 

              fmissi Flavian Missi
              rh-ee-xiuwang XiuJuan Wang
              XiuJuan Wang XiuJuan Wang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: