Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-15350

operator backup CR does not work when xsite replication is enabled

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • Operator
    • None
    • Hide

      Backup CR example:

      ```
      kubectl -n keycloak-blue get backups backup-2023-11-29-15-27-28 -o yaml
      apiVersion: infinispan.org/v2alpha1
      kind: Backup
      metadata:
      annotations:
      creationTimestamp: "2023-11-29T23:27:32Z"
      generation: 1
      name: backup-2023-11-29-15-27-28
      namespace: keycloak-blue
      resourceVersion: "456093296"
      uid: 76348b9c-7dc0-4fd2-ac69-aba0c0acd579
      spec:
      cluster: infinispan-blue
      container:
      memory: 1Gi
      resources:
      caches:

      • '*'
        volume:
        storage: 1Gi
        status:
        phase: Failed
        reason: 'unable to retrieve Backup with name ''backup-2023-11-29-15-27-28'' due
        to server error: ''500 Internal Server Error'''
        ```

      Operator config:
      ```
      apiVersion: infinispan.org/v1
      kind: Infinispan
      metadata:
      name: infinispan-blue
      annotations:
      infinispan.org/monitoring: 'true'
      spec:
      replicas: 2
      version: 14.0.19
      configMapName: cluster-config
      dependencies:

      1. Add libraries so infinispan knows how to marshall jboss encoded objects
      2. Source: https://github.com/keycloak/keycloak/issues/20031
        artifacts:
      • maven: org.keycloak:keycloak-model-infinispan:21.1.1
      • maven: org.keycloak:keycloak-server-spi:21.1.1
      • maven: org.keycloak:keycloak-server-spi-private:21.1.1
      • maven: org.keycloak:keycloak-core:21.1.1
        security:
        endpointSecretName: infinispan-credentials
        container:
        cpu: "2000m:100m"
        memory: "2Gi:250Mi"
        service:
        container:
        storage: 10Gi
        ephemeralStorage: false
        type: DataGrid
      1. Docs: https://infinispan.org/docs/infinispan-operator/main/operator.html#configuring-sites-in-clusters_cross-site
        sites:
        local:
        name: SiteBlue
        expose:
        type: ClusterIP
      2. It is recommended to configure all nodes as relay nodes.
      3. Docs: https://infinispan.org/docs/stable/titles/xsite/xsite.html#cross-site-relay-nodes_cross-site-replication
      4. tl;dr this number should match replicas
        maxRelayNodes: 2
        locations:
      • name: SiteGreen
        clusterName: infinispan-green
        namespace: keycloak-green
        logging:
        categories:
        org.jgroups.protocols.TCP: error
        org.jgroups.protocols.relay.RELAY2: error
        affinity:
        podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
      • weight: 100
        podAffinityTerm:
        labelSelector:
        matchLabels:
        app: infinispan-pod
        clusterName: infinispan-blue
        infinispan_cr: infinispan-blue
        topologyKey: "topology.kubernetes.io/zone"
      • weight: 90
        podAffinityTerm:
        labelSelector:
        matchLabels:
        app: infinispan-pod
        clusterName: infinispan-blue
        infinispan_cr: infinispan-blue
        topologyKey: "kubernetes.io/hostname"
        ```
      Show
      Backup CR example: ``` kubectl -n keycloak-blue get backups backup-2023-11-29-15-27-28 -o yaml apiVersion: infinispan.org/v2alpha1 kind: Backup metadata: annotations: creationTimestamp: "2023-11-29T23:27:32Z" generation: 1 name: backup-2023-11-29-15-27-28 namespace: keycloak-blue resourceVersion: "456093296" uid: 76348b9c-7dc0-4fd2-ac69-aba0c0acd579 spec: cluster: infinispan-blue container: memory: 1Gi resources: caches: '*' volume: storage: 1Gi status: phase: Failed reason: 'unable to retrieve Backup with name ''backup-2023-11-29-15-27-28'' due to server error: ''500 Internal Server Error''' ``` Operator config: ``` apiVersion: infinispan.org/v1 kind: Infinispan metadata: name: infinispan-blue annotations: infinispan.org/monitoring: 'true' spec: replicas: 2 version: 14.0.19 configMapName: cluster-config dependencies: Add libraries so infinispan knows how to marshall jboss encoded objects Source: https://github.com/keycloak/keycloak/issues/20031 artifacts: maven: org.keycloak:keycloak-model-infinispan:21.1.1 maven: org.keycloak:keycloak-server-spi:21.1.1 maven: org.keycloak:keycloak-server-spi-private:21.1.1 maven: org.keycloak:keycloak-core:21.1.1 security: endpointSecretName: infinispan-credentials container: cpu: "2000m:100m" memory: "2Gi:250Mi" service: container: storage: 10Gi ephemeralStorage: false type: DataGrid Docs: https://infinispan.org/docs/infinispan-operator/main/operator.html#configuring-sites-in-clusters_cross-site sites: local: name: SiteBlue expose: type: ClusterIP It is recommended to configure all nodes as relay nodes. Docs: https://infinispan.org/docs/stable/titles/xsite/xsite.html#cross-site-relay-nodes_cross-site-replication tl;dr this number should match replicas maxRelayNodes: 2 locations: name: SiteGreen clusterName: infinispan-green namespace: keycloak-green logging: categories: org.jgroups.protocols.TCP: error org.jgroups.protocols.relay.RELAY2: error affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: weight: 100 podAffinityTerm: labelSelector: matchLabels: app: infinispan-pod clusterName: infinispan-blue infinispan_cr: infinispan-blue topologyKey: "topology.kubernetes.io/zone" weight: 90 podAffinityTerm: labelSelector: matchLabels: app: infinispan-pod clusterName: infinispan-blue infinispan_cr: infinispan-blue topologyKey: "kubernetes.io/hostname" ```

      We are working at getting Infinispan running in kubernetes via the operator. We would like to have both xsite replication and scheduled backups enabled. We added xsite replication first.  When we create a backups CR the operator picks it up and creates the zero-capacity-pod. It seems to connect to the cluster, but then spits out the below error. The backup cache data is not successfully written to the PV. It is possible to work around (we think, haven't tried yet) by using REST api to create backups, but we would much prefer to use the operator managed backup/restore feature.

      ```
      19:10:21,964 ERROR (blocking-thread--p3-t1) [org.infinispan.CONFIG] ISPN000660: Cache sessions start failed, stopping any running components org.infinispan.commons.CacheConfigurationException: ISPN000571: RELAY2 not found in the protocol stack. Cannot perform cross-site operations.

      at org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkCrossSiteAvailable(JGroupsTransport.java:457)
      at org.infinispan.container.versioning.irac.DefaultIracVersionGenerator.start(DefaultIracVersionGenerator.java:63)
      at org.infinispan.container.versioning.irac.CorePackageImpl$2.start(CorePackageImpl.java:68)
      at org.infinispan.container.versioning.irac.CorePackageImpl$2.start(CorePackageImpl.java:60)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.invokeStart(BasicComponentRegistryImpl.java:616)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.doStartWrapper(BasicComponentRegistryImpl.java:607)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:576)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.running(BasicComponentRegistryImpl.java:807)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.startDependencies(BasicComponentRegistryImpl.java:634)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.doStartWrapper(BasicComponentRegistryImpl.java:598)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:576)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.running(BasicComponentRegistryImpl.java:807)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.startDependencies(BasicComponentRegistryImpl.java:649)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.doStartWrapper(BasicComponentRegistryImpl.java:598)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:576)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.running(BasicComponentRegistryImpl.java:807)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.startDependencies(BasicComponentRegistryImpl.java:634)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.doStartWrapper(BasicComponentRegistryImpl.java:598)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:576)
      at org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.running(BasicComponentRegistryImpl.java:807)
      at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:379)
      at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:252)
      at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:222)
      at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:1009)
      at org.infinispan.cache.impl.AbstractDelegatingCache.start(AbstractDelegatingCache.java:504)
      at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:727)
      at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:673)
      at org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:562)
      at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:525)
      at org.infinispan.security.actions.GetCacheAction.run(GetCacheAction.java:26)
      at org.infinispan.security.actions.GetCacheAction.run(GetCacheAction.java:14)
      at org.infinispan.security.Security.doPrivileged(Security.java:56)
      at org.infinispan.globalstate.impl.SecurityActions.doPrivileged(SecurityActions.java:30)
      at org.infinispan.globalstate.impl.SecurityActions.getCache(SecurityActions.java:39)
      at org.infinispan.globalstate.impl.VolatileLocalConfigurationStorage.lambda$createCache$0(VolatileLocalConfigurationStorage.java:87)
      at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
      at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
      at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
      at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
      at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
      at java.base/java.lang.Thread.run(Thread.java:840)
      ```

              pruivo@redhat.com Pedro Ruivo
              fletcm Matthew Fletcher (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: