For the Jaeger part of the SMCP resource, when the following spec is added, the operator fails to provide an elasticsearch instance:
storage: esIndexCleaner: enabled: true numberOfDays: 7 schedule: "55 23 * * *" type: Elasticsearch elasticsearch: properties: doNotProvision: false name: elasticsearch nodeCount: 1 resources: requests: cpu: "500m" memory: "8Gi" limits: cpu: "1" memory: "16Gi" redundancyPolicy: ZeroRedundancy
Furthermore, only redundancyPolicy and nodeCount are actually being applied - the rest is simply ignored.
The resource specification is simply being ignored - it has to be added manually to the Elasticsearch deployment later in order for the Elasticsearch pod to come up and Jaeger collector and query pods to work properly. This manual operation is not an optimal solution since any re-deployment of Elasticsearch through Operator will end up in error again, as the amended deployment is not persisted in any way.
Reproducer Steps:
- I provided the same set of configurations in the SMCP along with ElasticSearch's resource limits and requests:
storage: type: Elasticsearch esIndexCleaner: enabled: true numberOfDays: 7 schedule: "55 23 * * *" type: Elasticsearch elasticsearch: properties: doNotProvision: false name: elasticsearch nodeCount: 1 resources: requests: cpu: "500m" memory: "1Gi" limits: cpu: "1" memory: "1Gi" redundancyPolicy: ZeroRedundancy
- Once the SMCP is created, I checked its status, I could not see other configurations except the below ones:
storage: elasticsearch: nodeCount: 1 redundancyPolicy: ZeroRedundancy type: Elasticsearch
- Upon creating the SMCP CR, I found that the ElasticSearch POD is going into a 'Pending' state and complaining about insufficient memory on the nodes whereas the nodes were having sufficient memory to host the POD:
# oc get pods NAME READY STATUS RESTARTS AGE elasticsearch-cdm-istiosystemjaeger-1-5bfbcfbccc-f5psk 0/2 Pending 0 4m Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 3m55s default-scheduler 0/6 nodes are available: 3 Insufficient memory, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 3 No preemption victims found for incoming pod, 3 Preemption is not helpful for scheduling.. Warning FailedScheduling 3m31s default-scheduler 0/6 nodes are available: 3 Insufficient memory, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 3 No preemption victims found for incoming pod, 3 Preemption is not helpful for scheduling..
- Then I had to edit the ElasticSearch Deployment and I had to adjust the Memory a bit and immediately, the Elasticsearch POD came up:
# oc get pods NAME READY STATUS RESTARTS AGE elasticsearch-cdm-istiosystemjaeger-1-796c9f5985-p2944 2/2 Running 0 14m
- After this, all the PODs were running fine:
# oc get pods NAME READY STATUS RESTARTS AGE elasticsearch-cdm-istiosystemjaeger-1-796c9f5985-p2944 2/2 Running 0 14m grafana-5494f7549c-prnfm 2/2 Running 0 21m istio-egressgateway-b4448476d-7jtk8 1/1 Running 0 21m istio-ingressgateway-785b7d86d9-2w54g 1/1 Running 0 21m istiod-basic-6df6d4f674-72xb6 1/1 Running 0 21m jaeger-collector-5fdd4f94dd-tpt5t 1/1 Running 7 (12m ago) 18m jaeger-query-75cb6cc4b4-wf2qs 3/3 Running 7 (11m ago) 18m kiali-586d8dd847-twv7r 1/1 Running 0 17m prometheus-8454b64867-ttmpg 2/2 Running 0 21m
Expectations from this Bug:
ElasticSearch instances should be in a running state without any modification in the ElasticSearch deployment and SMCP should respect the resource limits and requests for ElasticSearch.