-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
False
-
-
False
-
?
-
?
-
?
-
?
-
None
-
-
-
Moderate
Steps to reproduce:
- Deploy OpenStackControlPlane with default settings from `openstack-galera-network-isolation` examle CR,
- Deploy OpenStackDataPlaneNodeSet and Deployment (I used install_yaml's `make edpm_deploy` for that)
- Once all will be deployed fine, create new OpenStackDataPlaneDeployment with ServiceOverride defined to install additionally neutron-sriov-agent, like:
apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-edpm-add-sriov-agent spec: nodeSets: - openstack-edpm-ipam servicesOverride: - neutron-sriov
4. Check status of the `neutron-sriov-openstack-edpm-add-sriov-agent-openstack-edpm` POD - it will be in Error state,
5. Check logs of that pod, error will be something like:
TASK [osp.edpm.edpm_container_manage : Create containers managed by Podman for /var/lib/edpm-config/container-startup-config/neutron_sriov_agent] *** Thursday 08 August 2024 08:18:50 +0000 (0:00:00.102) 0:00:26.445 ******* [WARNING]: ERROR: Container neutron_sriov_agent exited with code 125 when runed stderr: time="2024-08-08T08:18:51Z" level=info msg="podman filtering at log level info" time="2024-08-08T08:18:51Z" level=info msg="Using sqlite as database backend" time="2024-08-08T08:18:51Z" level=info msg="Not using native diff for overlay, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled" time="2024-08-08T08:18:51Z" level=info msg="Setting parallel job count to 7" time="2024-08-08T08:18:51Z" level=info msg="Sysctl net.ipv4.ping_group_range=0 0 ignored in containers.conf, since Network Namespace set to host" Error: statfs /var/lib/openstack/cacerts/neutron-sriov/tls-ca-bundle.pem: no such file or directory fatal: [edpm-compute-0]: FAILED! => {"changed": false, "msg": "Failed containers: neutron_sriov_agent"}
I was able to workaround it by ssh to the edpm node and do `sudo mkdir /var/lib/openstack/cacerts/neutron-sriov; sudo cp /var/lib/openstack/cacerts/neutron-metadata/tls-ca-bundle.pem /var/lib/openstack/cacerts/neutron-sriov/` and then ansible runner container finished job without any errors.
I didn't check this in the deployment where neutron-sriov agent would be enabled since begining. Maybe the issue is only when it is run with ServiceOverride. This has to be checked also.
- depends on
-
OSPRH-4767 Some services have to be deployed only after other services reach specific state
- Backlog
- links to