-
Spike
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
None
-
False
-
-
False
-
None
-
None
-
None
Epic Goal
- Enable SRIOV operator support on OCP on IBMZ with the Network Express Card Hardware support integrated
Why is this important?
- For OpenShift this would allow to make some types of PCI devices (NED in this case) available and directly attached in pods and improve performance
Scenarios
1. …
Acceptance Criteria
- SRIOV operator enabled on OCP on Z
- Able to directly attach PCI device (NED card) directly to pods
- Doc updated with the support statement
Dependencies (internal and external)
1. z17 Hardware with NED card enabled
2. mstflint (Mellanox Firmware Tool) — s390x Support Required
-
- The Mellanox SR-IOV plugin depends on mstflint for firmware queries and device capability detection.
-
- We must enable or validate s390x support in mstflint to ensure correct introspection on IBM Z hardware.
3. Multi-Arch CI / Build Tooling in Downstream.
Risks:
1) Firmware / Driver Compatibility Risks
- Mellanox firmware capabilities for s390x platforms may differ from x86/ARM.
- Requires validation to avoid incorrect VF creation or device misconfiguration.
2) There might be a potential risk during validation of the operator on s390x, as kernel and operator behavior may differ from other architectures. Any issues discovered during conformance or e2e testing will need to be identified and resolved to ensure the operator functions correctly on s390x.
3) For CI enablement, IBM Z hardware alone with SRIOV capable Network Express cards needs to be moved to the Orange Zone. If this process is not completed on time, CI will instead be enabled directly on the IBM Z platform, and the test results will be generated and shared accordingly.
Mitigation plan:
- Run full conformance and e2e testing on s390x to surface architecture-specific issues.
2. Investigate and fix kernel/operator inconsistencies in collaboration with upstream maintainers
3. Validate each fix on an isolated IBM Z environment to avoid regression risks.
4. Re-run tests until the operator behaves consistently across all architectures.
5. If the Orange Zone transition is delayed, enable CI on the IBM Z platform as a fallback, generate, validate,
and share test results , and use the results to continuously improve test coverage and stability
Previous Work (Optional):
1. …
Open questions::
1. …
Done Checklist
- CI - For new features (non-enablement), existing Multi-Arch CI jobs are not broken by the Epic
- Release Enablement: <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR orf GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
- DEV - If the Epic is adding a new stream, downstream build attached to advisory: <link to errata>
- QE - Test plans in Test Plan tracking software (e.g. Polarion, RQM, etc.): <link or reference to the Test Plan>
- QE - Automated tests merged: <link or reference to automated tests>
- QE - QE to verify documentation when testing
- DOC - Downstream documentation merged: <link to meaningful PR>
- All the stories, tasks, sub-tasks and bugs that belong to this epic need to have been completed and indicated by a status of 'Done'.
- links to