-
Feature Request
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
False
-
sat-proton
-
None
-
None
-
None
-
None
Problem Statement
The customer needs to comply to "SYS.1.6.A3 Secure deployment of containerized IT systems" (URL: https://access.redhat.com/articles/7045834#containerization). The requirement is: "During ongoing operations, the performance and condition of the containerized IT systems SHOULD be monitored (so-called health checks)."
Currently the Lightspeed in Satellite containers (the former insights on-premises) do not meet this requirement.
User Experience & Workflow
- The customer would like to receive some sort of notification from within the Satellite WebUI when some container is down, has some issue and is not working as expected.
- In addition to notification about issues, the WebUI should recommend troubleshooting steps to resolve the issue, e.g. restarting the container or the host.
- When some container is not running this should be reported in the Satellite WebUI in:
- Monitor/Dashboard
- Notifications (the bell in the upper right of the UI)
- Red Hat Lightspeed/<service name
Requirements
- Notification of container status, e.g.
- <service name> <green light> meaning all containers operational
- <service name> <red light> <container name> down. Pleasy try … to resolve the issue
Business Impact
- Delivering this feature enable our customers to meet regulatory requirements.
- Delivering it enables them to comply to "SYS.1.6.A3 Secure deployment of containerized IT systems"
- When his feature *is not included* it's not possible for the customer to determine whether all containers are up and running.
- I cannot assess whether or how this could negatively affect other Red Hat projects.
Additional Information
The following information are taken from my lab. They don't contain any customer data.
- When all containers are up the view in Red Hat Lightspeed/Recommendations looks like in attached screenshot 'iop-advisor-working.png'
- When I stop some containers related to the service it looks like in attached screenshot 'iop-advisor-broken.png'
- The 'check our status page for known outages.' is not helpful for on-premises service that are down.
- This page should contain information regarding which systemd services aka podman containers are not functional and need troubleshooting.
- Advise on how to recover from the situation, e.g. by restarting service units would be helpful.