-
Feature
-
Resolution: Unresolved
-
Major
-
rhos-18.0.14 FR 4
-
None
-
Important
-
-
Not Selected
-
False
-
False
-
-
L
-
-
-
-
-
0
-
0
-
0
-
10% To Do, 45% In Progress, 45% Done
-
rhos-conplat-observability
-
Red Hat OpenStack Services on OpenShift (formerly Red Hat OpenStack Platform)
-
Technology Preview
Feature Overview
Ability to request data from the data storage domains for the purposes of generating cloud resource usage reports and billing, allowing for consumption by external FinOps or billing systems.
We are initiating a project to build chargeback capability for our OpenStack environment. Customers need the ability to understand their cloud usage for the purposes of:
- Transparent Cost Recovery
- Customer Trust and Competitive Differentiation
Chargeback provides visibility about cloud resource consumption with various customer domains.
Why does the customer need this? (List the business requirements)
Transparent Cost Recovery
- As a cloud provider, the organization operates at large scale, offering on-demand compute, storage, and network resources to multiple tenants. Showback provides a clear, itemized breakdown of resource usage per tenant, ensuring each customer understands what they are consuming and the associated costs.
- By attributing real-time or monthly costs to usage (e.g., compute hours, storage GB-month, bandwidth), the cloud provider can recoup operational expenses without surprising customers with opaque billing.
- Showback also encourages optimized resource usage. When tenants see how their consumption affects costs, they can make informed decisions (e.g., rightsizing virtual machines or archiving stale data).
Customer Trust and Competitive Differentiation
- Offering transparent consumption reports increases customer trust. Tenants value clarity on how they are being charged or how their resource usage impacts their budget.
- In a competitive market, having robust chargeback capabilities can differentiate the cloud provider by demonstrating cost visibility.
- Chargeback enables additional use cases and an ability to become a cloud provider.
Goals
- This feature should allow for billing of tenants and resource planning based on tenant usage
- Usage should be available as a RAW data output from the CLI
- Usage should be based on the requirements agreed upon below in the requirements
Requirements
To achieve a comprehensive view of IaaS consumption, we require the collection of specific metrics from the following services:
- Nova (Compute)
- Cinder (Block Storage)
- Manila (Shared File Systems)
- Neutron (Networking)
- Keystone (Identity)
- Ironic (Bare Metal)
- Octavia (Load Balancing, including OVN load balancers)
- Swift (Object Storage)
- Glance (Image Service)
- Barbican (Key Management)
- Designate (DNS)
- Heat (Orchestration)
Service | Metric | MVP? | Available in CloudKitty | Work to include (S-XL) | Comment |
Nova (Compute) | Instance hours (total hours instances are running) | Yes | yes | This will provide lifetime but not uptime | |
vCPU usage (allocated CPU count or CPU time) | Its the same as above | ||||
Memory usage (RAM allocated in MB/GB) | yes | ||||
Ephemeral storage usage (local disk usage) | Yes | yes | |||
Cinder (Block Storage) | Volume hours (allocated volume lifespan in hours) | yes | |||
Volume size (allocated GB) | Yes | Its the same as above | |||
Snapshot size (allocated GB for snapshots) | Yes | yes | |||
Manila (Shared File Systems) | Share hours (allocated share lifespan in hours) | ||||
Share size (allocated GB) | Yes | ||||
Snapshot size (GB for snapshots) | |||||
Neutron (Networking) | Floating IP hours (number of hours floating IPs are assigned) | Yes | yes | ||
Network port usage (count of ports per tenant, possibly in hours) | |||||
Ingress/Egress bandwidth (volume of data in/out) | Yes | yes? | Should be checked that this can be mapped to the tenant | ||
Keystone (Identity) | |||||
Ironic (Bare Metal) | Bare metal node hours (physical node allocation hours) | ||||
CPU/Memory resources (total CPU cores, memory in GB) | |||||
Node state changes (provisioning states, cleaning, etc.) | |||||
Octavia (Load Balancing) | Load balancer hours (amphora or OVN LB usage) | Yes | no | There is an OVN exporter that we should investigate | |
Listener hours (each load balancer can have multiple listeners) | no | ||||
Pool hours (hours each pool is active) VIP usage (virtual IP addresses, similar to floating IPs) |
no | ||||
OVN Load Balancers | |||||
For deployments leveraging OVN as a driver | OVN LB creation/tear-down events | ||||
For deployments leveraging OVN as a driver | OVN LB VIP or router port usage (as applicable) | ||||
For deployments leveraging OVN as a driver | Traffic metrics (if exposed via OVN) | ||||
Swift (Object Storage) | Object storage usage (GB stored × hours) | yes | |||
Object count (objects per container) | yes | ||||
Container count (containers per tenant) | yes | ||||
Glance (Image Service) | Image storage usage (GB stored × hours) | Yes | yes | ||
Image count (number of active images per tenant) | its the same as above | ||||
Barbican (Key Management) | Secrets count (number of secrets stored per tenant) | no | |||
Container usage (if using Barbican containers for key groups) | no | ||||
Designate (DNS) | DNS zone hours (each zone allocated × hours) | Yes | no | ||
Record count (number of DNS records per zone, per tenant) | no | ||||
Zone create/delete events (track frequency of new zones or removed zones) | Yes | no | |||
Heat (Orchestration) | Stack hours (how long each stack is active) | no | |||
Stack count (total active stacks per tenant) | no | ||||
Stack events (creation, updates, deletions) | no |
Use Cases
- Collection of usage rates and data export based on flavor type
- As the administrator and customer of RHOSO, I want to reliably generate and access, itemized, raw data output (preferably JSON) of OpenStack resource usage for all my tenants, specifically ensuring that compute resources, <metric_1>, <metric_2, ..., and <metric_n> are broken down by instance flavor and identified project names (resolved from project IDs for better understanding), so that I can implement transparent cost recovery for my tenants, integrate this data into my internal FinOps or billing solutions, and support my strategic capacity planning and operational efficiency initiatives.
- While report generation will utilize the UUID by default, it should be possible to perform a lookup to get the vanity name of the UUID values.
Out of Scope
- Providing a billing interface and report generation that could be used for billing directly from the implementation.
- Interfacing with Horizon or other GUI systems.
Documentation Considerations
- Product documentation
- Minimal amount of information allowing for data collection enablement and access to creating the queries (ratings) required for the data export.
- Product Management to create a blog post provides a walk through of a general use case with examples and output provided from the CLI.
Questions to Answer
- Can we get functionality merged and backported allowing the retrieval of flavor attributes from the compute systems.
- Is it possible to get Manila data?
- Can we get Loki team sign off and an internal support agreement for usage of Loki within OpenStack customer environments?
- Do we need swift/S3 sign off?
- Do we need nova? Sign off for flavor?
Risks
- upstream patch to provide storage adapter for Loki
- downstream Loki being able to deploy on CRC for purposes of being enabled in the testing framework
- deployment of a production-style compact OpenShift cluster with a production style Loki environment for review and sign off by the Loki team
- getting CloudKitty image builds early enough in the development timeline to allow for testing and validation prior to shipping FR4
Background and Strategic Fit
This Feature is of high importance for a lot of new customers. Their use case could, as an example, be to build cloud services on their machines and hence would billing be crucial.
Customer Considerations
This feature is built from customer needs and should be continuously evaluated with the customer.
- relates to
-
COST-5067 Virtual Machines: Cost Management for OpenStack
-
- Backlog
-
- links to