Goal Summary:
Investigate and validate how RHACS could support AI workload identification and AI BOM ingestion in the future by researching model formats, metadata standards, and integration patterns, and by documenting clear recommendations, constraints, and next steps.
Goals and expected user outcomes:
Goals:
- Evaluate approaches for identifying AI workloads and ingesting AI BOMs in RHACS.
- Investigate model formats, metadata standards, and integration patterns relevant to AI workloads.
- Document feasibility, constraints, and architectural trade-offs for potential implementation approaches.
Expected end-user outcomes (enabled by future implementation informed by this discovery):
- End users will be able to identify which workloads in their clusters contain AI/ML models.
- End users will have centralized visibility into AI models and associated metadata through RHACS.
- End users will be able to associate externally generated AI BOMs with running workloads and images.
- End users will gain a clearer inventory of AI-related artifacts to support governance, audit, and risk assessment.
Possible approaches to evaluate:
- Option A: Classifying AI workloads as existing Kubernetes objects (Deployments, StatefulSets, Pods) with AI metadata.
- Option B: Introducing a new AI workload abstraction that can coexist with existing workloads and store AI-specific metadata such as model formats, provenance, and AI BOM associations.
- Option C: Hybrid approaches, where existing workloads are enriched with associated AI metadata objects to track AI models without modifying workload types.
This discovery does not directly deliver these outcomes, but defines the path and requirements to enable them in a subsequent implementation.
Acceptance Criteria:
- Evaluate and document technical approaches for representing AI workloads and AI BOMs in RHACS.
- Identify architectural, security, performance, and maintainability considerations for handling AI model artifacts.
- Assess integration patterns and constraints for future ingestion of externally generated AI BOMs and scanner outputs.
- Create lightweight prototype for how this will work withing RHACS.
Success Criteria or KPIs measured:
- Completion of a lightweight prototype that demonstrates how AI workload identification and AI BOM ingestion could work in RHACS.
- Alignment across Product, Engineering, and Architecture on the findings and recommendations from the Discovery.
- Stakeholder review and sign-off confirming the discovery objectives have been met.
Use Cases (Optional):
- Security Engineer: Wants visibility into which workloads in the cluster are running AI models so they can plan governance and risk mitigation.
- Cluster Administrator: Needs to inventory AI workloads and associated metadata to ensure proper management and compliance across the environment.
- DevSecOps Team Member: Wants to understand how externally generated AI BOMs could be associated with deployments or images to prepare for policy enforcement in CI/CD pipelines.
- Product Manager / Technical Stakeholder: Wants a prototype that demonstrates potential approaches for AI workload identification and BOM ingestion to inform roadmap decisions.
Out of Scope (Optional):
High-level list of items that are out of scope. Initial completion during Refinement status.
- Delivery of production-ready AI workload discovery or AI BOM ingestion features.
- Vulnerability scanning or CVE correlation for AI models.
- Behavioral, safety, bias, fairness, or hallucination evaluation of AI models.
- Native execution, deserialization, or analysis of AI models within RHACS.
- Integration with specific third-party AI artifact scanners for production use.
- Performance, scalability, or reliability guarantees for any future implementation.
- Future Direction / Step 2: Integration with external AI artifact scanners (e.g., modelscan) and policy evaluation of ingested findings — to be considered in a separate implementation phase.
Additional Context:
MODEL METADATA SOURCES
| Category | Metadata Source | What Metadata It Provides | Link |
|---|---|---|---|
| Metadata Source | Hugging Face Model Cards | Training data summary, intended use, risks, limitations, license, evaluation metrics. | https://huggingface.co/docs/hub/model-cards |
| Metadata Source | ONNX Model Metadata (ModelProto) | Opset version, graph structure, type info, custom metadata fields. | https://onnx.ai/onnx/repo-docs/IR.html |
| Metadata Source | Safetensors Metadata Header | Embedded key-value metadata alongside tensor storage. | https://huggingface.co/docs/safetensors/index |
| Metadata Source | TensorFlow SavedModel Metadata | Signatures, input/output shapes, function definitions, graph info. | https://www.tensorflow.org/guide/saved_model |
| Metadata Source | MLflow MLmodel YAML | Model signatures, input/output schema, environment/runtime metadata. | https://mlflow.org/docs/latest/models.html |
| Metadata Source | Keras Model Config | Architecture, training configuration, preprocessing details. | https://keras.io/guides/serialization_and_saving/ |
| Metadata Source | KServe / Kubeflow InferenceService Spec | Runtime type, model URI, predictor config, supported protocols. | https://kserve.github.io/ |
| Metadata Source | Registry Manifests (e.g., Hugging Face Hub, ONNX Model Zoo) | Model version, tags, authors, datasets, license, provenance. | https://huggingface.co/models / https://github.com/onnx/models |
| Metadata Source | Modelscan / AI BOM Extractors | Architecture, tokenizer, parameters, quantization, risks, provenance, reproducibility info. | https://github.com/orkohunter/modelscan |
FORMATS
| Category | Name / Format | Description | Link |
|---|---|---|---|
| Model Format | ONNX (Open Neural Network Exchange) | Portable, framework-agnostic format for exchanging deep learning models. | https://onnx.ai/ |
| Model Format | PyTorch (.pt, .pth) | PyTorch-native serialized model formats; can include TorchScript. | https://pytorch.org/tutorials/beginner/saving_loading_models.html |
| Model Format | TorchScript | PyTorch scripting/tracing format for serialized models. | https://pytorch.org/docs/stable/jit.html |
| Model Format | Safetensors | Hugging Face secure tensor serialization format for PyTorch / Transformers. | https://huggingface.co/docs/safetensors/index |
| Model Format | GGUF | Llama.cpp / GGML-based format; includes structured metadata for LLMs. | https://github.com/ggerganov/ggml/blob/master/docs/gguf.md |
| Model Format | TensorFlow SavedModel | TensorFlow’s standard serialized model format including graph, variables, assets. | https://www.tensorflow.org/guide/saved_model |
| Model Format | TensorFlow Frozen Graph (.pb) | TensorFlow serialized graph format (deprecated but still in use). | https://www.tensorflow.org/guide/intro_to_graphs |
| Model Format | TensorFlow Lite (.tflite) | Lightweight ML model format for mobile / edge deployment. | https://www.tensorflow.org/lite/models |
| Model Format | Keras HDF5 (.h5) | Keras / TensorFlow serialization including architecture, weights, training config. | https://keras.io/api/models/model_saving_apis/ |
| Model Format | Flax / JAX Checkpoints | Checkpoint format for Flax models; stores parameters and optimizer states. | https://flax.readthedocs.io/en/latest/guides/use_checkpointing.html |
| Model Format | Orbax Checkpointing | Newer JAX / Flax checkpointing framework for distributed training. | https://github.com/google/orbax |
| Model Format | XGBoost Models (.json, binary) | Gradient boosting models; can store tree structure, parameters, metadata. | https://xgboost.readthedocs.io/en/stable/tutorials/saving_model.html |
| Model Format | LightGBM Models (.txt, binary) | Gradient boosting models; supports textual and binary serialization. | https://lightgbm.readthedocs.io/en/latest/Advanced-Topics.html#model-format |
| Model Format | CatBoost (.cbm) | Gradient boosting model format with metadata for reproducibility. | https://catboost.ai/en/docs/concepts/python-reference_catboost_save_model |
| Model Format | Scikit-Learn Pickle / Joblib (.pkl, .joblib) | Serialized models for classical ML; includes architecture and parameters. | https://scikit-learn.org/stable/model_persistence.html |
| Model Format | Core ML (.mlmodel) | Apple ML format for iOS / macOS; includes model description, metadata, weights. | https://developer.apple.com/documentation/coreml |
| Model Format | RKNN (Rockchip NPU) | Optimized model format for Rockchip NPUs / embedded inference. | https://github.com/rockchip-linux/rknn-toolkit2 |
| Model Format | NNAPI Models | Android Neural Networks API runtime model format for mobile inference. | https://developer.android.com/ndk/guides/neuralnetworks |
| Model Format | TensorRT Engine (.engine) | NVIDIA optimized model format for TensorRT inference; stores network and weights. | https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html |
| Model Format | NVIDIA Triton Model Repository Format | Repository structure for serving models with Triton Inference Server. | https://github.com/triton-inference-server/server |
| Model Format | DeepLearning4J Models (.zip) | JVM-based neural network models with architecture, weights, and config. | https://deeplearning4j.konduit.ai/ |
| Model Format | PMML (Predictive Model Markup Language) | XML-based format for classical ML / statistical models; stores architecture, parameters, metadata. | https://dmg.org/pmml/v4-4-1/GeneralStructure.html |
| Model Format | Caffe (.caffemodel, .prototxt) | Caffe deep learning model and network definition files. | https://caffe.berkeleyvision.org/ |
| Model Format | MXNet (.params + .json) | Apache MXNet serialized model format. | https://mxnet.apache.org/versions/1.9.1/api/faq/model_load_save |
| Model Format | NNEF (Neural Network Exchange Format) | Khronos standardized model format for portable NN graphs. | https://www.khronos.org/nnef |
- is cloned by
-
ROX-32143 [TechPreview] AI Workload Discovery and AI BOM Ingestion in RHACS
-
- New
-