-
Bug
-
Resolution: Done
-
Major
-
None
-
Sprint 2.6, Model Serving Sprint 2.7, Model Serving Sprint 2.8, Model Serving Sprint 2.9-1, Model Serving Sprint 2.9-2, Model Serving Sprint Q2-2, Model Serving Sprint Q2-3
When kserve and modelmeh are running in the same namespace, modelmesh controller show these errors:
{"level":"error","ts":"2023-12-07T11:33:47Z","msg":"Reconciler error","controller":"predictor","controllerGroup":"serving.kserve.io","controllerKind":"Predictor","Predictor":{"name":"caikit-tgis-example-isvc","namespace":"isvc_kserve-demo"},"namespace":"isvc_kserve-demo","name":"caikit-tgis-example-isvc","reconcileID":"868aa907-1733-408b-a8cd-482ac234f616","error":"failed to remove corresponding VModel for deleted Predictor kserve-demo/caikit-tgis-example-isvc: rpc error: code = Unavailable desc = last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 10.128.0.84:8033: i/o timeout\"","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/root/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/root/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\ns... {"level":"error","ts":"2023-12-07T11:33:47Z","msg":"Reconciler error","controller":"predictor","controllerGroup":"serving.kserve.io","controllerKind":"Predictor","Predictor":{"name":"example-onnx-mnist","namespace":"isvc_kserve-demo"},"namespace":"isvc_kserve-demo","name":"example-onnx-mnist","reconcileID":"4736d0d4-e010-4915-a537-07634c94d85f","error":"failed to SetVModel for InferenceService example-onnx-mnist: rpc error: code = Unavailable desc = last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 10.128.0.84:8033: i/o timeout\"","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/root/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/root/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/contro...
because kserve-demo namespace is a member of ServiceMeshMemberRole due to which traffic is not passing from modelmesh-controller pod to modelmesh runtime pod. Below NetworkPolicy could be created in kserve-demo namespace which allows traffic from opendatahub namespace.
kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: allow-from-opendatahub-ns namespace: kserve-demo spec: podSelector: {} ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: opendatahub policyTypes: - Ingress
Please follow below thread for more details :
https://redhat-internal.slack.com/archives/C065ARTVA80/p1702293019814919?thread_ts=1701693652.733169&cid=C065ARTVA80
- is depended on by
-
RHOAIENG-1244 Test co-existence of KServe and ModelMesh in the same NS
- Review
- split to
-
RHOAIENG-2653 Model mesh controller monitors and reports logs about KServe model resources when in the same namespace
- New