SR-IOV cluster: * Borrowed BM cluster from Zhanqi: $ oc get node NAME STATUS ROLES AGE VERSION dell-per740-13.rhts.eng.pek2.redhat.com Ready master 2d17h v1.24.0+9546431 dell-per740-14.rhts.eng.pek2.redhat.com Ready worker 2d16h v1.24.0+9546431 dell-per740-31.rhts.eng.pek2.redhat.com Ready master 2d17h v1.24.0+9546431 dell-per740-32.rhts.eng.pek2.redhat.com Ready master 2d17h v1.24.0+9546431 dell-per740-35.rhts.eng.pek2.redhat.com Ready worker 2d16h v1.24.0+9546431 * One node with SR-IOV enabled dell-per740-14.rhts.eng.pek2.redhat.com: $ oc get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS dell-per740-13.rhts.eng.pek2.redhat.com Ready master 2d17h v1.24.0+9546431 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=dell-per740-13.rhts.eng.pek2.redhat.com,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos dell-per740-14.rhts.eng.pek2.redhat.com Ready worker 2d16h v1.24.0+9546431 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,feature.node.kubernetes.io/sriov-capable=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=dell-per740-14.rhts.eng.pek2.redhat.com,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos dell-per740-31.rhts.eng.pek2.redhat.com Ready master 2d17h v1.24.0+9546431 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=dell-per740-31.rhts.eng.pek2.redhat.com,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos dell-per740-32.rhts.eng.pek2.redhat.com Ready master 2d17h v1.24.0+9546431 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=dell-per740-32.rhts.eng.pek2.redhat.com,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos dell-per740-35.rhts.eng.pek2.redhat.com Ready worker 2d16h v1.24.0+9546431 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=dell-per740-35.rhts.eng.pek2.redhat.com,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos * OCP version: Server Version: 4.11.0-0.nightly-2022-08-11-023608 * SR-IOV Operator is installed: $ oc get pods -n openshift-sriov-network-operator NAME READY STATUS RESTARTS AGE network-resources-injector-cf5n8 1/1 Running 2 2d16h network-resources-injector-rmsh9 1/1 Running 1 2d16h network-resources-injector-x8dg8 1/1 Running 1 2d16h operator-webhook-8x4jq 1/1 Running 1 2d16h operator-webhook-fscvd 1/1 Running 1 2d16h operator-webhook-wdz4n 1/1 Running 2 2d16h sriov-device-plugin-f7csv 1/1 Running 0 12h sriov-network-config-daemon-cjlvm 3/3 Running 9 2d16h sriov-network-config-daemon-h87wb 3/3 Running 6 2d16h sriov-network-operator-bb7ff449b-zjsmx 1/1 Running 0 27h * Additional steps to provision NFS server storage is required in BM environment: Used this script to deploy nfs-server: https://gitlab.cee.redhat.com/-/ide/project/wduan/openshift_storage/tree/master/-/nfs/deploy_nfs_provisioner.sh With above script NFS server pod should be running in nfs-provisioner NS * Create a PV: **Replace server IP to NFS service IP: $ oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nfs-provisioner ClusterIP 172.30.61.95 2049/TCP,2049/UDP,32803/TCP,32803/UDP,20048/TCP,20048/UDP,875/TCP,875/UDP,111/TCP,111/UDP,662/TCP,662/UDP 119m $ cat ~/workspaces/cluster_bot/pv-nfs.yaml { "apiVersion": "v1", "kind": "PersistentVolume", "metadata": { "name": "nfs" }, "spec": { "capacity": { "storage": "5Gi" }, "accessModes": [ "ReadWriteOnce" ], "nfs": { "path": "/", "server": "172.30.61.95" }, "persistentVolumeReclaimPolicy": "Recycle" } } Source: https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/storage/nfs/auto-nfs-recycle-rwo.json $ oc apply -f ~/workspaces/cluster_bot/pv-nfs.yaml persistentvolume/nfs created $ oc get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE nfs 5Gi RWO Recycle Available 6s * Create PVC for loki: $ cat ~/workspaces/cluster_bot/1-loki-storage-bm.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: loki-store spec: resources: requests: storage: 1G volumeMode: Filesystem accessModes: - ReadWriteOnce% $ oc apply -f ~/workspaces/cluster_bot/1-loki-storage-bm.yaml persistentvolumeclaim/loki-store created $ oc get pvc -n nfs-provisioner NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE loki-store Bound pvc-b3db44f2-0c84-4058-b76b-8be7732ef1ad 1G RWO nfs 33m * Deployed loki: curl -S -L https://raw.githubusercontent.com/netobserv/documents/main/examples/zero-click-loki/2-loki.yaml > ~/workspaces/cluster_bot/2-loki.yaml $ oc apply -f ~/workspaces/cluster_bot/2-loki.yaml configmap/loki-config created pod/loki created service/loki created $ oc get pods NAME READY STATUS RESTARTS AGE loki 1/1 Running 0 57s nfs-provisioner-7bb7786897-wmg65 1/1 Running 0 125m * Deployed flowcollector as: $ oc get flowcollector -o yaml apiVersion: v1 items: - apiVersion: flows.netobserv.io/v1alpha1 kind: FlowCollector metadata: creationTimestamp: "2022-08-16T15:24:25Z" finalizers: - flows.netobserv.io/finalizer generation: 4 name: cluster resourceVersion: "1555033" uid: faf1aae0-00e3-43e1-88bf-efe21dc095a0 spec: agent: ebpf clusterNetworkOperator: namespace: openshift-network-operator consolePlugin: image: quay.io/netobserv/network-observability-console-plugin:v0.1.4 imagePullPolicy: IfNotPresent logLevel: info port: 9001 portNaming: enable: true portNames: "3100": loki register: true replicas: 1 resources: limits: memory: 100Mi requests: cpu: 100m memory: 50Mi ebpf: cacheActiveTimeout: 5s cacheMaxFlows: 1000 excludeInterfaces: - lo image: quay.io/netobserv/netobserv-ebpf-agent:v0.1.2 imagePullPolicy: IfNotPresent logLevel: info resources: {} flowlogsPipeline: dropUnusedFields: true enableKubeProbes: true healthPort: 8080 image: quay.io/netobserv/flowlogs-pipeline:v0.1.3 imagePullPolicy: IfNotPresent kind: DaemonSet logLevel: info port: 2055 prometheusPort: 9102 replicas: 1 resources: limits: memory: 300Mi requests: cpu: 100m memory: 100Mi ipfix: cacheActiveTimeout: 20s cacheMaxFlows: 400 forceSampleAll: false sampling: 2 kafka: address: kafka-cluster-kafka-bootstrap.network-observability enable: false tls: caCert: certFile: ca.crt name: kafka-cluster-cluster-ca-cert type: secret enable: false insecureSkipVerify: false userCert: certFile: user.crt certKey: user.key name: flp-kafka type: secret topic: network-flows loki: batchSize: 102400 batchWait: 1s maxBackoff: 5m0s maxRetries: 10 minBackoff: 1s staticLabels: app: netobserv-flowcollector tenantID: netobserv timeout: 10s tls: caCert: certFile: service-ca.crt name: loki type: configmap enable: false insecureSkipVerify: false userCert: {} url: http://loki.nfs-provisioner.svc:3100/ namespace: network-observability ovnKubernetes: containerName: ovnkube-node daemonSetName: ovnkube-node namespace: ovn-kubernetes status: conditions: - lastTransitionTime: "2022-08-17T08:17:04Z" message: "" reason: Ready status: "True" type: Ready namespace: network-observability kind: List metadata: resourceVersion: "" selfLink: "" $ oc get pods NAME READY STATUS RESTARTS AGE flowlogs-pipeline-2bxnm 1/1 Running 0 33m flowlogs-pipeline-dtqg4 1/1 Running 2 (33m ago) 33m flowlogs-pipeline-n95db 1/1 Running 2 (34m ago) 34m flowlogs-pipeline-v468q 1/1 Running 0 33m flowlogs-pipeline-wqd4k 1/1 Running 3 (34m ago) 34m network-observability-plugin-7866fcd6b7-gsm96 1/1 Running 0 34m $ oc get pods -n network-observability-privileged NAME READY STATUS RESTARTS AGE netobserv-ebpf-agent-hb87x 1/1 Running 1 26h netobserv-ebpf-agent-j7hnd 1/1 Running 2 26h netobserv-ebpf-agent-j9nk5 1/1 Running 0 26h netobserv-ebpf-agent-m847f 1/1 Running 0 26h netobserv-ebpf-agent-p77df 1/1 Running 0 26h Verified e2e from end to end in Table view and Topology flows are received well.