-
Bug
-
Resolution: Done
-
Undefined
-
rhelai-1.5
-
None
-
False
-
-
False
-
-
-
Approved
From: https://gitlab.com/redhat/rhel-ai/diip/-/jobs/10000155067
dk-bench is failing with:
ERROR 2025-05-12 06:16:58,330 instructlab.cli.model.evaluate:313: An error occurred during evaluation: zstd C API versions mismatch; Python bindings were not compiled/linked against expected zstd version (10501 returned by the lib, 10501 hardcoded in zstd headers, 10506 hardcoded in the cext)
running:
curl -s https://raw.githubusercontent.com/instructlab/instructlab/main/scripts/test-data/dk-bench-questions.jsonl > ${HOME_DIR}/dk-bench-questions.jsonl curl -s https://raw.githubusercontent.com/instructlab/instructlab/main/scripts/test-data/dk-bench-questions-with-responses.jsonl > ${HOME_DIR}/dk-bench-questions-with-responses.jsonl export ILAB_ADDITIONAL_ENV="OPENAI_API_KEY='$OPENAI_API_KEY'" && ilab model evaluate --model ${trained_model} --benchmark dk_bench --input-questions ${HOME_DIR}/dk-bench-questions.jsonl --output-file-formats ${dk_bench_output_formats} 2>&1 | tee dk_bench.log export ILAB_ADDITIONAL_ENV="OPENAI_API_KEY='$OPENAI_API_KEY'" && ilab model evaluate --model ${trained_model} --benchmark dk_bench --input-questions ${HOME_DIR}/dk-bench-questions-with-responses.jsonl --output-file-formats ${dk_bench_output_formats} 2>&1 | tee dk_bench_with_responses.log ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 8 CUDA devices: Device 0: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes Device 1: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes Device 2: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes Device 3: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes Device 4: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes Device 5: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes Device 6: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes Device 7: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes Platform: sys.version: 3.11.7 (main, Jan 8 2025, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] sys.platform: linux os.name: posix platform.release: 5.14.0-427.65.1.el9_4.x86_64 platform.machine: x86_64 platform.node: instructlab-ci-8xa100-preserve platform.python_version: 3.11.7 os-release.ID: rhel os-release.VERSION_ID: 9.4 os-release.PRETTY_NAME: Red Hat Enterprise Linux 9.4 (Plow) memory.total: 1259.87 GB memory.available: 1250.45 GB memory.used: 2.45 GB InstructLab: instructlab.version: 0.26.1 instructlab-dolomite.version: 0.2.0 instructlab-eval.version: 0.5.1 instructlab-quantize.version: 0.1.0 instructlab-schema.version: 0.4.2 instructlab-sdg.version: 0.8.2 instructlab-training.version: 0.10.2 Torch: torch.version: 2.6.0 torch.backends.cpu.capability: AVX512 torch.version.cuda: 12.4 torch.version.hip: None torch.cuda.available: True torch.backends.cuda.is_built: True torch.backends.mps.is_built: False torch.backends.mps.is_available: False torch.cuda.bf16: True torch.cuda.current.device: 0 torch.cuda.0.name: NVIDIA A100-SXM4-80GB torch.cuda.0.free: 78.7 GB torch.cuda.0.total: 79.1 GB torch.cuda.0.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute) torch.cuda.1.name: NVIDIA A100-SXM4-80GB torch.cuda.1.free: 78.7 GB torch.cuda.1.total: 79.1 GB torch.cuda.1.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute) torch.cuda.2.name: NVIDIA A100-SXM4-80GB torch.cuda.2.free: 78.7 GB torch.cuda.2.total: 79.1 GB torch.cuda.2.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute) torch.cuda.3.name: NVIDIA A100-SXM4-80GB torch.cuda.3.free: 78.7 GB torch.cuda.3.total: 79.1 GB torch.cuda.3.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute) torch.cuda.4.name: NVIDIA A100-SXM4-80GB torch.cuda.4.free: 78.7 GB torch.cuda.4.total: 79.1 GB torch.cuda.4.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute) torch.cuda.5.name: NVIDIA A100-SXM4-80GB torch.cuda.5.free: 78.7 GB torch.cuda.5.total: 79.1 GB torch.cuda.5.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute) torch.cuda.6.name: NVIDIA A100-SXM4-80GB torch.cuda.6.free: 78.7 GB torch.cuda.6.total: 79.1 GB torch.cuda.6.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute) torch.cuda.7.name: NVIDIA A100-SXM4-80GB torch.cuda.7.free: 78.7 GB torch.cuda.7.total: 79.1 GB torch.cuda.7.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute) llama_cpp_python: llama_cpp_python.version: 0.3.6 llama_cpp_python.supports_gpu_offload: True
- relates to
-
AIPCC-1352 zstandard package does not use system zstd
-
- Closed
-
- mentioned on
(1 mentioned on)