Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2548

SDG on granite-8b is not appropriately choosing the proper chat template

XMLWordPrintable

    • Important
    • Approved

      To Reproduce Steps to reproduce the behavior:
      **

      Run full scale agnetic SDG

       

      You will see that the pretraining output does not use the appropriate legacy template for sdg that should be used with the granite-8b-starter and granite-7b-starter model.  (It will have <|start_of_role|>user<|end_of_role|> tags versus the appropriate legacy tags of 

      '<|user|> <|assistant|>'

      https://github.com/instructlab/sdg/blob/eef8baedf36f21da73bda39270ed7378558541ad/src/instructlab/sdg/datamixing.py#L387-L389

       

      Expected behavior

      • With granite-8b-starter: the legacy template should be used in sdg.
      •  

      Device Info (please complete the following information):

      • Hardware Specs: 8x A100 machine IBM Cloud
      • OS Version: RHEL AI 1.3
      • InstructLab Version: 
        ilab, version 0.21.0
      •  
      • Provide the output of these two commands:
        • "registry.redhat.io/rhelai1/bootc-ibm-nvidia-rhel9:1.3"

      Platform:

        sys.version: 3.11.7 (main, Oct  9 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)]

        sys.platform: linux

        os.name: posix

        platform.release: 5.14.0-427.42.1.el9_4.x86_64

        platform.machine: x86_64

        platform.node: tyler-machine-boot-6

        platform.python_version: 3.11.7

        os-release.ID: rhel

        os-release.VERSION_ID: 9.4

        os-release.PRETTY_NAME: Red Hat Enterprise Linux 9.4 (Plow)

        memory.total: 1259.87 GB

        memory.available: 1196.92 GB

        memory.used: 38.83 GB

       

      InstructLab:

        instructlab.version: 0.21.0

        instructlab-dolomite.version: 0.2.0

        instructlab-eval.version: 0.4.1

        instructlab-quantize.version: 0.1.0

        instructlab-schema.version: 0.4.1

        instructlab-sdg.version: 0.6.1

        instructlab-training.version: 0.6.1

       

      Torch:

        torch.version: 2.4.1

        torch.backends.cpu.capability: AVX512

        torch.version.cuda: 12.4

        torch.version.hip: None

        torch.cuda.available: True

        torch.backends.cuda.is_built: True

        torch.backends.mps.is_built: False

        torch.backends.mps.is_available: False

        torch.cuda.bf16: True

        torch.cuda.current.device: 0

        torch.cuda.0.name: NVIDIA A100-SXM4-80GB

        torch.cuda.0.free: 69.5 GB

        torch.cuda.0.total: 79.1 GB

        torch.cuda.0.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)

        torch.cuda.1.name: NVIDIA A100-SXM4-80GB

        torch.cuda.1.free: 69.4 GB

        torch.cuda.1.total: 79.1 GB

        torch.cuda.1.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)

        torch.cuda.2.name: NVIDIA A100-SXM4-80GB

        torch.cuda.2.free: 69.4 GB

        torch.cuda.2.total: 79.1 GB

        torch.cuda.2.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)

        torch.cuda.3.name: NVIDIA A100-SXM4-80GB

        torch.cuda.3.free: 69.4 GB

        torch.cuda.3.total: 79.1 GB

        torch.cuda.3.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)

        torch.cuda.4.name: NVIDIA A100-SXM4-80GB

        torch.cuda.4.free: 69.4 GB

        torch.cuda.4.total: 79.1 GB

        torch.cuda.4.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)

        torch.cuda.5.name: NVIDIA A100-SXM4-80GB

        torch.cuda.5.free: 69.4 GB

        torch.cuda.5.total: 79.1 GB

        torch.cuda.5.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)

        torch.cuda.6.name: NVIDIA A100-SXM4-80GB

        torch.cuda.6.free: 69.4 GB

        torch.cuda.6.total: 79.1 GB

        torch.cuda.6.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)

        torch.cuda.7.name: NVIDIA A100-SXM4-80GB

        torch.cuda.7.free: 69.3 GB

        torch.cuda.7.total: 79.1 GB

        torch.cuda.7.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)

       

      llama_cpp_python:

        llama_cpp_python.version: 0.2.79

        llama_cpp_python.supports_gpu_offload: True

      •  

              lisowskiibm Tyler Lisowski
              lisowskiibm Tyler Lisowski
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: