/usr/lib64/python3.11/inspect.py:389: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead return isinstance(object, types.FunctionType) Detected capabilities: [-cpu -gaudi -gaudi2 +gaudi3 +index_reduce] DEBUG 01-31 03:52:19 scripts.py:139] Setting VLLM_WORKER_MULTIPROC_METHOD to 'spawn' INFO 01-31 03:52:19 api_server.py:592] vLLM API server version 0.6.4.post2 INFO 01-31 03:52:19 api_server.py:593] args: Namespace(subparser='serve', model_tag='instructlab/granite-7b-lab', config='', host=None, port=8000, uvicorn_log_level='info', allow_credentials=False, allowed_origins=['*'], allowed_methods=['*'], allowed_headers=['*'], api_key=None, lora_modules=None, prompt_adapters=None, chat_template=None, chat_template_content_format='auto', response_role='assistant', ssl_keyfile=None, ssl_certfile=None, ssl_ca_certs=None, ssl_cert_reqs=0, root_path=None, middleware=[], return_tokens_as_token_ids=False, disable_frontend_multiprocessing=False, enable_auto_tool_choice=False, tool_call_parser=None, tool_parser_plugin='', model='instructlab/granite-7b-lab', task='auto', tokenizer=None, skip_tokenizer_init=False, revision=None, code_revision=None, tokenizer_revision=None, tokenizer_mode='auto', trust_remote_code=False, allowed_local_media_path=None, download_dir=None, load_format='auto', weights_load_device=None, config_format=, dtype='bfloat16', kv_cache_dtype='auto', quantization_param_path=None, max_model_len=None, guided_decoding_backend='outlines', distributed_executor_backend=None, worker_use_ray=False, pipeline_parallel_size=1, tensor_parallel_size=4, max_parallel_loading_workers=None, ray_workers_use_nsight=False, block_size=128, enable_prefix_caching=False, disable_sliding_window=False, use_v2_block_manager=False, use_padding_aware_scheduling=False, num_lookahead_slots=0, seed=0, swap_space=4, cpu_offload_gb=0, gpu_memory_utilization=0.9, num_gpu_blocks_override=None, max_num_batched_tokens=None, max_num_seqs=256, max_num_prefill_seqs=None, max_logprobs=20, disable_log_stats=False, quantization=None, rope_scaling=None, rope_theta=None, hf_overrides=None, enforce_eager=False, max_seq_len_to_capture=8192, disable_custom_all_reduce=False, tokenizer_pool_size=0, tokenizer_pool_type='ray', tokenizer_pool_extra_config=None, limit_mm_per_prompt=None, mm_processor_kwargs=None, enable_lora=False, enable_lora_bias=False, max_loras=1, max_lora_rank=16, lora_extra_vocab_size=256, lora_dtype='auto', long_lora_scaling_factors=None, max_cpu_loras=None, fully_sharded_loras=False, enable_prompt_adapter=False, max_prompt_adapters=1, max_prompt_adapter_token=0, device='auto', num_scheduler_steps=1, multi_step_stream_outputs=True, scheduler_delay_factor=0.0, enable_chunked_prefill=None, speculative_model=None, speculative_model_quantization=None, num_speculative_tokens=None, speculative_disable_mqa_scorer=False, speculative_draft_tensor_parallel_size=None, speculative_max_model_len=None, speculative_disable_by_batch_size=None, ngram_prompt_lookup_max=None, ngram_prompt_lookup_min=None, spec_decoding_acceptance_method='rejection_sampler', typical_acceptance_sampler_posterior_threshold=None, typical_acceptance_sampler_posterior_alpha=None, disable_logprobs_during_spec_decoding=None, model_loader_extra_config=None, ignore_patterns=[], preemption_mode=None, served_model_name=None, qlora_adapter_name_or_path=None, otlp_traces_endpoint=None, collect_detailed_traces=None, disable_async_output_proc=False, scheduling_policy='fcfs', override_neuron_config=None, override_pooler_config=None, disable_log_requests=False, max_log_len=None, disable_fastapi_docs=False, enable_prompt_tokens_details=False, dispatch_function=) INFO 01-31 03:52:19 __init__.py:31] No plugins found. INFO 01-31 03:52:19 api_server.py:176] Multiprocessing frontend to use ipc:///tmp/be4dad1c-69d9-4e51-b88a-1e8b7b4d5ff5 for IPC Path. INFO 01-31 03:52:19 api_server.py:195] Started engine process with PID 34250 INFO 01-31 03:52:20 config.py:1874] Downcasting torch.float32 to torch.bfloat16. /usr/lib64/python3.11/inspect.py:389: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead return isinstance(object, types.FunctionType) Detected capabilities: [-cpu -gaudi -gaudi2 +gaudi3 +index_reduce] INFO 01-31 03:52:26 __init__.py:31] No plugins found. INFO 01-31 03:52:27 config.py:1874] Downcasting torch.float32 to torch.bfloat16. INFO 01-31 03:52:28 config.py:350] This model supports multiple tasks: {'embedding', 'generate'}. Defaulting to 'generate'. INFO 01-31 03:52:28 config.py:1017] Defaulting to use mp for distributed inference WARNING 01-31 03:52:28 arg_utils.py:1092] [DEPRECATED] Block manager v1 has been removed, and setting --use-v2-block-manager to True or False has no effect on vLLM behavior. Please remove --use-v2-block-manager in your engine argument. If your use case is not supported by SelfAttnBlockSpaceManager (i.e. block manager v2), please file an issue with detailed information. You are using the default legacy behaviour of the . This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message. INFO 01-31 03:52:34 config.py:350] This model supports multiple tasks: {'embedding', 'generate'}. Defaulting to 'generate'. INFO 01-31 03:52:34 config.py:1017] Defaulting to use mp for distributed inference WARNING 01-31 03:52:34 arg_utils.py:1092] [DEPRECATED] Block manager v1 has been removed, and setting --use-v2-block-manager to True or False has no effect on vLLM behavior. Please remove --use-v2-block-manager in your engine argument. If your use case is not supported by SelfAttnBlockSpaceManager (i.e. block manager v2), please file an issue with detailed information. INFO 01-31 03:52:34 llm_engine.py:250] Initializing an LLM engine (v0.6.4.post2) with config: model='instructlab/granite-7b-lab', speculative_config=None, tokenizer='instructlab/granite-7b-lab', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=4096, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=4, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, weights_load_device=hpu, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=hpu, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=instructlab/granite-7b-lab, num_scheduler_steps=1, chunked_prefill_enabled=False multi_step_stream_outputs=True, enable_prefix_caching=False, use_async_output_proc=True, use_cached_outputs=True, mm_processor_kwargs=None, pooler_config=None) You are using the default legacy behaviour of the . This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message. WARNING 01-31 03:52:34 multiproc_gpu_executor.py:56] Reducing Torch parallelism from 144 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed. INFO 01-31 03:52:34 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager INFO 01-31 03:52:34 __init__.py:31] No plugins found. WARNING 01-31 03:52:34 utils.py:754] Pin memory is not supported on HPU. INFO 01-31 03:52:34 selector.py:174] Using HPUAttention backend. ============================= HABANA PT BRIDGE CONFIGURATION =========================== PT_HPU_LAZY_MODE = 1 PT_RECIPE_CACHE_PATH = PT_CACHE_FOLDER_DELETE = 0 PT_HPU_RECIPE_CACHE_CONFIG = PT_HPU_MAX_COMPOUND_OP_SIZE = 9223372036854775807 PT_HPU_LAZY_ACC_PAR_MODE = 1 PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES = 0 PT_HPU_EAGER_PIPELINE_ENABLE = 1 PT_HPU_EAGER_COLLECTIVE_PIPELINE_ENABLE = 1 ---------------------------: System Configuration :--------------------------- Num CPU Cores : 288 CPU RAM : 1979158020 KB ------------------------------------------------------------------------------ VLLM_PROMPT_BS_BUCKET_MIN=1 (default:1) VLLM_PROMPT_BS_BUCKET_STEP=32 (default:32) VLLM_PROMPT_BS_BUCKET_MAX=256 (default:256) VLLM_DECODE_BS_BUCKET_MIN=1 (default:1) VLLM_DECODE_BS_BUCKET_STEP=32 (default:32) VLLM_DECODE_BS_BUCKET_MAX=256 (default:256) VLLM_PROMPT_SEQ_BUCKET_MIN=128 (default:128) VLLM_PROMPT_SEQ_BUCKET_STEP=128 (default:128) VLLM_PROMPT_SEQ_BUCKET_MAX=1024 (default:1024) VLLM_DECODE_BLOCK_BUCKET_MIN=128 (default:128) VLLM_DECODE_BLOCK_BUCKET_STEP=128 (default:128) VLLM_DECODE_BLOCK_BUCKET_MAX=4096 (default:4096) Prompt bucket config (min, step, max_warmup) bs:[1, 32, 256], seq:[128, 128, 1024] Decode bucket config (min, step, max_warmup) bs:[1, 32, 256], block:[128, 128, 4096] DEBUG 01-31 03:52:35 parallel_state.py:983] world_size=4 rank=0 local_rank=0 distributed_init_method=tcp://127.0.0.1:58449 backend=hccl DEBUG 01-31 03:52:38 client.py:186] Waiting for output from MQLLMEngine. /usr/lib64/python3.11/inspect.py:389: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead return isinstance(object, types.FunctionType) /usr/lib64/python3.11/inspect.py:389: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead return isinstance(object, types.FunctionType) /usr/lib64/python3.11/inspect.py:389: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead return isinstance(object, types.FunctionType) Detected capabilities: [-cpu -gaudi -gaudi2 +gaudi3 +index_reduce] (VllmWorkerProcess pid=34393) INFO 01-31 03:52:41 __init__.py:31] No plugins found. Detected capabilities: [-cpu -gaudi -gaudi2 +gaudi3 +index_reduce] (VllmWorkerProcess pid=34392) INFO 01-31 03:52:41 __init__.py:31] No plugins found. Detected capabilities: [-cpu -gaudi -gaudi2 +gaudi3 +index_reduce] (VllmWorkerProcess pid=34391) INFO 01-31 03:52:41 __init__.py:31] No plugins found. (VllmWorkerProcess pid=34393) WARNING 01-31 03:52:41 utils.py:754] Pin memory is not supported on HPU. (VllmWorkerProcess pid=34393) INFO 01-31 03:52:41 selector.py:174] Using HPUAttention backend. (VllmWorkerProcess pid=34393) VLLM_PROMPT_BS_BUCKET_MIN=1 (default:1) (VllmWorkerProcess pid=34393) VLLM_PROMPT_BS_BUCKET_STEP=32 (default:32) (VllmWorkerProcess pid=34393) VLLM_PROMPT_BS_BUCKET_MAX=256 (default:256) (VllmWorkerProcess pid=34393) VLLM_DECODE_BS_BUCKET_MIN=1 (default:1) (VllmWorkerProcess pid=34393) VLLM_DECODE_BS_BUCKET_STEP=32 (default:32) (VllmWorkerProcess pid=34393) VLLM_DECODE_BS_BUCKET_MAX=256 (default:256) (VllmWorkerProcess pid=34393) VLLM_PROMPT_SEQ_BUCKET_MIN=128 (default:128) (VllmWorkerProcess pid=34393) VLLM_PROMPT_SEQ_BUCKET_STEP=128 (default:128) (VllmWorkerProcess pid=34393) VLLM_PROMPT_SEQ_BUCKET_MAX=1024 (default:1024) (VllmWorkerProcess pid=34393) VLLM_DECODE_BLOCK_BUCKET_MIN=128 (default:128) (VllmWorkerProcess pid=34393) VLLM_DECODE_BLOCK_BUCKET_STEP=128 (default:128) (VllmWorkerProcess pid=34393) VLLM_DECODE_BLOCK_BUCKET_MAX=4096 (default:4096) (VllmWorkerProcess pid=34393) Prompt bucket config (min, step, max_warmup) bs:[1, 32, 256], seq:[128, 128, 1024] (VllmWorkerProcess pid=34393) Decode bucket config (min, step, max_warmup) bs:[1, 32, 256], block:[128, 128, 4096] (VllmWorkerProcess pid=34393) INFO 01-31 03:52:41 multiproc_worker_utils.py:215] Worker ready; awaiting tasks (VllmWorkerProcess pid=34392) WARNING 01-31 03:52:41 utils.py:754] Pin memory is not supported on HPU. (VllmWorkerProcess pid=34392) INFO 01-31 03:52:41 selector.py:174] Using HPUAttention backend. (VllmWorkerProcess pid=34392) VLLM_PROMPT_BS_BUCKET_MIN=1 (default:1) (VllmWorkerProcess pid=34392) VLLM_PROMPT_BS_BUCKET_STEP=32 (default:32) (VllmWorkerProcess pid=34392) VLLM_PROMPT_BS_BUCKET_MAX=256 (default:256) (VllmWorkerProcess pid=34392) VLLM_DECODE_BS_BUCKET_MIN=1 (default:1) (VllmWorkerProcess pid=34392) VLLM_DECODE_BS_BUCKET_STEP=32 (default:32) (VllmWorkerProcess pid=34392) VLLM_DECODE_BS_BUCKET_MAX=256 (default:256) (VllmWorkerProcess pid=34392) VLLM_PROMPT_SEQ_BUCKET_MIN=128 (default:128) (VllmWorkerProcess pid=34392) VLLM_PROMPT_SEQ_BUCKET_STEP=128 (default:128) (VllmWorkerProcess pid=34392) VLLM_PROMPT_SEQ_BUCKET_MAX=1024 (default:1024) (VllmWorkerProcess pid=34392) VLLM_DECODE_BLOCK_BUCKET_MIN=128 (default:128) (VllmWorkerProcess pid=34392) VLLM_DECODE_BLOCK_BUCKET_STEP=128 (default:128) (VllmWorkerProcess pid=34392) VLLM_DECODE_BLOCK_BUCKET_MAX=4096 (default:4096) (VllmWorkerProcess pid=34392) Prompt bucket config (min, step, max_warmup) bs:[1, 32, 256], seq:[128, 128, 1024] (VllmWorkerProcess pid=34392) Decode bucket config (min, step, max_warmup) bs:[1, 32, 256], block:[128, 128, 4096] (VllmWorkerProcess pid=34392) INFO 01-31 03:52:41 multiproc_worker_utils.py:215] Worker ready; awaiting tasks (VllmWorkerProcess pid=34391) WARNING 01-31 03:52:41 utils.py:754] Pin memory is not supported on HPU. (VllmWorkerProcess pid=34391) INFO 01-31 03:52:41 selector.py:174] Using HPUAttention backend. (VllmWorkerProcess pid=34391) VLLM_PROMPT_BS_BUCKET_MIN=1 (default:1) (VllmWorkerProcess pid=34391) VLLM_PROMPT_BS_BUCKET_STEP=32 (default:32) (VllmWorkerProcess pid=34391) VLLM_PROMPT_BS_BUCKET_MAX=256 (default:256) (VllmWorkerProcess pid=34391) VLLM_DECODE_BS_BUCKET_MIN=1 (default:1) (VllmWorkerProcess pid=34391) VLLM_DECODE_BS_BUCKET_STEP=32 (default:32) (VllmWorkerProcess pid=34391) VLLM_DECODE_BS_BUCKET_MAX=256 (default:256) (VllmWorkerProcess pid=34391) VLLM_PROMPT_SEQ_BUCKET_MIN=128 (default:128) (VllmWorkerProcess pid=34391) VLLM_PROMPT_SEQ_BUCKET_STEP=128 (default:128) (VllmWorkerProcess pid=34391) VLLM_PROMPT_SEQ_BUCKET_MAX=1024 (default:1024) (VllmWorkerProcess pid=34391) VLLM_DECODE_BLOCK_BUCKET_MIN=128 (default:128) (VllmWorkerProcess pid=34391) VLLM_DECODE_BLOCK_BUCKET_STEP=128 (default:128) (VllmWorkerProcess pid=34391) VLLM_DECODE_BLOCK_BUCKET_MAX=4096 (default:4096) (VllmWorkerProcess pid=34391) Prompt bucket config (min, step, max_warmup) bs:[1, 32, 256], seq:[128, 128, 1024] (VllmWorkerProcess pid=34391) Decode bucket config (min, step, max_warmup) bs:[1, 32, 256], block:[128, 128, 4096] (VllmWorkerProcess pid=34391) INFO 01-31 03:52:41 multiproc_worker_utils.py:215] Worker ready; awaiting tasks ============================= HABANA PT BRIDGE CONFIGURATION =========================== PT_HPU_LAZY_MODE = 1 PT_RECIPE_CACHE_PATH = PT_CACHE_FOLDER_DELETE = 0 PT_HPU_RECIPE_CACHE_CONFIG = PT_HPU_MAX_COMPOUND_OP_SIZE = 9223372036854775807 PT_HPU_LAZY_ACC_PAR_MODE = 1 PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES = 0 PT_HPU_EAGER_PIPELINE_ENABLE = 1 PT_HPU_EAGER_COLLECTIVE_PIPELINE_ENABLE = 1 ---------------------------: System Configuration :--------------------------- Num CPU Cores : 288 CPU RAM : 1979158020 KB ------------------------------------------------------------------------------ (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:41 parallel_state.py:983] world_size=4 rank=3 local_rank=3 distributed_init_method=tcp://127.0.0.1:58449 backend=hccl ============================= HABANA PT BRIDGE CONFIGURATION =========================== PT_HPU_LAZY_MODE = 1 PT_RECIPE_CACHE_PATH = PT_CACHE_FOLDER_DELETE = 0 PT_HPU_RECIPE_CACHE_CONFIG = PT_HPU_MAX_COMPOUND_OP_SIZE = 9223372036854775807 PT_HPU_LAZY_ACC_PAR_MODE = 1 PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES = 0 PT_HPU_EAGER_PIPELINE_ENABLE = 1 PT_HPU_EAGER_COLLECTIVE_PIPELINE_ENABLE = 1 ---------------------------: System Configuration :--------------------------- Num CPU Cores : 288 CPU RAM : 1979158020 KB ------------------------------------------------------------------------------ (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:41 parallel_state.py:983] world_size=4 rank=2 local_rank=2 distributed_init_method=tcp://127.0.0.1:58449 backend=hccl ============================= HABANA PT BRIDGE CONFIGURATION =========================== PT_HPU_LAZY_MODE = 1 PT_RECIPE_CACHE_PATH = PT_CACHE_FOLDER_DELETE = 0 PT_HPU_RECIPE_CACHE_CONFIG = PT_HPU_MAX_COMPOUND_OP_SIZE = 9223372036854775807 PT_HPU_LAZY_ACC_PAR_MODE = 1 PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES = 0 PT_HPU_EAGER_PIPELINE_ENABLE = 1 PT_HPU_EAGER_COLLECTIVE_PIPELINE_ENABLE = 1 ---------------------------: System Configuration :--------------------------- Num CPU Cores : 288 CPU RAM : 1979158020 KB ------------------------------------------------------------------------------ (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:42 parallel_state.py:983] world_size=4 rank=1 local_rank=1 distributed_init_method=tcp://127.0.0.1:58449 backend=hccl DEBUG 01-31 03:52:42 shm_broadcast.py:196] Binding to tcp://127.0.0.1:55747 INFO 01-31 03:52:42 shm_broadcast.py:236] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1, 2, 3], buffer=, local_subscribe_port=55747, remote_subscribe_port=None) (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:42 shm_broadcast.py:260] Connecting to tcp://127.0.0.1:55747 (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:42 shm_broadcast.py:260] Connecting to tcp://127.0.0.1:55747 (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:42 shm_broadcast.py:260] Connecting to tcp://127.0.0.1:55747 DEBUG 01-31 03:52:43 decorators.py:84] Inferred dynamic dimensions for forward method of : ['input_ids', 'positions', 'intermediate_tensors', 'inputs_embeds'] DEBUG 01-31 03:52:43 custom_op.py:66] custom op rotary_embedding enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled INFO 01-31 03:52:43 loader.py:340] Loading weights on hpu... INFO 01-31 03:52:43 weight_utils.py:243] Using model weights format ['*.safetensors'] (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 decorators.py:84] Inferred dynamic dimensions for forward method of : ['input_ids', 'positions', 'intermediate_tensors', 'inputs_embeds'] (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rotary_embedding enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 decorators.py:84] Inferred dynamic dimensions for forward method of : ['input_ids', 'positions', 'intermediate_tensors', 'inputs_embeds'] (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 decorators.py:84] Inferred dynamic dimensions for forward method of : ['input_ids', 'positions', 'intermediate_tensors', 'inputs_embeds'] (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rotary_embedding enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rotary_embedding enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34392) INFO 01-31 03:52:43 loader.py:340] Loading weights on hpu... (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34393) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34393) INFO 01-31 03:52:43 loader.py:340] Loading weights on hpu... (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op silu_and_mul enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) DEBUG 01-31 03:52:43 custom_op.py:66] custom op rms_norm enabled (VllmWorkerProcess pid=34391) INFO 01-31 03:52:43 loader.py:340] Loading weights on hpu... Loading safetensors checkpoint shards: 0% Completed | 0/3 [00:00