-
Bug
-
Resolution: Done-Errata
-
Undefined
-
None
-
None
-
None
Description of problem:
vllm doesn't do it's own hostname gathering on this input (comes from CLI), the hostname itself is a kubernetes service. So zmq not doing a DNS lookup breaks what is expected to work upstream
Version numbers (base image, wheels, builder, etc):
quay.io/aipcc/rhaiis/cuda-ubi9:3.2.2-175769903
pyzmq
Issue observed when libzmq is 4.3.4, not observed in upstream images when libzmq is 4.3.5.
Steps to Reproduce:
import zmq
import socket as sock
import platformprint("=== Environment Info ===")
print("ZMQ version:", zmq.zmq_version())
print("PyZMQ version:", zmq.pyzmq_version())
print("Platform:", platform.platform())
print("Python version:", platform.python_version())# Check hostname resolution
try:
localhost_ip = sock.gethostbyname("localhost")
print("localhost resolves to:", localhost_ip)
except Exception as e:
print("localhost resolution failed:", str(e))print("\\n=== ZMQ Binding Tests ===")
ctx = zmq.Context()# Test localhost binding
socket = ctx.socket(zmq.REP)
try:
socket.bind("tcp://localhost:0")
port = socket.getsockopt(zmq.LAST_ENDPOINT).decode()
print("✅ localhost bind succeeded on", port)
localhost_success = True
except Exception as e:
print("❌ localhost bind failed -", str(e))
localhost_success = False
socket.close()# Test IP binding
socket2 = ctx.socket(zmq.REP)
try:
socket2.bind("tcp://127.0.0.1:0")
port2 = socket2.getsockopt(zmq.LAST_ENDPOINT).decode()
print("✅ IP bind succeeded on", port2)
except Exception as e:
print("❌ IP bind failed -", str(e))
socket2.close()# Test wildcard binding
socket3 = ctx.socket(zmq.REP)
try:
socket3.bind("tcp://*:0")
port3 = socket3.getsockopt(zmq.LAST_ENDPOINT).decode()
print("✅ Wildcard bind succeeded on", port3)
except Exception as e:
print("❌ Wildcard bind failed -", str(e))
socket3.close()ctx.term()
''' cmd = ['podman', 'run', '--rm', '--entrypoint=', image, 'python3', '-c', test_script] try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=60)
print(result.stdout)
if result.stderr:
print("STDERR:", result.stderr)
except subprocess.TimeoutExpired:
print("❌ Container test timed out")
except Exception as e:
print(f"❌ Error running container: {e}") print()def main():
print("Testing ZMQ hostname binding across images...")
print() images = [
("AIPCC Image (Expected to FAIL on localhost)", "quay.io/aipcc/rhaiis/cuda-ubi9:3.2.2-1757699034", True),
("WSEATON Image (Expected to SUCCEED on localhost)", "quay.io/wseaton/vllm:llmdnixlfix-01", False),
("LLM-D Image (Expected to SUCCEED on localhost)", "ghcr.io/llm-d/llm-d-dev:sha-b3f0b0d", False)
] for name, image, should_fail in images:
test_zmq_in_container(name, image, should_fail)if __name__ == "__main__":
main()
Actual results:
Fails on the AIPCC image
Expected results:
localhost bind succeeds
Additional info:
- blocks
-
AIPCC-3181 Support for llm-d
-
- Closed
-
- links to
-
RHBA-2025:154563
Update ZeroMQ to 4.3.5
- mentioned on