InferX — Serverless GPU Inference Platform for Production Workloads

Snapshot ID Nodename State GPU Pageable (MB) Pinned (MB) Docker Image Name Build ID
ID Memory Size (MB)
public/ActionAnalytics/CR-70B/54#computeinstance-e00r2jrqynf83a8b4f computeinstance-e00r2jrqynf83a8b4f Ready
0
1
2
3
71202
71202
71202
71202
5464.0 4.328125 vllm/vllm-openai:v0.9.0 [129, 57, 99, 194, 104, 242, 237, 165, 233, 123, 11, 232, 127, 111, 105, 112, 15, 123, 205, 59]