InferX — Serverless GPU Inference Platform for Production Workloads

Model IntelliAsk-Qwen3-32B-450-Merged

Namespace Model Name Type Standby GPU Standby Pageable Standby Pinned Memory GPU Count vRam (MB) CPU Memory (MB) State Revision
Qwen IntelliAsk-Qwen3-32B-450-Merged text2text Mem File File 2 58000 12.0 80000 Normal 76

Image

Prompt



Sample Rest Call

Pods

Tenant Namespace Pod Name State Required Resource Allocated Resource GPU
public Qwen public/Qwen/IntelliAsk-Qwen3-32B-450-Merged/76/252 Standby
CPU
12000
Mem
80000
CacheMem
0
GPU Type
Any
GPU Count
2
GPU vRam
58000
GPU Contexts
0
Node Name
computeinstance-e00r2jrqynf83a8b4f
CPU
0
Memory
0
Cache Memory
0
GPU Type
NVIDIA H100 80GB HBM3
vRam
0
Slot Size
268435456
Total Slot Count
285
Max Context Per GPU
1

Logs

tenant namespace model name revision id node name create time exit info state
public Qwen IntelliAsk-Qwen3-32B-450-Merged 76 86 computeinstance-e00r2jrqynf83a8b4f 2026-03-01 16:09:29 None log

Snapshot History

tenant namespace model name revision nodename state detail updatetime
public Qwen IntelliAsk-Qwen3-32B-450-Merged 76 computeinstance-e00r2jrqynf83a8b4f Waiting Resource is busy 2026-03-01 15:41:56
public Qwen IntelliAsk-Qwen3-32B-450-Merged 76 computeinstance-e00r2jrqynf83a8b4f Done Done 2026-03-01 16:09:29
public Qwen IntelliAsk-Qwen3-32B-450-Merged 76 computeinstance-e00r2jrqynf83a8b4f Scheduled Scheduled 2026-03-01 16:23:44

Model Spec


Policy