InferX — Serverless GPU Inference Platform for Production Workloads

Model Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated

Namespace Model Name Type Standby GPU Standby Pageable Standby Pinned Memory GPU Count vRam (MB) CPU Memory (MB) State Revision
Trial Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated text2text Mem File File 4 45000 12.0 100000 Normal 255

Image

Prompt



Sample Rest Call

Pods

Tenant Namespace Pod Name State Required Resource Allocated Resource GPU
public Trial public/Trial/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated/255/276 Snapshoting
CPU
12000
Mem
100000
CacheMem
0
GPU Type
Any
GPU Count
4
GPU vRam
45000
GPU Contexts
0
Node Name
computeinstance-e00r2jrqynf83a8b4f
CPU
12000
Memory
180000
Cache Memory
0
GPU Type
NVIDIA H100 80GB HBM3
vRam
45056
Slot Size
268435456
Total Slot Count
285
Max Context Per GPU
1

Logs

tenant namespace model name revision id node name create time exit info state
public Trial Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated 255 258 computeinstance-e00r2jrqynf83a8b4f 2026-03-01 17:18:43 Error("DockerContainerWaitError { error: \"\", code: 139 }") log

Snapshot History

tenant namespace model name revision nodename state detail updatetime
public Trial Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated 255 computeinstance-e00r2jrqynf83a8b4f Waiting Resource is busy 2026-03-01 16:41:44
public Trial Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated 255 computeinstance-e00r2jrqynf83a8b4f Scheduled Scheduled 2026-03-01 16:41:47

Model Spec


Policy