GPU · dedicated cards
Dedicated RTX Pro Blackwell cards for GPU workloads.
Every GPU instance gets a whole NVIDIA RTX Pro Blackwell card, not a time-slice of one. The VM underneath is NVMe-backed, the meter is hourly, and access is reviewed so capacity goes to real workloads.
GPU access starts with an email to support@excloud.dev — details below.
nv2a.xlarge · RTX 4500 Pro Blackwell · 32 GiB VRAM
- one hour
- ₹44.554
- a working day × 8 h
- ₹356.43
- around the clock × 24 h
- ₹1,069.30
The cards
Pick the card that fits your model's VRAM budget.
All three are current RTX Pro Blackwell generation. The docs list the same supported workloads across the range: LLM inference, fine-tuning, and professional visualization.
| Card | Instance | vCPU | RAM | VRAM | On-demand |
|---|---|---|---|---|---|
| RTX 4500 Pro Blackwell | nv2a.xlarge | 4 | 16 GiB | 32 GiB | ₹44.554/hr |
| RTX 5000 Pro Blackwell | nv3a.2xlarge | 8 | 32 GiB | 48 GiB | ₹63.849/hr |
| RTX 6000 Pro Blackwell | nv1a.4xlarge | 16 | 64 GiB | 96 GiB | ₹126.784/hr |
Disks use the same EBS NVMe volume model as compute: ₹4/GB·mo for NVMe block storage, egress flat at ₹1/GiB, ingress free. Detailed units live in the compute billing docs.
The math
The bill is just hours times the hourly rate.
Hourly billing means you can work out the cost of a GPU job before submitting it. An eight-hour fine-tune on an nv2a.xlarge is 8 × ₹44.554, which comes to ₹356.43. If the run finishes early, so does the bill.
The 96 GiB card follows the same logic. A full day on an nv1a.4xlarge is 24 × ₹126.784 = ₹3,042.82, plus the disk and whatever you move out.
8 h fine-tune · nv2a.xlarge
8 × ₹44.554
₹356.43
24 h inference · nv1a.4xlarge
24 × ₹126.784
₹3,042.82
Access
New accounts start at zero GPU quota; we raise it on request.
Every GPU here is a physical card in our racks in Mumbai. We'd rather allocate them to people who have a job to run than let a signup script hold them idle. Write to us, tell us roughly what you're running and which card you want, and a human reads it and replies.
What people run
Inference, visualization, or just tokens — the card handles all of it.
Self-host with Ollama
The 32 and 48 GiB cards are a comfortable fit for open-weight models served through Ollama; the 96 GiB card takes the larger quantizations. The docs walk through the whole setup on an Excloud VM.
Visualization and rendering
The same cards the docs list for inference also carry professional visualization work. A dedicated card means your viewport isn't sharing the GPU with a stranger's training job.
Skip the card entirely
If you only need tokens, our hosted Qwen3.6-27B costs ₹20 per million input tokens and ₹60 per million output. No quota request, no idle hours.
Get started
Request quota for a dedicated GPU instance.
Email support@excloud.dev with the card you need and the workload you plan to run. Once quota is approved, launch from the console or CLI.