Open Source · AGPLv3

Run open-source AI on
your own hardware.

One command to go from bare hardware to a fully working local AI API and management dashboard. No cloud required. No API keys. No data leaving your network.

curl -fsSL https://warphost.io/install | bash

Star on GitHub Read the Docs

How it works

Three steps from bare hardware to a working AI API. No PhD required.

Install

Run the one-line installer. WarpHost detects Docker, checks for NVIDIA GPUs, and sets up everything automatically.

Detect & Recommend

WarpHost scans your hardware — GPU model, VRAM, CPU, RAM — and recommends the best models for your setup.

Run

Pull a model and start serving. You get an OpenAI-compatible API and a management dashboard instantly.

Everything you need to run AI locally

OpenAI-Compatible API

Drop-in replacement for OpenAI's API. Point any client at localhost:8811/v1 and it just works.

Hardware Auto-Detection

Automatically detects your NVIDIA GPU, VRAM, and system specs. Recommends the best models for your hardware.

Management Dashboard

Clean web UI to monitor your system, manage models, and test with a built-in chat playground.

One-Click Model Management

Browse a curated catalog, pull models with one click, switch between them instantly.

Docker Native

Runs in Docker with NVIDIA Container Toolkit for GPU passthrough. Clean, isolated, easy to update.

100% Local & Private

No data leaves your network. No API keys. No cloud dependency. Your hardware, your models, your data.

Supported Models

18 curated models from 3B to 70B. From laptop-friendly to datacenter-grade.

Llama 3.2 3B

4 GB

Meta's edge model. Runs on any GPU or CPU-only. Great starting point.

Qwen3 4B

4 GB

Punches way above its weight. Thinking modes, coding, multilingual.

Qwen3 8B

8 GB

Best all-rounder at 8B. The sweet spot for most hardware.

Gemma 3 12B

10 GB

Google's standout. 128K context and 140+ languages.

DeepSeek R1 32B

24 GB

Outperforms OpenAI o1-mini. Best reasoning you can run locally.

Qwen3 32B

24 GB

Top-ranked open-source model. Rivals models 10x its size.

Qwen2.5 Coder 32B

24 GB

Matches GPT-4o on coding benchmarks. State of the art for code.

Llama 3.3 70B

48 GB

Meta's best. Rivals Llama 3.1 405B. Top-tier local quality.

View all 18 models →

Run open-source AI on your own hardware.

How it works

Install

Detect & Recommend

Run

Everything you need to run AI locally

OpenAI-Compatible API

Hardware Auto-Detection

Management Dashboard

One-Click Model Management

Docker Native

100% Local & Private

Supported Models

Llama 3.2 3B

Qwen3 4B

Qwen3 8B

Gemma 3 12B

DeepSeek R1 32B

Qwen3 32B

Qwen2.5 Coder 32B

Llama 3.3 70B

Ready to get started?

Run open-source AI on
your own hardware.