
Setting Up vLLM with Open-WebUI Using Docker Compose
Overview
Leverage vLLM and Open-WebUI with Docker Compose for a streamlined, containerized deployment. This approach simplifies setup, ensures reproducibility, and offers easy scalability.
Why Use Docker Compose?
✅ Simple Setup: Manage everything with one command.
✅ Reproducibility: Consistent environments across deployments.
✅ Isolation: Separate services in containers.
✅ Scalability: Add or remove services easily.
✅ Easy Maintenance: Restart, update, or remove containers effortlessly.
1. Prerequisites
- Docker: Install Docker
- Docker Compose: Comes bundled with Docker Desktop or can be installed separately.
- NVIDIA Container Toolkit: (If using GPU) Install Guide
2. Docker Compose Setup
Step 1: Create Project Directory
mkdir vllm-openwebui
cd vllm-openwebui
Step 2: Create Docker Compose File
Create a file named docker-compose.yml
:
version: '3.8'
services:
vllm:
image: vllm/vllm:latest
command: ["python", "-m", "vllm.entrypoints.api_server", "--model", "meta-llama/Llama-2-7b-chat-hf"]
ports:
- "8000:8000"
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities: [gpu]
restart: always
open-webui:
image: ghcr.io/open-webui/open-webui:latest
environment:
- LLM_API_URL=http://vllm:8000/v1
ports:
- "3000:8080"
depends_on:
- vllm
restart: always
Step 3: Launch the Services
docker-compose up -d
- Open-WebUI will be available at: http://localhost:3000
- vLLM API will be available at: http://localhost:8000/v1
3. Secure and Scale
Secure with NGINX (Optional)
- Add a reverse proxy with HTTPS using NGINX and Let’s Encrypt.
- Enable basic authentication for the WebUI and API endpoints.
Scale with Multiple GPUs
Modify docker-compose.yml
:
command: ["python", "-m", "vllm.entrypoints.api_server", "--model", "meta-llama/Llama-2-7b-chat-hf", "--tensor-parallel-size", "2"]
Add Persistent Volumes
volumes:
vllm-models:
openwebui-data:
services:
vllm:
volumes:
- vllm-models:/root/.cache/huggingface
open-webui:
volumes:
- openwebui-data:/app/data
4. Maintenance Commands
- Check Logs:
docker-compose logs -f
- Stop Services:
docker-compose down
- Update Containers:
docker-compose pull docker-compose up -d --build
5. Conclusion
Using Docker Compose for vLLM and Open-WebUI provides simplicity, scalability, and easy maintenance. You can quickly deploy your LLM interface with minimal overhead while retaining full control and privacy.
📩 Need Help Setting Up?
- Full Docker Compose Configurations & Optimizations
- Secure, Scalable Deployments (on-premises or cloud)
- Integrations with Slack, Notion, or other tools
- Ongoing Support and Performance Tuning
Contact Me – Let’s make your LLM solution a reality!