Docker Compose

MemGhost uses Docker Compose for production deployment. The compose file defines seven services across multiple profiles, allowing you to start with the core platform and add AI and voice features as needed.

Services

db (PostgreSQL + pgvector)

db:
  image: pgvector/pgvector:pg15
  restart: unless-stopped

Always starts (no profile required)
Uses pgvector image for vector search support (AI embeddings)
Health check: pg_isready every 5 seconds
Data persists in the postgres-data Docker volume

migrate (Database Migrations)

Uses the same image as the API
Runs once on startup, then exits
Applies all pending SQL migrations
The API waits for this service to complete before starting

api (Go Backend)

Production image with the compiled Go binary
Waits for migrations to complete before starting
Serves the REST API, SSE events, TTS/STT proxy, MCP endpoints, and theme assets on port 8080

web (Next.js Frontend)

Production image with pre-built Next.js standalone output
Serves the web UI on port 3000
In production, all API routing goes through Caddy (no CORS needed)

caddy (Reverse Proxy)

Only starts with the standalone profile
Routes requests to the API and web services
Auto-provisions HTTPS certificates for public domains via Let’s Encrypt
Configurable via the Caddyfile

ollama (LLM Inference)

Only starts with the ai profile
Provides local LLM inference for chat and embeddings
Models are stored in the ollama-data volume (persist across restarts)
Keeps up to 2 models loaded in memory simultaneously

kokoro (Text-to-Speech)

Only starts with the voice profile
Kokoro-FastAPI: lightweight 82M-parameter TTS model
Supports 67+ voices across multiple languages
Runs on CPU, typical latency under 1 second per sentence

whisper (Speech-to-Text)

Only starts with the voice profile
Whisper.cpp server for fast audio transcription
Runs the whisper-large-v3-turbo model on CPU

Profiles

Profile	Services Started	Use Case
(none)	db, migrate, api, web	Core platform (needs external proxy)
`standalone`	+ caddy	Built-in reverse proxy with HTTPS
`ai`	+ ollama	AI chat agents and semantic search
`voice`	+ kokoro, whisper	Text-to-speech and speech-to-text

Profiles can be combined:

# Core with reverse proxy
docker compose --profile standalone up -d

# Core + AI
docker compose --profile standalone --profile ai up -d

# Everything
docker compose --profile standalone --profile ai --profile voice up -d

Volumes

Volume	Purpose
`postgres-data`	PostgreSQL data persistence
`import-data`	Catalog import data files
`caddy-data`	Caddy TLS certificates and state
`caddy-config`	Caddy configuration cache
`ollama-data`	Ollama model files (~5 GB with default models)
`kokoro-data`	Kokoro TTS voice data
`whisper-models`	Whisper STT model files
`tts-audio`	Synthesized TTS audio output

Networking

Docker Compose automatically creates a bridge network for the stack. Services communicate by hostname:

From	To	Address
API	Database	`db:5432`
API	Ollama	`ollama:11434`
API	Kokoro TTS	`kokoro:8880`
API	Whisper STT	`whisper:8178`
Caddy	API	`api:8080`
Caddy	Web	`web:3000`

External access goes through Caddy on ports 80/443 (or your own reverse proxy).

Customizing

Change Ports

Edit your .env file:

PORT=8080
HTTPS_PORT=8443

Use a Different Chat Model

# Pull the model
docker compose exec ollama ollama pull llama3.1:8b

# Update .env
AI_LLM_MODEL=llama3.1:8b

# Restart API
docker compose restart api

External Ollama

If Ollama runs on a separate machine, skip the ai profile and set the URL directly:

AI_ENABLED=true
AI_EMBEDDING_BASE_URL=http://192.168.1.100:11434
AI_LLM_BASE_URL=http://192.168.1.100:11434

GPU Support for Ollama

Add GPU reservation to the ollama service in your compose file:

ollama:
  image: docker.io/ollama/ollama:latest
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            count: 1
            capabilities: [gpu]