System Overview
MemGhost is a self-hosted personal knowledge management platform. It captures raw items (notes, bookmarks, clippings, webhooks) into a vault, classifies them, routes them to topical hub pages maintained by an AI pipeline, and lets you chat with your knowledge through AI agents.
Architecture Philosophy
Modular Monolith with Event Sourcing
MemGhost uses a modular monolith architecture with event sourcing and CQRS patterns:
- Single Go Binary — easy deployment and operation
- PostgreSQL — event store, read models, and vector search (pgvector) in one database
- Event Bus — in-process event routing with SSE bridge for real-time UI updates
- Domain Modules — clear boundaries with enforced separation
- AI Pipeline — Ollama for embeddings and chat, Kokoro for TTS
Why This Architecture?
Operational Simplicity
- Single deployment unit (one Go binary)
- One database to manage (PostgreSQL)
- Minimal resource footprint for self-hosting
- Easy local development
Event Sourcing Benefits
- Complete audit log of all state changes
- Time-travel debugging (replay events to any point)
- Multiple read models from the same event stream (notes view, hub pages, search index)
- Vault as single source of truth with all views as projections
Future Flexibility
- Modules can be extracted to independent services later
- Clear boundaries enforced by architecture tests
- New item types and integrations plug into the same pipeline
System Components
Backend (Go)
The backend is a single Go binary that includes:
- HTTP Server — OpenAPI 3.0 generated handlers, JWT validation, authorization middleware
- Domain Modules — Vault, Hubs, Spaces, Note, Agent, TTS, Auth, Setup
- Event Bus — in-process event routing with SSE bridge
- Event Store — PostgreSQL-based event storage
- Read Models — optimized projections for queries
- AI Services — embedding generation, vector search, LLM routing, MCP tool servers
Frontend (Next.js)
The web interface is built with Next.js 14+ (App Router), TypeScript, Tailwind CSS, shadcn/ui components, React Query for server state, and Server-Sent Events for real-time updates.
Database (PostgreSQL)
A single PostgreSQL database stores:
- Event Store — all state changes as immutable events
- Read Models — optimized views for queries (note_views, hub_nodes, hub_edges)
- Vector Embeddings — pgvector for semantic search and hub routing
Key Data Flow
Input (note, bookmark, webhook) → Vault (store as item, emit item.created.v1) → Classification pipeline (type, tags, metadata → item.classified.v1) → Hub routing (assign to topic node → item.routed.v1) → Materialization (synthesize hub page content) → Note projection (populate note_views table) → Embedding projection (generate vectors for search)All writes enter through the vault. Hubs, notes, and search are downstream projections of vault events. This ensures one source of truth with a complete audit trail.
Modules
| Module | Purpose | Key Aggregates |
|---|---|---|
| Vault | Canonical item store, ingest and classification pipeline | VaultItem |
| Hubs | Topic graph, AI routing, content materialization | HubNode, HubEdge |
| Spaces | AI conversation workspaces with personas and artifacts | Space |
| Note | Read model for note-type items, tags, folders, pin/unpin | Note (projection) |
| Agent | AI chat sessions, MCP tool servers, semantic search | AgentSession |
| TTS | Voice synthesis and voice management | Voice |
| Auth | JWT authentication, sessions, roles | User, Session |
| Setup | First-run wizard, initial data import | — |
See Domain Modules for detailed module documentation.
AI & Voice
MemGhost includes optional AI capabilities that run entirely on your hardware:
| Feature | Technology | Purpose |
|---|---|---|
| AI Chat | OpenRouter / Anthropic / Ollama | Conversational agents with tool use |
| Semantic Search | Ollama (nomic-embed-text) + pgvector | Natural-language search across all data |
| Hub Routing | LLM classification | Route items to the right topic page |
| Content Synthesis | LLM generation | Materialize hub pages from source items |
| Text-to-Speech | Kokoro (82M params) | Spoken responses from AI agents |
| MCP Tools | Go SDK | Per-module tool definitions for AI agents |
See the AI Features and Voice guides for details.
Technology Stack
| Layer | Technologies |
|---|---|
| Backend | Go 1.25+, PostgreSQL 15+ (pgvector), OpenAPI 3.0 |
| Frontend | Next.js 14+, TypeScript, Tailwind CSS, shadcn/ui, React Query, Zustand |
| AI | Ollama, pgvector, MCP (Model Context Protocol), Kokoro TTS |
| Infrastructure | Docker, Docker Compose, Caddy, OpenTelemetry |
| Development | Dev Containers, Taskfile, Air (hot reload) |
Deployment
Self-hosted (Production)
- Single Go binary + PostgreSQL
- Docker Compose for easy deployment
- Minimal resource requirements (1-2 CPU cores, 512 MB - 2 GB RAM without AI)
Development
- Docker Compose for local services
- Dev Container for consistent environment
- Hot reload for rapid development