System Overview

MemGhost is a self-hosted personal knowledge management platform. It captures raw items (notes, bookmarks, clippings, webhooks) into a vault, classifies them, routes them to topical hub pages maintained by an AI pipeline, and lets you chat with your knowledge through AI agents.

Architecture Philosophy

Modular Monolith with Event Sourcing

MemGhost uses a modular monolith architecture with event sourcing and CQRS patterns:

Single Go Binary — easy deployment and operation
PostgreSQL — event store, read models, and vector search (pgvector) in one database
Event Bus — in-process event routing with SSE bridge for real-time UI updates
Domain Modules — clear boundaries with enforced separation
AI Pipeline — Ollama for embeddings and chat, Kokoro for TTS

Why This Architecture?

Operational Simplicity

Single deployment unit (one Go binary)
One database to manage (PostgreSQL)
Minimal resource footprint for self-hosting
Easy local development

Event Sourcing Benefits

Complete audit log of all state changes
Time-travel debugging (replay events to any point)
Multiple read models from the same event stream (notes view, hub pages, search index)
Vault as single source of truth with all views as projections

Future Flexibility

Modules can be extracted to independent services later
Clear boundaries enforced by architecture tests
New item types and integrations plug into the same pipeline

System Components

Backend (Go)

The backend is a single Go binary that includes:

HTTP Server — OpenAPI 3.0 generated handlers, JWT validation, authorization middleware
Domain Modules — Vault, Hubs, Spaces, Note, Agent, TTS, Auth, Setup
Event Bus — in-process event routing with SSE bridge
Event Store — PostgreSQL-based event storage
Read Models — optimized projections for queries
AI Services — embedding generation, vector search, LLM routing, MCP tool servers

Frontend (Next.js)

The web interface is built with Next.js 14+ (App Router), TypeScript, Tailwind CSS, shadcn/ui components, React Query for server state, and Server-Sent Events for real-time updates.

Database (PostgreSQL)

A single PostgreSQL database stores:

Event Store — all state changes as immutable events
Read Models — optimized views for queries (note_views, hub_nodes, hub_edges)
Vector Embeddings — pgvector for semantic search and hub routing

Key Data Flow

Input (note, bookmark, webhook)
  → Vault (store as item, emit item.created.v1)
    → Classification pipeline (type, tags, metadata → item.classified.v1)
      → Hub routing (assign to topic node → item.routed.v1)
        → Materialization (synthesize hub page content)
      → Note projection (populate note_views table)
      → Embedding projection (generate vectors for search)

All writes enter through the vault. Hubs, notes, and search are downstream projections of vault events. This ensures one source of truth with a complete audit trail.

Modules

Module	Purpose	Key Aggregates
Vault	Canonical item store, ingest and classification pipeline	VaultItem
Hubs	Topic graph, AI routing, content materialization	HubNode, HubEdge
Spaces	AI conversation workspaces with personas and artifacts	Space
Note	Read model for note-type items, tags, folders, pin/unpin	Note (projection)
Agent	AI chat sessions, MCP tool servers, semantic search	AgentSession
TTS	Voice synthesis and voice management	Voice
Auth	JWT authentication, sessions, roles	User, Session
Setup	First-run wizard, initial data import	—

See Domain Modules for detailed module documentation.

AI & Voice

MemGhost includes optional AI capabilities that run entirely on your hardware:

Feature	Technology	Purpose
AI Chat	OpenRouter / Anthropic / Ollama	Conversational agents with tool use
Semantic Search	Ollama (nomic-embed-text) + pgvector	Natural-language search across all data
Hub Routing	LLM classification	Route items to the right topic page
Content Synthesis	LLM generation	Materialize hub pages from source items
Text-to-Speech	Kokoro (82M params)	Spoken responses from AI agents
MCP Tools	Go SDK	Per-module tool definitions for AI agents

See the AI Features and Voice guides for details.

Technology Stack

Layer	Technologies
Backend	Go 1.25+, PostgreSQL 15+ (pgvector), OpenAPI 3.0
Frontend	Next.js 14+, TypeScript, Tailwind CSS, shadcn/ui, React Query, Zustand
AI	Ollama, pgvector, MCP (Model Context Protocol), Kokoro TTS
Infrastructure	Docker, Docker Compose, Caddy, OpenTelemetry
Development	Dev Containers, Taskfile, Air (hot reload)