Local AI Voice Memo Pipeline
A fully self-hosted system that captures voice memos from my phone, transcribes them on my home server, extracts topics and summaries, and stores everything in a graph database I can query and visualize — all without any data leaving my network.
The Problem
Quick voice notes are the fastest way to capture ideas on the go, but they end up as an unsearchable pile of audio files. Existing transcription services send your data to the cloud, and none of them connect your notes into a structured knowledge base you can actually explore.
The Solution
A local-only pipeline triggered by Wi-Fi sync. When I return home, voice memos automatically transfer to my home server, where they flow through transcription, summarization, topic clustering, and embedding — landing in a graph database that surfaces connections between ideas over time.
How It Works
- Capture & Sync — Record a voice memo on my phone; it syncs to the home server once I connect to my private Wi-Fi
- Transcription — Whisper converts the audio to text locally
- Summarization — Ollama (running a local LLM) generates a concise summary of each transcript
- Topic Clustering — BERTopic with T5 embeddings identifies thematic clusters across all notes
- Storage & Retrieval — Embeddings and cluster metadata are stored in Qdrant, forming a queryable knowledge graph
- Visualization — Query and explore the graph to surface connections between ideas, recurring themes, and evolving topics
Key Design Decisions
- Fully local — No cloud services, no API calls; all processing runs on the home server
- Qdrant as vector database — Enables semantic search and nearest-neighbor queries across all memos
- BERTopic + T5 — Produces human-readable topic labels rather than opaque cluster IDs