My Data Flow
Data Harmony and Flow
loading utilities…
Data Harmony and Flow
Controls downstream propagation for personal-flagged sources — iMessage, email (personal threads), PML / PKL media, personal cloud-storage roots. Ingest stays alive either way; with the toggle off, those rows still land in items.inbox_items but the extractor, observation synthesis, and My Portfolio push all skip them.
text-embedding-3-small, then runs cosine-similarity ANN search against items.embeddings (pgvector + HNSW index). Returns top-k hits across tasks, commitments, decisions, entities, documents, and inbox_items. You can filter by ?tenant=lfi, ?table=tasks, ?since_days=90, or ?entity_id=….POST /api/ingest/m365?mode=cron&hours=48 — backfill outlook → inboxPOST /api/admin/close-stale-items?apply=1 — close tasks > 14dPOST /api/extract — manual extractor pass (dev only)items-hub-ask <question> SKILL — synthesize via plan auth/search or /ask. Embedding writes happen during extraction — the Mac-local items-hub-extract SKILL fires on a 2-hour cron and calls upsertEmbeddings() per processed inbox row. Synthesis runs under plan auth; only the OpenAI embedding calls touch a paid API (cheap — ~$0.02 per 1M tokens).app/api/search/route.ts (GET /api/search?q=…) and reused byapp/api/ask/route.ts. Embedding writes: app/lib/embeddings.ts viaupsertEmbeddings() called from app/lib/extractor.ts. Frontend: the global SearchBar at the top of every dashboard tab + dedicated /search page.curl -H "x-ingest-secret: $SECRET" "https://<host>/api/search?q=tidewater&limit=10".SELECT * FROM items.embeddings ORDER BY embedding <=> ‘[…]’::vector LIMIT 10 (use postgres npm pkg, not psql — channel-binding fails).