Consuming Items Hub
Who this is for. You (or a teammate, or an LLM agent, or another app like Brickston) want to read from the Items Hub data lake once sources have landed rows in the inbox and the extractor has normalized them. This page shows the three supported access patterns, with working examples.
The short version. Three ways in, ranked by coupling:
- Direct SQL to Neon — tightest, lowest-latency, full expressive power. For services you own end-to-end (Brickston, internal scripts).
- REST
/api/*on the Items Hub Vercel deployment — safest for external or LLM consumers. Serverless, rate-limited by Vercel, no secrets sprayed around. - Brickston's proxy router (
/api/items-hub/*on the Brickston FastAPI) — only relevant if you're already inside Brickston's auth/RBAC surface. This is how Brickston surfaces Items Hub context on property / entity / deal pages.
All three read the same Neon schema. There is one source of truth (items.* in Neon); these are just three ways to get at it.
Pattern 1 — Direct SQL to Neon
When to use: internal services that already have Neon creds (Brickston backend, admin scripts, scheduled-task runners). Shortest path, full SQL expressivity.
Credentials. DATABASE_URL for runtime pooled traffic, DATABASE_URL_UNPOOLED for migrations / long txns. Read-only? Make a dedicated Neon role — the project's pooled connection is owner-level and should not leave trusted boundaries.
# Mac / Linux quick one-liner
URL="postgresql://…" # from .env.local or Keychain
psql "$URL" -c "SELECT display_name, tenant, status FROM items.source_config ORDER BY sort_order;"
The tables (stable contract)
items.inbox_items— raw-ish payloads, one row per event (email, meeting, file, chat line, …).payload jsonbhas akinddiscriminator (seedocs/DATA_DICTIONARY.md).items.tasks— extracted action items.title,due_date,priority_score(0–1),tenant,entity_id,status(open/done/cancelled),source_item_id.items.commitments— outbound promises you made.promised_to,promise_text,deadline,fulfilled_at.items.decisions— resolved things with context.title,context,decided_at,entity_ids uuid[].items.entities— people / companies / portfolios / markets / properties.name,type,tenant,canonical_email,aliases[],metadata jsonb.items.documents— file references.title,source_uri,content_hash,entity_ids[].items.embeddings— 1536-dim vectors (OpenAItext-embedding-3-small) withsource_db/source_table/source_id/text/tenant/occurred_at/metadata. HNSW-indexed onembedding vector_cosine_ops.items.source_runs— operational log, one row per scheduled-task fire.items.source_config— one row per source; this is the registry the dashboard + data dictionary read from.
Useful query cookbook
-- All open tasks for a tenant, prioritized the way the dashboard does
SELECT t.title, t.due_date, t.priority_score, e.name AS entity
FROM items.tasks t
LEFT JOIN items.entities e ON e.id = t.entity_id
WHERE t.status = 'open' AND t.tenant = 'lfi'
ORDER BY
CASE WHEN t.due_date < NOW() THEN 0 ELSE 1 END,
t.due_date ASC NULLS LAST,
COALESCE(t.priority_score, 0) DESC;
-- Everything that mentions an entity (name match, tenant-scoped)
WITH target AS (
SELECT id FROM items.entities
WHERE tenant = 'mosser' AND name ILIKE '%Ji Won%' LIMIT 1
)
SELECT 'task' AS kind, t.title AS body, t.created_at FROM items.tasks t
WHERE t.entity_id = (SELECT id FROM target)
UNION ALL
SELECT 'decision' AS kind, d.title, d.created_at FROM items.decisions d
WHERE (SELECT id FROM target) = ANY(d.entity_ids)
UNION ALL
SELECT 'document' AS kind, dc.title, dc.extracted_at FROM items.documents dc
WHERE (SELECT id FROM target) = ANY(dc.entity_ids)
ORDER BY 3 DESC LIMIT 100;
-- Semantic nearest-neighbor over everything (pgvector)
-- Requires: the query vector as text like '[0.123,-0.456,...]' (1536 floats)
SELECT source_table, source_id, text,
1 - (embedding <=> $1::vector) AS similarity
FROM items.embeddings
WHERE tenant IN ('lfi','mosser')
ORDER BY embedding <=> $1::vector
LIMIT 20;
-- Extractor queue depth per source
SELECT c.display_name, c.status,
COUNT(*) FILTER (WHERE ii.processed_at IS NULL) AS queue,
COUNT(*) FILTER (WHERE ii.error IS NOT NULL) AS errors
FROM items.source_config c
LEFT JOIN items.inbox_items ii ON ii.source = c.source
GROUP BY c.display_name, c.status
ORDER BY queue DESC;
Pooled vs unpooled
- Pooled (
-poolerhost) — use for runtime Vercel + short queries. DefaultDATABASE_URL. - Unpooled — required for
drizzle-kit push/generate, long transactions (LISTEN/NOTIFY, advisory locks), and SET-GUC-session-scoped work.
Pattern 2 — REST /api/* on Items Hub
When to use: any consumer that shouldn't have Neon creds — LLM agents, third-party dashboards, mobile clients, curl from a workstation. These endpoints are the stable public contract.
Base URL. https://lfiqdataflow-git-main-lfiq.vercel.app
Auth. Read endpoints (search, sources, overall) are public for now. Write endpoints (POST /api/ingest/*) require x-ingest-secret: $INGEST_SECRET. If you expose this deployment externally, gate reads with Vercel Auth or an Edge-middleware check.
Available endpoints
| Method | Path | Purpose |
|---|---|---|
| GET | /api/health | Liveness + DB ping. |
| GET | /api/overall | Aggregate counters the home header uses (today / 7d / queue / open tasks / …). |
| GET | /api/sources | One row per source in source_config, with recent counts + health shape the dashboard grid consumes. |
| GET | /api/search?q=&tenant=&table=&since_days=&limit= | Semantic search. Returns source_db/source_table/source_id/text/similarity hits; caller hydrates from the owning table. |
| POST | /api/search with { query, tenant?, table?, since_days?, limit?, entity_id? } | Same as GET but body-based, used by Brickston's proxy. |
| POST | /api/extract with { limit? } | Manually drain the inbox queue (Mac-local scheduled task uses this; external callers normally don't). |
| GET | /api/priorities/digest?format=html|json&tenant=? | Daily priorities digest. html returns email-safe inline-styled HTML (fetch from PKM-AI-Workflow → forward to Outlook); json returns {tasks, commitments, counts}. Public read, no auth. |
| POST | /api/ingest/<source> | Write rows into items.inbox_items. Secret required. This is how data gets in, not out. |
Example — semantic search from a shell
curl -sS 'https://lfiqdataflow-git-main-lfiq.vercel.app/api/search?q=Ji+Won+follow+up+on+870+Oak&tenant=lfi&limit=5' \
| jq '.results[] | { similarity, source_table, occurred_at, text }'
Example — semantic search from Python
import httpx, json
r = httpx.post(
"https://lfiqdataflow-git-main-lfiq.vercel.app/api/search",
json={
"query": "870 Oak loan draw schedule",
"tenant": "lfi",
"since_days": 180,
"limit": 10,
},
timeout=30,
)
for hit in r.json()["results"]:
print(f"{hit['similarity']:.3f} {hit['source_table']} {hit['text'][:120]}")
Example — embed an Items Hub source card in another app
const res = await fetch("https://lfiqdataflow-git-main-lfiq.vercel.app/api/sources");
const { rows } = await res.json();
const granola = rows.find((r: any) => r.source === "granola");
// { displayName, lastSuccessAt, last7dRecords, backlog, ... } — render directly
Pattern 3 — Brickston proxy (/api/items-hub/*)
When to use: you're building inside Brickston and want to surface Items Hub context on existing pages. The proxy wraps direct Neon reads + the Items Hub search endpoint so Brickston can enforce its own RBAC and session auth before letting a user see any rows.
Where it lives. Brickston backend at 02-brickston2.ai/backend/app/routers/items_hub.py. Requires two env vars to be set on the Brickston backend:
BRICKSTON_ITEMS_HUB_DATABASE_URL=postgresql://… # same Neon URL as Items Hub
BRICKSTON_ITEMS_HUB_INGEST_SECRET=… # to proxy writes if ever needed
Endpoints (on the Brickston backend)
| Method | Path | Purpose |
|---|---|---|
| GET | /api/items-hub/health | Confirms the second asyncpg pool is alive. |
| GET | /api/items-hub/entity/{name}/activity?since_days=90&limit=50 | Fuzzy-match entity by name, return open tasks + decisions + recent inbox mentions. |
| GET | /api/items-hub/property/{code}/activity?since_days=180&limit=50 | Same shape, keyed on property code / address. |
| GET | /api/items-hub/tasks/open?tenant=&limit=100 | Open-task roll-up. |
| POST | /api/items-hub/search | Proxies Items Hub /api/search. |
| POST | /api/items-hub/events/market | Proxies writes to /api/ingest/market with the ingest secret injected server-side. |
How Brickston UI consumes it
apps/web/lib/api/hooks/use-items-hub.ts exposes TanStack-Query hooks:
import { useEntityActivity, usePropertyActivity, useItemsHubSearch } from "@/lib/api/hooks/use-items-hub";
// On a property page:
<ItemsHubActivity propertyCode={property.name} sinceDays={180} />
The ItemsHubActivity component (apps/web/components/domain/items-hub-activity.tsx) renders a two-column "Activity" tab with open tasks + decisions + recent mentions. Graceful fallback: if the backend is missing BRICKSTON_ITEMS_HUB_DATABASE_URL, the tab shows a "not configured" hint instead of 500-ing.
Writing into Items Hub (for completeness)
Any new source follows the convergent pattern:
- Trigger runs (Anthropic cloud / Windows Claude Desktop / Mac login-item).
- Trigger POSTs JSON to one of
/api/ingest/<source>withx-ingest-secret. - Row lands in
items.inbox_itemswith a source-specific idempotency key. - Mac-local scheduled task
items-hub-extractdrains the queue every 10 min. - Normalized rows land in
items.tasks/commitments/decisions/entities/documents+ a row initems.embeddings. - Dashboard + consumers see the new data on the next query.
New source checklist (for future expansion):
- Add a row to
app/lib/sources.tswithstatus: 'designed'and the next freesort_order. - Run
npm run bootstrap:sources(upserts intoitems.source_config). - Add a new
/api/ingest/<source>/route.tsor reuse/api/ingest/marketif the payload is a summary-style blob. - Add a payload
kindclause to the extractor's prompt dispatch inapp/lib/extractor.ts. - Register a trigger (cloud scheduled task, Windows Claude Desktop, or LaunchAgent) that posts.
- First successful POST auto-promotes the card to
live. - Update
docs/DATA_DICTIONARY.mdwith the new source's row + payload contract.
When things look broken
- No rows for a source you expected to ingest. Check
items.source_runsfor the source; check the trigger's log; check the dashboard detail page/sources/<source>— the "Retry failed" button clears errors and re-drains. /api/searchreturns empty. Either the query is semantic-weird, oritems.embeddingsdoesn't have that row type indexed. Runscripts/backfill_embeddings.tsfor a full pass.- Extractor queue keeps growing. The Mac-local
items-hub-extractscheduled task needs to be running. Open Claude Code, check the scheduled-task sidebar, click "Run now" if needed. - Brickston Activity tab says "not configured". Set the two env vars above and restart the backend.
See also
docs/DATA_DICTIONARY.md— every source, every payload shape, every scope / retention caveat.docs/WINDOWS_CLAUDE_DESKTOP.md— M365 trigger operator guide.docs/CRON_HISTORY.md— trigger / cron changelog, including the 2026-04-22 Craigslist + PBI fix.app/lib/db/schema.ts— Drizzle schema (ground truth for table shapes).prompt.md— the living plan; status snapshot + phase roadmap.