Consuming Items Hub

Who this is for. You (or a teammate, or an LLM agent, or another app like Brickston) want to read from the Items Hub data lake once sources have landed rows in the inbox and the extractor has normalized them. This page shows the three supported access patterns, with working examples.

The short version. Three ways in, ranked by coupling:

Direct SQL to Neon — tightest, lowest-latency, full expressive power. For services you own end-to-end (Brickston, internal scripts).
REST /api/* on the Items Hub Vercel deployment — safest for external or LLM consumers. Serverless, rate-limited by Vercel, no secrets sprayed around.
Brickston's proxy router (/api/items-hub/* on the Brickston FastAPI) — only relevant if you're already inside Brickston's auth/RBAC surface. This is how Brickston surfaces Items Hub context on property / entity / deal pages.

All three read the same Neon schema. There is one source of truth (items.* in Neon); these are just three ways to get at it.

Pattern 1 — Direct SQL to Neon

When to use: internal services that already have Neon creds (Brickston backend, admin scripts, scheduled-task runners). Shortest path, full SQL expressivity.

Credentials. DATABASE_URL for runtime pooled traffic, DATABASE_URL_UNPOOLED for migrations / long txns. Read-only? Make a dedicated Neon role — the project's pooled connection is owner-level and should not leave trusted boundaries.

# Mac / Linux quick one-liner
URL="postgresql://…"  # from .env.local or Keychain
psql "$URL" -c "SELECT display_name, tenant, status FROM items.source_config ORDER BY sort_order;"

The tables (stable contract)

items.inbox_items — raw-ish payloads, one row per event (email, meeting, file, chat line, …). payload jsonb has a kind discriminator (see docs/DATA_DICTIONARY.md).
items.tasks — extracted action items. title, due_date, priority_score (0–1), tenant, entity_id, status (open / done / cancelled), source_item_id.
items.commitments — outbound promises you made. promised_to, promise_text, deadline, fulfilled_at.
items.decisions — resolved things with context. title, context, decided_at, entity_ids uuid[].
items.entities — people / companies / portfolios / markets / properties. name, type, tenant, canonical_email, aliases[], metadata jsonb.
items.documents — file references. title, source_uri, content_hash, entity_ids[].
items.embeddings — 1536-dim vectors (OpenAI text-embedding-3-small) with source_db / source_table / source_id / text / tenant / occurred_at / metadata. HNSW-indexed on embedding vector_cosine_ops.
items.source_runs — operational log, one row per scheduled-task fire.
items.source_config — one row per source; this is the registry the dashboard + data dictionary read from.

Useful query cookbook

-- All open tasks for a tenant, prioritized the way the dashboard does
SELECT t.title, t.due_date, t.priority_score, e.name AS entity
FROM items.tasks t
LEFT JOIN items.entities e ON e.id = t.entity_id
WHERE t.status = 'open' AND t.tenant = 'lfi'
ORDER BY
  CASE WHEN t.due_date < NOW() THEN 0 ELSE 1 END,
  t.due_date ASC NULLS LAST,
  COALESCE(t.priority_score, 0) DESC;

-- Everything that mentions an entity (name match, tenant-scoped)
WITH target AS (
  SELECT id FROM items.entities
  WHERE tenant = 'mosser' AND name ILIKE '%Ji Won%' LIMIT 1
)
SELECT 'task'     AS kind, t.title AS body, t.created_at FROM items.tasks t
  WHERE t.entity_id = (SELECT id FROM target)
UNION ALL
SELECT 'decision' AS kind, d.title, d.created_at FROM items.decisions d
  WHERE (SELECT id FROM target) = ANY(d.entity_ids)
UNION ALL
SELECT 'document' AS kind, dc.title, dc.extracted_at FROM items.documents dc
  WHERE (SELECT id FROM target) = ANY(dc.entity_ids)
ORDER BY 3 DESC LIMIT 100;

-- Semantic nearest-neighbor over everything (pgvector)
-- Requires: the query vector as text like '[0.123,-0.456,...]' (1536 floats)
SELECT source_table, source_id, text,
       1 - (embedding <=> $1::vector) AS similarity
FROM items.embeddings
WHERE tenant IN ('lfi','mosser')
ORDER BY embedding <=> $1::vector
LIMIT 20;

-- Extractor queue depth per source
SELECT c.display_name, c.status,
       COUNT(*) FILTER (WHERE ii.processed_at IS NULL) AS queue,
       COUNT(*) FILTER (WHERE ii.error IS NOT NULL)     AS errors
FROM items.source_config c
LEFT JOIN items.inbox_items ii ON ii.source = c.source
GROUP BY c.display_name, c.status
ORDER BY queue DESC;

Pooled vs unpooled

Pooled (-pooler host) — use for runtime Vercel + short queries. Default DATABASE_URL.
Unpooled — required for drizzle-kit push/generate, long transactions (LISTEN/NOTIFY, advisory locks), and SET-GUC-session-scoped work.

Pattern 2 — REST `/api/*` on Items Hub

When to use: any consumer that shouldn't have Neon creds — LLM agents, third-party dashboards, mobile clients, curl from a workstation. These endpoints are the stable public contract.

Base URL. https://lfiqdataflow-git-main-lfiq.vercel.app

Auth. Read endpoints (search, sources, overall) are public for now. Write endpoints (POST /api/ingest/*) require x-ingest-secret: $INGEST_SECRET. If you expose this deployment externally, gate reads with Vercel Auth or an Edge-middleware check.

Available endpoints

Method	Path	Purpose
GET	`/api/health`	Liveness + DB ping.
GET	`/api/overall`	Aggregate counters the home header uses (today / 7d / queue / open tasks / …).
GET	`/api/sources`	One row per source in `source_config`, with recent counts + health shape the dashboard grid consumes.
GET	`/api/search?q=&tenant=&table=&since_days=&limit=`	Semantic search. Returns `source_db/source_table/source_id/text/similarity` hits; caller hydrates from the owning table.
POST	`/api/search` with `{ query, tenant?, table?, since_days?, limit?, entity_id? }`	Same as GET but body-based, used by Brickston's proxy.
POST	`/api/extract` with `{ limit? }`	Manually drain the inbox queue (Mac-local scheduled task uses this; external callers normally don't).
GET	`/api/priorities/digest?format=html\|json&tenant=?`	Daily priorities digest. `html` returns email-safe inline-styled HTML (fetch from PKM-AI-Workflow → forward to Outlook); `json` returns `{tasks, commitments, counts}`. Public read, no auth.
POST	`/api/ingest/<source>`	Write rows into `items.inbox_items`. Secret required. This is how data gets in, not out.

Example — semantic search from a shell

curl -sS 'https://lfiqdataflow-git-main-lfiq.vercel.app/api/search?q=Ji+Won+follow+up+on+870+Oak&tenant=lfi&limit=5' \
  | jq '.results[] | { similarity, source_table, occurred_at, text }'

Example — semantic search from Python

import httpx, json

r = httpx.post(
    "https://lfiqdataflow-git-main-lfiq.vercel.app/api/search",
    json={
        "query": "870 Oak loan draw schedule",
        "tenant": "lfi",
        "since_days": 180,
        "limit": 10,
    },
    timeout=30,
)
for hit in r.json()["results"]:
    print(f"{hit['similarity']:.3f}  {hit['source_table']}  {hit['text'][:120]}")

Example — embed an Items Hub source card in another app

const res = await fetch("https://lfiqdataflow-git-main-lfiq.vercel.app/api/sources");
const { rows } = await res.json();
const granola = rows.find((r: any) => r.source === "granola");
// { displayName, lastSuccessAt, last7dRecords, backlog, ... } — render directly

Pattern 3 — Brickston proxy (`/api/items-hub/*`)

When to use: you're building inside Brickston and want to surface Items Hub context on existing pages. The proxy wraps direct Neon reads + the Items Hub search endpoint so Brickston can enforce its own RBAC and session auth before letting a user see any rows.

Where it lives. Brickston backend at 02-brickston2.ai/backend/app/routers/items_hub.py. Requires two env vars to be set on the Brickston backend:

BRICKSTON_ITEMS_HUB_DATABASE_URL=postgresql://…   # same Neon URL as Items Hub
BRICKSTON_ITEMS_HUB_INGEST_SECRET=…               # to proxy writes if ever needed

Endpoints (on the Brickston backend)

Method	Path	Purpose
GET	`/api/items-hub/health`	Confirms the second asyncpg pool is alive.
GET	`/api/items-hub/entity/{name}/activity?since_days=90&limit=50`	Fuzzy-match entity by name, return open tasks + decisions + recent inbox mentions.
GET	`/api/items-hub/property/{code}/activity?since_days=180&limit=50`	Same shape, keyed on property code / address.
GET	`/api/items-hub/tasks/open?tenant=&limit=100`	Open-task roll-up.
POST	`/api/items-hub/search`	Proxies Items Hub `/api/search`.
POST	`/api/items-hub/events/market`	Proxies writes to `/api/ingest/market` with the ingest secret injected server-side.

How Brickston UI consumes it

apps/web/lib/api/hooks/use-items-hub.ts exposes TanStack-Query hooks:

import { useEntityActivity, usePropertyActivity, useItemsHubSearch } from "@/lib/api/hooks/use-items-hub";

// On a property page:
<ItemsHubActivity propertyCode={property.name} sinceDays={180} />

The ItemsHubActivity component (apps/web/components/domain/items-hub-activity.tsx) renders a two-column "Activity" tab with open tasks + decisions + recent mentions. Graceful fallback: if the backend is missing BRICKSTON_ITEMS_HUB_DATABASE_URL, the tab shows a "not configured" hint instead of 500-ing.

Writing into Items Hub (for completeness)

Any new source follows the convergent pattern:

Trigger runs (Anthropic cloud / Windows Claude Desktop / Mac login-item).
Trigger POSTs JSON to one of /api/ingest/<source> with x-ingest-secret.
Row lands in items.inbox_items with a source-specific idempotency key.
Mac-local scheduled task items-hub-extract drains the queue every 10 min.
Normalized rows land in items.tasks / commitments / decisions / entities / documents + a row in items.embeddings.
Dashboard + consumers see the new data on the next query.

New source checklist (for future expansion):

Add a row to app/lib/sources.ts with status: 'designed' and the next free sort_order.
Run npm run bootstrap:sources (upserts into items.source_config).
Add a new /api/ingest/<source>/route.ts or reuse /api/ingest/market if the payload is a summary-style blob.
Add a payload kind clause to the extractor's prompt dispatch in app/lib/extractor.ts.
Register a trigger (cloud scheduled task, Windows Claude Desktop, or LaunchAgent) that posts.
First successful POST auto-promotes the card to live.
Update docs/DATA_DICTIONARY.md with the new source's row + payload contract.

When things look broken

No rows for a source you expected to ingest. Check items.source_runs for the source; check the trigger's log; check the dashboard detail page /sources/<source> — the "Retry failed" button clears errors and re-drains.
/api/search returns empty. Either the query is semantic-weird, or items.embeddings doesn't have that row type indexed. Run scripts/backfill_embeddings.ts for a full pass.
Extractor queue keeps growing. The Mac-local items-hub-extract scheduled task needs to be running. Open Claude Code, check the scheduled-task sidebar, click "Run now" if needed.
Brickston Activity tab says "not configured". Set the two env vars above and restart the backend.