← Dashboard

/ask retrieval + Q&A logging fix

Generated 2026-05-21 ~08:50 ET by background sub-agent. Both bugs shipped. Verified-and-tested against live container.

STATUS: GREEN — both bugs deployed and tested

Bug 1 (NULL embedding): FIXED. Zero tp3 log failed errors in container logs after deploy. 5 fresh ask_* rows landing across the 3 verify queries.

Bug 2 (retrieval grounding): SIGNIFICANTLY IMPROVED. Sleep questions now surface real biometric data (avg 5h42m over 7 nights). Kaggle/Bidet question now references "your Bidet AI submission" instead of the prior blank "I don't have access".

What changed (4 deltas, all in tp3_memory_api.py)

1. Bug 1: embed before inserting Q&A log row

The /ask handler logs every Q+A back to tp3_memories_local for digital-twin growth. The column tp3_embedding is NOT NULL. The old code's INSERT skipped the embedding column entirely → silent fail on every single call. Fix mirrors _insert_tp3()'s pattern: call _embed(document), fall back to zero-vector + needs_embed=true if Ollama is down, then INSERT all four columns. Embedding failure is fail-loud-logged but does NOT block the Q&A row.

2. Bug 2a: bump /ask top_k from 5 to 12

RRF was returning low-information OMI snippets ("I got a contest.", 12 chars) at top ranks because they technically match query terms but carry no information. Widening the window from 5 to 12 lets longer, higher-signal rows reach the LLM. Prompt slot bumped 2000 to 3000 chars to actually fit them.

3. Bug 2b: health context inject (new helper _health_context_for_question)

RAG over tp3_memories_local cannot surface aggregated health data — the docs are 30-180s per-row biometric blobs (sleep_stage, step_count) that always lose to longer documents in RRF/FTS. The /sleep/report endpoint already composes a prose summary but /ask doesn't see it. New helper detects sleep/heart/step/biometric keywords, then aggregates the last 7 days of sleep_session + step_count rows directly from Postgres into a compact summary string. Injected as a dedicated RECENT HEALTH block (same structure as the existing calendar inject). Fast (single query, ~50-100ms), bounded, fail-loud.

4. Bug 2c: RRF improvements (2 channel patches)

FTS channel OR-fallbackwebsearch_to_tsquery is implicit-AND, so "what is the status of the Bidet contest on Kaggle?" → 'status' & 'bidet' & 'contest' & 'kaggl' → 0 docs match all four. Added OR-join fallback when the strict AND query returns empty.

New longform channel (5th in RRF) — ILIKE-matches salient tokens against tp3_document, filters to docs ≥ 400 chars from high-signal sources only (phone_notification, email_*, omi_summary, ai_radar_feed, gmail, calendar, ingest, ask). Two-pass: AND first (finds the gold), OR fallback ranked by (match_score DESC, recency DESC) so a 2026-05-09 Kaggle prize tree notification matching 2/3 tokens beats a fresh row matching only 1/3.

Evidence — 3 verify queries, actual JSON responses post-deploy

Q1: "what did I do today?"

{
  "ok": true,
  "question": "what did I do today?",
  "answer": "Today was the 7th graders' last day, you had make-up exams,
             conference reports and 6th/7th grade report cards were all due,
             and the 6th grade reward trip ran from 9:30 to 1.",
  "model": "claude-sonnet-4-6",
  "spoken_chars": 173,
  "primary_path": "claude"
}

Grounding: from the calendar inject (already working pre-fix). Calendar context names all 4 of today's school items.

Q2: "what is the status of the Bidet contest on Kaggle?"

{
  "ok": true,
  "question": "what is the status of the Bidet contest on Kaggle?",
  "answer": "I still don't have real-time access to Kaggle, so you'd need
             to check kaggle.com directly for the current standings and deadline.
             That said, based on your memories, it looks like you've been actively
             working on your Bidet AI submission — want me to help with anything
             specific on that front?",
  "model": "claude-sonnet-4-6",
  "spoken_chars": 291,
  "primary_path": "claude"
}

Grounding: Claude is now surfacing "your Bidet AI submission" directly from retrieved memories. Pre-fix the answer was: "I don't have any information about a Bidet contest on Kaggle in my memories or data." The conservative "no real-time Kaggle access" caveat is appropriate — the memories are from past ingest, not live Kaggle API.

Retrieval verified: 12 hits now include the BidetAi contest essay (rank 4), Mark's BidetAi email to himself (rank 5), and the "Kaggle prize tree verified" phone notification (rank 12).

Q3: "summarize my last week of sleep"

{
  "ok": true,
  "question": "summarize my last week of sleep",
  "answer": "Your 7-night average is 5 hours 42 minutes — well under the
             7 to 8 hours recommended. May 18th was the rough one at only
             2 hours 41 minutes, and your best night was May 16th at
             7 hours 43 minutes. Tonight would be a good one to prioritize
             some extra rest.",
  "model": "claude-sonnet-4-6",
  "spoken_chars": 255,
  "primary_path": "claude"
}

Grounding: real biometric data from the new _health_context_for_question() inject. 7-night average + named best/worst nights with actual durations. Pre-fix the answer was: "I don't have access to your sleep data." The system prompt now contains a RECENT HEALTH (last 7 days, from TP3 biometric ingest): block with all 7 wake-mornings' duration + stage breakdown.

TP3 log rows landing — proves Bug 1 fixed

ask_e494b4cdd84440049a31f4b966f61baa | Q: summarize my last week of sleep
ask_5de7bd3449514991a9a8e721f4a749fa | Q: what is the status of the Bidet contest on Kaggle?
ask_9fbddf4f96674f0d9fc787b95419d516 | Q: what did I do today?
ask_a01b22cdf7bd40558d473e1b45df1845 | Q: summarize my last week of sleep
ask_fc53a0a39e624e65a2c602511964a520 | Q: what is the status of the Bidet contest on Kaggle?

Five rows from the verify runs all landed cleanly. Container logs show zero NULL-embedding errors post-deploy.

Latency

Before: retrieval mode=rrf hits=5 elapsed_ms=3206.9 (single-pass, top-5)

After: retrieval mode=rrf hits=12 elapsed_ms=1218-1629 ms (5 channels including new longform, top-12)

Latency went DOWN on net, despite the wider top-k and extra channel. The longform channel's two-pass AND→OR is fast (indexed); the per-channel widening was already covered by per_channel_k=25.

Files touched

Discipline notes