phase0 review: tighten phase 2 row + add Q9, Q10, sharpen Q6 #1

Merged
marfrit merged 1 commits from review/mcp-phase-questions into main 2026-05-10 11:00:35 +00:00
Collaborator

What this is

A one-time PR-flow exercise (the project is otherwise non-PR per CLAUDE.md §11) opened for readable feedback on the open questions surfaced by review of 013c625 (the MCP phase amendment). Discussion forum is the PR review UI rather than the Claude Code transcript.

Diff is the questions

Single file edit, docs/PHASE0.md. Three findings from the review:

  • §11 Phase 2 row — spelled out two scope items the original row left implicit: system-prompt rewrite to declare the tools schema (Phase 0's CMD: contract is hard-coded into the prompt), and safety.lua extension to gate tool calls (per Q8).
  • §13 Q6 — sharpened to flag that choosing "retire CMD:" is a §3 invariant amendment, not a Phase 2 internal call.
  • §13 Q9 (new) — MCP system-prompt augmentation locus: static block in broker.lua / per-request assembly from connected servers / hybrid. Real architectural call with token-cost tradeoff.
  • §13 Q10 (new) — tool-call streaming vs the Phase 1 SSE substrate. Phase-ordering question: Phase 2 on the blocking Phase 0 broker (refit later), or Phase 1 SSE first so tool-call deltas stream from day one?

What kind of feedback

  • Inline comments on the diff for individual question wording, scope, or alternatives.
  • Top-level comments for cross-question concerns or PR-flow process feedback (this being a one-time exercise, the meta-process is itself up for review).

If the questions land as written, merge — they take effect as §13 open questions for the eventual Phase 2 doc to resolve. If they need rewording, request changes; I'll iterate. If they're the wrong questions entirely, close — and we discuss what the right ones are.

Phase 0 implementation per the locked manifest is the next move regardless; this PR doesn't block it.

## What this is A one-time PR-flow exercise (the project is otherwise non-PR per CLAUDE.md §11) opened for **readable feedback** on the open questions surfaced by review of `013c625` (the MCP phase amendment). Discussion forum is the PR review UI rather than the Claude Code transcript. ## Diff is the questions Single file edit, `docs/PHASE0.md`. Three findings from the review: - **§11 Phase 2 row** — spelled out two scope items the original row left implicit: system-prompt rewrite to declare the tools schema (Phase 0's `CMD:` contract is hard-coded into the prompt), and `safety.lua` extension to gate tool calls (per Q8). - **§13 Q6** — sharpened to flag that choosing "retire `CMD:`" is a §3 invariant amendment, not a Phase 2 internal call. - **§13 Q9** (new) — MCP system-prompt augmentation locus: static block in `broker.lua` / per-request assembly from connected servers / hybrid. Real architectural call with token-cost tradeoff. - **§13 Q10** (new) — tool-call streaming vs the Phase 1 SSE substrate. Phase-ordering question: Phase 2 on the blocking Phase 0 broker (refit later), or Phase 1 SSE first so tool-call deltas stream from day one? ## What kind of feedback - Inline comments on the diff for individual question wording, scope, or alternatives. - Top-level comments for cross-question concerns or PR-flow process feedback (this being a one-time exercise, the meta-process is itself up for review). If the questions land as written, merge — they take effect as `§13` open questions for the eventual Phase 2 doc to resolve. If they need rewording, request changes; I'll iterate. If they're the wrong questions entirely, close — and we discuss what the right ones are. Phase 0 implementation per the locked manifest is the next move regardless; this PR doesn't block it.
claude-noether added 1 commit 2026-05-10 06:23:11 +00:00
Captures three findings from the review of 013c625 ("phase0 amendment:
insert MCP phase 2"). Opening as a PR rather than direct-to-main: the
non-PR-flow convention works fine for autonomous work, but feedback-
required iteration needs a readable medium that isn't the Claude Code
transcript.

§11 phase 2 row: spell out two scope items the original row left implicit —
the system-prompt rewrite to declare the tools schema (Phase 0's `CMD:`
contract is hard-coded into the prompt) and `safety.lua` extension to
gate tool calls (per Q8).

§13 Q6: explicit note that choosing "retire `CMD:`" requires a §3
invariant amendment in the same commit — keeps the substrate-vs-phase
boundary honest. Adds (§3 if retiring) to the impact column.

§13 Q9 (new): MCP system-prompt augmentation locus — static block in
broker.lua / per-request assembly from connected servers / hybrid.
Real architectural call with token-cost tradeoff per option.

§13 Q10 (new): tool-call streaming vs the Phase 1 SSE substrate —
phase-ordering question. Either Phase 2 lands on the blocking Phase 0
broker and refits when SSE arrives, or Phase 1 SSE moves before MCP
so tool-call deltas stream from day one.
Author
Collaborator

Architect review — Sonnet 4.6 (outside-eyes pass)

This PR is a net improvement: Q6's sharpening is exactly right, and Q9/Q10 are questions that needed to be on paper before Phase 2 design work starts. A few positions and one structural objection below.


Q6 — CMD: coexistence strategy

Pick: both survive into Phase 2; retire CMD: no earlier than Phase 3, and only if the tool-call path fully covers the use case.

Rationale: CMD: is zero-infrastructure — it works with any model that can follow text instructions, no tool-calling support required. MCP tool calls require the model to emit a structured JSON object in a specific schema, which local quantized models handle unreliably at small parameter counts. Retiring CMD: in Phase 2 would break the fast/local model path. The PR's note that retiring requires a §3 invariant amendment in the same commit is correct and important — but the right Phase 2 answer is coexistence, not retirement. Revisit at Phase 3 when the Norris planner needs a single dispatch path.

One thing Q6 still doesn't surface: if both mechanisms coexist, the system prompt in Phase 2 must instruct the model when to use each. That framing question should be added to Q6's text. "Both" is not a complete answer without a dispatch rule.


Q9 — System-prompt augmentation locus

Pick: per-request assembly, scoped to the connected server set at request time.

Rationale: static drifts from server reality and is worse than the token cost; hybrid increases complexity without a clear win for a Phase 2 first cut. The real question is scope of assembly: tool schemas from all connected servers on every turn, or only servers relevant to the current task? For Phase 2, all connected is fine — the server set is small and known. The token cost is bounded by the number of connected MCP servers, and lmcp's schema surface area is not large.

Where the PR's Q9 framing is weak: it treats this as purely a broker.lua vs mcp.lua question. It's actually a context.lua question too — the assembled tool block is logically part of the system message, which §6 says is "prepended on every request and not stored as a history turn." Q9 should name context.lua in its impact column, because whoever owns system_prompt assembly owns this feature.


Q10 — Tool-call streaming vs Phase 1 SSE ordering

Pick: require Phase 1 SSE substrate first; do not implement Phase 2 tool calls on the blocking broker.

Rationale: tool-call responses from the model arrive as streamed delta chunks in the tool_calls field (OpenAI wire format). Consuming them correctly without SSE requires buffering the entire completion before parsing — which defeats the purpose and makes the Phase 1 refit a full rewrite of the tool-call parsing path, not a simple substrate swap. The Phase 0 blocking broker was designed for text responses; bolting tool-call delta reassembly onto it creates technical debt that the refit will have to undo.

The framing Q10 is missing, as flagged in the task brief: do tool calls even need streaming for the first cut? They don't — you can set stream: false for Phase 2 tool-call turns only, collect the full response, parse tool_calls[] out of the non-streamed completion, and add streaming later. This gives you Phase 2 tool calls on the Phase 0 blocking broker without the delta-reassembly debt. That's a third option Q10 should enumerate before the phase-ordering question is answered.

So the real answer is: add the non-streaming option to Q10, then pick it for the Phase 2 first cut, with SSE streaming of tool-call deltas deferred to Phase 1 (or a Phase 2.1 pass). This also means Phase 2 can proceed independently of Phase 1 — the ordering dependency Q10 raises goes away.


Architectural risk: "system prompt rewrite" wording in the Phase 2 row

The Phase 2 row now says "system prompt rewrite to declare the tools schema." This is overloaded. In OpenAI-compatible APIs, tool schemas are declared in the tools field of the request body — not in the system prompt. Injecting tool descriptions into the system message is a fallback for models that don't understand the tools field natively. The Phase 2 row should distinguish:

  • Structured path: pass tools: [...] in the request body; the model emits tool_calls[] objects (standard, works with llama.cpp when the model supports it)
  • Fallback path: inject tool descriptions as system-prompt text; the model emits CMD:-style or JSON-in-text tool invocations (works with any model)

As written, "system prompt rewrite" implies only the fallback path, which underspecifies Phase 2 and conflicts with Q9's assumption that tool schema augmentation is a system-prompt concern. This needs a clarifying sentence in the Phase 2 row or a Q11.

Q8 × Q9 interaction — not surfaced

If tool schemas are declared per-request (Q9's correct answer), and the gate in safety.lua can revoke a tool at runtime (Q8), then a revoked tool must be omitted from the assembled schema block in that request, not just ignored when the model calls it. Otherwise the model will still see the tool as available and call it; the gate then has to reject at invocation time, which is a worse UX than never offering the tool. This interaction between Q8's gate and Q9's assembly locus should be noted in at least one of the two questions.


Merge recommendation

Request changes — minor. The PR is directionally correct and the questions it adds are necessary. Required before merge:

  1. Add a third option to Q10: stream: false for tool-call turns, Phase 1 SSE not required.
  2. Add a note to Q9's impact column: context.lua (system_prompt assembly).
  3. Either add a sentence to the Phase 2 row clarifying structured (tools field) vs fallback (system-prompt injection) paths, or open Q11 for it.
  4. Add one sentence to Q6 or Q8 noting the gate × schema-assembly interaction.

Items 2–4 are one-liners. Item 1 changes the answer to Q10's phase-ordering question and is architecturally load-bearing.


Posted by claude-noether acting as outside-eyes architect (Sonnet 4.6), not as PR author.

## Architect review — Sonnet 4.6 (outside-eyes pass) This PR is a net improvement: Q6's sharpening is exactly right, and Q9/Q10 are questions that needed to be on paper before Phase 2 design work starts. A few positions and one structural objection below. --- ### Q6 — CMD: coexistence strategy **Pick: both survive into Phase 2; retire `CMD:` no earlier than Phase 3, and only if the tool-call path fully covers the use case.** Rationale: `CMD:` is zero-infrastructure — it works with any model that can follow text instructions, no tool-calling support required. MCP tool calls require the model to emit a structured JSON object in a specific schema, which local quantized models handle unreliably at small parameter counts. Retiring `CMD:` in Phase 2 would break the fast/local model path. The PR's note that retiring requires a §3 invariant amendment in the same commit is correct and important — but the right Phase 2 answer is coexistence, not retirement. Revisit at Phase 3 when the Norris planner needs a single dispatch path. One thing Q6 still doesn't surface: if both mechanisms coexist, the system prompt in Phase 2 must instruct the model when to use each. That framing question should be added to Q6's text. "Both" is not a complete answer without a dispatch rule. --- ### Q9 — System-prompt augmentation locus **Pick: per-request assembly, scoped to the connected server set at request time.** Rationale: static drifts from server reality and is worse than the token cost; hybrid increases complexity without a clear win for a Phase 2 first cut. The real question is scope of assembly: tool schemas from *all* connected servers on every turn, or only servers relevant to the current task? For Phase 2, all connected is fine — the server set is small and known. The token cost is bounded by the number of connected MCP servers, and lmcp's schema surface area is not large. Where the PR's Q9 framing is weak: it treats this as purely a `broker.lua` vs `mcp.lua` question. It's actually a `context.lua` question too — the assembled tool block is logically part of the system message, which §6 says is "prepended on every request and not stored as a history turn." Q9 should name `context.lua` in its impact column, because whoever owns `system_prompt` assembly owns this feature. --- ### Q10 — Tool-call streaming vs Phase 1 SSE ordering **Pick: require Phase 1 SSE substrate first; do not implement Phase 2 tool calls on the blocking broker.** Rationale: tool-call responses from the model arrive as streamed delta chunks in the `tool_calls` field (OpenAI wire format). Consuming them correctly without SSE requires buffering the entire completion before parsing — which defeats the purpose and makes the Phase 1 refit a full rewrite of the tool-call parsing path, not a simple substrate swap. The Phase 0 blocking broker was designed for text responses; bolting tool-call delta reassembly onto it creates technical debt that the refit will have to undo. The framing Q10 is missing, as flagged in the task brief: **do tool calls even need streaming for the first cut?** They don't — you can set `stream: false` for Phase 2 tool-call turns only, collect the full response, parse `tool_calls[]` out of the non-streamed completion, and add streaming later. This gives you Phase 2 tool calls on the Phase 0 blocking broker without the delta-reassembly debt. That's a third option Q10 should enumerate before the phase-ordering question is answered. So the real answer is: **add the non-streaming option to Q10, then pick it for the Phase 2 first cut, with SSE streaming of tool-call deltas deferred to Phase 1 (or a Phase 2.1 pass).** This also means Phase 2 can proceed independently of Phase 1 — the ordering dependency Q10 raises goes away. --- ### Architectural risk: "system prompt rewrite" wording in the Phase 2 row The Phase 2 row now says "system prompt rewrite to declare the tools schema." This is overloaded. In OpenAI-compatible APIs, tool schemas are declared in the `tools` field of the request body — not in the system prompt. Injecting tool descriptions into the system message is a fallback for models that don't understand the `tools` field natively. The Phase 2 row should distinguish: - **Structured path**: pass `tools: [...]` in the request body; the model emits `tool_calls[]` objects (standard, works with llama.cpp when the model supports it) - **Fallback path**: inject tool descriptions as system-prompt text; the model emits `CMD:`-style or JSON-in-text tool invocations (works with any model) As written, "system prompt rewrite" implies only the fallback path, which underspecifies Phase 2 and conflicts with Q9's assumption that tool schema augmentation is a system-prompt concern. This needs a clarifying sentence in the Phase 2 row or a Q11. ### Q8 × Q9 interaction — not surfaced If tool schemas are declared per-request (Q9's correct answer), and the gate in `safety.lua` can revoke a tool at runtime (Q8), then a revoked tool must be *omitted from the assembled schema block* in that request, not just ignored when the model calls it. Otherwise the model will still see the tool as available and call it; the gate then has to reject at invocation time, which is a worse UX than never offering the tool. This interaction between Q8's gate and Q9's assembly locus should be noted in at least one of the two questions. --- ### Merge recommendation **Request changes — minor.** The PR is directionally correct and the questions it adds are necessary. Required before merge: 1. Add a third option to Q10: `stream: false` for tool-call turns, Phase 1 SSE not required. 2. Add a note to Q9's impact column: `context.lua` (system_prompt assembly). 3. Either add a sentence to the Phase 2 row clarifying structured (`tools` field) vs fallback (system-prompt injection) paths, or open Q11 for it. 4. Add one sentence to Q6 or Q8 noting the gate × schema-assembly interaction. Items 2–4 are one-liners. Item 1 changes the answer to Q10's phase-ordering question and is architecturally load-bearing. --- _Posted by claude-noether acting as outside-eyes architect (Sonnet 4.6), not as PR author._
marfrit merged commit fcfc23eef2 into main 2026-05-10 11:00:35 +00:00
Sign in to join this conversation.