iter1 phase5: second-model review (Plan/sonnet)

Plan subagent with model:sonnet reviewed Phase 0-4 with raw artifacts (no curation per dev process). Two findings: 1. H.264 SSIM precedent framing — "mirrors fresnel iter1 exactly" overstates the cross-host link. Two independent empirical data points on two different rkvdec generations (RK3399 0.6431 vs RK3588 0.6676) that converge, not a cross-verified anchor. Reframe in iter1_close.md. 2. C7 deferral framing — "rig blocker" misframes a deliberate scope boundary (headless-ssh by design) as a rig deficit. Reframe in iter1_close.md. Both findings are pure framing changes (no code/data/struct claim), so no test-verify gate. Per empirical-over-theoretical guardrail, adoption is direct. Reviewer also confirmed the codec fragmentation rationale and kernel-agent-vs-backend issue assignment as correct (no cross-codec coupling problem). Plan sound, proceed to Phase 6 with the two amendments applied. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 07:26:30 +00:00
parent 73726112e7
commit ad913dc14b
1 changed files with 37 additions and 0 deletions
@@ -0,0 +1,37 @@
+# Phase 5 — Second-model review (iter1)
+
+Reviewed 2026-05-16 by `Plan` subagent with `model: sonnet`. Per `feedback_dev_process.md` Phase 5 doctrine ("Claude does not curate the artifact going to the reviewer"), the full Phase 0/1/2/3/4 markdown was pasted verbatim — no summarization, no pre-framing. Reviewer was given the empirical-over-theoretical guardrail (`feedback_review_empirical_over_theoretical`) and the "empty review is a strong signal, don't pad" guardrail explicitly.
+
+Reviewer findings:
+
+## Finding 1 — H.264 SSIM precedent chain needs a flag
+
+> The fresnel iter1 0.6431 datum is used to justify "mirrors fresnel, known drift." That's empirically grounded. However fresnel ran on RK3399 (rkvdec v1-era), ampere runs on RK3588 (rkvdec v2). The fact that the two drift values are close (0.643 vs 0.668) is reassuring, but it doesn't rule out a substrate question: if the two rkvdec generations differ in their deblocking or entropy-decode precision, you'd expect them to converge on similar-but-not-identical drift vs libavcodec SW — which is exactly what you see. So "mirrors fresnel" is a correct characterization for the purpose of iter1, but I'd suggest the iter1_close.md note explicitly records that this is two independent empirical data points on two different SoCs converging, not a single cross-verified anchor. The distinction matters if iter2+ starts investigating the drift.
+
+**Verdict: adopt.** This is a framing tightening that aligns with the Phase 0 carryover rule ("data does not carry across hosts; numbers are in-session-acquired only"). Phase 3 + Phase 4 language understates that distinction — I called it "replicates fresnel exactly" which over-promises a cross-verified anchor when it's actually two independent measurements on different chips that happen to converge.
+
+**Phase 6 amendment**: in `iter1_close.md` H.264 drift section, replace "mirrors fresnel iter1 exactly" with "is an independent RK3588 empirical observation, separate from fresnel iter1's RK3399 observation; the two converge but neither is the other's reference anchor." Also in `phase4_plan.md` Section 1's H.264 row, mark the "fresnel iter1 0.6431" cell as "independent RK3399 data, convergent not anchoring."
+
+No measurement / code change required to adopt — pure framing in the iter1 close docs.
+
+## Finding 2 — C7 deferral framing is subtly wrong
+
+> Phase 2 lists "no Wayland session active on ampere" as the rig blocker for C7. […] The right close language is "C7 requires a compositor; iter1 is headless by design; C7 moves to whichever iteration first runs a compositor on ampere, not to a future iter1 rig-fix." Calling it a "rig blocker" implies it should have been fixed before Phase 3. It shouldn't — a Wayland session would have introduced scope creep with no benefit to C1-C6. The deferral decision is correct; just reframe it as a scope boundary, not a missing prerequisite.
+
+**Verdict: adopt.** Phase 3's "Rig is incomplete" wording suggests the campaign owes a future rig-prep iteration to close C7. Reviewer is right that the *decision* (not to prep Wayland) is correct and the *language* should reflect it as a scope boundary, not a deficit.
+
+**Phase 6 amendment**: in `iter1_close.md`, C7 section title becomes "C7 — out of iter1 scope (headless by design)" instead of "C7 deferred — Wayland prerequisite missing." Reword the body to say "C7 needs a compositor on ampere; iter1 ran headless-ssh by design (no UX requirement); C7 anchors to whichever future iteration first runs a graphical session — likely the firefox-consumer iteration that follows AV1 backend iter39, or a UX-sanity sub-iteration in its own right." No need to file a separate "rig prep" task — it's not a prep task, it's a scope boundary.
+
+## Informational note (no action)
+
+> The three deferred codecs (HEVC, VP9, AV1) each block for independent, distinct reasons: HEVC is a kernel OOPS in a specific RPS assembly path; VP9 is absent from the rkvdec format registration entirely; AV1 is a backend fd-slot count limit. None of these share a fix path with each other or with the iter1 codecs. The iter2/3/4 assignment to kernel-agent vs backend repo is also correct: HEVC and VP9 are kernel module changes (right team), AV1 is userspace backend (right repo). No coupling problem.
+
+Acknowledged, no change needed. The fragmentation rationale is now reviewer-confirmed.
+
+## Summary
+
+Reviewer's overall verdict: **plan is sound, proceed to Phase 6.** Two framing amendments adopted (both apply to Phase 6 artifacts, not to Phase 4 algorithmic plan). Per the empirical-over-theoretical guardrail, neither amendment is a code/data/struct claim, so no test-verify gate triggers. Per global CLAUDE.md "Reviews are never skippable" plus "An empty review (no findings) is itself a strong signal" — this review wasn't empty but the two findings strengthen rather than redirect the plan, which is the same value-of-outside-look the rule was written to capture.
+
+## Phase 5 close
+
+Review run, findings recorded, both adopted. Ready for Phase 6 implementation (with the two framing amendments applied to the close artifact).