iter2 phase5: second-model review, 1 finding rebut, 3 findings adopt

Plan subagent (sonnet) reviewed Phase 0-4 with raw artifacts. Four
findings:

1. GstH265SPSEXT data-source gap — REBUT with empirical evidence.
   Reviewer confused GstH265SPSEXT (SPS extension params, separate
   struct) with GstH265ShortTermRefPicSetExt (RPS-extended struct,
   contains the 4 fields they flagged as missing). Both
   ShortTermRefPicSet AND ShortTermRefPicSetExt ARE in the vendored
   gsth265parser.h. Direct read of fetched header confirmed.
2. Per-fd storage for has_hevc_ext_sps_rps — ADOPT. Mirror iter38
   pattern of storing rkvdec + hantro fds separately. Add explicit
   driver_kind=='r' gate for human-readable intent.
3. SPS NAL caching strategy — ADOPT, critical. SPS NALs only arrive
   at IDR frames; per-frame walk would submit zero-filled RPS for
   non-IDR frames and re-OOPS. Parse-and-cache at first IDR, reuse
   on subsequent frames.
4. C3 prediction caveat — ADOPT. Anchor SHA per-clip (HEVC HW vs
   HEVC SW) not cross-codec; iter1's shared SHA across codecs was
   lucky empirical convergence, not guaranteed.

Three Phase 4 amendments applied as appendix to phase4_plan_iter2.md:
  - §Step 3 — per-driver-kind probe storage (pair instead of scalar)
  - §Step 4 — explicit two-struct mapping table; SPS parse-and-cache
  - §Phase 7 predictions C3 — anchor per-clip

Risk register gains risk #6 (SPS absent on non-IDR frames).

Per feedback_review_empirical_over_theoretical: the Finding #1
rebut was done by reading the actual vendored header file content,
not by source-reading the reviewer's argument. Empirical evidence
won, as the memory rule requires.

Plan sound with amendments. Phase 6 can proceed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-16 08:57:06 +00:00
parent a575e37190
commit d2ff661554
2 changed files with 159 additions and 0 deletions
+57
View File
@@ -172,3 +172,60 @@ Commit message: "README: document vendored GStreamer H.265 parser."
## Phase 4 close
5 commits, 8.7k LOC, 5 file modifications. Phase 7 predictions concrete (C1-C8 mapped). 5 risks named, 4 mitigated. Ready for Phase 5 second-model review.
---
## Phase 5 amendments (applied 2026-05-16 post-review)
### Amendment §Step 3 — per-driver-kind probe storage
Replace single `bool has_hevc_ext_sps_rps;` field on `struct request_data` with a pair:
```c
bool has_hevc_ext_sps_rps_rkvdec;
bool has_hevc_ext_sps_rps_hantro;
```
Probe runs at the time each fd is opened (init for primary, alt-probe for secondary per iter38). Storage parallels `video_fd_rkvdec` / `video_fd_hantro`. h265_set_controls consults only the rkvdec entry (HEVC routes through rkvdec only).
§Step 4 control-submission guard becomes explicit:
```c
if (driver_data->driver_kind == 'r' && driver_data->has_hevc_ext_sps_rps_rkvdec) {
/* … populate + submit EXT_SPS_*_RPS controls … */
}
```
The `driver_kind == 'r'` clause is logically redundant given the probe-result gate but provides reviewer-flagged human-readable intent.
### Amendment §Step 4 — explicit two-struct usage + SPS caching
The GStreamer reference `fill_ext_sps_rps` reads from BOTH `GstH265ShortTermRefPicSet` AND `GstH265ShortTermRefPicSetExt` (both structs are in the vendored gsth265parser.h — see Phase 5 Finding #1 rebut). Step 4 mapping:
| V4L2 field | Source in vendored parser |
|------------|---------------------------|
| `delta_idx_minus1`, `delta_rps_sign`, `abs_delta_rps_minus1`, `num_negative_pics`, `num_positive_pics`, `flags` | `GstH265ShortTermRefPicSet` (members `delta_idx_minus1`, `delta_rps_sign`, `abs_delta_rps_minus1`, `NumNegativePics`, `NumPositivePics`, `inter_ref_pic_set_prediction_flag`) |
| `use_delta_flag`, `used_by_curr_pic`, `delta_poc_s0_minus1[]`, `delta_poc_s1_minus1[]` | `GstH265ShortTermRefPicSetExt` (members `use_delta_flag[]`, `used_by_curr_pic_flag[]`, `delta_poc_s0_minus1[]`, `delta_poc_s1_minus1[]`) |
If GStreamer's `gst_h265_decoder_get_sps_ext()` helper is in the base-decoder class (not in gsth265parser.c), inline its body — it likely just reaches into `sps->short_term_ref_pic_set_ext[i]` on the parsed `GstH265SPS`.
**SPS parse-and-cache strategy** (Phase 5 Finding #3):
- Add a cache slot on `struct request_data`: `struct hevc_sps_cache_entry { uint8_t sps_id; bool valid; GstH265SPS parsed_sps; } hevc_sps_cache[16]` (sized to match the H.265 spec's max of 16 active SPS).
- At first `h265_set_controls` invocation OR whenever the SPS NAL is present in `source_data` (walk for `nal_unit_type == 33`): parse, store at `hevc_sps_cache[sps_id]`, mark valid.
- On subsequent frames: look up `hevc_sps_cache[pps->seq_parameter_set_id]`, reuse if valid; if invalid AND SPS NAL not in current source_data, log warning + skip the EXT_SPS_*_RPS controls (graceful fallback — better to fail soft than re-OOPS).
- Cache cleared in `RequestDestroyContext`.
This adds approximately +20 LOC to Step 4 (cache plumbing) on top of the original control-submission code.
### Amendment §"Phase 7 predictions" C3 row
Replace:
> C3 — PASS — HEVC I-frame at t=0, no inter-prediction, same sha as ampere-fourier iter1 H.264/VP8/MPEG-2 (`3214803d8be74416`)
With:
> C3 — PASS — HEVC frame 0 HW-decode SHA matches HEVC SW-decode SHA from the same encoded clip (anchored per-clip, not cross-codec). Cross-codec sha convergence observed in iter1 was empirical-not-guaranteed: encoder's quantization choices differ across codecs and the SHA match was a happy coincidence, not a contract.
### Updated risk register (added risk #6)
| 6 | SPS NAL absent from `source_data` on non-IDR frames; without caching, h265_set_controls submits zero-filled RPS for every non-IDR → re-OOPS | high (would have fired in practice) | parse-and-cache per Amendment §Step 4 above; first IDR populates cache, subsequent non-IDR frames reuse it; graceful skip + warning if cache miss AND no SPS NAL available |