test-case: HALT proceed/skip/abort prompt #35

New Issue

2026-05-13T04:20:37Z

claude-noether commented

2026-05-13 04:20:37 +00:00

Steps

Boot aish with a config that has safety = { llm_second_opinion = false } (to keep this test fast and deterministic) and a connected MCP server (boltzmann recommended; auto_approve may be empty).
Use a model that emits real tool_calls: :model deep or :model cloud.
Issue: :norris use boltzmann__shell to run "rm -rf /tmp/aish-test-doesnt-exist".

Expected

Step 1: model emits a tool_call for boltzmann__shell with the rm -rf command in arguments.
aish runs is_destructive on the serialized call. Static patterns flag "rm -rf".
Red NORRIS HALT banner renders showing:
step: 1/8
reason: rm -rf
action: boltzmann__shell {"cmd":"rm -rf /tmp/aish-test-doesnt-exist"}
Prompt: [N] proceed / skip / abort?
Type s (skip) — synthesized role:tool turn "[aish] tool call skipped by user" appears in next iteration's context.
Model re-plans or stops.
After 3 consecutive skips of similar destructive proposals, escalation HALT should fire with reason 3 consecutive user skips.
Type a (abort) at any halt — Norris exits with red ABORTED banner.

What this exercises

safety.norris_step's destructive-detection on tool_call args (JSON-serialized).
halt_fn round-trip through the user's terminal.
Skip-budget escalation (R-C1 / 3-consecutive-skips threshold).
The synthesized role:tool turn preserves chat-template alternation (PHASE0 §6).

Likely failure modes

HALT doesn't fire because is_destructive looks only at command-string and the model's tool_call args aren't string-matched properly → fix: ensure serialized form contains the dangerous pattern.
After skip, model loops with same proposal forever → skip-budget didn't trigger; check ctx.norris_consecutive_skips.
After abort, aish crashes or context is lost → driver loop didn't clean up ctx.norris_active.

## Steps 1. Boot aish with a config that has `safety = { llm_second_opinion = false }` (to keep this test fast and deterministic) and a connected MCP server (boltzmann recommended; auto_approve may be empty). 2. Use a model that emits real tool_calls: `:model deep` or `:model cloud`. 3. Issue: `:norris use boltzmann__shell to run "rm -rf /tmp/aish-test-doesnt-exist"`. ## Expected - Step 1: model emits a tool_call for boltzmann__shell with the rm -rf command in arguments. - aish runs `is_destructive` on the serialized call. Static patterns flag "rm -rf". - Red NORRIS HALT banner renders showing: step: 1/8 reason: rm -rf action: boltzmann__shell {"cmd":"rm -rf /tmp/aish-test-doesnt-exist"} - Prompt: `[N] proceed / skip / abort? ` - Type `s` (skip) — synthesized role:tool turn "[aish] tool call skipped by user" appears in next iteration's context. - Model re-plans or stops. - After 3 consecutive skips of similar destructive proposals, escalation HALT should fire with reason `3 consecutive user skips`. - Type `a` (abort) at any halt — Norris exits with red `ABORTED` banner. ## What this exercises - safety.norris_step's destructive-detection on tool_call args (JSON-serialized). - halt_fn round-trip through the user's terminal. - Skip-budget escalation (R-C1 / 3-consecutive-skips threshold). - The synthesized role:tool turn preserves chat-template alternation (PHASE0 §6). ## Likely failure modes - HALT doesn't fire because is_destructive looks only at command-string and the model's tool_call args aren't string-matched properly → fix: ensure serialized form contains the dangerous pattern. - After skip, model loops with same proposal forever → skip-budget didn't trigger; check `ctx.norris_consecutive_skips`. - After abort, aish crashes or context is lost → driver loop didn't clean up `ctx.norris_active`.

claude-noether added the test-case label 2026-05-13 04:20:37 +00:00

claude-noether closed this issue

2026-05-13 12:56:32 +00:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marfrit/aish#35