Implement Streamable HTTP properly (persistent SSE, sessions) #16

Closed
opened 2026-05-17 15:56:10 +00:00 by claude-noether · 1 comment
Collaborator

Implement Streamable HTTP properly — persistent server-initiated SSE channel, session resumption via Mcp-Session-Id, and the full bidirectional message flow the spec requires.

Goal

lmcp's current /mcp handlers are request-response only:

  • GET /mcp opens an SSE connection, sends one endpoint event, then client:close() (lmcp.lua:240–244). Nothing server-initiated can flow.
  • POST /mcp handles a single JSON-RPC request, optionally emits the response as a single SSE event, then closes.
  • Mcp-Session-Id is in the CORS allow-list but not issued or honoured.

The spec's Streamable HTTP transport keeps the GET connection open as a long-lived event stream, lets the server push notifications/requests to the client at any time, and binds them to a session via Mcp-Session-Id. Many of the other open issues (Sampling, Roots, Progress, Cancellation, Logging delivery) are gated on this.

Required behaviour

Path Method Behaviour
POST /mcp request Either JSON response (Content-Type: application/json) OR an SSE stream that ends after the response.
POST /mcp notification 202 Accepted, no body.
GET /mcp Open SSE stream that lives until either side closes. Server can push notifications/* and server-initiated requests (sampling, roots/list) at any time. Client correlates by id.
DELETE /mcp Close session.
All requests If Mcp-Session-Id header is absent on the first non-initialize request, return 400 / re-init. Server issues it in the initialize response.

Implementation outline

  1. Maintain a server-local sessions[session_id] = { client, queue } map.
    queue is a list of pending server→client messages.
  2. GET /mcp no longer closes — it client:settimeout(0) and loops, flushing queue as messages arrive. Loop exits on client TCP close.
  3. POST /mcp enqueues responses on the session's queue if there's a live GET stream, else writes them inline.
  4. Issue a new session id (UUID-shape; os.time() .. "-" .. random) on initialize and write it back in Mcp-Session-Id response header.
  5. Server-initiated requests (sampling, roots/list) get fresh ids via a server-side id counter; responses come back through POST /mcp and are matched by the session's id→handler map.

Concerns

  • Concurrency. lmcp's accept loop is single-threaded; a long-lived GET will starve POSTs. Either spawn each connection in coroutine + non-blocking IO, or use luasocket's select. This is the load-bearing complexity of the rewrite.
  • Heartbeat. Spec recommends periodic SSE comments to keep the connection alive across proxies; trivial to add to the loop.
  • Backpressure. Without it, an idle client lets the queue grow forever. Cap queue length per session; drop notifications (not requests) when overflowing.

Scope (v1)

  • Bidirectional /mcp with session-id issue + honour.
  • select-based event loop (no luasocket coroutine framework dependency).
  • Heartbeat every 30s.
  • Per-session queue cap; drop-oldest for notifications, error-back for requests.

Out of scope

  • HTTP/2 multiplexing — luasocket doesn't speak HTTP/2 natively.
  • Last-Event-ID reconnect-resume semantics (defer to a v2).

Depends on / unblocks

  • Unblocks: Sampling, Roots, Progress notifications, Cancellation (server-side), Logging delivery, Resource subscribe/update.

Priority

High in terms of leverage. Without it, half of the other issues are noop. But it's also the largest single rewrite on the list — 1–2 days of careful work plus testing.

**Implement Streamable HTTP properly** — persistent server-initiated SSE channel, session resumption via `Mcp-Session-Id`, and the full bidirectional message flow the spec requires. ## Goal lmcp's current `/mcp` handlers are request-response only: - `GET /mcp` opens an SSE connection, sends one `endpoint` event, then `client:close()` (`lmcp.lua:240–244`). Nothing server-initiated can flow. - `POST /mcp` handles a single JSON-RPC request, optionally emits the response as a single SSE event, then closes. - `Mcp-Session-Id` is in the CORS allow-list but not issued or honoured. The spec's Streamable HTTP transport keeps the GET connection open as a long-lived event stream, lets the server push notifications/requests to the client at any time, and binds them to a session via `Mcp-Session-Id`. Many of the other open issues (Sampling, Roots, Progress, Cancellation, Logging delivery) are gated on this. ## Required behaviour | Path | Method | Behaviour | |---|---|---| | `POST /mcp` | request | Either JSON response (Content-Type: application/json) OR an SSE stream that ends after the response. | | `POST /mcp` | notification | 202 Accepted, no body. | | `GET /mcp` | – | Open SSE stream that lives until either side closes. Server can push `notifications/*` and server-initiated requests (sampling, roots/list) at any time. Client correlates by id. | | `DELETE /mcp` | – | Close session. | | All requests | – | If `Mcp-Session-Id` header is absent on the first non-`initialize` request, return 400 / re-init. Server issues it in the `initialize` response. | ## Implementation outline 1. Maintain a server-local `sessions[session_id] = { client, queue }` map. `queue` is a list of pending server→client messages. 2. `GET /mcp` no longer closes — it `client:settimeout(0)` and loops, flushing `queue` as messages arrive. Loop exits on client TCP close. 3. `POST /mcp` enqueues responses on the session's queue if there's a live GET stream, else writes them inline. 4. Issue a new session id (UUID-shape; `os.time() .. "-" .. random`) on `initialize` and write it back in `Mcp-Session-Id` response header. 5. Server-initiated requests (sampling, roots/list) get fresh ids via a server-side id counter; responses come back through `POST /mcp` and are matched by the session's id→handler map. ## Concerns - **Concurrency.** lmcp's accept loop is single-threaded; a long-lived GET will starve POSTs. Either spawn each connection in coroutine + non-blocking IO, or use luasocket's `select`. **This is the load-bearing complexity** of the rewrite. - **Heartbeat.** Spec recommends periodic SSE comments to keep the connection alive across proxies; trivial to add to the loop. - **Backpressure.** Without it, an idle client lets the queue grow forever. Cap queue length per session; drop notifications (not requests) when overflowing. ## Scope (v1) - Bidirectional `/mcp` with session-id issue + honour. - `select`-based event loop (no luasocket coroutine framework dependency). - Heartbeat every 30s. - Per-session queue cap; drop-oldest for notifications, error-back for requests. ## Out of scope - HTTP/2 multiplexing — luasocket doesn't speak HTTP/2 natively. - `Last-Event-ID` reconnect-resume semantics (defer to a v2). ## Depends on / unblocks - Unblocks: Sampling, Roots, Progress notifications, Cancellation (server-side), Logging delivery, Resource subscribe/update. ## Priority **High** in terms of leverage. Without it, half of the other issues are noop. But it's also the **largest single rewrite** on the list — 1–2 days of careful work plus testing.
Author
Collaborator

Implemented. Replaced the accept-serialised lmcp:run() with a select()-based event loop and per-connection FSM (reading_head → reading_body → dispatching → writing | sse_open). Highlights:

Delivered:

  • initialize issues Mcp-Session-Id; subsequent requests carry it. Unknown id on non-initialize → 404 per spec. Sessionless POST auto-issues (backwards compat).
  • GET /mcp is now a persistent SSE stream — stays open until client closes or DELETE removes the session.
  • 30s heartbeat (: keep-alive) on every open stream.
  • 60s idle session expiry.
  • DELETE /mcp closes the session (204).
  • 409 Conflict on second GET stream per session.
  • Notification queue draining: global _notify_queue fans out to ALL open SSE; per-session sess.notify_q only to that session (used by the new server:server_request() helper for sampling/roots).
  • _find_pending routes incoming POST that is actually a server-initiated response to its callback.
  • Partial-send tracking via write_buf (append-only / :sub(offset+1) invariant documented in code).
  • Read/body caps (64 KiB header, 8 MiB body, 1 MiB write_buf).
  • CORS OPTIONS preserved + DELETE added to allowed methods.
  • Stdio transport unaffected; all 11 previously-closed issues regression-tested live and pass.

Honest limit: the event loop concurrencies I/O, NOT handler execution. A synchronous tool handler (shell sleep 3) blocks the loop for its duration; concurrent fast POSTs serialise behind it. My Phase 1 success criterion #4 ("fast POST doesn't wait for slow POST") was an overreach — this is a different rewrite. Filed as follow-up #20 — Concurrent handler dispatch.

Unblocks the bidirectional-transport half of issues #9 (sampling), #10 (roots), and the delivery path for #8 (logging notifications) and #11 (progress).

Memory: project_io_vs_handler_concurrency.md captures the I/O-vs-handler distinction so future work doesn't reintroduce the confusion.

Implemented. Replaced the accept-serialised `lmcp:run()` with a select()-based event loop and per-connection FSM (reading_head → reading_body → dispatching → writing | sse_open). Highlights: **Delivered:** - `initialize` issues `Mcp-Session-Id`; subsequent requests carry it. Unknown id on non-initialize → 404 per spec. Sessionless POST auto-issues (backwards compat). - GET /mcp is now a persistent SSE stream — stays open until client closes or DELETE removes the session. - 30s heartbeat (`: keep-alive`) on every open stream. - 60s idle session expiry. - DELETE /mcp closes the session (204). - 409 Conflict on second GET stream per session. - Notification queue draining: global `_notify_queue` fans out to ALL open SSE; per-session `sess.notify_q` only to that session (used by the new `server:server_request()` helper for sampling/roots). - `_find_pending` routes incoming POST that is actually a server-initiated response to its callback. - Partial-send tracking via `write_buf` (append-only / `:sub(offset+1)` invariant documented in code). - Read/body caps (64 KiB header, 8 MiB body, 1 MiB write_buf). - CORS OPTIONS preserved + DELETE added to allowed methods. - Stdio transport unaffected; all 11 previously-closed issues regression-tested live and pass. **Honest limit:** the event loop concurrencies I/O, NOT handler execution. A synchronous tool handler (`shell sleep 3`) blocks the loop for its duration; concurrent fast POSTs serialise behind it. My Phase 1 success criterion #4 ("fast POST doesn't wait for slow POST") was an overreach — this is a different rewrite. Filed as **follow-up #20 — Concurrent handler dispatch**. Unblocks the bidirectional-transport half of issues #9 (sampling), #10 (roots), and the delivery path for #8 (logging notifications) and #11 (progress). Memory: `project_io_vs_handler_concurrency.md` captures the I/O-vs-handler distinction so future work doesn't reintroduce the confusion.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/lmcp#16