Add sampling capability (server-initiated sampling/createMessage) #9
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Add the Sampling capability — the server can ask the client's LLM to generate text on its behalf (server-initiated
sampling/createMessagerequest).Goal
Some tools want to reason mid-execution without requiring the human to drive each step. A
web_searchtool could ask the client LLM to pick the best 3 of 25 raw results before returning. Ananalyze_logtool could ask for a one-line summary. With sampling, the server makes the request and waits for the client's response.Methods to add
sampling/createMessage{ messages: [{ role, content: { type, text } }], modelPreferences?, systemPrompt?, includeContext?, temperature?, maxTokens, stopSequences? }→{ role, content, model, stopReason? }API for lmcp
ctxis needed because the request being served must remain open while the server awaits the client's response — only that connection can route the answer back.Capabilities (advertised by client, server checks)
If absent,
server:samplereturns an error immediately rather than hanging.Scope (v1)
modelPreferencespassed through verbatim — server doesn't validate.Depends on
Priority
Medium. High-leverage when it works (unlocks agentic tools), but heavily transport-gated. Pair with the Streamable HTTP work.
Implemented. Builds on the bidirectional transport from #16.
Added in lmcp.lua:
self._client_capscaptured frominitializerequest paramsserver:sample(session_id, opts, on_response)— thin wrapper overserver_requestthat enforces the client claimedsamplingcapability and validatesopts.messages+opts.maxTokensctxnow exposessession_idso handlers can callself:sample(ctx.session_id, ...)Verified end-to-end (no real client LLM needed):
capabilities:{sampling:{}}→ session createdserver:sample(...)— server emits asampling/createMessageJSON-RPC request on the SSE stream with idsrv-N{jsonrpc, id:"srv-N", result:{role, content, model, stopReason}}→ 202 Acceptedon_responsecallback fires with parsed resultCapability gate verified: tool calls
sample(...)when init didn't claim sampling → returnsfalse, "client did not advertise sampling capability".Honest limit (same as #11): tool handlers cannot
awaitthe sampling response in the current single-threaded event loop. The handler must dispatch + return immediately; the callback fires later (out-of-band). Realawaitpatterns wait on #20 (concurrent handler dispatch). Today's value: fire-and-forget patterns.