Documentation Index
Fetch the complete documentation index at: https://docs.nevermined.app/llms.txt
Use this file to discover all available pages before exploring further.
Start here: need to register a service and create a plan first? Follow the
5-minute setup.
Runnable tutorial
langchain-langsmith-deployment-py — a deliberately minimal LangGraph agent
with POST /threads/{id}/runs/wait gated by Nevermined x402. Clone, fill in
.env, run poetry run buyer to drive the full 402 → token-acquisition →
settlement round-trip in five numbered steps.@requires_payment decorator covered in LangChain — both can coexist; they protect different layers.
| Layer | Tool-time (LangChain) | Deployment-time (this page) |
|---|---|---|
| Code surface | @requires_payment on individual @tool functions | PaymentMiddleware mounted via langgraph.json http.app |
| Gated unit | A single tool call inside the agent | The agent’s HTTP entry point (e.g. POST /threads/{id}/runs/wait) |
| Charge frequency | Once per tool invocation | Once per HTTP request to the deployment |
| Runtime | Any LangChain / LangGraph host | LangSmith Deployment, langgraph dev, langgraph up |
Install
[langsmith] extra pulls fastapi, starlette, and langsmith.
Python only. LangSmith Deployment’s custom-app surface is documented by
LangChain as Python-only. A TypeScript variant is tracked in our LangChain
integration epic but blocked on LangChain shipping a TS runtime.
Define the middleware app
Createnvm_app.py next to your langgraph.json. Four lines of glue:
build_payment_app returns a FastAPI app pre-wired with PaymentMiddleware. Mount it from langgraph.json:
langgraph dev (local) and langgraph up (Docker) both honor the http.app field; the middleware composes around LangSmith Deployment’s built-in routes (/runs, /threads/{id}/runs, /assistants, etc.).
Why FastAPI? Some
langgraph-api versions crash on plain Starlette http.app
wrappers due to an upstream OpenAPI generation bug. FastAPI takes a clean path
through app.openapi(). The build_payment_app factory returns a FastAPI app
so you do not need to know about this — PaymentMiddleware itself is a
BaseHTTPMiddleware and works on both.The 402 round-trip
payments_py.x402.resolve_scheme.resolve_network to pick the right enrolled payment method from the plan metadata.
Per-route pricing
RouteConfig accepts a static int or a callable for credits:
:param or FastAPI/LangGraph {param} syntax — both match by position. Routes not listed pass through ungated.
Lifecycle
The middleware implements the canonical x402 verify → agent runs → settle ordering inside one HTTP cycle. Failed agent runs (non-2xx) skip settlement so buyers are not charged. Settle failures after a successful 2xx are logged but do not surface to the client — the buyer already received the value. For the full step-by-step diagram, see chapter 13 of the SDK docs.Why /runs/wait specifically
LangSmith Deployment exposes three run-execution shapes:
| Endpoint | Behavior | Works with this middleware? |
|---|---|---|
POST /threads/{id}/runs/wait | Synchronous; blocks until the agent finishes, returns final state | Yes — the only path that fits verify-then-work-then-settle in one HTTP cycle |
POST /threads/{id}/runs | Background; returns 202 immediately with a run_id | No — settle would fire before the agent did the work |
POST /threads/{id}/runs/stream | Server-sent events; streams agent output | Partially — the middleware buffers the response body to attach the settlement header, which negates streaming |
/runs/wait for a clean demo. The middleware will pass through /threads, /assistants/search, /info, /ok, and other non-billable endpoints automatically.
Observability
WhenLANGSMITH_TRACING=true is set, the middleware emits two top-level traces per gated request:
langgraph-api initiates it at the graph-invocation boundary, independent of our middleware’s trace context. Both nvm spans plus the parent carry searchable nvm.* metadata; the raw payment-signature token is abbreviated to eyJ4NDAyVmVyc2lvb…bsig-style so it can be cross-referenced without exposure.
Verification failures raise PaymentRequiredError inside verify_span so LangSmith marks the parent + child as failed via the canonical context-manager exit path. Settle failures after a successful 2xx mark only the settle child as failed; the parent stays successful (matching the buyer-visible 200).
Host a chat UI on top of the deployment
Runnable tutorial
langchain-chat-ui-nvm — a Next.js fork of LangChain’s agent-chat-ui
with a card-delegation popup and an x402-aware proxy. Pairs with the
langchain-research-agent-py companion (in-tool @requires_payment)
for a full browser demo.langchain-chat-ui-nvm tutorial does this by forking langchain-ai/agent-chat-ui and adding a handful of Next.js API routes plus a popup target.
This section describes a chat-UI host built on top of the in-tool gating
pattern (
@requires_payment from LangChain) rather than the
route-level middleware on the rest of this page. The middleware gates every
HTTP request — fine for “every call is paid” pricing, but it forces users to
pay before they can ask the agent what it does. The chat-UI flow benefits
from letting the LLM act as a free concierge (introspection, capability
discovery) and only charging when a paid tool actually fires. Pick by UX:
the same PaymentMiddleware would work if you want a hard paywall in front
of /runs/stream.- The user opens the chat and clicks Authorize on a top banner. A popup opens at
https://embed.<tier-host>/cards/setup?sessionToken=…&returnUrl=…/x402-callback&state=…(e.g.embed.nevermined.devfor staging,embed.nevermined.appfor production — the standalone embed app that replaced the webapp’s removed/embed/*routes). - They enrol a card on the embed page (Stripe, Braintree, or Visa Intelligent Commerce) and pick a budget — e.g. $10 / 24 h — then submit.
- Nevermined redirects the popup back to the chat UI’s callback page with
paymentMethodId,delegationId, and the round-trippedstatenonce. The callback validatesstate,postMessages the IDs towindow.opener, then closes itself. - The chat UI’s Next.js server mints an x402 access token from the
delegationId(payments.x402.getX402AccessToken(planId, agentId, { delegationConfig: { delegationId } }), pattern B in@nevermined-io/payments) and stores it in ahttpOnlycookie. The browser never sees the raw token. - From then on, the catch-all
/api/[..._path]proxy reads the cookie on every outgoing LangGraph run and JSON-injects it into the run body atconfig.configurable.payment_token— the contract@requires_paymentreads from. The agent’s tool runs verify → execute → settle internally; the LLM concierge handles everything else for free.
NVM_API_KEY lives only on the Next.js server. The browser holds, at most, the short-lived sessionToken for the duration of the popup.
Because gating is in-graph (no http.app on langgraph.json), neither /runs/wait nor /runs/stream need to be in any routes map — the chat UI’s useStream hits /runs/stream and the tool itself decides whether to charge.
- The widget session uses the self-mint endpoint (
POST /api/v1/widgets/session/self), which restrictsreturnUrltolocalhost/127.0.0.1/[::1]. Deploying the chat UI to a real domain requires the widget-key flow instead. - The agent’s paid tool is bound to a single plan in this demo (the chat UI reads
NVM_PLAN_IDfrom env, the tool uses the matching plan id in@requires_payment(plan_id=…)). Multi-tool agents with mixed pricing need oneaccepts[]entry per paid tool. - The popup pattern needs same-origin between the chat UI and the callback page (both served by Next.js).
Combining with @requires_payment
The middleware and the LangChain decorator can be used together — the middleware gates the agent’s HTTP entry point, the decorator gates individual tools inside the agent. Each layer charges independently. Common pattern: charge a flat rate per agent invocation (middleware) plus dynamic per-tool credits (decorator) for expensive tool calls.
The chat-UI tutorial above is an example of using the decorator alone at deployment-time (no middleware on the HTTP layer). That choice is driven by the UX — free introspection, paid execution — and it composes cleanly with vanilla LangSmith Deployment because the agent doesn’t need a custom http.app.
Limitations
- Streaming responses are buffered. The middleware reads the downstream response body in full before attaching the
payment-responsesettlement header. SSE //runs/streamendpoints become blocking-then-bulk. Gate/runs/waitonly, or accept the trade-off. - Python only. TypeScript variant tracked but blocked on LangChain shipping a TS runtime.
- Sync I/O is wrapped. The four sync SDK calls (
resolve_scheme,resolve_network,verify_permissions,settle_permissions) run viaasyncio.to_thread(...)so they don’t block the event loop.langgraph dev’s blocking-call detector treats unwrapped sync HTTP as fatal warnings.
See also
- LangChain — tool-time
@requires_paymentdecorator payments-py[langsmith]SDK docs (chapter 13) — full API referencelangchain-langsmith-deployment-pytutorial — runnable CLI end-to-end demo of route-level middleware gating (this page)langchain-research-agent-pytutorial — companion in-tool gating demo (freemium ReAct agent) used by the chat UIlangchain-chat-ui-nvmtutorial — browser chat UI with card-delegation popup- Card delegation — white-label redirect — full reference for the embed-app
/cards/setupflow used by the chat UI popup