Frontend Integration — Conversation API

The conversation REST API is the recommended way to integrate a frontend application with Swarmd. It’s designed to survive real-world HTTP conditions — CDN cutoffs, mobile network churn, browser tab suspension — that a long-lived synchronous call cannot. Every conversation is identified by a stable contextId. You start it once, send messages against it, and poll its state URL until the aggregate goes terminal. The relay handles the orchestration in the background on a virtual thread; your frontend never has to hold the socket open for the whole run.

Coming soon: a first-party client library (@swarmd/client for TypeScript, swarmd for Python) will handle the create/send/poll cycle, the 200 vs 202 signal, token refresh, and HITL bubble state transitions for you. If you’re integrating today, follow the raw HTTP flow below — the library will be a drop-in replacement when it ships.

Access Types

The conversation API works for both channel-based access and user-based access. Only the URL prefix and the auth token differ.

	Channel Access	User Access
Use case	Apps, bots, services, embedded chat	Dashboard, admin tools
Auth	OAuth2 client credentials (`channel-{channelId}` client)	User login (Bearer token)
URL prefix	`/relay/v1/channels/{channelId}/conversations`	`/relay/v1/human/conversations`

Examples in this guide use the channel prefix; substitute the human prefix and its Bearer token for user flows.

The Three Endpoints

The API is deliberately small — three operations cover the whole lifecycle.

Verb	Path	Purpose
`POST`	`/conversations`	Create a conversation; returns `{ contextId, ... }`
`POST`	`/conversations/{contextId}/messages`	Send a user message; returns `ConversationState`
`GET`	`/conversations/{contextId}/state`	Poll the current aggregate + latest reply

There is also GET /conversations/{contextId}/messages for paginated history if you need it — most frontends don’t; the messages[] array on ConversationState covers UI redraws.

The Send-and-Poll Flow

Every user turn follows the same three-step pattern.

POST /conversations                                → contextId
POST /conversations/{contextId}/messages           → 200 (terminal)   ← done
                                                   → 202 (working)    ← poll
GET  /conversations/{contextId}/state    (repeat)  → aggregate flips terminal

Two response codes on the send, one shape.

Status	Meaning	Client action
`200 OK`	Middleware chain reached a terminal aggregate within the early-return window (30 s by default)	Render the reply; done for this turn
`202 Accepted`	Chain still running; response body carries `aggregateState=WORKING`	Start polling `GET /state` every ~2 s until aggregate is terminal

The response body is a ConversationState in both cases — same JSON shape, no polymorphism. Your code inspects aggregateState (or reads the HTTP status as a fast path) and decides whether to render or poll.

Step 1 — Create the Conversation

Once per user session (or once per widget mount), mint a conversation bound to the agent you want to talk to.

POST /relay/v1/channels/{channelId}/conversations
Authorization: Bearer <channel-access-token>
Content-Type: application/json

{
  "agentId": "4a411d26-dec0-4553-a164-96ccecf0ecb9"
}

201 Created

{
  "id": "b8a7...",
  "contextId": "17aa30cf-9a10-4c25-8e8d-2b31ff31a4c1",
  "source": { "kind": "CHANNEL", "id": "..." },
  "sink":   { "kind": "AGENT",   "id": "4a411d26-..." },
  "createdAt": "2026-07-03T10:10:33Z"
}

Cache the contextId client-side. Every follow-up message in this conversation reuses the same value.

Step 2 — Send a Message

POST /relay/v1/channels/{channelId}/conversations/{contextId}/messages
Authorization: Bearer <channel-access-token>
Content-Type: application/json

{
  "message": {
    "messageId": "msg-a1b2c3d4",
    "role": "user",
    "kind": "message",
    "contextId": "17aa30cf-9a10-4c25-8e8d-2b31ff31a4c1",
    "parts": [
      { "kind": "text", "text": "Shift RES-000108 from 12 to 19 August. Rate difference?" }
    ]
  }
}

Two possible responses — same JSON body shape, different HTTP status:

Fast — 200 OK
Slow — 202 Accepted

The chain finished within the early-return window (30 s default). The body is a terminal ConversationState. Render the reply and you’re done.

{
  "id": "b8a7...",
  "contextId": "17aa30cf-...",
  "aggregateState": "COMPLETED",
  "parentState": "COMPLETED",
  "messageCount": 2,
  "messages": [
    { "messageId": "msg-a1b2c3d4", "role": "user",  "parts": [{ "kind": "text", "text": "..." }] },
    { "messageId": "msg-e5f6g7h8", "role": "agent", "parts": [{ "kind": "text", "text": "Rate difference is $102/night ..." }] }
  ],
  "tasks": [
    { "taskId": "d49a82c0-...", "sinkAgentId": "4a411d26-...", "state": "COMPLETED", "createdAt": "..." }
  ],
  "latestTask": {
    "id": "d49a82c0-...",
    "status": {
      "state": "completed",
      "message": {
        "messageId": "msg-e5f6g7h8",
        "parts": [{ "kind": "text", "text": "Rate difference is $102/night ..." }]
      }
    }
  }
}

The chain overshot the early-return window. The relay handed the socket back so your frontend isn’t stuck holding it (avoids CDN 524s / mobile-network cutoffs / browser tab suspension). The middleware chain is still running on a background virtual thread. Body’s aggregateState is WORKING.

{
  "id": "b8a7...",
  "contextId": "17aa30cf-...",
  "aggregateState": "WORKING",
  "parentState": "CREATED",
  "messageCount": 1,
  "messages": [
    { "messageId": "msg-a1b2c3d4", "role": "user", "parts": [{ "kind": "text", "text": "..." }] }
  ],
  "tasks": [
    { "taskId": "d49a82c0-...", "sinkAgentId": "4a411d26-...", "state": "CREATED", "createdAt": "..." }
  ],
  "latestTask": {
    "id": "d49a82c0-...",
    "status": { "state": "working" }
  }
}

Start polling GET /state until aggregate flips terminal.

The 202 signal is what makes this integration robust. In the old JSON-RPC path (a2a/0.3.0), a slow orchestration meant your frontend held a 60–120 s HTTP call — anything longer than 100 s died to Cloudflare’s origin-timeout (524). With 202 + polling, the initial POST always returns within ~30 s regardless of how long the chain takes.

Step 3 — Poll for the Final Reply

While the aggregate is non-terminal, poll:

GET /relay/v1/channels/{channelId}/conversations/{contextId}/state
Authorization: Bearer <channel-access-token>

Response is a ConversationState — same shape as the send response.

{
  "contextId": "17aa30cf-...",
  "aggregateState": "COMPLETED",
  "parentState": "COMPLETED",
  "messageCount": 2,
  "messages": [ /* full conversation history */ ],
  "tasks":    [ /* per-hop task summaries — see below */ ],
  "latestTask": { /* the final agent reply */ }
}

Stop polling when aggregateState is in { COMPLETED, FAILED }. Otherwise wait ~2 s and try again.

The `aggregateState` field

Value	Terminal?	What it means
`UNKNOWN`	No	Conversation just created, no tasks yet — treat as WORKING
`WORKING`	No	Chain is executing — keep polling
`HITL_HELD`	No	A human reviewer needs to approve/reject — keep polling; show a “held for review” bubble
`COMPLETED`	Yes	Read `latestTask.status.message.parts[0].text` for the final reply. A HITL-rejected chain also lands here — check `latestTask.metadata.relay_reason === 'HITL_REJECTED'` to distinguish a rejection from a normal reply
`FAILED`	Yes	Something upstream errored — surface a generic error to the user

One Conversation, Many Tasks

Under one contextId there can be many RelayTask rows — one for every delegation hop. Say your user asks the reservations agent to move a booking that’s owned by a partner tenant. Behind the scenes:

reservation_hub  → guest_directory        (task #1)
reservation_hub  → hilton_reservations    (task #2)
hilton_reservations → availability_mcp    (task #3)
hilton_reservations → bookings_mcp        (task #4)
reservation_hub  → user (final reply)

Every hop persists its own RelayTask with its own state. ConversationState.tasks[] gives you the full breakdown:

"tasks": [
  { "taskId": "d49a82c0-...", "sinkAgentId": "4a411d26-...", "state": "COMPLETED", "createdAt": "..." },
  { "taskId": "45e57e95-...", "sinkAgentId": "be5c4923-...", "state": "COMPLETED", "createdAt": "..." },
  { "taskId": "d08328ff-...", "sinkAgentId": "be5c4923-...", "state": "COMPLETED", "createdAt": "..." }
]

The relay rolls those per-task states up into a single aggregateState using this precedence:

HITL_HELD > WORKING > FAILED > COMPLETED

So if any task is HITL_HELD, the aggregate is HITL_HELD. If any is still WORKING, it’s WORKING. Otherwise if any FAILED, it’s FAILED. Only if every task is COMPLETED does the aggregate go terminal. Most frontends only need to render latestTask.status.message (the user-visible reply) and the aggregate — but the per-task breakdown is available if you want to show a progress list (“guest lookup ✓ · Hilton availability ✓ · quote pending…”).

The Full Client Loop (TypeScript)

Minimal implementation of the pattern for a sendUserMessage function.

interface ConversationState {
  contextId: string;
  aggregateState: 'UNKNOWN' | 'WORKING' | 'HITL_HELD' | 'FAILED' | 'COMPLETED';
  latestTask?: {
    status?: { state?: string; message?: { parts?: Array<{ kind?: string; text?: string }> } };
    metadata?: { relay_reason?: string };
  };
  // ... messages[], tasks[], etc.
}

const BASE   = 'https://api.swarmd.ai';
const POLL_INTERVAL_MS = 2_000;
const POLL_DEADLINE_MS = 5 * 60 * 1_000;   // 5 min for non-HITL; extend for HITL bubbles

async function createConversation(channelId: string, agentId: string, token: string) {
  const r = await fetch(`${BASE}/relay/v1/channels/${channelId}/conversations`, {
    method: 'POST',
    headers: { Authorization: `Bearer ${token}`, 'Content-Type': 'application/json' },
    body: JSON.stringify({ agentId }),
  });
  if (!r.ok) throw new Error(`createConversation failed: ${r.status}`);
  return (await r.json()) as { contextId: string };
}

async function sendMessage(
  channelId: string,
  contextId: string,
  text: string,
  token: string,
): Promise<{ state: ConversationState; pending: boolean }> {
  const r = await fetch(
    `${BASE}/relay/v1/channels/${channelId}/conversations/${encodeURIComponent(contextId)}/messages`,
    {
      method: 'POST',
      headers: { Authorization: `Bearer ${token}`, 'Content-Type': 'application/json' },
      body: JSON.stringify({
        message: {
          messageId: crypto.randomUUID(),
          role: 'user',
          kind: 'message',
          contextId,
          parts: [{ kind: 'text', text }],
        },
      }),
    },
  );
  // 200 → terminal, 202 → still running (poll). Both bodies are ConversationState.
  const state = (await r.json()) as ConversationState;
  return { state, pending: r.status === 202 };
}

async function pollUntilTerminal(
  channelId: string,
  contextId: string,
  token: string,
): Promise<ConversationState> {
  const deadline = Date.now() + POLL_DEADLINE_MS;
  while (Date.now() < deadline) {
    const r = await fetch(
      `${BASE}/relay/v1/channels/${channelId}/conversations/${encodeURIComponent(contextId)}/state`,
      { headers: { Authorization: `Bearer ${token}` } },
    );
    if (!r.ok) throw new Error(`state poll failed: ${r.status}`);
    const state = (await r.json()) as ConversationState;
    const agg = state.aggregateState;
    if (agg === 'COMPLETED' || agg === 'FAILED') return state;
    await new Promise(res => setTimeout(res, POLL_INTERVAL_MS));
  }
  throw new Error('polling deadline exceeded');
}

// Top-level: mint a conversation once per session; call this per user turn.
export async function sendUserMessage(
  channelId: string, contextId: string, text: string, token: string,
): Promise<string> {
  const { state, pending } = await sendMessage(channelId, contextId, text, token);
  const finalState = pending ? await pollUntilTerminal(channelId, contextId, token) : state;
  return finalState.latestTask?.status?.message?.parts?.[0]?.text ?? '[no reply]';
}

That’s the whole integration — three endpoints, one shape, one status-code branch.

HITL: What Changes for Your UI

When a policy holds a task for human review, aggregateState goes to HITL_HELD. It’s non-terminal, so your poll loop keeps ticking. The response carries an explicit relay reason so you can render the right UI:

{
  "aggregateState": "HITL_HELD",
  "latestTask": {
    "status": { "state": "working" },
    "metadata": {
      "relay_reason": "HITL_HELD",
      "policy_name": "high_value_refund",
      "policy_version": "v3",
      "policy_level": "HITL"
    }
  }
}

Two flavours you’ll see:

`relay_reason`	UI state
`HITL_HELD`	Reviewer approval required — show “Awaiting human approval” bubble
`HITL_HELD_AGENT_INPUT_REQUIRED`	Agent asking the caller for input — show “Agent needs your confirmation” bubble

On resolution, the aggregate flips to COMPLETED in both cases — the reject signal is in latestTask:

Approved → latestTask.status.message.parts[0].text carries the agent’s reply.
Rejected → latestTask.metadata.relay_reason === 'HITL_REJECTED' and latestTask.status.state === 'canceled'; body is empty.

For HITL flows, extend your polling deadline — analyst approvals can take hours. A common pattern: 5-min deadline by default, sliding 30-min deadline while the aggregate is HITL_HELD.

Errors and Retries

Situation	Response	Recommended handling
Bad auth token	`401 Unauthorized`	Refresh the token, retry once
Wrong tenant / wrong channel for this contextId	`404 Not Found`	Do not retry — auth misconfiguration
Policy blocked the request at ingress	`400` or `403` with `POLICY_BLOCKED` body	Surface the policy message to the user; do not retry
Poll returns 502 / 503	Transport hiccup	Retry with backoff; abort after ~3 consecutive failures
Timed out after `POLL_DEADLINE_MS`	Client-side deadline	Show “still processing” to the user; you can resume polling later on the same contextId — the state is durable server-side

The contextId is the durable handle for a conversation. If your user closes the tab or your app crashes mid-poll, you can pick up exactly where you left off by re-polling the same GET /state URL.

Conversation Continuity

Follow-up user turns reuse the same contextId — the agent retains full history.

POST /relay/v1/channels/{channelId}/conversations/{contextId}/messages

{
  "message": {
    "messageId": "msg-b2c3d4e5",
    "role": "user",
    "kind": "message",
    "contextId": "17aa30cf-...",
    "parts": [{ "kind": "text", "text": "What about the following week?" }]
  }
}

There’s no need to mint a new conversation per turn. One contextId per user session (or per widget mount) is the typical pattern.

Comparison with the Legacy JSON-RPC Path

If you’re on the older JSON-RPC path (POST /relay/v1/…/agents/{agentId}/a2a/0.3.0 with message/send + tasks/get), here’s what carries over:

Concept	JSON-RPC (legacy)	Conversation REST (this guide)
Session handle	`contextId` on the result envelope	`contextId` from `POST /conversations`
Send	`POST … a2a/0.3.0` with `method: "message/send"`	`POST /conversations/{ctx}/messages`
Poll	`POST … a2a/0.3.0` with `method: "tasks/get"`	`GET /conversations/{ctx}/state`
Task identity	Single `result.id` per turn	Multi-task per contextId; aggregate + per-task states
Long-running signal	`result.status.state == "working"` + `metadata.relay_reason == "TIMEOUT"`	HTTP 202 + `aggregateState == "WORKING"`

The legacy path still works and is documented at HITL Frontend Integration. New integrations should prefer the conversation REST API — it survives CDN cutoffs cleanly, is more explicit about the “keep polling” signal, and is what the upcoming first-party client library will target.

Summary

Mint a conversation once with POST /conversations. Keep the contextId.
Send messages with POST /conversations/{contextId}/messages. Inspect the HTTP status: 200 = done, 202 = poll.
Poll GET /conversations/{contextId}/state until aggregateState is COMPLETED or FAILED.
The response body is a ConversationState in every case — same shape, no polymorphism.
One contextId groups many RelayTask rows; the relay rolls them up into an aggregateState for you.
Coming soon: first-party client library that wraps all of this.

Next Steps

Your First Agent — provision a channel or user and mint an access token.
Human-in-the-Loop — background on how HITL policies feed the states above.
Monitoring and Audit — inspect the per-task audit trail behind ConversationState.tasks[].

​Frontend Integration — Conversation API

​Access Types

​The Three Endpoints

​The Send-and-Poll Flow

​Step 1 — Create the Conversation

​Step 2 — Send a Message

​Step 3 — Poll for the Final Reply

​The aggregateState field

​One Conversation, Many Tasks

​The Full Client Loop (TypeScript)

​HITL: What Changes for Your UI

​Errors and Retries

​Conversation Continuity

​Comparison with the Legacy JSON-RPC Path

​Summary

​Next Steps

Frontend Integration — Conversation API

Access Types

The Three Endpoints

The Send-and-Poll Flow

Step 1 — Create the Conversation

Step 2 — Send a Message

Step 3 — Poll for the Final Reply

The `aggregateState` field

One Conversation, Many Tasks

The Full Client Loop (TypeScript)

HITL: What Changes for Your UI

Errors and Retries

Conversation Continuity

Comparison with the Legacy JSON-RPC Path

Summary

Next Steps