Skip to content

ADR-012 — Support Agent Architecture

Status: Accepted
Date: 2026-05-21

Context

M3 introduces the support agent that turns inbound customer messages into supervised drafts. Key decisions: call pattern, confidence scoring, intent handling, prompt structure, and history window.

Decisions

Single-call architecture

The agent uses a single LLM call per inbound message — no tool-calling loops.

ProcessInboundMessageJob already pre-fetches orders and conversation history before the agent runs. The agent receives a fully-assembled context block and returns one structured JSON response.

Why not tool-calling: ZAI GLM and Kimi (the dog-food models) have unreliable tool-calling support via OpenRouter. Single-call is deterministically testable with one Http::fake() stub.

Response schema

The LLM must return valid JSON matching this schema:

json
{
  "intent":        "order_status | refund_request | product_qa | other",
  "action_type":   "reply | escalate | refund | cancel | resolve",
  "confidence":    85,
  "draft":         "Hi Sarah, your order #1042 shipped yesterday...",
  "internal_note": "Customer asking about delayed delivery. Order shows shipped via Leopards."
}
  • draft is empty string when action_type is escalate
  • internal_note is posted to the channel (Chatwoot) as an internal note — visible to human reviewer, not the customer

Confidence scoring

Self-reported: the LLM returns confidence as an integer 0–100.

Hard overrides (applied in code after parsing):

  • action_type = refund → force confidence = 0 (always requires human approval)
  • action_type = cancel → force confidence = 0 (always requires human approval)
  • confidence < 80 → set action_type = escalate, clear draft

This means refunds and cancellations always enter the approval queue, regardless of how confident the LLM claims to be.

Prompt structure

System prompt (~500 tokens, static per tenant):

  • Role: "You are a customer support agent for {store_name}. Tone: {store_tone}."
  • Output contract: exact JSON schema above
  • Escalation rule: "If confidence < 80 or you cannot answer from the provided context, set confidence low and leave draft empty"
  • Hard refusal: "Never promise a refund or cancellation unless the customer has explicitly requested it"
  • Supported intents list

Runtime context block (injected per call):

<context>
Orders: [{ reference, status, items, total, placed_at }]
Products: [{ name, sku, price, stock_status }]  ← only if message contains product keywords
Conversation (last {history_window} messages):
  [customer]: ...
  [agent]: ...
</context>

History window

Default: last 10 messages (5 exchanges). Configurable per tenant via agent_config.history_window.

Truncation: oldest messages dropped first. The current inbound message is always included.

Consequences

  • SupportAgentService builds the context block and system prompt, calls LlmClientContract, applies hard confidence overrides
  • ProcessInboundMessageJob delegates to SupportAgentService (replaces the // TODO (M3) stub)
  • AgentResponseDTO is a typed DTO carrying all five response fields
  • Product search (ProductAdapterContract::searchProducts) is called only when the message body contains product-like keywords — avoids unnecessary API calls for pure order-status queries
  • Tests: Http::fake() the OpenRouter endpoint; assert correct ApprovalRequest payload and confidence override logic