> ## Documentation Index
> Fetch the complete documentation index at: https://docs.merchantai.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Keep Your Agent Accurate and On-Brand with Guardrails

> Prevent hallucinations and keep your agent on-brand using approved-source grounding, confidence thresholds, topic controls, and escalation rules.

Guardrails are what keep your agent trustworthy. Without them, an AI agent can drift into guesswork, touch topics it shouldn't, or respond in ways that conflict with your brand or policies. MerchantAI's guardrails system is designed to prevent all of these problems by grounding every answer in sources you have approved, requiring a minimum confidence level before responding, and routing to a human whenever the agent encounters something it shouldn't handle alone.

## Approved-source grounding

Every answer your agent gives must be traceable back to a source you have approved in the knowledge base. MerchantAI does not draw on general world knowledge, training data, or any content outside of your approved sources when forming a response. If a visitor asks a question and the agent cannot construct a confident answer from your approved content, it will escalate the conversation rather than guess.

This means you control the scope of what your agent knows, and your visitors only ever receive answers backed by your content.

## Confidence thresholds

MerchantAI assigns a confidence score to every answer it considers giving. You can configure the minimum score required before the agent delivers a response in **Configuration → Guardrails → Confidence Threshold**. If the score falls below your threshold, the agent escalates to a human rather than returning a low-confidence answer.

<Accordion title="Choosing the right threshold">
  A higher threshold means the agent escalates more often but is less likely to give an incorrect or misleading answer. A lower threshold means the agent handles more queries independently but may occasionally be less precise. Start with a higher threshold while you are building out your knowledge base, then lower it gradually as you review conversations and confirm the agent's accuracy.
</Accordion>

<Tip>
  Start with a stricter confidence threshold when you first go live. Review the missed-answer and escalation logs in Analytics after your first week, then loosen the threshold incrementally as your knowledge base matures and you confirm the agent is performing well on the topics you care about most.
</Tip>

## Topic controls

You can prevent your agent from engaging with specific topics entirely using the topic blocklist. Any query that matches a blocked topic will receive a polite decline or trigger a handover, depending on how you configure the rule.

Common topics to consider blocking or controlling:

* **Competitor mentions** — Decline to compare your products with competitors or discuss competitor pricing.
* **Off-topic questions** — Redirect visitors who ask questions unrelated to your business.
* **Sensitive subjects** — Block topics such as medical advice, legal guidance, or financial advice that fall outside your scope.
* **Specific keywords** — Block individual terms or phrases that should never appear in agent responses.

To manage your blocklist, go to **Configuration → Guardrails → Topic Controls**.

## Profanity filters

MerchantAI includes a built-in profanity filter that activates automatically. When a visitor sends abusive or profane input, the agent responds gracefully without escalating the language or matching the visitor's tone. The agent may offer a handover to a human team member if the interaction continues to be abusive. You do not need to configure this filter — it is always on.

## Escalation rules

Escalation rules define the precise conditions under which a conversation is handed over to your human team. You can configure rules based on:

* **Confidence falling below threshold** — the primary guardrail trigger
* **Blocked topic matched** — any query that hits your topic blocklist
* **Specific keywords detected** — trigger words that indicate the visitor needs human support
* **Repeated failed answers** — the agent has tried and failed to answer the same query multiple times

To view and edit your escalation rules, go to **Configuration → Guardrails → Escalation Rules**.

<Note>
  Topic controls and confidence threshold settings are available on all paid plans. Free plan users have access to the default guardrail configuration, including profanity filtering and approved-source grounding, but cannot customise thresholds or add topic blocklist entries.
</Note>