Summary: What You’ll Find Inside
We built a fully custom agentic AI chatbot for our own website and this post breaks down exactly how we did it. We cover why we chose ReAct architecture over simpler approaches, how we structured specialized domain tools instead of one bloated system prompt, and how dual-layer memory makes follow-up questions feel natural. We also walk through the guardrails keeping the bot on topic, the CRM integration that turns every conversation into a captured lead, and the observability setup that makes debugging straightforward. Real architectural decisions, real tradeoffs, and real production code throughout.

The chatbot sitting on the main page of Genetech Solutions website is not a widget from a SaaS platform. It is not a scripted FAQ bot. It is a fully custom agentic AI system built by our own team, running in production, handling real queries from real visitors every day. It reasons. It routes. It remembers. And when the moment is right, it captures a lead and notifies the right person on our team within seconds.
Across this series, we have covered what agentic AI is and why it matters for business leaders [link], how to choose the right frameworks, and the different types of agentic systems. This episode is where we get concrete. We are pulling back the curtain on our own chatbot — the decisions we made, the architecture we chose, the code we wrote, and what we would do differently if we were starting today.
This is not a high-level overview. Throughout this breakdown, we will show you exactly how the system works including real architectural decisions, implementation patterns, and production code snippets from the chatbot running on our website today. If you are building something similar, you should be able to reuse parts of this directly.
If you are a developer building something similar, the technical sections will give you a working reference. If you are a business or product leader trying to understand whether this kind of system makes sense for your organization, the earlier sections will tell you what questions to ask and why the answers matter.
Introducing: The Genetech Solutions Agentic AI Chatbot
The Genetech Solutions website has a custom-built agentic AI chatbot that handles visitor queries, captures leads, and routes conversations to the right team — all without a human on standby.
What it does
At its core, the chatbot answers any question a visitor might have about Genetech — services, pricing, portfolio, team, industries, and contact information. But it goes well beyond a knowledge base. It actively guides visitors toward the next step: booking a consultation, connecting with the sales team, submitting a project brief, or reporting an issue.
How it works

Video: Introducing the agentic AI chatbot
The chatbot operates on two tracks. For common queries and predictable journeys, it uses a button-based tree navigation — no LLM involved, which keeps responses instant and costs low. For open-ended, free-form questions, it fires the language model, which reasons across the conversation and routes to the appropriate response or action. Both tracks terminate at the same place: a form that captures structured data.
Every conversation path ends in a form — lead generation, consultation scheduling, sales connection, or issue reporting. The forms are NLP-driven, meaning the agent extracts relevant information from the conversation before presenting them, so visitors are not repeating themselves.
(screenmovie – product development get a quote)

Video: Interacting with the Genetech Solutions agentic AI chatbot
Guardrails keep the LLM focused on Genetech. It will not invent services that do not exist, drift into off-topic territory, or produce responses that contradict what is on the website.
Personalized engagement
Beyond the chat widget, the system includes behavior-triggered pop-ups. When a visitor reads a specific service page and scrolls past 60% or shows exit intent, a contextual prompt appears tied to exactly what they were reading — with two CTAs: connect with the sales team, or access a relevant resource like a checklist guide.
CRM integration

Image: Every lead falls into the integrated company website CRM
Every form submission flows into the Genetech CRM with the full conversation history attached. Leads, sales queries, consultation requests, and issues each land in separate tables so every team sees only what is relevant to them. The conversation transcript paired with the form entry means the sales team picks up a lead knowing exactly what the visitor was looking for, what questions they asked, and where the conversation went.
Who it is for
Any visitor to the Genetech website — developers evaluating technical capabilities, founders comparing vendors, operations leads with specific questions, or businesses exploring AI and software services for the first time. The agentic AI chatbot meets each of them where they are.
Now: What Kind of AI Agent Is This?
Before we go into how we built it, it is worth being clear about what category of system this is — because not all agentic AI is built the same way, and the category you are building in determines most of your technical decisions.
Agentic AI systems generally fall into six types:
- Conversational AI Agents — chat-based systems that use an LLM to reason across multi-turn conversations, understand intent, and respond or act accordingly. This is what the Genetech agentic AI chatbot is.
- Task Automation Agents — systems that execute structured, multi-step workflows triggered by an event. Invoice processing, lead routing, employee onboarding.
- Multi-Agent Systems — teams of specialist agents coordinated by an orchestrator. Each agent handles a specific domain; the orchestrator manages the workflow.
- RAG-Based Knowledge Agents — agents connected to a vector database of proprietary documents. They retrieve relevant information before generating a response, grounding the LLM in your actual data.
- Code Agents — agents that write, execute, test, and debug code in a sandboxed environment. Used in software development and QA pipelines.
- Computer Use / Browser Agents — agents that interact with websites and applications the way a human does: clicking, navigating, filling forms. Used when no API exists.
The Genetech chatbot is a conversational AI agent. Its primary mode of interaction is natural language. Its job is to understand what a visitor wants, route them through the right workflow, and capture structured data at the end. But it borrows from other types too — it uses task automation patterns for form routing and CRM writes, and RAG principles in how its tools are structured around curated knowledge bases.
Understanding which type you are building is not an academic exercise. It tells you which framework to use, how to structure your memory system, what your testing approach should look like, and where the most likely failure points are. We are building a conversational agent, so everything that follows is shaped by that.
What Makes a Chatbot ‘Agentic’?
The word gets used loosely, so it is worth being precise. A traditional chatbot follows a script — if the user says X, respond with Y. It does not interpret. It does not adapt. It does not remember what you said two messages ago. It matches inputs to pre-written outputs.
A conversational agentic AI chatbot works differently. At its core is an LLM acting as a reasoning engine. Instead of matching to a script, it reads the conversation, classifies what the user wants, selects the appropriate tool or response path, and generates an output tailored to that specific moment. When the next message arrives, it does the same thing again — but now with the full context of everything that came before.
The three capabilities that define a genuinely agentic conversational system are:
- Reasoning across turns — the agent understands follow-up questions, pronoun references, and context shifts. ‘What is the pricing for that?’ only makes sense if the agent remembers what ‘that’ refers to.
- Dynamic tool selection — rather than a fixed pipeline, the agent decides in real time which capability to invoke based on what the user actually said.
- Goal-directed action — the agent is not just generating responses. It is working toward an outcome: in our case, capturing a qualified lead or booking a consultation.
A chatbot that lacks any of these is not really an agent — it is a more sophisticated FAQ bot. The Genetech chatbot has all three. Here is how we built it.
Before You Write a Single Line of Code
Every agentic AI project that struggles in production shares a common cause: the team started building before they finished thinking. The most valuable investment you can make before touching any framework is a clear, written answer to four questions.
What is the specific problem this agentic AI chatbot is solving?
Not ‘we want AI on our website.’ Something specific: visitors are landing on our services pages and leaving without a conversion action, and we have no way to capture intent or route them appropriately at scale. The more precisely you can state the problem, the more clearly the solution reveals itself.
For Genetech, the problem was conversion and intent segmentation. That single definition told us we needed lead capture forms, routing logic by service type, and a way to engage visitors before they left. Everything in the architecture traces back to that problem statement.
Who is going to use it, and what do they expect?
A chatbot serving technical users can tolerate more friction than one serving business decision-makers who are evaluating vendors. The Genetech agentic AI chatbot serves a mixed audience — developers, founders, operations leads, marketers — and the interaction design reflects that. The button navigation handles users who want to move quickly without typing. The open query input is designed for users with specific questions. Both paths are necessary because the audience is not uniform.
What data do you need to capture, and where does it need to go?
Agentic chatbots are data pipelines as much as they are conversation interfaces. Define upfront what you need to collect — name, email, query type, service interest, issue description — and where each type of data needs to land. If you design conversation flows first and figure out the data destination later, you will build the wrong forms, in the wrong order, connected to nothing useful.
How will you measure whether it is working?
Conversion rate on leads captured. Time to first response for sales follow-up. Drop-off rate at each stage of the conversation flow. Define these before you deploy. Without baseline metrics, you cannot make the case that the system is working, and you cannot identify what to improve.
With these four questions answered, the architecture choices become much cleaner. For Genetech, the answers pointed directly to a ReAct agent with specialized domain tools, dual-layer memory, and a CRM-connected form system. Here is how each of those pieces was built.
1. Building the Agentic AI Chatbot Workflow (Architecture)
1.1 Choosing the Right Architecture: Why ReAct?
When building the Genetech agentci AI chatbot, we evaluated several agentic architectures — Chain-of-Thought prompting, simple tool-use pipelines, plan-and-execute agents, and routing-based multi-agent systems. Each has its place, but for a customer-facing chatbot that must answer diverse questions, collect leads, route to the right information, and handle follow-up context, we selected the ReAct (Reasoning + Acting) architecture.
What is ReAct Architecture?
ReAct is an agent pattern that interleaves reasoning (Thought) with action (Act) in an iterative loop. Instead of executing a fixed pipeline, the LLM reasons about what to do next, calls a tool, observes the result, and reasons again — continuing this loop until it produces a final answer.
| Thought → Act → Observe → Thought → Act → Observe → … → Final AnswerThis is fundamentally different from a static prompt pipeline. The agent dynamically decides which tools to invoke based on user intent, conversation history, and tool outputs. |
Why Not the Alternatives?
Before settling on ReAct, we evaluated other approaches:
- Simple Chain Prompting — a single LLM call with a large system prompt containing all company knowledge. The problem: massive context bloat, high token cost, no structured action capability.
- Plan-and-Execute — the agent plans all steps upfront, then executes. The problem: brittle for conversational flows where the user changes direction mid-conversation.
- Static Tool Routing — hardcoded rules to route queries to handlers. The problem: fails on nuanced or ambiguous queries and cannot handle follow-ups naturally.
- Multi-Agent Systems — multiple specialized agents coordinating. The problem: overkill complexity for a single-domain chatbot; harder to debug and maintain.
ReAct solves all of these by giving the LLM full agency to reason about user intent and dynamically select the right tool — within a single, coherent reasoning loop.
How the Main LLM Acts as a Query Router
In the Genetech agentic AI chatbot, GPT-4o serves as the central orchestrator. It receives every user message along with the conversation history and the full list of available tools. It then reads the user’s intent, reasons about which specialized tool can best answer, calls that tool with relevant parameters, observes the output, and formulates a final response. The LLM is not just a text generator — it is a dynamic query router that dispatches to specialized domain handlers.
1.2 ReAct Agent Flow Diagram

Diagram: Each iteration of the Reason → Act → Observe loop lets the agent adjust based on tool results
1.3 Code Snippet: ReAct Agent Initialization in LangGraph
# ── ReAct Agent Initialization (app.py) ──────────────────────────────────
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver
# Short-Term Memory (STM) — persists conversation per session thread
checkpointer = MemorySaver()
# LLM Orchestrator: GPT-4o as the central reasoning engine
llm = ChatOpenAI(
model='gpt-4o',
temperature=0.0, # Deterministic routing — no hallucination
max_tokens=500
)
# Register all specialized domain tools
tools = [
handle_greeting_feedbacks, # Greetings & casual messages
services_info, # Knowledge: Genetech services
company_portfolio, # Knowledge: Portfolio by domain
Company_Industries, # Knowledge: Industries served
pricing_info, # Knowledge: Pricing models
company_contact_info, # Knowledge: Office locations
Awards, # Knowledge: Company awards
about_genetech, # Knowledge: Company overview
show_lead_form, # Action: Capture project lead
show_consultation_form, # Action: Capture consultation
show_sales_form, # Action: Connect to sales team
show_issue_form, # Action: Report website issue
handle_irrelevant_queries, # Guardrail: Out-of-scope filter
business_recommendations_solutions,
looking_job_opportunity,
client_reviews_and_ratings,
Podcast_Blogs,
Company_Founder_and_Teams,
]
# Create the ReAct Agent with LangGraph
agent_executor = create_react_agent(
model=llm,
tools=tools,
checkpointer=checkpointer, # Enables STM across turns
state_modifier=system_prompt # Company context injected
)
# Invoke with per-session thread ID for isolated memory
config = {'configurable': {'thread_id': thread_id}}
response = agent_executor.invoke(
{'messages': [HumanMessage(content=user_input)]},
config=config
)Code: ReAct agent initialization — GPT-4o as the reasoning engine, all domain tools registered, MemorySaver for session persistence
2. Binding the ReAct Agent with Specialized Domain Tools
2.1 Why Specialized Tools Instead of One Big System Prompt?
The most common mistake in building LLM-powered chatbots is loading everything into a single, monolithic system prompt. Imagine injecting all of Genetech’s service descriptions, pricing tables, portfolio links, industry data, contact information, awards, and CRM form logic into one giant context window. This creates two critical problems.
| Context Rot: As the context grows, the LLM’s attention degrades on earlier parts of the prompt. Important instructions get ‘forgotten’ in long conversations.Context Bloat: Each API call becomes expensive and slow because unnecessary tokens are processed on every single turn. |
The solution is specialization. Instead of one large context, we define compact, focused tools — each a domain expert with its own curated knowledge. The ReAct agent only loads what it needs for the current query. This is both cheaper and more accurate.
- Reduced Token Cost — only the relevant tool’s knowledge is processed per query
- Higher Accuracy — each tool is optimized for its specific domain, no dilution
- Easier Maintenance — update a tool’s knowledge without touching the entire agent
- Better Testability — each tool can be unit-tested in isolation
- Cleaner Guardrails — out-of-scope queries are rejected by a dedicated tool
2.2 Tool Categories in the Genetech Agentic AI Chatbot
A) Knowledge Base Tools
Knowledge tools provide domain-specific information by combining curated company data with LLM-powered natural language generation. Each tool maintains its own prompt template and knowledge base:
- services_info — deep knowledge of all 16+ Genetech services. Uses a dedicated services_data string and a PromptTemplate to generate context-aware responses.
- company_portfolio — validates whether Genetech offers a service or industry, then provides the appropriate portfolio link. Maintains a 66+ industry catalog.
- Company_Industries — industry-specific deep dives for Healthcare, Fintech, Logistics, Education, and Hospitality, plus portfolio links for 50+ additional verticals.
- pricing_info — handles all pricing queries: hourly (USD 50), dedicated resources (USD 2,500–4,000/month), and project-based estimates. Includes objection handling.
- company_contact_info — contact details for 4 offices (2 Karachi, 1 Skardu, 1 Ann Arbor USA) with strict social media policy enforcement.
- about_genetech, Awards, Company_Founder_and_Teams, client_reviews_and_ratings, Podcast_Blogs, looking_job_opportunity — each serves a focused information domain.
- business_recommendations_solutions — maps user industry context to the most relevant Genetech services. Triggered when a user says ‘I am in healthcare, how can you help me?’
B) Action-Based Tools (CRM Integration)
Action tools trigger interactive forms that collect leads and save data directly into PostgreSQL CRM tables. These are the bridge between conversation and business value:
- show_lead_form — triggered when a user wants to hire Genetech for a software project. Returns ‘SHOW_LEAD_FORM’ signal; the Flask backend detects this and renders the lead capture form. Data is saved to the leads table.
- show_consultation_form — captures consultation requests. Saved to the consultant table with session_id, name, email, and consultation type.
- show_sales_form — connects users with the sales team. Saves name, email, and reason_for_connecting to the sales table.
- show_issue_form — reports Genetech-specific issues. Saved to the issues table with issue_description.
| Every form submission is tracked with the user’s session_id, enabling full conversation-to-lead tracing. The ConversationTracker saves every user message, bot response, button click, and form submission — creating a complete client interaction audit trail for lead generation analytics. |
2.3 Tool Connection Diagram — ReAct Loop
The following diagram shows how tools are dynamically selected inside the ReAct loop:

Diagram: The LLM selects the appropriate tool dynamically — based on intent, not hardcoded rules
2.4 Code Snippet: Tool Registration and Binding
Each tool is defined using LangChain’s @tool decorator. Below are examples of both a Knowledge Base tool and an Action-based tool from the Genetech chatbot:
# ── Knowledge Base Tool: services_info ──────────────────────────────────
@tool
def services_info(user_message: str = '') -> str:
'''AI-Powered tool for Genetech service inquiries.
Handles: web development, mobile, AI, DevOps, cybersecurity queries.
'''
services_data = '''
1. Web Development: React, Node.js, Python — custom websites & web apps
2. Mobile App Development: iOS, Android, Flutter — cross-platform solutions
3. AI Integration: ML, Chatbots, NLP, Computer Vision, Generative AI
4. Cybersecurity: Security audits, penetration testing, compliance
5. DevOps: CI/CD, Docker, Kubernetes, infrastructure automation
... (16+ services with detailed descriptions)
'''
prompt = PromptTemplate(
template='You are Genetech Solutions AI assistant... {services_data}... {user_message}',
input_variables=['user_message', 'services_data']
)
chain = prompt | llm
return chain.invoke({'user_message': user_message, 'services_data': services_data}).content
# ── Action-Based Tool: show_lead_form ────────────────────────────────────
@tool
def show_lead_form() -> str:
'''Triggered when user wants to hire Genetech for software development.
Returns signal string that Flask backend intercepts to render the form UI.
'''
return 'SHOW_LEAD_FORM' # Signal detected in process_user_message()
# ── Flask: Signal Detection and Form Routing ─────────────────────────────
def process_user_message(user_input, client_session_id, thread_id):
config = {'configurable': {'thread_id': thread_id}}
response = agent_executor.invoke(
{'messages': [HumanMessage(content=user_input)]},
config=config
)
for msg in reversed(response['messages'][-3:]):
if type(msg).__name__ == 'ToolMessage':
if 'SHOW_LEAD_FORM' in msg.content:
return {'show_form': 'lead', 'response': ''}
elif 'SHOW_CONSULTATION_FORM' in msg.content:
return {'show_form': 'consultation', 'response': ''}
return {'show_form': None, 'response': agent_response}
# ── CRM: Lead saved to PostgreSQL ────────────────────────────────────────
@app.route('/submit_lead_form', methods=['POST'])
def submit_lead_form():
cursor.execute('''
INSERT INTO leads (date, name, email, phone, status, session_id)
VALUES (%s, %s, %s, %s, %s, %s)
''', (datetime.now(), name, email, phone, 'New Lead', session_id))
ConversationTracker.track_form_submit(session_id, 'lead', form_data)
send_form_notification_async('lead', form_data) # Email alertCode: Knowledge tool using PromptTemplate + LLM chain; action tool returning a signal string; Flask signal detection and PostgreSQL CRM write
3. Designing the Agentic AI Chatbot Memory System
3.1 What Makes This a Real AI Agent?
The difference between a simple chatbot and a true AI agent often comes down to one capability: memory. Without it, every message is a brand-new, isolated interaction. The agent cannot remember what was said two messages ago, cannot build on previous context, and cannot maintain the natural flow of a real conversation.
| Without memory, ask the chatbot ‘Tell me about your web development services’ and then follow up with ‘What was the pricing for that?’ — and it has no idea what ‘that’ refers to. This breaks the user experience entirely and makes the chatbot feel robotic and unhelpful. |
For the Genetech agentic AI chatbot to behave like a real intelligent assistant — understanding follow-ups, routing based on conversation context, and tracking leads across a session — memory is non-negotiable.
3.2 Short-Term Memory vs Long-Term Memory
Short-Term Memory (STM)
Stores the active conversation history within a single user session. Enables the agent to understand follow-up questions, pronoun references, and context switches. In LangGraph, this is implemented as an in-memory checkpointer keyed by a unique thread_id per session.
Long-Term Memory (LTM)
Persists data across sessions and stores structured business data permanently. Does not expire when the session ends. In the Genetech chatbot, LTM is implemented as a PostgreSQL database that stores leads, consultations, issues, sales contacts, and complete conversation histories for analytics and follow-up.
3.3 Our Memory Implementation
The Genetech chatbot implements both STM and LTM in a complementary architecture:
- STM via LangGraph MemorySaver: Each user gets a unique thread_id. MemorySaver persists all messages in-memory for that thread. When the agent is invoked with the same thread_id, it automatically receives the full conversation history — enabling natural follow-ups and context-aware routing.
- LTM via PostgreSQL CRM: Every form submission is saved permanently to PostgreSQL with the session_id as a foreign key. The ConversationTracker writes every user message, bot response, button click, and form submission to the conversation_history table.
| The combination of STM + LTM means: the chatbot remembers you within this conversation (STM), and Genetech’s sales team can see your full interaction history later (LTM). This is what transforms a chatbot into a lead generation engine. |
3.4 Memory System Diagram

Diagram: Short-term memory for live context. Long-term memory for everything that outlasts the session.
3.5 Code Snippet: STM + LTM Integration
# ── SHORT-TERM MEMORY (STM): LangGraph MemorySaver ──────────────────────
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver() # In-memory state store
agent_executor = create_react_agent(
model=llm,
tools=tools,
checkpointer=checkpointer, # STM enabled
state_modifier=system_prompt
)
# Each user session gets an isolated thread with unique ID
def get_or_create_thread_id(client_session_id):
return session_manager.get_or_create_session(client_session_id)
# STM is automatically loaded when agent is invoked with same thread_id
config = {'configurable': {'thread_id': thread_id}}
response = agent_executor.invoke(
{'messages': [HumanMessage(content=user_input)]},
config=config # LangGraph restores full message history from MemorySaver
)
# ── LONG-TERM MEMORY (LTM): PostgreSQL CRM ───────────────────────────────
cursor.execute('''
CREATE TABLE IF NOT EXISTS conversation_history (
id SERIAL PRIMARY KEY,
session_id VARCHAR(255) NOT NULL,
timestamp TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
message_type VARCHAR(50) NOT NULL, -- user_message | bot_response | button_click
user_message TEXT,
bot_response TEXT,
button_clicked VARCHAR(255),
form_type VARCHAR(50),
menu_action VARCHAR(255)
)
''')
class ConversationTracker:
@staticmethod
def track_user_message(session_id, user_message):
cursor.execute('''
INSERT INTO conversation_history
(session_id, timestamp, message_type, user_message)
VALUES (%s, %s, %s, %s)
''', (session_id, datetime.now(), 'user_message', user_message))
@staticmethod
def track_form_submit(session_id, form_type, form_data):
cursor.execute('''
INSERT INTO conversation_history
(session_id, timestamp, message_type, form_type, user_message)
VALUES (%s, %s, %s, %s, %s)
''', (session_id, datetime.now(), 'form_submit', form_type, str(form_data)))
Code: MemorySaver for per-session STM; PostgreSQL schema and ConversationTracker for persistent LTM
4. Building Guardrails
4.1 What Are Guardrails?
A guardrail is a safety boundary that defines the scope of your chatbot’s behavior. After you build powerful tools covering all business domains, you still face a fundamental challenge: users will ask things completely outside your chatbot’s purpose — the weather, how to fix their phone, cryptocurrency prices, general machine learning concepts. None of which the Genetech chatbot is designed to answer.
Without guardrails, the LLM might hallucinate an answer, share incorrect information, or respond in ways that are off-brand and confusing. With guardrails, the chatbot politely redirects these queries while keeping the conversation on topic.
| Guardrails define the boundary between what the chatbot CAN do and what it SHOULD NOT do — and ensure both cases are handled gracefully and professionally. |
4.2 Guardrails as a Tool in the Genetech Agentic AI Chatbot
In the Genetech chatbot, guardrails are implemented as a specialized tool called handle_irrelevant_queries. This design means the guardrail participates in the same ReAct loop as all other tools — the LLM itself decides when a query is out of scope and routes it to the guardrail tool.
The system prompt explicitly instructs the agent to classify every query before routing, and mandates use of the guardrail tool for anything not related to Genetech Solutions business:
- General knowledge questions — history, science, cooking
- Technical tutorials unrelated to our services — how to code, how ML works
- Hardware issues — phone repair, laptop problems
- Physical construction or non-software services
- Personal advice, health, or entertainment queries
The guardrail tool generates a polite, professional response that acknowledges the query, explains the chatbot’s scope, and gently redirects toward Genetech’s actual services. For example: ‘I’m sorry, but I can only help with Genetech Solutions topics. Would you like to know about our AI solutions or web development services?’
4.3 Guardrails Flow Diagram

Diagram: Out-of-scope queries are intercepted within the ReAct loop — the LLM routes to the guardrail tool before generating any response
4.4 Code Snippet: Guardrail Tool and System Prompt Enforcement
# ── Guardrail Tool: handle_irrelevant_queries ────────────────────────────
@tool
def handle_irrelevant_queries(user_message: str = '') -> str:
'''
Guardrail: Handles ANY query NOT related to Genetech Solutions business.
Triggered by LLM when it determines the query is out of scope.
Returns polite, professional redirect response.
Examples of triggers:
- General knowledge: 'What is the capital of France?'
- Hardware: 'How do I fix my phone?'
- Non-tech: 'Can you build me a smart watch?'
- Personal advice: 'What are good meditation techniques?'
'''
irrelevant_prompt = PromptTemplate(
template='''You are Genetech Solutions AI assistant.
USER MESSAGE: {user_message}
RESPONSE PATTERNS:
1. Hardware issues → 'I'm sorry, I can only help with Genetech Solutions
topics. For device issues, please consult the manufacturer.'
2. Non-software build requests → 'Sorry, I only help with software and
AI solutions. Want to build a website or mobile app?'
3. General queries → Acknowledge, redirect to Genetech services.
Keep response to 1-2 sentences. End with service invitation.
''',
input_variables=['user_message']
)
chain = irrelevant_prompt | llm
return chain.invoke({'user_message': user_message}).content
# ── System Prompt Guardrail Enforcement ──────────────────────────────────
system_prompt = '''
CRITICAL QUERY VALIDATION — APPLY BEFORE EVERY RESPONSE:
RELEVANT queries (use appropriate tools):
- Services, portfolio, pricing, team, awards, contact info
- Hiring, consultation, sales connection, issue reporting
- Software/web/mobile/AI/cloud development inquiries
IRRELEVANT queries (MUST use handle_irrelevant_queries tool):
- General knowledge, tutorials, personal advice
- Hardware issues, physical construction, non-tech services
- ANY topic not involving Genetech Solutions business
RULE: If query is about ANYTHING other than Genetech business,
call handle_irrelevant_queries tool IMMEDIATELYCode: Guardrail tool using PromptTemplate + LLM for professional redirects; system prompt enforcement ensuring consistent classification
5. Testing, Monitoring, and Observability
5.1 Testing the Chatbot
Testing an AI agent is fundamentally different from testing traditional software. You cannot write unit tests with exact expected outputs, because LLM responses are probabilistic and context-dependent. Instead, the recommended approach is to build a custom evaluation dataset and benchmark the agent against it.
Building a Custom Evaluation Dataset
For the Genetech agentic AI chatbot, testing involves creating a dataset of representative question-answer pairs across all tool domains:
- Knowledge queries — ‘What services does Genetech offer?’ → Expected: mentions web, mobile, AI, DevOps, cybersecurity
- Action triggers — ‘I want to build a website’ → Expected: SHOW_LEAD_FORM signal
- Guardrail cases — ‘What is the capital of France?’ → Expected: polite redirect, no Genetech answer
- Context follow-ups — ‘Tell me about your web services’ → ‘What is the pricing?’ → Expected: web-relevant pricing
- Industry routing — ‘I am in healthcare, how can you help?’ → Expected: healthcare-specific service recommendations
| A well-tested agent should achieve >90% correct tool routing on the evaluation dataset before going live. Periodic re-evaluation after adding new tools or modifying prompts ensures regression-free deployments. |
5.2 Monitoring and Observability with LangSmith
AI agents have a fundamental challenge: they are black boxes. When a user reports that the chatbot gave a wrong answer, how do you debug it? Was it a bad prompt? Wrong tool selected? Tool returned incorrect data? Without observability, you are guessing.
LangSmith is the observability platform for LangChain and LangGraph applications. It solves the black-box problem by capturing complete execution traces for every agent invocation. A trace captures the user’s input message, the LLM’s reasoning at each ReAct step, every tool call and its output, the final response, and token usage, latency per step, and total cost.
For the Genetech agentic AI chatbot, this means if a user asked about pricing and got the wrong answer, you can open LangSmith, find the trace, see exactly which tool was called, what it returned, and how the LLM synthesized the final response. Debugging becomes deterministic rather than guesswork.
5.3 LangSmith Monitoring Diagram

Diagram: LangSmith captures every step of every agent invocation — making debugging deterministic
5.4 Code Snippet: LangSmith Observability Setup
# ── LangSmith Observability Setup (app.py) ───────────────────────────────
import os
from dotenv import load_dotenv
load_dotenv()
# Enable LangSmith tracing
os.environ['LANGCHAIN_API_KEY'] = os.getenv('LANGCHAIN_API_KEY')
os.environ['LANGCHAIN_PROJECT'] = os.getenv('LANGCHAIN_PROJECT') # e.g. 'genetech-chatbot'
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
# Once enabled, EVERY agent.invoke() call is automatically traced
# No other code changes needed — LangGraph handles instrumentation
# ── LangSmith Evaluation: Benchmark Against Custom Dataset ───────────────
from langsmith import Client
from langsmith.evaluation import evaluate
client = Client()
dataset = client.create_dataset('genetech-chatbot-evals')
client.create_examples(
inputs=[
{'message': 'What web development services do you offer?'},
{'message': 'I want to build a mobile app'},
{'message': 'What is the capital of France?'}, # Guardrail test
],
outputs=[
{'expected_tool': 'services_info'},
{'expected_signal': 'SHOW_LEAD_FORM'},
{'expected_tool': 'handle_irrelevant_queries'},
],
dataset_id=dataset.id
)
results = evaluate(
lambda inputs: agent_executor.invoke({'messages': [HumanMessage(content=inputs['message'])]}),
data=dataset.name,
evaluators=['tool_selection_accuracy', 'response_relevance'],
experiment_prefix='genetech-v1'
)Code: Three environment variables enable full LangSmith tracing. Evaluation dataset benchmarks tool selection accuracy and response relevance.
What You End Up With
Work through these five pieces — ReAct architecture, specialized domain tools, dual-layer memory, guardrails, and observability — and you end up with a system that is meaningfully different from a chatbot in the traditional sense. It reasons. It selects tools based on what the user actually said, not what you hardcoded it to do. It remembers the conversation across turns and stores every interaction permanently. It stays on topic when pushed off it. And when something breaks, you can trace exactly what happened and why.
That is the Genetech chatbot. Every code snippet in this post is drawn from the actual implementation. Every architectural decision reflects a real tradeoff we made. The system is live at genetechsolutions.com, and if you want to understand what it feels like from the user side before you commit to building something like it, the fastest way is to open the chat widget on this page and try it with a real question.
What you experience as a user — the way it handles ambiguity, the way it routes between knowledge queries and action triggers, the way it terminates in a form — is the system this post describes, running in real time.
Build Yours With Us
If you have worked through this post and have a use case forming in your head — a specific workflow, a conversion problem, a customer experience you want to improve — the next step is a conversation, not a commitment. We built this system for ourselves. We know what it takes to build one for a business with real requirements, real constraints, and real users.
Use the chatbot on this page to start that conversation. It will route you to the right person. Or if you would rather go direct:


