Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) is an AI architecture that enhances large language model (LLM) responses by first retrieving relevant documents from a knowledge base, then using those documents as context when generating an answer. Instead of relying solely on the model's training data, RAG grounds responses in your actual documentation — reducing hallucinations and ensuring accuracy.
How RAG works in customer support
When a customer asks a question, the RAG pipeline converts the query into a vector embedding, searches your indexed documentation for semantically similar passages, and feeds the most relevant passages to the LLM along with the original question. The model then generates a response that directly references your docs, providing accurate, sourced answers rather than generic responses.
RAG vs fine-tuning
Fine-tuning permanently modifies a model's weights using your data, requiring retraining whenever content changes. RAG keeps the model unchanged and retrieves fresh content at query time — meaning your support answers update instantly when you update your docs, with no retraining required.
How EchoSDK uses RAG
EchoSDK's RAG pipeline automatically indexes your documentation via URL or text input, creates vector embeddings using Firestore Vector Search, and serves accurate answers in under 2 seconds. When the AI can't find a confident answer, it automatically escalates to a human agent via the ticket system.
Related terms
Vector Embeddings
Numerical representations of text that capture semantic meaning, enabling AI systems to find relevant content through similarity search.
Large Language Model (LLM)
An AI model trained on vast amounts of text data that can understand and generate human language, powering chatbots, summarization, and question-answering systems.
AI Hallucination
When an AI model generates a response that sounds plausible but is factually incorrect or fabricated, not grounded in actual data.
Knowledge Base
A structured collection of documentation, FAQs, and guides that serves as the source of truth for customer support — both for human agents and AI systems.