Divide these documents into chunks that can be fed into your LLM to generate embeddings and store these embeddings in a vector database.Gather all the documents you want your LLM to use.Phase 1: chunking (also known as indexing) For those not yet swept away in the RAG rage, RAG works in two phases: For example, say a company builds a chatbot for customer support, for this chatbot to answer any customer question about any product, the context needed might be that customer’s history or that product’s information.īecause the model “learns” from the context provided to it, this process is also called context learning.Ĭontext length is especially important for RAG – Retrieval Augmented Generation (Lewis et al., 2020) – which has emerged to be the predominant pattern for LLM industry use cases. Personally, I suspect that this percentage would be even higher for enterprise use cases. roughly 16.5% of the Natural Questions NQ-Open dataset. For example, if we ask ChatGPT: “What’s the best Vietnamese restaurant?”, the context needed would be “where” because the best Vietnamese restaurant in Vietnam would be different from the best Vietnamese in the US.Īccording to this cool paper SituatedQA (Zhang & Choi, 2021), a significant proportion of information-seeking questions have context-dependent answers, e.g. A simple example of fact-checking and hallucination by NVIDIA’s NeMo-GuardrailsĪ vast majority of questions require context.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |