0


Langchain中改进RAG能力的3种常用的扩展查询方法

有多种方法可以提高检索增强生成(RAG)的能力,其中一种方法称为查询扩展。我们这里主要介绍在Langchain中常用的3种方法

查询扩展技术涉及对用户的原始查询进行细化,以生成更全面和信息丰富的搜索。使用扩展后的查询将从向量数据库中获取更多相关文档。

1、Step Back Prompting

Take A Step Back: Evoking Reasoning Via Abstraction In Large Language Models

https://arxiv.org/pdf/2310.06117.pdf

这是google deep mind开发的一种方法,它使用LLM来创建用户查询的抽象。该方法将从用户查询中退后一步,以便更好地从问题中获得概述。LLM将根据用户查询生成更通用的问题。

下面是原始查询和后退查询的示例。

  1. {
  2. "Original_Query": "Could the members of The Police perform lawful arrests?",
  3. "Step_Back_Query": "what can the members of The Police do?",
  4. },
  5. {
  6. "Original_Query": "Jan Sindel’s was born in what country?",
  7. "Step_Back_Query": "what is Jan Sindel’s personal history?",
  8. }

下面代码演示了如何使用Langchain进行Step Back Prompting

  1. #---------------------Prepare VectorDB-----------------------------------
  2. # Build a sample vectorDB
  3. from langchain.text_splitter import RecursiveCharacterTextSplitter
  4. from langchain_community.document_loaders import WebBaseLoader
  5. from langchain_community.vectorstores import Chroma
  6. from langchain.embeddings import OpenAIEmbeddings
  7. import os
  8. os.environ["OPENAI_API_KEY"] = "Your OpenAI KEY"
  9. # Load blog post
  10. loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
  11. data = loader.load()
  12. # Split
  13. text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=0)
  14. splits = text_splitter.split_documents(data)
  15. # VectorDB
  16. embedding = OpenAIEmbeddings()
  17. vectordb = Chroma.from_documents(documents=splits, embedding=embedding)
  18. #-------------------Prepare Step Back Prompt Pipeline------------------------
  19. from langchain_core.output_parsers import StrOutputParser
  20. from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
  21. from langchain_core.runnables import RunnableLambda
  22. from langchain.chat_models import ChatOpenAI
  23. retriever = vectordb.as_retriever()
  24. llm = ChatOpenAI()
  25. # Few Shot Examples
  26. examples = [
  27. {
  28. "input": "Could the members of The Police perform lawful arrests?",
  29. "output": "what can the members of The Police do?",
  30. },
  31. {
  32. "input": "Jan Sindel’s was born in what country?",
  33. "output": "what is Jan Sindel’s personal history?",
  34. },
  35. ]
  36. # We now transform these to example messages
  37. example_prompt = ChatPromptTemplate.from_messages(
  38. [
  39. ("human", "{input}"),
  40. ("ai", "{output}"),
  41. ]
  42. )
  43. few_shot_prompt = FewShotChatMessagePromptTemplate(
  44. example_prompt=example_prompt,
  45. examples=examples,
  46. )
  47. prompt = ChatPromptTemplate.from_messages(
  48. [
  49. (
  50. "system",
  51. """You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:""",
  52. ),
  53. # Few shot examples
  54. few_shot_prompt,
  55. # New question
  56. ("user", "{question}"),
  57. ]
  58. )
  59. question_gen = prompt | llm | StrOutputParser()
  60. #--------------------------QnA using Back Prompt Technique-----------------
  61. from langchain import hub
  62. def format_docs(docs):
  63. doc_strings = [doc.page_content for doc in docs]
  64. return "\n\n".join(doc_strings)
  65. response_prompt = hub.pull("langchain-ai/stepback-answer")
  66. chain = (
  67. {
  68. # Retrieve context using the normal question
  69. "normal_context": RunnableLambda(lambda x: x["question"]) | retriever | format_docs,
  70. # Retrieve context using the step-back question
  71. "step_back_context": question_gen | retriever | format_docs,
  72. # Pass on the question
  73. "question": lambda x: x["question"],
  74. }
  75. | response_prompt
  76. | llm
  77. | StrOutputParser()
  78. )
  79. result = chain.invoke({"question": "What Task Decomposition that work in 2022?"})

在那个脚本中,我们的问题是

  1. Original Query: What Task Decomposition that work in 2022?

Step Back Prompting为

  1. Step Back Query: What are some examples of task decomposition in the current year?

这两个查询将用于提取相关文档,将这些文档组合在一起作为一个上下文,提供给LLM生成最终的答案。

  1. {
  2. # Retrieve context using the normal question
  3. "normal_context": RunnableLambda(lambda x: x["question"]) | retriever | format_docs,
  4. # Retrieve context using the step-back question
  5. "step_back_context": question_gen | retriever | format_docs,
  6. # Pass on the question
  7. "question": lambda x: x["question"],
  8. }

2、 Multi Query

Langchain Multi Query Retriever

https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever

多步查询是一种使用LLM从第一个查询生成更多查询的技术。这种技术试图解决用户提示不是那么具体的情况。这些生成的查询将用于在矢量数据库中查找文档。

多步查询的目标是改进查询,使其与主题更加相关,从而从数据库中检索更多相关的文档。

因为Langchain 有详细的文档,我们就不贴代码了

3、Cross Encoding Re-Ranking

这个方法是多查询和交叉编码器重新排序的结合,当用户使用LLM生成更多的问题时,每个生成的查询都从向量数据库中提取一对文档。

这些提取的文档通过交叉编码器传递,获得与初始查询的相似度分数。然后对相关文档进行排序,并选择前5名作为LLM返回结果。

为什么需要挑选前5个文档?因为需要尽量避免从矢量数据库检索的不相关文档。这种选择确保交叉编码器专注于最相似和最有意义的文档,从而生成更准确和简洁的摘要。

  1. #------------------------Prepare Vector Database--------------------------
  2. # Build a sample vectorDB
  3. from langchain.text_splitter import RecursiveCharacterTextSplitter
  4. from langchain_community.document_loaders import WebBaseLoader
  5. from langchain_community.vectorstores import Chroma
  6. from langchain.embeddings import OpenAIEmbeddings
  7. from langchain.chat_models import ChatOpenAI
  8. import os
  9. os.environ["OPENAI_API_KEY"] = "Your API KEY"
  10. # Load blog post
  11. loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
  12. data = loader.load()
  13. llm = ChatOpenAI()
  14. # Split
  15. text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=0)
  16. splits = text_splitter.split_documents(data)
  17. # VectorDB
  18. embedding = OpenAIEmbeddings()
  19. vectordb = Chroma.from_documents(documents=splits, embedding=embedding)
  20. #--------------------Generate More Question----------------------------------
  21. #This function use to generate queries using LLM
  22. def create_original_query(original_query):
  23. query = original_query["question"]
  24. qa_system_prompt = """
  25. You are an AI language model assistant. Your task is to generate five
  26. different versions of the given user question to retrieve relevant documents from a vector
  27. database. By generating multiple perspectives on the user question, your goal is to help
  28. the user overcome some of the limitations of the distance-based similarity search.
  29. Provide these alternative questions separated by newlines."""
  30. qa_prompt = ChatPromptTemplate.from_messages(
  31. [
  32. ("system", qa_system_prompt),
  33. ("human", "{question}"),
  34. ]
  35. )
  36. rag_chain = (
  37. qa_prompt
  38. | llm
  39. | StrOutputParser()
  40. )
  41. question_string = rag_chain.invoke(
  42. {"question": query}
  43. )
  44. lines_list = question_string.splitlines()
  45. queries = []
  46. queries = [query] + lines_list
  47. return queries
  48. #-------------------Retrieve Document and Cross Encoding--------------------
  49. from sentence_transformers import CrossEncoder
  50. from langchain_core.prompts import ChatPromptTemplate
  51. from langchain_core.runnables import RunnableLambda, RunnablePassthrough
  52. from langchain_core.output_parsers import StrOutputParser
  53. import numpy as np
  54. cross_encoder = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')
  55. #Cross Encoding happens in here
  56. def create_documents(queries):
  57. retrieved_documents = []
  58. for i in queries:
  59. results = vectordb.as_retriever().get_relevant_documents(i)
  60. docString = format_docs(results)
  61. retrieved_documents.extend(docString)
  62. unique_a = []
  63. #If there is duplication documents for each query, make it unique
  64. for item in retrieved_documents:
  65. if item not in unique_a:
  66. unique_a.append(item)
  67. unique_documents = list(unique_a)
  68. pairs = []
  69. for doc in unique_documents:
  70. pairs.append([queries[0], doc])
  71. #Cross Encoder Scoring
  72. scores = cross_encoder.predict(pairs)
  73. final_queries = []
  74. for x in range(len(scores)):
  75. final_queries.append({"score":scores[x],"document":unique_documents[x]})
  76. #Rerank the documents, return top 5
  77. sorted_list = sorted(final_queries, key=lambda x: x["score"], reverse=True)
  78. first_five_elements = sorted_list[:6]
  79. return first_five_elements
  80. #-----------------QnA Document-----------------------------------------------
  81. qa_system_prompt = """
  82. Assistant is a large language model trained by OpenAI. \
  83. Use the following pieces of retrieved context to answer the question. \
  84. If you don't know the answer, just say that you don't know. \
  85. {context}"""
  86. qa_prompt = ChatPromptTemplate.from_messages(
  87. [
  88. ("system", qa_system_prompt),
  89. ("human", "{question}"),
  90. ]
  91. )
  92. def format(docs):
  93. doc_strings = [doc["document"] for doc in docs]
  94. return "\n\n".join(doc_strings)
  95. chain = (
  96. # Prepare the context using below pipeline
  97. # Generate Queries -> Cross Encoding -> Rerank ->return context
  98. {"context": RunnableLambda(create_original_query)| RunnableLambda(create_documents) | RunnableLambda(format), "question": RunnablePassthrough()}
  99. | qa_prompt
  100. | llm
  101. | StrOutputParser()
  102. )
  103. result = chain.invoke({"question":"What Task Decomposition that work in 2022?"})

从上面代码主要是创建了两个用于生成查询和交叉编码的自定义函数。

create_original_query用于生成查询,它将返回5个生成的问题加上原始查询。

create_documents则根据6个问题(上面的5个生成问题和1个原始查询)检索24个相关文档。这24个相关文档可能重复,所以需要进行去重。

之后我们使用

  1. scores = cross_encoder.predict(pairs)

给出文档和原始查询之间的交叉编码分数。然后就是对文档重新排序,保留前5个文档。

总结

以上就是最常用的3种改进RAG能力扩展查询方法。当你在使用RAG时,并且没有得到正确或详细的答案,可以使用上述查询扩展方法来解决这些问题。希望所有这些技术可以用于你的下一个项目。

作者:Wayan Wardana

“Langchain中改进RAG能力的3种常用的扩展查询方法”的评论:

还没有评论