首页 > 后端开发 > Python教程 > 历史感知检索器如何工作?

历史感知检索器如何工作?

PHPz
发布: 2024-09-03 15:33:34
原创
418 人浏览过

How a history-aware retriever works?

本文中讨论的历史感知检索器是由 LangChain 包中的 create_history_aware_retriever 函数返回的。此函数旨在在其构造函数中接收以下输入:

  • LLM(接收查询并返回答案的语言模型);
  • 向量存储检索器(接收查询并返回相关文档列表的模型)。
  • 聊天历史记录(消息交互列表,通常是人类和人工智能之间的)。

调用时,历史感知检索器将用户查询作为输入并输出相关文档的列表。相关文档基于查询并结合聊天记录提供的上下文。

最后总结一下它的工作流程。

设置它

from langchain.chains import create_history_aware_retriever
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_chroma import Chroma
from dotenv import load_dotenv
import bs4

load_dotenv() # To get OPENAI_API_KEY
登录后复制
def create_vectorsore_retriever():
    """
    Returns a vector store retriever based on the text of a specific web page.
    """
    URL = r'https://lilianweng.github.io/posts/2023-06-23-agent/'
    loader = WebBaseLoader(
        web_paths=(URL,),
        bs_kwargs=dict(
            parse_only=bs4.SoupStrainer(class_=("post-content", "post-title", "post-header"))
        ))
    docs = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0, add_start_index=True)
    splits = text_splitter.split_documents(docs)
    vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
    return vectorstore.as_retriever()
登录后复制
def create_prompt():
    """
    Returns a prompt instructed to produce a rephrased question based on the user's
    last question, but referencing previous messages (chat history).
    """
    system_instruction = """Given a chat history and the latest user question \
        which might reference context in the chat history, formulate a standalone question \
        which can be understood without the chat history. Do NOT answer the question, \
        just reformulate it if needed and otherwise return it as is."""

    prompt = ChatPromptTemplate.from_messages([
        ("system", system_instruction),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")])
    return prompt
登录后复制
llm = ChatOpenAI(model='gpt-4o-mini')
vectorstore_retriever = create_vectorsore_retriever()
prompt = create_prompt()
登录后复制
history_aware_retriever = create_history_aware_retriever(
    llm,
    vectorstore_retriever,
    prompt
)
登录后复制

使用它

这里提出的问题没有任何聊天记录,因此检索器仅回复与上一个问题相关的文档。

chat_history = []

docs = history_aware_retriever.invoke({'input': 'what is planning?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()
登录后复制
Chunk 1:
Planning is essentially in order to optimize believability at the moment vs in time.
Prompt template: {Intro of an agent X}. Here is X's plan today in broad strokes: 1)
Relationships between agents and observations of one agent by another are all taken into consideration for planning and reacting.
Environment information is present in a tree structure.

Chunk 2:
language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.

Chunk 3:
Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural

Chunk 4:
Planning

Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.
Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.


Memory
登录后复制

现在,根据聊天记录,检索器知道人类想要了解任务分解和规划。因此它会用引用这两个主题的文本块进行响应。

chat_history = [
    ('human', 'when I ask about planning I want to know about Task Decomposition too.')]

docs = history_aware_retriever.invoke({'input': 'what is planning?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()
登录后复制
Chunk 1:
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Chunk 2:
Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#

Chunk 3:
Planning

Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.
Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.


Memory

Chunk 4:
Challenges in long-term planning and task decomposition: Planning over a lengthy history and effectively exploring the solution space remain challenging. LLMs struggle to adjust plans when faced with unexpected errors, making them less robust compared to humans who learn from trial and error.
登录后复制

现在问题完全基于聊天记录。我们可以看到它用引用正确概念的文本块进行响应。

chat_history = [
    ('human', 'What is ReAct?'),
    ('ai', 'ReAct integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space')]

docs = history_aware_retriever.invoke({'input': 'It is a way of doing what?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()
登录后复制
Chunk 1:<br>
ReAct (Yao et al. 2023) integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. The former enables LLM to interact with the environment (e.g. use Wikipedia search API), while the latter prompting LLM to generate reasoning traces in natural language.<br>
The ReAct prompt template incorporates explicit steps for LLM to think, roughly formatted as:<br>
Thought: ...<br>
Action: ...<br>
Observation: ...

<p>Chunk 2:<br>
Fig. 2.  Examples of reasoning trajectories for knowledge-intensive tasks (e.g. HotpotQA, FEVER) and decision-making tasks (e.g. AlfWorld Env, WebShop). (Image source: Yao et al. 2023).<br>
In both experiments on knowledge-intensive tasks and decision-making tasks, ReAct works better than the Act-only baseline where Thought: … step is removed.</p>

<p>Chunk 3:<br>
The LLM is provided with a list of tool names, descriptions of their utility, and details about the expected input/output.<br>
It is then instructed to answer a user-given prompt using the tools provided when necessary. The instruction suggests the model to follow the ReAct format - Thought, Action, Action Input, Observation.</p>

<p>Chunk 4:<br>
Case Studies#<br>
Scientific Discovery Agent#<br>
ChemCrow (Bran et al. 2023) is a domain-specific example in which LLM is augmented with 13 expert-designed tools to accomplish tasks across organic synthesis, drug discovery, and materials design. The workflow, implemented in LangChain, reflects what was previously described in the ReAct and MRKLs and combines CoT reasoning with tools relevant to the tasks:<br>
</p>
登录后复制




结论

总而言之,当调用 .invoke({'input': '...', 'chat_history': '...'}) 时,历史感知检索器的工作流程如下:

  • 它将 提示 中的 input 和 chat_history 占位符替换为指定值,创建一个新的即用型提示,其实质上是“获取此聊天记录和最后一个输入,并重新表述最后一个输入”以任何人都可以在不查看聊天记录的情况下理解它的方式”。
  • 它将新提示发送到LLM并接收改写的输入。
  • 然后将改写后的输入发送到向量存储检索器并接收与此改写后的输入相关的文档列表。
  • 最后,它返回相关文档的列表。

Obs.:需要注意的是,用于将文本转换为向量的嵌入是调用 Chroma.from_documents 时指定的嵌入。当未指定任何内容时(当前情况),将使用默认的色度嵌入。

以上是历史感知检索器如何工作?的详细内容。更多信息请关注PHP中文网其他相关文章!

来源:dev.to
本站声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn
热门教程
更多>
最新下载
更多>
网站特效
网站源码
网站素材
前端模板