歷史感知檢索器如何運作？-Python教學-PHP中文網

How a history-aware retriever works?

本文中討論的歷史感知檢索器是由 LangChain 套件中的 create_history_aware_retriever 函數傳回的。此函數旨在在其建構函數中接收以下輸入：

LLM（接收查詢並回傳答案的語言模型）；
向量儲存檢索器（接收查詢並傳回相關文件清單的模型）。
聊天歷史記錄（訊息互動列表，通常是人類和人工智慧之間的）。

呼叫時，歷史感知檢索器將使用者查詢作為輸入並輸出相關文件的清單。相關文件基於查詢並結合聊天記錄提供的上下文。

最後總結一下它的工作流程。

設定它

from langchain.chains import create_history_aware_retriever
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_chroma import Chroma
from dotenv import load_dotenv
import bs4

load_dotenv() # To get OPENAI_API_KEY

登入後複製

def create_vectorsore_retriever():
    """
    Returns a vector store retriever based on the text of a specific web page.
    """
    URL = r'https://lilianweng.github.io/posts/2023-06-23-agent/'
    loader = WebBaseLoader(
        web_paths=(URL,),
        bs_kwargs=dict(
            parse_only=bs4.SoupStrainer(class_=("post-content", "post-title", "post-header"))
        ))
    docs = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0, add_start_index=True)
    splits = text_splitter.split_documents(docs)
    vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
    return vectorstore.as_retriever()

登入後複製

def create_prompt():
    """
    Returns a prompt instructed to produce a rephrased question based on the user's
    last question, but referencing previous messages (chat history).
    """
    system_instruction = """Given a chat history and the latest user question \
        which might reference context in the chat history, formulate a standalone question \
        which can be understood without the chat history. Do NOT answer the question, \
        just reformulate it if needed and otherwise return it as is."""

    prompt = ChatPromptTemplate.from_messages([
        ("system", system_instruction),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")])
    return prompt

登入後複製

llm = ChatOpenAI(model='gpt-4o-mini')
vectorstore_retriever = create_vectorsore_retriever()
prompt = create_prompt()

登入後複製

history_aware_retriever = create_history_aware_retriever(
    llm,
    vectorstore_retriever,
    prompt
)

登入後複製

使用它

這裡提出的問題沒有任何聊天記錄，因此檢索器僅回覆與上一個問題相關的文件。

chat_history = []

docs = history_aware_retriever.invoke({'input': 'what is planning?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()

登入後複製

Chunk 1:
Planning is essentially in order to optimize believability at the moment vs in time.
Prompt template: {Intro of an agent X}. Here is X's plan today in broad strokes: 1)
Relationships between agents and observations of one agent by another are all taken into consideration for planning and reacting.
Environment information is present in a tree structure.

Chunk 2:
language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.

Chunk 3:
Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural

Chunk 4:
Planning

Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.
Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.


Memory

登入後複製

現在，根據聊天記錄，檢索者知道人類想要了解任務分解和規劃。因此它會用引用這兩個主題的文字區塊來回應。

chat_history = [
    ('human', 'when I ask about planning I want to know about Task Decomposition too.')]

docs = history_aware_retriever.invoke({'input': 'what is planning?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()

登入後複製

Chunk 1:
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Chunk 2:
Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#

Chunk 3:
Planning

Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.
Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.


Memory

Chunk 4:
Challenges in long-term planning and task decomposition: Planning over a lengthy history and effectively exploring the solution space remain challenging. LLMs struggle to adjust plans when faced with unexpected errors, making them less robust compared to humans who learn from trial and error.

登入後複製

現在問題完全基於聊天記錄。我們可以看到它用引用正確概念的文字區塊進行回應。

chat_history = [
    ('human', 'What is ReAct?'),
    ('ai', 'ReAct integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space')]

docs = history_aware_retriever.invoke({'input': 'It is a way of doing what?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()

登入後複製

Chunk 1:<br>
ReAct (Yao et al. 2023) integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. The former enables LLM to interact with the environment (e.g. use Wikipedia search API), while the latter prompting LLM to generate reasoning traces in natural language.<br>
The ReAct prompt template incorporates explicit steps for LLM to think, roughly formatted as:<br>
Thought: ...<br>
Action: ...<br>
Observation: ...

<p>Chunk 2:<br>
Fig. 2.  Examples of reasoning trajectories for knowledge-intensive tasks (e.g. HotpotQA, FEVER) and decision-making tasks (e.g. AlfWorld Env, WebShop). (Image source: Yao et al. 2023).<br>
In both experiments on knowledge-intensive tasks and decision-making tasks, ReAct works better than the Act-only baseline where Thought: … step is removed.</p>

<p>Chunk 3:<br>
The LLM is provided with a list of tool names, descriptions of their utility, and details about the expected input/output.<br>
It is then instructed to answer a user-given prompt using the tools provided when necessary. The instruction suggests the model to follow the ReAct format - Thought, Action, Action Input, Observation.</p>

<p>Chunk 4:<br>
Case Studies#<br>
Scientific Discovery Agent#<br>
ChemCrow (Bran et al. 2023) is a domain-specific example in which LLM is augmented with 13 expert-designed tools to accomplish tasks across organic synthesis, drug discovery, and materials design. The workflow, implemented in LangChain, reflects what was previously described in the ReAct and MRKLs and combines CoT reasoning with tools relevant to the tasks:<br>
</p>

登入後複製

結論

總而言之，當呼叫 .invoke({'input': '...', 'chat_history': '...'}) 時，歷史感知檢索器的工作流程如下：

它將提示中的input 和chat_history 佔位符替換為指定值，創建一個新的即用型提示，其實質上是「獲取此聊天記錄和最後一個輸入，並重新表述最後一個輸入」以任何人都可以在不查看聊天記錄的情況下理解它的方式」。
LLM 並接收改寫的輸入。
向量儲存檢索器並接收與此改寫後的輸入相關的文件清單。

Obs.：需要注意的是，用於將文字轉換為向量的嵌入是呼叫 Chroma.from_documents 時指定的嵌入。當未指定任何內容時（目前情況），將使用預設的色度嵌入。

以上是歷史感知檢索器如何運作？的詳細內容。更多資訊請關注PHP中文網其他相關文章！