Comment fonctionne un récupérateur prenant en compte l'historique ?-Tutoriel Python-php.cn

How a history-aware retriever works?

Le récupérateur prenant en compte l'historique abordé dans cet article est celui renvoyé par la fonction create_history_aware_retriever du package LangChain. Cette fonction est conçue pour recevoir les entrées suivantes dans son constructeur :

Un LLM (un modèle de langage qui reçoit une requête et renvoie une réponse) ;
Un récupérateur de magasin vectoriel (un modèle qui reçoit une requête et renvoie une liste de documents pertinents).
Un historique de discussion (une liste d'interactions de messages, généralement entre un humain et une IA).

Lorsqu'il est invoqué, le récupérateur prenant en compte l'historique prend une requête utilisateur en entrée et génère une liste de documents pertinents. Les documents pertinents sont basés sur la requête combinée au contexte fourni par l'historique du chat.

À la fin, je résume son workflow.

Le régler

from langchain.chains import create_history_aware_retriever
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_chroma import Chroma
from dotenv import load_dotenv
import bs4

load_dotenv() # To get OPENAI_API_KEY

Copier après la connexion

def create_vectorsore_retriever():
    """
    Returns a vector store retriever based on the text of a specific web page.
    """
    URL = r'https://lilianweng.github.io/posts/2023-06-23-agent/'
    loader = WebBaseLoader(
        web_paths=(URL,),
        bs_kwargs=dict(
            parse_only=bs4.SoupStrainer(class_=("post-content", "post-title", "post-header"))
        ))
    docs = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0, add_start_index=True)
    splits = text_splitter.split_documents(docs)
    vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
    return vectorstore.as_retriever()

Copier après la connexion

def create_prompt():
    """
    Returns a prompt instructed to produce a rephrased question based on the user's
    last question, but referencing previous messages (chat history).
    """
    system_instruction = """Given a chat history and the latest user question \
        which might reference context in the chat history, formulate a standalone question \
        which can be understood without the chat history. Do NOT answer the question, \
        just reformulate it if needed and otherwise return it as is."""

    prompt = ChatPromptTemplate.from_messages([
        ("system", system_instruction),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")])
    return prompt

Copier après la connexion

llm = ChatOpenAI(model='gpt-4o-mini')
vectorstore_retriever = create_vectorsore_retriever()
prompt = create_prompt()

Copier après la connexion

history_aware_retriever = create_history_aware_retriever(
    llm,
    vectorstore_retriever,
    prompt
)

Copier après la connexion

L'utiliser

Ici, une question est posée sans historique de discussion, le récupérateur ne répond donc qu'avec les documents pertinents à la dernière question.

chat_history = []

docs = history_aware_retriever.invoke({'input': 'what is planning?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()

Copier après la connexion

Chunk 1:
Planning is essentially in order to optimize believability at the moment vs in time.
Prompt template: {Intro of an agent X}. Here is X's plan today in broad strokes: 1)
Relationships between agents and observations of one agent by another are all taken into consideration for planning and reacting.
Environment information is present in a tree structure.

Chunk 2:
language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.

Chunk 3:
Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural

Chunk 4:
Planning

Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.
Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.


Memory

Copier après la connexion

Maintenant, sur la base de l'historique des discussions, le retriever sait que l'humain veut en savoir plus sur la décomposition des tâches ainsi que sur la planification. Il répond donc par des morceaux de texte faisant référence aux deux thèmes.

chat_history = [
    ('human', 'when I ask about planning I want to know about Task Decomposition too.')]

docs = history_aware_retriever.invoke({'input': 'what is planning?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()

Copier après la connexion

Chunk 1:
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Chunk 2:
Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#

Chunk 3:
Planning

Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.
Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.


Memory

Chunk 4:
Challenges in long-term planning and task decomposition: Planning over a lengthy history and effectively exploring the solution space remain challenging. LLMs struggle to adjust plans when faced with unexpected errors, making them less robust compared to humans who learn from trial and error.

Copier après la connexion

Maintenant, toute la question est basée sur l'historique du chat. Et nous pouvons voir qu'il répond avec des morceaux de texte qui font référence au concept correct.

chat_history = [
    ('human', 'What is ReAct?'),
    ('ai', 'ReAct integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space')]

docs = history_aware_retriever.invoke({'input': 'It is a way of doing what?', 'chat_history': chat_history})
for i, doc in enumerate(docs):
    print(f'Chunk {i+1}:')
    print(doc.page_content)
    print()

Copier après la connexion

Chunk 1:<br>
ReAct (Yao et al. 2023) integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. The former enables LLM to interact with the environment (e.g. use Wikipedia search API), while the latter prompting LLM to generate reasoning traces in natural language.<br>
The ReAct prompt template incorporates explicit steps for LLM to think, roughly formatted as:<br>
Thought: ...<br>
Action: ...<br>
Observation: ...

<p>Chunk 2:<br>
Fig. 2.  Examples of reasoning trajectories for knowledge-intensive tasks (e.g. HotpotQA, FEVER) and decision-making tasks (e.g. AlfWorld Env, WebShop). (Image source: Yao et al. 2023).<br>
In both experiments on knowledge-intensive tasks and decision-making tasks, ReAct works better than the Act-only baseline where Thought: … step is removed.</p>

<p>Chunk 3:<br>
The LLM is provided with a list of tool names, descriptions of their utility, and details about the expected input/output.<br>
It is then instructed to answer a user-given prompt using the tools provided when necessary. The instruction suggests the model to follow the ReAct format - Thought, Action, Action Input, Observation.</p>

<p>Chunk 4:<br>
Case Studies#<br>
Scientific Discovery Agent#<br>
ChemCrow (Bran et al. 2023) is a domain-specific example in which LLM is augmented with 13 expert-designed tools to accomplish tasks across organic synthesis, drug discovery, and materials design. The workflow, implemented in LangChain, reflects what was previously described in the ReAct and MRKLs and combines CoT reasoning with tools relevant to the tasks:<br>
</p>

Copier après la connexion

Conclusion

En conclusion, le workflow des récupérateurs sensibles à l'historique fonctionne comme suit lorsque .invoke({'input': '...', 'chat_history': '...'}) est appelé :

Il remplace les espaces réservés input et chat_history dans l'invite par des valeurs spécifiées, créant une nouvelle invite prête à l'emploi qui dit essentiellement "Prenez cet historique de discussion et cette dernière entrée, et reformulez la dernière entrée de manière à ce que tout le monde puisse le comprendre sans voir l'historique des discussions".
Il envoie la nouvelle invite au LLM et reçoit une entrée reformulée.
Il envoie ensuite l'entrée reformulée au outil de récupération de magasin vectoriel et reçoit une liste de documents pertinents pour cette entrée reformulée.
Enfin, il renvoie cette liste de documents pertinents.

Obs. : Il est important de noter que l'intégration utilisée pour transformer le texte en vecteur est celle spécifiée lors de l'appel de Chroma.from_documents. Lorsqu'aucun n'est spécifié (le cas présent), l'intégration de chrominance par défaut est utilisée.

Ce qui précède est le contenu détaillé de. pour plus d'informations, suivez d'autres articles connexes sur le site Web de PHP en chinois!