In this post, I’ll explain how I built a chatbot using the Llama2 model to query Excel data intelligently.
Python (≥ 3.8)
Libraries: langchain, pandas, unstructured, Chroma
%pip install -q unstructured langchain %pip install -q "unstructured[all-docs]"
import pandas as pd excel_path = "Book2.xlsx" if excel_path: df = pd.read_excel(excel_path) data = df.to_string(index=False) else: print("Upload an Excel file")
Large text data is split into smaller, overlapping chunks for effective embedding and querying. These chunks are stored in a Chroma vector database.
from langchain_text_splitters import RecursiveCharacterTextSplitter from langchain_community.embeddings import OllamaEmbeddings from langchain_community.vectorstores import Chroma text_splitter = RecursiveCharacterTextSplitter(chunk_size=7500, chunk_overlap=100) chunks = text_splitter.split_text(data) embedding_model = OllamaEmbeddings(model="nomic-embed-text", show_progress=False) vector_db = Chroma.from_texts( texts=chunks, embedding=embedding_model, collection_name="local-rag" )
We use ChatOllama to load the Llama2 model locally.
from langchain_community.chat_models import ChatOllama local_model = "llama2" llm = ChatOllama(model=local_model)
The chatbot will respond based on specific column names from the Excel file. We create a prompt template to guide the model
from langchain.prompts import PromptTemplate QUERY_PROMPT = PromptTemplate( input_variables=["question"], template="""You are an AI assistant. Answer the user's questions based on the column names: Id, order_id, name, sales, refund, and status. Original question: {question}""" )
We configure a retriever to fetch relevant chunks from the vector database, which will be used by the Llama2 model to answer questions.
from langchain.retrievers.multi_query import MultiQueryRetriever retriever = MultiQueryRetriever.from_llm( vector_db.as_retriever(), llm, prompt=QUERY_PROMPT )
The response chain integrates:
from langchain.prompts import ChatPromptTemplate from langchain_core.runnables import RunnablePassthrough from langchain_core.output_parsers import StrOutputParser template = """Answer the question based ONLY on the following context: {context} Question: {question} """ prompt = ChatPromptTemplate.from_template(template) chain = ( {"context": retriever, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser() )
Now we’re ready to ask a question! Here’s how we invoke the chain to get a response:
raw_result = chain.invoke("How many rows are there?") final_result = f"{raw_result}\n\nIf you have more questions, feel free to ask!" print(final_result)
When I ran the above code on a sample Excel file, here’s what I got:
Based on the provided context, there are 10 rows in the table. If you have more questions, feel free to ask!
This approach leverages the power of embeddings and the Llama2 model to create a smart, interactive chatbot for Excel data. With some tweaks, you can extend this to work with other types of documents or integrate it into a full-fledged app!
Introducing BChat Excel: A Conversational AI-Powered Tool for Excel File Interactions
The above is the detailed content of Building a Simple Chatbot with LlamaChat with Excel]. For more information, please follow other related articles on the PHP Chinese website!