Architecture
Build a custom chatbot that answers questions about your specific domain using open-source LLMs, vector search for context retrieval, and a web interface for interaction.
Setup Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3:8b
Build the Chatbot
from langchain_community.llms import Ollama
from langchain.chains import ConversationalRetrievalChain
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
from langchain.memory import ConversationBufferMemory
# Setup
llm = Ollama(model="llama3:8b")
embeddings = OllamaEmbeddings(model="llama3:8b")
# Load your documents into vector store
vectorstore = Chroma.from_documents(documents, embeddings, persist_directory="/opt/chroma")
# Create conversational chain
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=vectorstore.as_retriever(),
memory=memory
)
# Chat
result = chain({"question": "What are your pricing plans?"})
print(result["answer"])
Web Interface with Gradio
import gradio as gr
def chat(message, history):
result = chain({"question": message})
return result["answer"]
gr.ChatInterface(chat, title="My AI Assistant").launch(server_name="0.0.0.0")
Features
- Runs entirely on your server
- Custom knowledge base from your documents
- Conversation memory for context
- No API costs or data sharing