Custom GPT Chatbot with Open Source Models

By Admin · Mar 15, 2026 · Updated Apr 23, 2026 · 518 views · 1 min read

Architecture

Build a custom chatbot that answers questions about your specific domain using open-source LLMs, vector search for context retrieval, and a web interface for interaction.

Setup Ollama

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3:8b

Build the Chatbot

from langchain_community.llms import Ollama
from langchain.chains import ConversationalRetrievalChain
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
from langchain.memory import ConversationBufferMemory

# Setup
llm = Ollama(model="llama3:8b")
embeddings = OllamaEmbeddings(model="llama3:8b")

# Load your documents into vector store
vectorstore = Chroma.from_documents(documents, embeddings, persist_directory="/opt/chroma")

# Create conversational chain
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectorstore.as_retriever(),
    memory=memory
)

# Chat
result = chain({"question": "What are your pricing plans?"})
print(result["answer"])

Web Interface with Gradio

import gradio as gr

def chat(message, history):
    result = chain({"question": message})
    return result["answer"]

gr.ChatInterface(chat, title="My AI Assistant").launch(server_name="0.0.0.0")

Features

Runs entirely on your server
Custom knowledge base from your documents
Conversation memory for context
No API costs or data sharing