mirror of
https://github.com/Shubhamsaboo/awesome-llm-apps.git
synced 2025-12-05 18:56:07 -06:00
🔥 Agentic RAG with EmbeddingGemma
🎓 FREE Step-by-Step Tutorial
👉 Click here to follow our complete step-by-step tutorial and learn how to build this from scratch with detailed code walkthroughs, explanations, and best practices.
This Streamlit app demonstrates an agentic Retrieval-Augmented Generation (RAG) Agent using Google's EmbeddingGemma for embeddings and Llama 3.2 as the language model, all running locally via Ollama.
Features
- Local AI Models: Uses EmbeddingGemma for vector embeddings and Llama 3.2 for text generation
- PDF Knowledge Base: Dynamically add PDF URLs to build a knowledge base
- Vector Search: Efficient similarity search using LanceDB
- Interactive UI: Beautiful Streamlit interface for adding sources and querying
- Streaming Responses: Real-time response generation with tool call visibility
How to Get Started?
- Clone the GitHub repository
git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
cd awesome-llm-apps/rag_tutorials/agentic_rag_embedding_gemma
- Install the required dependencies:
pip install -r requirements.txt
-
Ensure Ollama is installed and running with the required models:
- Pull the models:
ollama pull embeddinggemma:latestandollama pull llama3.2:latest - Start Ollama server if not running
- Pull the models:
-
Run the Streamlit app:
streamlit run agentic_rag_embeddinggemma.py
(Note: The app file is in the root directory)
- Open your web browser to the URL provided (usually http://localhost:8501) to interact with the RAG agent.
How It Works?
- Knowledge Base Setup: Add PDF URLs in the sidebar to load and index documents.
- Embedding Generation: EmbeddingGemma creates vector embeddings for semantic search.
- Query Processing: User queries are embedded and searched against the knowledge base.
- Response Generation: Llama 3.2 generates answers based on retrieved context.
- Tool Integration: The agent uses search tools to fetch relevant information.
Requirements
- Python 3.8+
- Ollama installed and running
- Required models:
embeddinggemma:latest,llama3.2:latest
Technologies Used
- Agno: Framework for building AI agents
- Streamlit: Web app framework
- LanceDB: Vector database
- Ollama: Local LLM server
- EmbeddingGemma: Google's embedding model
- Llama 3.2: Meta's language model