mirror of
https://github.com/Shubhamsaboo/awesome-llm-apps.git
synced 2026-03-11 17:48:31 -05:00
refactor: Update qwen_local_rag_agent to use Agno v2.0 and enhance README
- Changed import path for OllamaEmbedder to reflect new Agno structure. - Switched from show_tool_calls to debug_mode for improved debugging experience.
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# 🐋 Qwen 3 Local RAG Reasoning Agent
|
||||
|
||||
This RAG Application demonstrates how to build a powerful Retrieval-Augmented Generation (RAG) system using locally running Qwen 3 and Gemma 3 models via Ollama. It combines document processing, vector search, and web search capabilities to provide accurate, context-aware responses to user queries.
|
||||
This RAG Application demonstrates how to build a powerful Retrieval-Augmented Generation (RAG) system using locally running Qwen 3 and Gemma 3 models via Ollama. It combines document processing, vector search, and web search capabilities to provide accurate, context-aware responses to user queries. Built with Agno v2.0.
|
||||
|
||||
## Features
|
||||
|
||||
@@ -29,6 +29,11 @@ This RAG Application demonstrates how to build a powerful Retrieval-Augmented Ge
|
||||
|
||||
- Qdrant vector database for efficient similarity search
|
||||
- Persistent storage of document embeddings
|
||||
- **🔧 Agno v2.0 Framework**:
|
||||
|
||||
- Uses Agno v2.0 Knowledge embedder system
|
||||
- Debug mode for enhanced development experience
|
||||
- Modern agent architecture with improved tool integration
|
||||
|
||||
## How to Get Started
|
||||
|
||||
@@ -36,8 +41,9 @@ This RAG Application demonstrates how to build a powerful Retrieval-Augmented Ge
|
||||
|
||||
- [Ollama](https://ollama.ai/) installed locally
|
||||
- Python 3.8+
|
||||
- Qdrant account (free tier available) for vector storage
|
||||
- Qdrant running locally (via Docker) for vector storage
|
||||
- Exa API key (optional, for web search capability)
|
||||
- Agno v2.0 installed
|
||||
|
||||
### Installation
|
||||
|
||||
@@ -58,9 +64,11 @@ pip install -r requirements.txt
|
||||
|
||||
```bash
|
||||
ollama pull qwen3:1.7b # Or any other model you want to use
|
||||
ollama pull snowflake-arctic-embed # Or any other model you want to use
|
||||
ollama pull snowflake-arctic-embed # For embeddings
|
||||
```
|
||||
4. Run Qdrant locally through docker
|
||||
|
||||
4. Run Qdrant locally through Docker:
|
||||
|
||||
```bash
|
||||
docker pull qdrant/qdrant
|
||||
|
||||
@@ -69,12 +77,11 @@ docker run -p 6333:6333 -p 6334:6334 \
|
||||
qdrant/qdrant
|
||||
```
|
||||
|
||||
5. Get your API keys (optional):
|
||||
|
||||
4. Get your API keys:
|
||||
|
||||
- Exa API key (optional, for web search)
|
||||
- Exa API key (for web search fallback capability)
|
||||
|
||||
5. Run the application:
|
||||
6. Run the application:
|
||||
|
||||
```bash
|
||||
streamlit run qwen_local_rag_agent.py
|
||||
@@ -87,28 +94,36 @@ streamlit run qwen_local_rag_agent.py
|
||||
- PDF files are processed using PyPDFLoader
|
||||
- Web content is extracted using WebBaseLoader
|
||||
- Documents are split into chunks with RecursiveCharacterTextSplitter
|
||||
- Metadata is added to track source types and timestamps
|
||||
|
||||
2. **Vector Database**:
|
||||
|
||||
- Document chunks are embedded using Ollama's embedding models
|
||||
- Document chunks are embedded using Ollama's embedding models via Agno's OllamaEmbedder
|
||||
- Embeddings are stored in Qdrant vector database
|
||||
- Similarity search retrieves relevant documents based on query
|
||||
- Similarity search retrieves relevant documents based on query with configurable threshold
|
||||
|
||||
3. **Query Processing**:
|
||||
|
||||
- User queries are analyzed to determine the best information source
|
||||
- System checks document relevance using similarity threshold
|
||||
- Falls back to web search if no relevant documents are found
|
||||
- Falls back to web search if no relevant documents are found (when enabled)
|
||||
- Supports forced web search mode via toggle
|
||||
|
||||
4. **Response Generation**:
|
||||
|
||||
- Local LLM (Qwen/Gemma) generates responses based on retrieved context
|
||||
- Local LLM (Qwen/Gemma/DeepSeek) generates responses based on retrieved context
|
||||
- Agno agents use debug mode for enhanced visibility into tool calls
|
||||
- Sources are cited and displayed to the user
|
||||
- Web search results are clearly indicated when used
|
||||
- Reasoning process is displayed for reasoning models
|
||||
|
||||
## Configuration Options
|
||||
|
||||
- **Model Selection**: Choose between different Qwen, Gemma, and DeepSeek models
|
||||
- **RAG Mode**: Toggle between RAG-enabled and direct LLM interaction
|
||||
- **Search Tuning**: Adjust similarity threshold for document retrieval
|
||||
- **Search Tuning**: Adjust similarity threshold (0.0-1.0) for document retrieval
|
||||
- **Web Search**: Enable/disable web search fallback and configure domain filtering
|
||||
- **Debug Mode**: Agents use debug mode by default for better visibility into tool calls and execution flow
|
||||
|
||||
## Use Cases
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ from qdrant_client import QdrantClient
|
||||
from qdrant_client.models import Distance, VectorParams
|
||||
from langchain_core.embeddings import Embeddings
|
||||
from agno.tools.exa import ExaTools
|
||||
from agno.embedder.ollama import OllamaEmbedder
|
||||
from agno.knowledge.embedder.ollama import OllamaEmbedder
|
||||
|
||||
|
||||
class OllamaEmbedderr(Embeddings):
|
||||
@@ -254,7 +254,7 @@ def get_web_search_agent() -> Agent:
|
||||
2. Compile and summarize the most relevant information
|
||||
3. Include sources in your response
|
||||
""",
|
||||
show_tool_calls=True,
|
||||
debug_mode=True,
|
||||
markdown=True,
|
||||
)
|
||||
|
||||
@@ -279,7 +279,7 @@ def get_rag_agent() -> Agent:
|
||||
|
||||
Always maintain high accuracy and clarity in your responses.
|
||||
""",
|
||||
show_tool_calls=True,
|
||||
debug_mode=True,
|
||||
markdown=True,
|
||||
)
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
agno
|
||||
agno>=2.2.10
|
||||
pypdf
|
||||
exa
|
||||
qdrant-client
|
||||
|
||||
Reference in New Issue
Block a user