9.1 KiB
🧠 DevPulseAI — Multi-Agent Signal Intelligence
A reference implementation demonstrating how to build a multi-agent pipeline that aggregates technical signals from multiple sources, scores them for relevance, assesses risks, and synthesizes an actionable intelligence digest.
Design Philosophy: Agents are used only where reasoning is required. Deterministic operations (collection, normalization, deduplication) are implemented as plain utilities — not agents.
Architecture
┌─────────────────────────────────────────────────────────┐
│ DATA SOURCES │
│ GitHub · ArXiv · HackerNews · Medium · HuggingFace │
└──────────────────────┬──────────────────────────────────┘
│ raw signals
▼
┌──────────────────────────────────────────────────────────┐
│ SignalCollector (UTILITY — no LLM) │
│ • Normalizes to unified schema │
│ • Deduplicates via source:id composite key │
│ • Filters incomplete signals │
└──────────────────────┬───────────────────────────────────┘
│ normalized signals
▼
┌──────────────────────────────────────────────────────────┐
│ RelevanceAgent (AGENT — gpt-4.1-mini) │
│ • Scores each signal 0–100 for developer relevance │
│ • Considers: novelty, impact, actionability, timeliness │
│ • Falls back to heuristics if no API key │
└──────────────────────┬───────────────────────────────────┘
│ scored signals
▼
┌──────────────────────────────────────────────────────────┐
│ RiskAgent (AGENT — gpt-4.1-mini) │
│ • Assesses security vulnerabilities │
│ • Flags breaking changes and deprecations │
│ • Rates risk: LOW / MEDIUM / HIGH / CRITICAL │
└──────────────────────┬───────────────────────────────────┘
│ risk-assessed signals
▼
┌──────────────────────────────────────────────────────────┐
│ SynthesisAgent (AGENT — gpt-4.1) │
│ • Cross-references relevance + risk data │
│ • Produces executive summary │
│ • Generates actionable recommendations │
└──────────────────────┬───────────────────────────────────┘
│
▼
📄 Intelligence Digest
Why Signal Collection Is Not an Agent
This is an intentional, opinionated design choice — not a shortcut.
Signal collection involves:
- Fetching data from HTTP APIs (deterministic)
- Normalizing fields to a unified schema (mechanical transformation)
- Deduplicating by composite key (hash comparison)
None of these tasks require reasoning, judgment, or language understanding.
Wrapping collection in an Agent class would be decorative — it would have an LLM import that never gets called. This misleads readers into thinking an LLM is necessary, when the actual logic is a for loop with a set().
Rule of thumb: If you can write the logic as a pure function with no ambiguity, it's a utility. If the output depends on understanding context, making judgment calls, or generating natural language, it's an agent.
Agent Roles & Model Selection
| Component | Type | Model | Why This Model |
|---|---|---|---|
SignalCollector |
Utility | none | Deterministic — no reasoning required |
RelevanceAgent |
Agent | gpt-4.1-mini |
Classification task — fast, cheap, high-volume |
RiskAgent |
Agent | gpt-4.1-mini |
Structured analysis — careful but not expensive |
SynthesisAgent |
Agent | gpt-4.1 |
Cross-referencing & summarization — needs strongest reasoning |
Single provider by default (OpenAI) to reduce onboarding friction. Override per-agent via environment variables:
export MODEL_RELEVANCE=gpt-4.1-nano # cheaper, faster
export MODEL_RISK=o4-mini # deeper reasoning for risk
export MODEL_SYNTHESIS=gpt-4.1 # default, strongest
How to Run
Quick Verification (No API Key Required)
cd advanced_ai_agents/multi_agent_apps/devpulse_ai
python verify.py
This runs the full pipeline with mock data in <1 second. No network calls, no API keys.
Expected output:
[OK] DevPulseAI reference pipeline executed successfully
Full Pipeline (With API Key)
pip install -r requirements.txt
export OPENAI_API_KEY=sk-...
python main.py
Without an API key, agents automatically fall back to heuristic scoring.
Streamlit Dashboard
streamlit run streamlit_app.py
Project Structure
devpulse_ai/
├── agents/
│ ├── __init__.py # Package exports + design docs
│ ├── signal_collector.py # UTILITY — normalize & dedup
│ ├── relevance_agent.py # AGENT — score relevance (gpt-4.1-mini)
│ ├── risk_agent.py # AGENT — assess risks (gpt-4.1-mini)
│ └── synthesis_agent.py # AGENT — produce digest (gpt-4.1)
├── adapters/
│ ├── github.py # GitHub trending repos
│ ├── arxiv.py # ArXiv recent papers
│ ├── hackernews.py # HackerNews top stories
│ ├── medium.py # Medium AI/ML blogs
│ └── huggingface.py # HuggingFace trending models
├── workflows/
│ └── signal-intelligence-pipeline.json
├── main.py # Full pipeline runner
├── verify.py # Mock-data verification (<1s)
├── streamlit_app.py # Interactive dashboard
└── requirements.txt # Minimal deps (single provider)
Optional Extensions (Advanced Users)
These are not required for the reference implementation, but show how the architecture extends:
-
Multi-provider models — Swap
RelevanceAgentto use Anthropic Claude or Google Gemini by updating the model config. Theagnoframework supports multiple providers. -
Vector search — Add a Pinecone or Qdrant adapter to store and retrieve signals semantically for long-term pattern detection.
-
Streaming digests — Use WebSocket streaming from
SynthesisAgentfor real-time intelligence feeds. -
Custom adapters — Add new signal sources by implementing a
fetch_*function that returnsList[Dict]with the standard schema (id,source,title,description,url,metadata). -
Feedback loop — Store user feedback (👍/👎) in Supabase and use it to fine-tune relevance scoring over time.
Dependencies
agno # Agent framework
openai # LLM provider (single default)
httpx # HTTP client for adapters
feedparser # RSS/Atom parsing for Medium
streamlit>=1.30 # Interactive dashboard
No google-generativeai required. Gemini is an optional extension if users want multi-provider support — install google-genai (not the deprecated google-generativeai) separately.
Design Tradeoffs
| Decision | Tradeoff | Why |
|---|---|---|
| Single provider default | Less flexibility | Reduces onboarding from 2+ keys to 1 |
| Signal collection as utility | Less "agentic" demo | Honest architecture — agents where reasoning exists |
| Heuristic fallbacks | Lower quality without API key | Pipeline always works, even for evaluation |
| 5 signals per source default | Less data | Keeps demo fast (<10s with API, <1s mock) |
| No async in agents | Less throughput | Simpler code, clearer educational value |
Built as a reference implementation for awesome-llm-apps.