Add AI Research Planner & Executor with Gemini Interactions API

This commit is contained in:
Shubhamsaboo
2026-01-01 13:43:59 -08:00
parent 69378d0105
commit a2518bf4b5
4 changed files with 213 additions and 0 deletions

View File

@@ -110,6 +110,7 @@ A curated collection of **Awesome LLM apps built with RAG, AI Agents, Multi-agen
### 🚀 Advanced AI Agents
* [🏚️ 🍌 AI Home Renovation Agent with Nano Banana](advanced_ai_agents/multi_agent_apps/ai_home_renovation_agent)
* [🔍 AI Deep Research Agent](advanced_ai_agents/single_agent_apps/ai_deep_research_agent/)
* [🔬 AI Research Planner & Executor (Gemini Interactions API)](advanced_ai_agents/single_agent_apps/ai_research_planner_executor_interactions/)
* [🤝 AI Consultant Agent](advanced_ai_agents/single_agent_apps/ai_consultant_agent)
* [🏗️ AI System Architect Agent](advanced_ai_agents/single_agent_apps/ai_system_architect_r1/)
* [💰 AI Financial Coach Agent](advanced_ai_agents/multi_agent_apps/ai_financial_coach_agent/)

View File

@@ -0,0 +1,106 @@
# 🔬 AI Research Planner & Executor with Gemini Interactions API
A streamlined multi-phase research assistant built with **Google's Gemini Interactions API** that demonstrates stateful conversations, model mixing, background execution, and automatic infographic generation.
## 🌟 Features
- **📋 Phase 1 - Research Planning**: Uses **Gemini 3 Flash** to create structured, actionable research plans
- **🔍 Phase 2 - Task Selection & Deep Research**: Select specific tasks and leverage **Deep Research Agent** with built-in web search
- **📊 Phase 3 - Synthesis + TL;DR**: Uses **Gemini 3 Pro** for executive reports + **Gemini 3 Pro Image** for automatic infographic generation
- **🎨 Auto-Generated Infographics**: Creates whiteboard-style TL;DR summary at the top of every report
- **🔄 Stateful Conversations**: Demonstrates `previous_interaction_id` for maintaining context across phases
- **⚡ Background Execution**: Async research execution with progress tracking
- **📥 Export Reports**: Download comprehensive research reports as markdown files
## 🎯 How It Works
```
User Goal
[Phase 1] Gemini 3 Flash → Research Plan
[Phase 2] Select Tasks → Deep Research Agent → Research Results
[Phase 3] Gemini 3 Pro → Executive Report
+ Gemini 3 Pro Image → TL;DR Infographic
```
### Phase 1: Planning
1. Enter your research goal
2. **Gemini 3 Flash** creates a numbered research plan with 5-8 specific tasks
3. Plan is stored as an `Interaction` for stateful continuation
### Phase 2: Select & Research
1. Review the research plan with checkboxes for each task
2. Select/deselect tasks to focus your research
3. **Deep Research Agent** executes comprehensive web research using `previous_interaction_id`
### Phase 3: Synthesis + Infographic
1. **Gemini 3 Pro** synthesizes findings into an executive report
2. **Gemini 3 Pro Image** automatically generates a whiteboard TL;DR infographic
3. Report displays with infographic at the top, followed by full text
4. Download as markdown
## 🛠️ Tech Stack
| Component | Technology |
|-----------|-----------|
| **Planning Model** | `gemini-3-flash-preview` |
| **Research Agent** | `deep-research-pro-preview-12-2025` |
| **Synthesis Model** | `gemini-3-pro-preview` |
| **Infographic Model** | `gemini-3-pro-image-preview` |
| **UI Framework** | Streamlit |
| **Python SDK** | `google-genai` |
### How to get Started?
1. Clone the GitHub repository
```bash
git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
cd advanced_ai_agents/single_agent_apps/research_agent_gemini_interaction_api
```
2. Install the required dependencies:
```bash
pip install -r requirements.txt
```
3. Get your Google API Key
- Sign up for a [Google AI Studio account](https://ai.google.dev/) and obtain your API key.
4. Run the Streamlit App
```bash
streamlit run research_planner_executor_agent.py
```
5. Open your browser at `http://localhost:8501`
6. Enter your Google API key in the sidebar and start researching!
## 📝 Example Research Goals
- "Research the B2B HR SaaS market in Germany - key players, regulations, pricing models"
- "Analyze market opportunities for AI-powered customer support tools"
- "Investigate the competitive landscape for sustainable packaging in e-commerce"
- "Research regulatory requirements for fintech products targeting Gen Z"
## ⚠️ Notes
- **Beta API**: The Interactions API is in Beta - features may change
- **Deep Research**: May take 2-5 minutes for comprehensive research
- **Agent vs Model**: Deep Research uses `agent` parameter, not `model`
- **Image Generation**: Infographic generation uses the standard `generate_content` API
## 🔗 Resources
- [Gemini Interactions API Docs](https://ai.google.dev/gemini-api/docs/interactions)
- [Gemini Models](https://ai.google.dev/gemini-api/docs/models)
- [Google AI Studio](https://ai.google.dev/)
## 📄 License
Part of the [Awesome LLM Apps](https://github.com/Shubhamsaboo/awesome-llm-apps) collection.

View File

@@ -0,0 +1,3 @@
google-genai>=1.55.0
streamlit>=1.28.0

View File

@@ -0,0 +1,103 @@
"""Research Planner using Gemini Interactions API - demonstrates stateful conversations, model mixing, and background execution."""
import streamlit as st, time, re
from google import genai
def get_text(outputs): return "\n".join(o.text for o in (outputs or []) if hasattr(o, 'text') and o.text) or ""
def parse_tasks(text):
return [{"num": m.group(1), "text": m.group(2).strip().replace('\n', ' ')}
for m in re.finditer(r'^(\d+)[\.\)\-]\s*(.+?)(?=\n\d+[\.\)\-]|\n\n|\Z)', text, re.MULTILINE | re.DOTALL)]
def wait_for_completion(client, iid, timeout=300):
progress, status, elapsed = st.progress(0), st.empty(), 0
while elapsed < timeout:
interaction = client.interactions.get(iid)
if interaction.status != "in_progress": progress.progress(100); return interaction
elapsed += 3; progress.progress(min(90, int(elapsed/timeout*100))); status.text(f"{elapsed}s..."); time.sleep(3)
return client.interactions.get(iid)
# Setup
st.set_page_config(page_title="Research Planner", page_icon="🔬", layout="wide")
st.title("🔬 AI Research Planner & Executor Agent (Gemini Interactions API) ✨")
for k in ["plan_id", "plan_text", "tasks", "research_id", "research_text", "synthesis_text", "infographic"]:
if k not in st.session_state: st.session_state[k] = [] if k == "tasks" else None
with st.sidebar:
api_key = st.text_input("🔑 Google API Key", type="password")
if st.button("Reset"): [setattr(st.session_state, k, [] if k == "tasks" else None) for k in ["plan_id", "plan_text", "tasks", "research_id", "research_text", "synthesis_text", "infographic"]]; st.rerun()
st.markdown("""
### How It Works
1. **Plan** → Gemini 3 Flash creates research tasks
2. **Select** → Choose which tasks to research
3. **Research** → Deep Research Agent investigates
4. **Synthesize** → Gemini 3 Pro writes report + TL;DR infographic
Each phase chains via `previous_interaction_id` for context.
""")
client = genai.Client(api_key=api_key) if api_key else None
if not client: st.info("👆 Enter API key to start"); st.stop()
# Phase 1: Plan
research_goal = st.text_area("📝 Research Goal", placeholder="e.g., Research B2B HR SaaS market in Germany")
if st.button("📋 Generate Plan", disabled=not research_goal, type="primary"):
with st.spinner("Planning..."):
try:
i = client.interactions.create(model="gemini-3-flash-preview", input=f"Create a numbered research plan for: {research_goal}\n\nFormat: 1. [Task] - [Details]\n\nInclude 5-8 specific tasks.", tools=[{"type": "google_search"}], store=True)
st.session_state.plan_id, st.session_state.plan_text, st.session_state.tasks = i.id, get_text(i.outputs), parse_tasks(get_text(i.outputs))
except Exception as e: st.error(f"Error: {e}")
# Phase 2: Select & Research
if st.session_state.plan_text:
st.divider(); st.subheader("🔍 Select Tasks & Research")
selected = [f"{t['num']}. {t['text']}" for t in st.session_state.tasks if st.checkbox(f"**{t['num']}.** {t['text']}", True, key=f"t{t['num']}")]
st.caption(f"{len(selected)}/{len(st.session_state.tasks)} selected")
if st.button("🚀 Start Deep Research", type="primary", disabled=not selected):
with st.spinner("Researching (2-5 min)..."):
try:
i = client.interactions.create(agent="deep-research-pro-preview-12-2025", input=f"Research these tasks thoroughly with sources:\n\n" + "\n\n".join(selected), previous_interaction_id=st.session_state.plan_id, background=True, store=True)
i = wait_for_completion(client, i.id)
st.session_state.research_id, st.session_state.research_text = i.id, get_text(i.outputs) or f"Status: {i.status}"
st.rerun()
except Exception as e: st.error(f"Error: {e}")
if st.session_state.research_text:
st.divider(); st.subheader("📄 Research Results"); st.markdown(st.session_state.research_text)
# Phase 3: Synthesis + Infographic
if st.session_state.research_id:
if st.button("📊 Generate Executive Report", type="primary"):
with st.spinner("Synthesizing report..."):
try:
i = client.interactions.create(model="gemini-3-pro-preview", input=f"Create executive report with Summary, Findings, Recommendations, Risks:\n\n{st.session_state.research_text}", previous_interaction_id=st.session_state.research_id, store=True)
st.session_state.synthesis_text = get_text(i.outputs)
except Exception as e: st.error(f"Error: {e}"); st.stop()
with st.spinner("Creating TL;DR infographic..."):
try:
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=f"Create a whiteboard summary infographic for the following: {st.session_state.synthesis_text}"
)
for part in response.candidates[0].content.parts:
if hasattr(part, 'inline_data') and part.inline_data:
st.session_state.infographic = part.inline_data.data
break
except Exception as e: st.warning(f"Infographic error: {e}")
st.rerun()
if st.session_state.synthesis_text:
st.divider(); st.markdown("## 📊 Executive Report")
# TL;DR Infographic at the top
if st.session_state.infographic:
st.markdown("### 🎨 TL;DR")
st.image(st.session_state.infographic, use_container_width=True)
st.divider()
st.markdown(st.session_state.synthesis_text)
st.download_button("📥 Download Report", st.session_state.synthesis_text, "research_report.md", "text/markdown")
st.divider(); st.caption("[Gemini Interactions API](https://ai.google.dev/gemini-api/docs/interactions)")