feat: Add Multimodal UI/UX Feedback Agent Team

- Introduced a new multi-agent system for analyzing landing page designs and providing expert UI/UX feedback.
- Implemented a Coordinator/Dispatcher pattern with specialized agents for visual analysis, design strategy, and visual implementation.
- Added tools for editing and generating improved landing pages, along with comprehensive reporting features.
This commit is contained in:
Shubhamsaboo
2025-10-11 16:23:02 -07:00
parent d2025175f8
commit ec0990b29c
6 changed files with 975 additions and 0 deletions

View File

@@ -129,6 +129,7 @@ A curated collection of **Awesome LLM apps built with RAG, AI Agents, Multi-agen
* [👨‍🏫 AI Teaching Agent Team](advanced_ai_agents/multi_agent_apps/agent_teams/ai_teaching_agent_team/)
* [💻 Multimodal Coding Agent Team](advanced_ai_agents/multi_agent_apps/agent_teams/multimodal_coding_agent_team/)
* [✨ Multimodal Design Agent Team](advanced_ai_agents/multi_agent_apps/agent_teams/multimodal_design_agent_team/)
* [🎨 🍌 Multimodal UI/UX Feedback Agent Team with Nano Banana](advanced_ai_agents/multi_agent_apps/agent_teams/multimodal_uiux_feedback_agent_team/)
* [🌏 AI Travel Planner Agent Team](/advanced_ai_agents/multi_agent_apps/agent_teams/ai_travel_planner_agent_team/)
### 🗣️ Voice AI Agents

View File

@@ -0,0 +1,141 @@
# 🎨 🍌 Multimodal UI/UX Feedback Agent Team with Nano Banana
A sophisticated multi-agent system built with Google ADK that analyzes landing page designs, provides expert UI/UX feedback, and automatically generates improved versions using Gemini 2.5 Flash's multimodal capabilities.
## Features
- **👁️ Visual AI Analysis**: Upload landing page screenshots - agents automatically analyze layout, typography, colors, and UX patterns
- **🎯 Expert Feedback**: Comprehensive critique covering visual hierarchy, accessibility, conversion optimization, and design best practices
- **✨ Automatic Improvements**: Generates improved landing page designs incorporating all recommendations
- **📊 Detailed Reports**: Creates comprehensive reports summarizing issues and improvements made
- **🤖 Multi-Agent Orchestration**: Demonstrates Coordinator/Dispatcher + Sequential Pipeline patterns
- **♻️ Iterative Refinement**: Edit and refine generated designs based on additional feedback
- **♿ Accessibility Focus**: WCAG compliance checks and recommendations
## How It Works
The system uses a **Coordinator/Dispatcher pattern** with three specialized agents working in sequence:
### The Team
1. **UI Critic Agent** 🎨
- Analyzes landing page design using Gemini 2.5 Flash's vision capabilities
- Can see and analyze uploaded images directly (no manual tool calls needed)
- Evaluates layout, visual hierarchy, typography, color scheme, and CTA effectiveness
- Identifies critical issues and improvement opportunities
- Provides detailed scores across multiple dimensions
- References specific elements and provides actionable feedback
2. **Design Strategist Agent** 📐
- Creates comprehensive improvement plan based on analysis
- Specifies exact colors (with hex codes), typography, and spacing
- Prioritizes changes for maximum impact
- Ensures accessibility compliance (WCAG AA)
- Considers mobile responsiveness
3. **Visual Implementer Agent** 🚀
- Generates improved landing page design using Gemini 2.5 Flash
- Implements all recommendations from the analysis
- Creates high-quality, professional designs
- Generates comprehensive improvement report
- Maintains version history
## Quick Start
### 1. Clone the repository
```bash
git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
cd awesome-llm-apps/advanced_ai_agents/multi_agent_apps/agent_teams/multimodal_uiux_feedback_agent_team
```
### 2. Install dependencies
```bash
pip install -r requirements.txt
```
### 3. Set up your API key
```bash
export GOOGLE_API_KEY="your_gemini_api_key"
```
Or create a `.env` file:
```
GOOGLE_API_KEY=your_gemini_api_key
```
### 4. Launch ADK Web
```bash
cd advanced_ai_agents/multi_agent_apps/agent_teams
adk web
```
### 5. Open browser
Navigate to the ADK Web interface and select **multimodal_uiux_feedback_agent_team**
## Tools & Capabilities
### Core Tools
- **Direct Vision Analysis**: Agents can see and analyze uploaded images automatically (no tool needed)
- **edit_landing_page_image**: Refine existing designs based on feedback
- **generate_improved_landing_page**: Create new improved designs from scratch
- **google_search**: Research UI/UX trends and best practices
### Features
- **Native Vision Capabilities**: Agents automatically see uploaded images in conversations
- **Versioned artifacts**: Automatic version tracking for all designs
- **State management**: Maintains context across the conversation
- **Detailed prompts**: Generates ultra-specific prompts for high-quality results
- **Sequential Processing**: Each agent builds on previous agent's analysis
## Multi-Agent Architecture
```
Coordinator (Root Agent)
├── Info Agent (general Q&A)
├── Design Editor (iterative refinements)
└── Analysis Pipeline (Sequential)
├── UI Critic (visual analysis & feedback)
├── Design Strategist (improvement planning)
└── Visual Implementer (generate improved design + report)
```
## Best Practices for Users
### Getting Better Results
1. **Use High-Quality Screenshots**
- Full-page captures preferred
- Minimum 1920x1080 resolution
- Clear, uncompressed images
2. **Provide Context**
- Mention target audience (B2B, B2C, enterprise, consumer)
- Share goals (conversions, awareness, engagement)
- Specify any constraints or requirements
3. **Be Specific with Refinements**
- "Make the CTA button 20% larger with vibrant orange color"
- vs "Make the button better"
4. **Iterate Gradually**
- Make one category of changes at a time
- Review each version before requesting more changes
### Common Use Cases
- **Landing Page Audits**: Comprehensive analysis of existing pages
- **Pre-Launch Review**: Get feedback before going live
- **A/B Testing Ideas**: Generate alternative designs to test
- **Competitive Analysis**: Compare your design to competitors
- **Accessibility Audit**: Check WCAG compliance
- **Mobile Optimization**: Review mobile responsiveness
- **Conversion Optimization**: Improve CTA and user flow
## Limitations
- Image generation has inherent variability (run multiple times for options)
- Complex interactions and animations cannot be fully captured
- Best suited for static landing page screenshots
- Real code implementation requires manual development
- Analysis focuses on visual design, not content quality or copy

View File

@@ -0,0 +1,9 @@
"""AI UI/UX Feedback Team
A multi-agent system for analyzing landing pages and providing actionable feedback.
"""
from .agent import root_agent
__all__ = ["root_agent"]

View File

@@ -0,0 +1,463 @@
from google.adk.agents import LlmAgent, SequentialAgent
from google.adk.tools import google_search
from google.adk.tools.agent_tool import AgentTool
from .tools import (
edit_landing_page_image,
generate_improved_landing_page,
)
# ============================================================================
# Helper Tool Agent (wraps google_search)
# ============================================================================
search_agent = LlmAgent(
name="SearchAgent",
model="gemini-2.5-flash",
description="Searches for UI/UX best practices, design trends, and accessibility guidelines",
instruction="Use google_search to find current UI/UX trends, design principles, WCAG guidelines, and industry best practices. Be concise and cite authoritative sources.",
tools=[google_search],
)
# ============================================================================
# Specialist Agent 1: Info Agent (for general inquiries)
# ============================================================================
info_agent = LlmAgent(
name="InfoAgent",
model="gemini-2.5-flash",
description="Handles general questions and provides system information about the UI/UX feedback team",
instruction="""
You are the Info Agent for the AI UI/UX Feedback Team.
WHEN TO USE: The coordinator routes general questions and casual greetings to you.
YOUR RESPONSE:
- Keep it brief and helpful (2-4 sentences)
- Explain the system analyzes landing pages using AI vision
- Mention capabilities: image analysis, constructive feedback, automatic improvements, comprehensive reports
- Ask them to upload a landing page screenshot for analysis
EXAMPLE:
"Hi! I'm part of the AI UI/UX Feedback Team. We analyze landing page designs using advanced AI vision, provide detailed constructive feedback on layout, typography, colors, and CTAs, then automatically generate improved versions with our recommendations applied. Upload a screenshot of your landing page and I'll get our expert team to review it!"
Be enthusiastic about design and helpful!
""",
)
# ============================================================================
# Specialist Agent 2: Design Editor (for iterative refinements)
# ============================================================================
design_editor = LlmAgent(
name="DesignEditor",
model="gemini-2.5-flash",
description="Edits existing landing page designs based on specific feedback or refinement requests",
instruction="""
You refine existing landing page designs based on user feedback.
**TASK**: User wants to modify an existing design (e.g., "make the CTA button larger", "use a different color scheme", "improve the hero section").
**CRITICAL**: Find the most recent design filename from conversation history!
Look for: "Saved as artifact: [filename]" or "landing_page_v1.png" type references.
Use **edit_landing_page_image** tool:
Parameters:
1. artifact_filename: The exact filename of the most recent design
2. prompt: Very specific edit instruction with UI/UX context
3. asset_name: Base name without _vX (e.g., "landing_page_improved")
**Example:**
User: "Make the CTA button more prominent"
Last design: "landing_page_improved_v1.png"
Call: edit_landing_page_image(
artifact_filename="landing_page_improved_v1.png",
prompt="Increase the CTA button size by 20%, use a high-contrast color (vibrant orange #FF6B35) to make it stand out more against the background. Add subtle shadow for depth. Ensure the button text is bold and clearly readable. Keep all other design elements unchanged.",
asset_name="landing_page_improved"
)
Be SPECIFIC in prompts and apply UI/UX best practices:
- Visual hierarchy (size, color, contrast)
- Whitespace and breathing room
- Typography hierarchy
- Color psychology
- Accessibility (WCAG)
After editing, briefly explain the UI/UX rationale for the changes.
""",
tools=[edit_landing_page_image],
)
# ============================================================================
# Specialist Agents 3-5: Full Analysis Pipeline (SequentialAgent)
# ============================================================================
ui_critic = LlmAgent(
name="UICritic",
model="gemini-2.5-flash",
description="Analyzes landing page design and provides comprehensive UI/UX feedback using visual AI",
instruction="""
You are a Senior UI/UX Designer with expertise in conversion optimization and accessibility.
**YOUR ROLE**: Analyze uploaded landing page images and provide expert, actionable feedback.
**IMPORTANT**: You can SEE and ANALYZE uploaded images directly using your vision capabilities.
The images are automatically visible to you in the conversation - no tools needed.
Focus on providing detailed analysis and specific recommendations.
## Analysis Framework
When you see a landing page image, examine it across these dimensions:
### 1. First Impression (1-10 rating)
- Visual appeal and professionalism
- Brand perception and trust signals
- Emotional impact
### 2. Layout & Visual Hierarchy ⭐ HIGH PRIORITY
- Hero section effectiveness (headline, subheadline, imagery)
- F-pattern or Z-pattern adherence
- Element sizing and positioning
- Above-the-fold content quality
- Alignment and grid usage
- Section spacing and flow
### 3. Typography
- Font choices (modern, professional, readable?)
- Heading hierarchy (H1, H2, H3 distinction)
- Body text readability (size 16px+, line height 1.5+, line length)
- Font pairing harmony
- Text contrast with background
### 4. Color Scheme & Contrast
- Brand color consistency
- Color psychology alignment with purpose
- Sufficient contrast for readability (WCAG AA: 4.5:1 for text)
- Color harmony (complementary, analogous, triadic?)
- Emotional response appropriateness
### 5. Call-to-Action (CTA) ⭐ CRITICAL
- CTA visibility and prominence (size, color, placement)
- Action-oriented copy ("Get Started" vs "Submit")
- Button design (contrast, hover states implied)
- Multiple CTAs coordination (primary vs secondary)
- Above-the-fold CTA presence
### 6. Whitespace & Balance
- Adequate breathing room around elements
- Cluttered vs clean sections
- Visual weight distribution
- Margins and padding consistency
### 7. Content Structure
- Information architecture clarity
- Content scanability
- Social proof placement (testimonials, logos, stats)
- Trust elements (security badges, guarantees)
### 8. Mobile Responsiveness Considerations
- Elements that may not translate well to mobile
- Touch target sizes
- Mobile-first design principles
## Output Structure
Provide feedback in this format:
**🎯 OVERALL IMPRESSION**
[Rating and 2-3 sentence summary]
**✅ WHAT WORKS WELL**
[List 3-5 strengths]
**⚠️ CRITICAL ISSUES** (High Priority)
1. [Issue with severity and specific location]
2. [Issue with severity and specific location]
3. [Issue with severity and specific location]
**📋 ADDITIONAL IMPROVEMENTS** (Medium/Low Priority)
[4-6 additional suggestions]
**🚀 TOP 3 IMPACT PRIORITIES**
1. [Most impactful change]
2. [Second most impactful change]
3. [Third most impactful change]
**📊 DETAILED SCORES**
- Layout & Hierarchy: X/10
- Typography: X/10
- Color & Contrast: X/10
- CTA Effectiveness: X/10
- Whitespace & Balance: X/10
**IMPORTANT**: At the end of your analysis, output a structured summary:
```
ANALYSIS COMPLETE
Images Analyzed: [Yes/No - describe what you see]
Key Issues Identified: [number]
Critical Priority: [main issue]
Target Audience: [detected or general]
```
Be DETAILED and SPECIFIC in your analysis - this drives the quality of the improvement plan and generated design.
**IF NO IMAGE IS VISIBLE**: Ask the user to upload a landing page screenshot so you can provide analysis.
""",
tools=[AgentTool(search_agent)],
)
design_strategist = LlmAgent(
name="DesignStrategist",
model="gemini-2.5-flash",
description="Creates detailed improvement plan based on UI/UX analysis",
instruction="""
Read from state: latest_analysis, key issues, priorities
You are a Design Strategist who creates actionable improvement plans.
**YOUR TASK**: Based on the UI Critic's analysis, create a SPECIFIC, DETAILED plan for improvements.
## Improvement Plan Structure
### 🎯 Design Strategy Overview
- Primary goal: [conversion optimization/brand awareness/user engagement]
- Target user: [persona]
- Key improvement theme: [modernization/simplification/boldness/etc.]
### 📐 Layout & Structure Improvements
**Changes to make:**
- Hero section: [specific modifications to headline, subheadline, imagery, CTA]
- Visual hierarchy: [size adjustments, reordering, emphasis changes]
- Grid system: [alignment fixes, column structure]
- Whitespace: [specific areas to add/reduce space]
### 🎨 Visual Design Improvements
**Color Palette:**
- Primary: [specific color with hex code and usage]
- Secondary: [specific color with hex code and usage]
- Accent (CTA): [high-contrast color with hex code]
- Background: [specific shade]
- Text colors: [with contrast ratios]
**Typography:**
- Heading font: [font name, size, weight]
- Body font: [font name, size, line height]
- CTA text: [font treatment]
- Hierarchy: [H1: Xpx, H2: Xpx, Body: 16-18px]
### 🎯 CTA Optimization
- Primary CTA: [exact text, color, size, placement]
- Secondary CTA: [if applicable]
- Button design: [shape, padding, shadow, hover effect]
### ♿ Accessibility Enhancements
- Contrast improvements needed: [specific areas]
- Font size increases: [where]
- Alt text considerations
- Focus states for interactive elements
### 📱 Mobile Considerations
- Elements to stack vertically
- Font size adjustments for mobile
- Touch target sizes (minimum 44x44px)
### 🔤 Content Recommendations
- Headline improvements: [more compelling/clearer]
- Subheadline clarity
- CTA copy: [action-oriented language]
- Trust signals to add/improve
**IMPORTANT: At the end, provide:**
```
DESIGN PLAN COMPLETE
Improvement Categories: [Layout, Color, Typography, CTA, Accessibility]
Estimated Impact: [High/Medium/Low]
Implementation Complexity: [Simple/Moderate/Complex]
Ready for visual implementation.
```
Be ULTRA-SPECIFIC with colors (hex codes), sizes (px), and placements. This drives the image generation quality.
""",
tools=[AgentTool(search_agent)],
)
visual_implementer = LlmAgent(
name="VisualImplementer",
model="gemini-2.5-flash",
description="Generates improved landing page design and creates comprehensive report",
instruction="""
Read conversation history to extract:
- UI Critic's detailed analysis
- Design Strategist's improvement plan
- Original landing page image (if visible in conversation)
**IMPORTANT**: You have VISION CAPABILITIES and can see images in the conversation.
If there's an original landing page image visible, use it as inspiration for the improved version.
**YOUR TASK**: Generate an improved landing page implementing ALL recommendations
Use **generate_improved_landing_page** tool with an EXTREMELY DETAILED prompt.
**Build the prompt by incorporating:**
From UI Critic:
- Critical issues to fix
- Top 3 priorities
- What currently works well (preserve these)
From Design Strategist:
- Exact color palette (with hex codes)
- Typography specifications (fonts, sizes, weights)
- Layout structure and hierarchy
- CTA design details
- Whitespace improvements
**Prompt Structure:**
"Professional landing page design with modern UI/UX best practices applied.
**Layout & Hierarchy:**
[Detailed description of hero section, content structure, visual flow]
**Color Palette:**
- Primary: [color name + hex code]
- Secondary: [color name + hex code]
- CTA/Accent: [high-contrast color + hex code]
- Background: [color + hex code]
- Text: [color with contrast ratio]
**Typography:**
- Headlines: [font, size, weight, color] - Clear hierarchy with [X]px for H1
- Body text: [font, 16-18px, line-height 1.6, color] - Highly readable
- CTA text: [font, size, weight] - Action-oriented
**Call-to-Action:**
[Detailed CTA button design: size, color, text, placement, shadow/effects]
**Visual Elements:**
- Hero image/graphic: [description]
- Section images: [description]
- Icons: [style and placement]
- Social proof: [testimonials, logos, stats placement]
**Whitespace & Balance:**
[Specific spacing between sections, margins, padding]
**Accessibility:**
- WCAG AA compliant contrast ratios
- Readable font sizes (16px minimum)
- Clear focus states
**Style:**
- Modern, clean, professional
- [Additional style keywords from analysis]
- High-quality UI design, Dribbble/Behance quality
Camera/Quality: Desktop web design screenshot, 16:9 aspect ratio, professional UI/UX portfolio quality"
Parameters:
- prompt: [your ultra-detailed prompt above]
- aspect_ratio: "16:9"
- asset_name: "landing_page_improved"
- reference_image: [filename of original if available]
**After generating the improved design, provide a brief summary:**
Describe the key improvements in 3-4 sentences:
- What critical issues were addressed
- Main visual/design changes applied
- Expected impact on user experience and conversion
**Example:**
"✅ **Improved Landing Page Generated!**
**Key Improvements Applied:**
- ✨ Enhanced visual hierarchy with larger hero headline (48px) and prominent CTA
- 🎨 Implemented high-contrast color scheme (#FF6B35 accent) with WCAG AA compliance
- 📝 Improved typography with clear heading hierarchy and 18px readable body text
- 🎯 Redesigned CTA button with vibrant accent color and better placement above-the-fold
- 💨 Optimized whitespace for better content flow and readability
The new design addresses all critical issues identified in the analysis and follows modern UI/UX best practices."
""",
tools=[generate_improved_landing_page],
)
# Create the analysis pipeline (runs only when coordinator routes analysis requests here)
analysis_pipeline = SequentialAgent(
name="AnalysisPipeline",
description="Full UI/UX analysis pipeline: Image Analysis → Design Strategy → Visual Implementation",
sub_agents=[
ui_critic,
design_strategist,
visual_implementer,
],
)
# ============================================================================
# Coordinator/Dispatcher (Root Agent)
# ============================================================================
root_agent = LlmAgent(
name="UIUXFeedbackTeam",
model="gemini-2.5-flash",
description="Intelligent coordinator that routes UI/UX feedback requests to the appropriate specialist or analysis pipeline. Supports landing page image analysis!",
instruction="""
You are the Coordinator for the AI UI/UX Feedback Team.
YOUR ROLE: Analyze the user's request and route it to the right specialist using transfer_to_agent.
**IMPORTANT**: You have VISION CAPABILITIES. If you see an image in the conversation, route to AnalysisPipeline immediately.
ROUTING LOGIC:
1. **For general questions/greetings** (NO images present):
→ transfer_to_agent to "InfoAgent"
→ Examples: "hi", "what can you do?", "how does this work?", "what is UI/UX?"
2. **For editing EXISTING designs** (only if a design was already generated):
→ transfer_to_agent to "DesignEditor"
→ Examples: "make the CTA bigger", "change the color scheme", "improve the hero section", "make it more modern"
→ User wants to MODIFY an existing improved design
→ Check: Was an improved design generated earlier in this conversation?
3. **For NEW landing page analysis** (PRIORITY ROUTE):
→ transfer_to_agent to "AnalysisPipeline"
→ Examples: "analyze this landing page", "review my design", "give me feedback"
→ **CRITICAL**: If you SEE an image in the conversation → ALWAYS route here!
→ First-time analysis or new project
→ This runs the full pipeline: UI Critic → Design Strategist → Visual Implementer
CRITICAL: You MUST use transfer_to_agent - don't answer directly!
Decision flow:
- **Image visible in conversation** → IMMEDIATELY transfer to AnalysisPipeline
- Design exists + wants changes → DesignEditor
- No image + asking questions → InfoAgent
Be a smart router - prioritize image analysis!
""",
sub_agents=[
info_agent,
design_editor,
analysis_pipeline,
],
)
__all__ = ["root_agent"]

View File

@@ -0,0 +1,5 @@
google-adk
python-dotenv
pydantic

View File

@@ -0,0 +1,356 @@
import os
import logging
from google import genai
from google.genai import types
from google.adk.tools import ToolContext
from pydantic import BaseModel, Field
from dotenv import load_dotenv
load_dotenv()
# Configure logging
logger = logging.getLogger(__name__)
# ============================================================================
# Helper Functions for Asset Version Management
# ============================================================================
def get_next_version_number(tool_context: ToolContext, asset_name: str) -> int:
"""Get the next version number for a given asset name."""
asset_versions = tool_context.state.get("asset_versions", {})
current_version = asset_versions.get(asset_name, 0)
next_version = current_version + 1
return next_version
def update_asset_version(tool_context: ToolContext, asset_name: str, version: int, filename: str) -> None:
"""Update the version tracking for an asset."""
if "asset_versions" not in tool_context.state:
tool_context.state["asset_versions"] = {}
if "asset_filenames" not in tool_context.state:
tool_context.state["asset_filenames"] = {}
tool_context.state["asset_versions"][asset_name] = version
tool_context.state["asset_filenames"][asset_name] = filename
def create_versioned_filename(asset_name: str, version: int, file_extension: str = "png") -> str:
"""Create a versioned filename for an asset."""
return f"{asset_name}_v{version}.{file_extension}"
async def load_landing_page_image(tool_context: ToolContext, filename: str):
"""Load a landing page image artifact by filename."""
try:
loaded_part = await tool_context.load_artifact(filename)
if loaded_part:
logger.info(f"Successfully loaded landing page image: {filename}")
return loaded_part
else:
logger.warning(f"Landing page image not found: {filename}")
return None
except Exception as e:
logger.error(f"Error loading landing page image {filename}: {e}")
return None
# ============================================================================
# Pydantic Input Models
# ============================================================================
class EditLandingPageInput(BaseModel):
artifact_filename: str = Field(..., description="The filename of the landing page artifact to edit.")
prompt: str = Field(..., description="Detailed description of UI/UX improvements to apply.")
asset_name: str = Field(default=None, description="Optional: specify asset name for the new version.")
class GenerateImprovedLandingPageInput(BaseModel):
prompt: str = Field(..., description="A detailed description of the improved landing page based on feedback.")
aspect_ratio: str = Field(default="16:9", description="The desired aspect ratio. Default is 16:9.")
asset_name: str = Field(default="landing_page_improved", description="Base name for the improved design.")
reference_image: str = Field(default=None, description="Optional: filename of the original landing page to use as reference.")
# ============================================================================
# NOTE: Image Analysis is handled directly by agents with vision capabilities
# Agents with gemini-2.5-flash can see and analyze uploaded images automatically
# No separate image analysis tool is needed
# ============================================================================
# ============================================================================
# Image Editing Tool
# ============================================================================
async def edit_landing_page_image(tool_context: ToolContext, inputs: EditLandingPageInput) -> str:
"""
Edits a landing page image by applying UI/UX improvements.
This tool uses Gemini 2.5 Flash's image generation capabilities to create
an improved version of the landing page based on feedback.
"""
if "GEMINI_API_KEY" not in os.environ and "GOOGLE_API_KEY" not in os.environ:
raise ValueError("GEMINI_API_KEY or GOOGLE_API_KEY environment variable not set.")
logger.info("Starting landing page image editing")
try:
client = genai.Client()
inputs = EditLandingPageInput(**inputs)
# Load the existing landing page image
logger.info(f"Loading artifact: {inputs.artifact_filename}")
try:
loaded_image_part = await tool_context.load_artifact(inputs.artifact_filename)
if not loaded_image_part:
return f"❌ Could not find landing page artifact: {inputs.artifact_filename}"
except Exception as e:
logger.error(f"Error loading artifact: {e}")
return f"Error loading landing page artifact: {e}"
model = "gemini-2.5-flash-image"
# Build edit prompt with UI/UX best practices
enhanced_prompt = f"""
{inputs.prompt}
**Apply these UI/UX best practices while editing:**
- Maintain visual hierarchy (size, color, spacing)
- Ensure sufficient whitespace for breathing room
- Use consistent alignment and grid system
- Make CTAs prominent with contrasting colors
- Improve readability (font size, line height, contrast)
- Follow modern web design principles
- Keep the overall brand aesthetic
Make the improvements look natural and professional.
"""
# Build content parts
content_parts = [loaded_image_part, types.Part.from_text(text=enhanced_prompt)]
contents = [
types.Content(
role="user",
parts=content_parts,
),
]
generate_content_config = types.GenerateContentConfig(
response_modalities=[
"IMAGE",
"TEXT",
],
)
# Determine asset name and generate versioned filename
if inputs.asset_name:
asset_name = inputs.asset_name
else:
current_asset_name = tool_context.state.get("current_asset_name")
if current_asset_name:
asset_name = current_asset_name
else:
base_name = inputs.artifact_filename.split('_v')[0] if '_v' in inputs.artifact_filename else "landing_page"
asset_name = base_name
version = get_next_version_number(tool_context, asset_name)
edited_artifact_filename = create_versioned_filename(asset_name, version)
logger.info(f"Editing landing page with artifact filename: {edited_artifact_filename} (version {version})")
# Edit the image
for chunk in client.models.generate_content_stream(
model=model,
contents=contents,
config=generate_content_config,
):
if (
chunk.candidates is None
or chunk.candidates[0].content is None
or chunk.candidates[0].content.parts is None
):
continue
if chunk.candidates[0].content.parts[0].inline_data and chunk.candidates[0].content.parts[0].inline_data.data:
inline_data = chunk.candidates[0].content.parts[0].inline_data
# Create a Part object from the inline data
edited_image_part = types.Part(inline_data=inline_data)
try:
# Save the edited image as an artifact
version = await tool_context.save_artifact(
filename=edited_artifact_filename,
artifact=edited_image_part
)
# Update version tracking
update_asset_version(tool_context, asset_name, version, edited_artifact_filename)
# Store in session state
tool_context.state["last_edited_landing_page"] = edited_artifact_filename
tool_context.state["current_asset_name"] = asset_name
logger.info(f"Saved edited landing page as artifact '{edited_artifact_filename}' (version {version})")
return f"✅ **Landing page edited successfully!**\n\nSaved as: **{edited_artifact_filename}** (version {version} of {asset_name})\n\nThe landing page has been improved with the UI/UX enhancements."
except Exception as e:
logger.error(f"Error saving edited artifact: {e}")
return f"Error saving edited landing page as artifact: {e}"
else:
if hasattr(chunk, 'text') and chunk.text:
logger.info(f"Model response: {chunk.text}")
return "No edited landing page was generated. Please try again."
except Exception as e:
logger.error(f"Error in edit_landing_page_image: {e}")
return f"An error occurred while editing the landing page: {e}"
# ============================================================================
# Generate Improved Landing Page Tool
# ============================================================================
async def generate_improved_landing_page(tool_context: ToolContext, inputs: GenerateImprovedLandingPageInput) -> str:
"""
Generates an improved landing page based on the analysis and feedback.
This tool creates a new landing page design incorporating all the recommended
UI/UX improvements. Can work with or without a reference image.
"""
if "GEMINI_API_KEY" not in os.environ and "GOOGLE_API_KEY" not in os.environ:
raise ValueError("GEMINI_API_KEY or GOOGLE_API_KEY environment variable not set.")
logger.info("Starting improved landing page generation")
try:
client = genai.Client()
inputs = GenerateImprovedLandingPageInput(**inputs)
# Note: Reference images from the conversation are automatically available to agents
# This parameter is kept for backwards compatibility with saved artifacts
reference_part = None
if inputs.reference_image:
try:
reference_part = await load_landing_page_image(tool_context, inputs.reference_image)
if reference_part:
logger.info(f"Using reference image artifact: {inputs.reference_image}")
except Exception as e:
logger.warning(f"Could not load reference image, proceeding without it: {e}")
# Get the analysis from state to incorporate feedback
latest_analysis = tool_context.state.get("latest_analysis", "")
# Build enhanced prompt
enhancement_prompt = f"""
Create a professional landing page design that incorporates these improvements:
{inputs.prompt}
**Previous Analysis Insights:**
{latest_analysis[:500] if latest_analysis else "No previous analysis available"}
**Design Requirements:**
- Modern, clean aesthetic
- Clear visual hierarchy
- Prominent, well-designed CTAs
- Proper whitespace and breathing room
- Professional typography with clear hierarchy
- Accessible color contrast (WCAG AA)
- Mobile-first responsive considerations
- Follow the latest UI/UX best practices
- High-quality, photorealistic rendering
Aspect ratio: {inputs.aspect_ratio}
Create a professional UI/UX design that would be magazine-quality.
"""
# Prepare content parts
content_parts = [types.Part.from_text(text=enhancement_prompt)]
if reference_part:
content_parts.append(reference_part)
# Generate enhanced prompt first
rewritten_prompt_response = client.models.generate_content(
model="gemini-2.5-flash",
contents=enhancement_prompt
)
rewritten_prompt = rewritten_prompt_response.text
logger.info(f"Enhanced prompt: {rewritten_prompt}")
model = "gemini-2.5-flash-image"
contents = [
types.Content(
role="user",
parts=[types.Part.from_text(text=rewritten_prompt)] + ([reference_part] if reference_part else []),
),
]
generate_content_config = types.GenerateContentConfig(
response_modalities=[
"IMAGE",
"TEXT",
],
)
# Generate versioned filename
version = get_next_version_number(tool_context, inputs.asset_name)
artifact_filename = create_versioned_filename(inputs.asset_name, version)
logger.info(f"Generating improved landing page with filename: {artifact_filename} (version {version})")
# Generate the image
for chunk in client.models.generate_content_stream(
model=model,
contents=contents,
config=generate_content_config,
):
if (
chunk.candidates is None
or chunk.candidates[0].content is None
or chunk.candidates[0].content.parts is None
):
continue
if chunk.candidates[0].content.parts[0].inline_data and chunk.candidates[0].content.parts[0].inline_data.data:
inline_data = chunk.candidates[0].content.parts[0].inline_data
image_part = types.Part(inline_data=inline_data)
try:
version = await tool_context.save_artifact(
filename=artifact_filename,
artifact=image_part
)
update_asset_version(tool_context, inputs.asset_name, version, artifact_filename)
tool_context.state["last_generated_landing_page"] = artifact_filename
tool_context.state["current_asset_name"] = inputs.asset_name
logger.info(f"Saved improved landing page as artifact '{artifact_filename}' (version {version})")
return f"✅ **Improved landing page generated successfully!**\n\nSaved as: **{artifact_filename}** (version {version} of {inputs.asset_name})\n\nThis design incorporates all the recommended UI/UX improvements."
except Exception as e:
logger.error(f"Error saving artifact: {e}")
return f"Error saving improved landing page as artifact: {e}"
else:
if hasattr(chunk, 'text') and chunk.text:
logger.info(f"Model response: {chunk.text}")
return "No improved landing page was generated. Please try again with a more detailed prompt."
except Exception as e:
logger.error(f"Error in generate_improved_landing_page: {e}")
return f"An error occurred while generating the improved landing page: {e}"
# ============================================================================
# Note: No utility tools needed - agents handle everything directly
# ============================================================================