awesome-llm-apps

A Streamlit application that combines video analysis and web search capabilities using Google's Gemini 2.5 model. This agent can analyze uploaded videos and answer questions by combining visual understanding with web-search.

Features

Video analysis using Gemini 2.5 Flash/Pro
Web research integration via DuckDuckGo
Support for multiple video formats (MP4, MOV, AVI)
Real-time video processing
Combined visual and textual analysis

How to get Started?

Clone the GitHub repository

git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
cd ai_agent_tutorials/multimodal_ai_agent

Install the required dependencies:

pip install -r requirements.txt

Get your Google Gemini API Key

Set up your Gemini API Key as the environment variable

GOOGLE_API_KEY=your_api_key_here

Run the Streamlit App

streamlit run multimodal_agent.py