evMLP: An Efficient Event-Driven MLP Architecture for Vision
-
Updated
Nov 25, 2025 - Python
evMLP: An Efficient Event-Driven MLP Architecture for Vision
AI-powered browser automation agent using a dual-LLM architecture. The orchestrator (qwen3-vl-32k) creates execution plans from screenshots, while the executor (llama3.1:8b) translates steps into browser actions using an accessibility tree for reliable element selection. Local, private, powered by Ollama.
🔍 A CLIP-powered image similarity finder built with Streamlit — upload a query image and find the most visually similar matches from a gallery using deep visual embeddings.
AI Nutrition Vision analyzes food images using OpenAI Vision to detect food items and produce detailed nutrition insights (calories, protein, fat, serving size, etc.) with clean Streamlit UI.
Next-gen AI Optical Music Recognition (OMR) platform. Convert sheet music images into playable ABC notation instantly using Google Gemini 3 Pro Vision. Built with React 19, TypeScript, and Tailwind.
Add a description, image, and links to the vision-model topic page so that developers can more easily learn about it.
To associate your repository with the vision-model topic, visit your repo's landing page and select "manage topics."