Fuyu-8B is a small multimodal AI model developed by Adept that understands both images and text. It has a simpler architecture than other models, making it easy to understand and scale up. Fuyu-8B is designed specifically for digital agents – it can handle images at any resolution, understand charts and diagrams, and answer questions about user interfaces. Despite being optimized for agents, it still performs well on standard image tasks like visual question answering.