What can you do with multimodal LLMs? How about identifying objects by name, description, color, and even drawing a bounding box around them? 🖼️ ➡️ 📄 Gemini makes it possible, Genkit makes it simple. - ThreadSky

mbleigh.dev • 6 days ago

What can you do with multimodal LLMs? How about identifying objects by name, description, color, and even drawing a bounding box around them?

🖼️ ➡️ 📄

Gemini makes it possible, Genkit makes it simple.