We posted our paper on arxiv recently, sharing this here too: arxiv.org/abs/2412.141... - work led by our amazing intern Peter Tong. Key findings: - LLMs can be trained to generate visual embeddings!! - VQA data appears to help a lot in generation! - Better understanding = better generation! - ThreadSky

koustuvsinha.com • 96 days ago

We posted our paper on arxiv recently, sharing this here too: https://arxiv.org/abs/2412.14164v1 - work led by our amazing intern Peter Tong. Key findings:

- LLMs can be trained to generate visual embeddings!!
- VQA data appears to help a lot in generation!
- Better understanding = better generation!

Comments

Posting Rules

Comments

Posting Rules

Reply