OmniVision-968M: a new local VLM for edge devices, fast & small but performant π
it's based on SigLIP-so-400M and Qwen-2.5-0.5B
π¨ 9x less image tokens, super efficient
π aligned with SFT and DPO for reducing hallucinations
π₯ Apache 2.0 license
Demo https://hf.co/spaces/NexaAIDev/omnivlm-dpo-demo
it's based on SigLIP-so-400M and Qwen-2.5-0.5B
π¨ 9x less image tokens, super efficient
π aligned with SFT and DPO for reducing hallucinations
π₯ Apache 2.0 license
Demo https://hf.co/spaces/NexaAIDev/omnivlm-dpo-demo
Comments