Profile avatar
adinayakup.bsky.social
AI Research @Hugging Face 🤗 Contributing to the Chinese ML community.
64 posts 604 followers 63 following
Prolific Poster
Conversation Starter

LLaDA 🔥a 8B diffusion model by GSAI Lab Renmin University ✨Fully trained from scratch, LLaDA delivers performance on par with LLaMA3 8B Paper: huggingface.co/papers/2502.... Model: huggingface.co/GSAI-ML/LLaD... Demo: huggingface.co/spaces/multi...

The AI race in the automotive industry is heating up🚗 Li Auto’s research team has released their latest paper on LLM research👇 huggingface.co/papers/2502.... ✨This paper introduces LDGen, which integrates LLMs with diffusion models to enhance text-to-image (T2I) generation capabilities.

Wan2.1 🔥📹 new OPEN video model by Alibaba Demo: huggingface.co/spaces/Wan-A... Model: huggingface.co/Wan-AI/Wan2.... ✨Apache 2.0 ✨8.19GB VRAM, runs on most GPUs ✨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A ✨Text Generation: Supports Chinese & English ✨Powerful Video VAE

🚀 StepFun is making BIG open moves! Last year, their GOT-OCR 2.0 took the community by storm 🔥but many didn’t know they were also building some amazing models. Now, they’ve just dropped something huge on @hf.co 🧵

Skywork昆仑万维 shipped some interesting work🚢 SkyReels-A1🎭 SOTA expression and motion controllable algorithm huggingface.co/Skywork/SkyR... SkyReels-V1📹 an open video model tailored for mini show. huggingface.co/collections/... ✨Based on Hunyuan ✨33 expressions + 400 natural movement combos

The community has been releasing some amazing work🔥 Here is a powerful Chinese open-source distillation dataset (R1) with 110K samples: covering math data + a wide range of general-purpose data🚀 huggingface.co/datasets/Con...

Some interesting news today👀 ✨ Alibaba and Apple are collaborating on AI for iPhones in China. ✨ Baidu, which has been cautious about open source, announced it will join the open-source community in June 2025. ✨ Leaders from Unitree🤖, OpenBMB, XPENG Motors🚗, SiliconFlow met with Huawei executives.

Ovis2 🔥 a multimodal LLM released by Alibaba AIDC team. huggingface.co/collections/... ✨1B/2B/4B/8B/16B/34B ✨Strong CoT for deeper problem solving ✨Multilingual OCR – Expanded beyond English & Chinese, with better data extraction

InspireMusic 🎵🔥 an open music generation framework by Alibaba FunAudio Lab Model: huggingface.co/FunAudioLLM/... Demo: huggingface.co/spaces/FunAu...

New release by DeepSeek 🔥 Official demo of DeepSeek VL2 small is now live on @hf.co 🚀 huggingface.co/spaces/deeps...

Snack Huggy is all set , and so are we! Happy new year to everyone celebrating!! 🧧🐍🤗

New MoE model 🔥 Qwen2.5-Max 🧧 New Year's gift from Alibaba Qwen huggingface.co/spaces/Qwen/...

YuE 乐🎵 7B open music foundation models released by the community M-A-P huggingface.co/m-a-p/YuE-s1... ✨ Transforms lyrics into full songs ✨ Support English/Chinese/Japanese/Korean

On the last day before the Spring Festival holiday in China, DeepSeek released a NEW work on @hf.co 🤯 Janus-Pro🔥 autoregressive framework that unifies multimodal understanding and generation huggingface.co/deepseek-ai/... ✨ 1B / 7B ✨ MIT License

🔥So many exciting releases coming from the Chinese community this month! huggingface.co/collections/...

Qwen2.5-1M 🔥the long-context version of Qwen2.5, supports up to 1M tokens by Alibaba huggingface.co/collections/...

Baichuan is making big moves today 🔥 ✨ Released Baichuan-M1-14B Medical LLM on the hub Model: huggingface.co/baichuan-inc... huggingface.co/baichuan-inc... ✨ Launched All-Scenario Reasoning Model with medical expertise as one of its key highlights. ying.baichuan-ai.com/chat

VideoLLaMA 3🔥multimodal foundation models for Image and Video Understanding by DAMO Alibaba Paper: huggingface.co/papers/2501.... Model: huggingface.co/collections/... ✨ 2B/7B ✨ Apache2.0

Came across some amazing work by OmAI_lab🔥 ✨ Open Agent Leaderboard: Track the best AI agents in one place! huggingface.co/spaces/omlab... ✨ OmAgent: A multimodal agent framework for video understanding, handles everything from CCTV to full-length films 🎥 huggingface.co/papers/2406....

UI-TARS 🔥 series of native GUI agent models (2B/7B/72B) released by ByteDance, combining perception, reasoning, grounding, and memory into one system. Model: huggingface.co/bytedance-re... Paper: huggingface.co/papers/2501....

What happened yesterday in the Chinese AI community? 🚀 huggingface.co/posts/AdinaY...

Hunyuan 3D 2.0🔥 a synthesis system for high-res textured 3D assets released by Tenceny Hunyuan 2 key components: Hunyuan3D-DiT (geometry) and Hunyuan3D-Paint (textures) work together, achieving highly realistic 3D results. Model: huggingface.co/tencent/Huny... Demo coming soon!

BIG release by DeepSeek AI🔥🔥🔥 DeepSeek-R1 & DeepSeek-R1-Zero: two 660B reasoning models are here, alongside 6 distilled dense models (based on Llama & Qwen) for the community! huggingface.co/deepseek-ai huggingface.co/deepseek-ai/...

New work from Alibaba_Qwen🔥 Qwen2.5-Math-PRM 7B & 72B 🔢 Process Reward Models for enhanced process supervision in the mathematical reasoning of LLMs. Paper: huggingface.co/papers/2501.... Model: huggingface.co/Qwen/Qwen2.5... huggingface.co/Qwen/Qwen2.5...

InternLM3-8B🔥 Trained on just 4T tokens, it outperforms Llama3.1-8B and Qwen2.5-7B in reasoning tasks, at 75% lower cost! huggingface.co/collections/...

MiniMax, the company behind Hailuo_AI, has joined the open source community by releasing both models and demos of MiniMax-Text-01 & MiniMax-VL-01🔥 huggingface.co/papers/2501.... huggingface.co/MiniMaxAI/Mi... huggingface.co/MiniMaxAI/Mi...

MiniCPM-o2.6 🔥 an end-side multimodal LLMs released by OpenBMB huggingface.co/openbmb/Mini... ✨ Real-time English/Chinese conversation, emotion control and ASR/STT ✨ Real-time video/audio understanding ✨ Processes up to 1.8M pixels, leads OCRBench & supports 30+ languages

Super excited to see this amazing collaborative work from researchers in the Chinese community 🔥 Next Token Prediction (NTP) expands to multimodal learning 🚀 Paper: huggingface.co/papers/2412.... Github: github.com/LMM101/Aweso...

LLaVA-Mini🔥 A efficient multimodal model for image and video understanding released by Chinese Academy of Sciences Paper: huggingface.co/papers/2501.... Model: huggingface.co/ICTNLP/llava... ✨ Matches LLaVA-v1.5 using just 1 vision token ✨ Delivers <40ms response time

Excited to see Alibaba DAMO Academy release a multimodel dataset for vision language pretraining on @hf.co 🔥 Dataset: huggingface.co/datasets/DAM... Paper: huggingface.co/papers/2501.... ✨ Apache 2.0 ✨ 6.5M images + 0.8B text from 22k hours of instructional videos

Back to work! Starting the day with Mate tea 🧉 and an interesting robotics paper by AgiBot-world from OpenDriveLab 🔥 huggingface.co/papers/2501.... huggingface.co/agibot-world

Megrez-3B-Omni 🔥 3B on-device multimodal LLM by InfinigenceAI huggingface.co/Infinigence/... huggingface.co/spaces/Infin... ✨Leads in bilingual speech ( English & Chinese ) input, multi-turn conversations, and voice-based queries ✨Outperforms in scene understanding and OCR across major benchmarks

2025 will be the year of AI for science Leveraging all the things we've recently learned training AI models for 1000x impact in science and this will need data! More details: huggingface.co/blog/lemater... Thread: ...

The Open LLM Leaderboard got a new front page for Christmas Check it out at huggingface.co/spaces/open-...

DeepSeek-V2.5-1210 🔥 the updated version of DeepSeek-V2.5 just released! huggingface.co/deepseek-ai/... Upgrades include: ✨ MATH-500: 74.8% → 82.8% ✨ LiveCodebench: 29.2% → 34.38% ✨ Writing & reasoning improved on internal tests. ✨ Enhanced file upload & webpage summarization UX

Exciting updates from the Chinese community last week! 🔥 huggingface.co/zh-ai-commun...

ClearVoice 🔊 an integrated voice processing framework by Alibaba Speech Lab 🔥 huggingface.co/alibabasglab huggingface.co/spaces/aliba... ✨Remove background noise for crystal-clear sound ✨Isolate target speech in complex audio ✨Extract voices precisely using audio-video models

The most upvoted papers from the Chinese community on the Daily Papers - November🔥 huggingface.co/collections/...

Sailor 2 🚢 open multilingual model for Southeast Asia by Sea AI Lab🔥 huggingface.co/sailor2 huggingface.co/spaces/sail/... ✨ Fully open code & ALL datasets 🙌 ✨ 1B/ 8B/20B base & chat expanded on Qwen2.5 ✨ Apache 2.0 ✨ Supports 15 languages 🇬🇧🇨🇳🇱🇦🇲🇾🇲🇲🇻🇳🇹🇭