MagicPIG: LSH Sampling for Efficient LLM Generation
This repo is for exploring the possibility of GPU-CPU system powered by LSH. Three models are supported now: llama3-8b-chat-128k, llama3-70b-chat-128k, mistral-7b-chat-512k.
https://github.com/Infini-AI-Lab/MagicPIG
This repo is for exploring the possibility of GPU-CPU system powered by LSH. Three models are supported now: llama3-8b-chat-128k, llama3-70b-chat-128k, mistral-7b-chat-512k.
https://github.com/Infini-AI-Lab/MagicPIG
Comments
https://chatgpt.com/share/67479d27-27e8-8006-bb7d-2c01e2b6f0f7