MagicPIG: LSH Sampling for Efficient LLM Generation This repo is for exploring the possibility of GPU-CPU system powered by LSH. Three models are supported now: llama3-8b-chat-128k, llama3-70b-chat-128k, mistral-7b-chat-512k. github.com/Infini-AI-La... - ThreadSky

sungkim.bsky.social • 210 days ago

MagicPIG: LSH Sampling for Efficient LLM Generation

This repo is for exploring the possibility of GPU-CPU system powered by LSH. Three models are supported now: llama3-8b-chat-128k, llama3-70b-chat-128k, mistral-7b-chat-512k.

https://github.com/Infini-AI-Lab/MagicPIG

Comments

Posting Rules

Comments

Posting Rules

Reply