This is super exciting! I've been hanging out for a modern uplift to BERT models that have larger context windows. 512 tokens is pretty limiting for a number of use cases I've had. Have yet to dig in, but it looks like awesome work! #MLSky #DataBS #NLP
Reposted from
Jeremy Howard
I'll get straight to the point.
We trained 2 new models. Like BERT, but modern. ModernBERT.
Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.
It's much faster, more accurate, longer context, and more useful. 🧵
We trained 2 new models. Like BERT, but modern. ModernBERT.
Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.
It's much faster, more accurate, longer context, and more useful. 🧵
Comments