This is super exciting! I've been hanging out for a modern uplift to BERT models that have larger context windows. 512 tokens is pretty limiting for a number of use cases I've had. Have yet to dig in, but it looks like awesome work! #MLSky #DataBS #NLP

Reposted from Jeremy Howard

I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵

Comments

Posting Rules

Comments

Posting Rules

Reply