Efficiency is not only about speed. ModernBERT is also memory-friendly and can handle larger batch sizes than previous encoders, which can come in handy for contrastive learning or running on smaller GPUs (which is an important use case for encoders)

Comments