Profile avatar
ufal-cuni.bsky.social
Computational linguistics • Natural language processing • Formal linguistics • Machine translation | at Faculty of Mathematics and Physics, Charles University
74 posts 543 followers 53 following
Prolific Poster
Conversation Starter

🎉 Congratulations to @zdenekkasner.bsky.social for winning 2nd Prize in the Joseph Fourier Prize for Computer Science! The Joseph Fourier Prize recognizes outstanding scientific work in computer science and is awarded annually by the French Embassy in the Czech Republic. 🇨🇿 🇫🇷

CUNI system submission for IWSLT 2025 Simultaneous Shared Task using Whisper and EuroLLM achieves great quality with 2-4 seconds latency 🤯 and outperforms baseline by a large margin. It uses context, prompting, and in-context learning. It's among the top-performing systems 🎖️in IWSLT 2025.

Accepted long paper at IWSLT 2025! 🥳 📆 31 July-1 August 2025 📍 Vienna, Austria Prompting LLMs: Length Control for Isometric Machine Translation, by Dávid Javorský, Ondřej Bojar, @yvofr.bsky.social, in collaboration with @ufal-cuni.bsky.social .

Research on trustworthy text generation by @tuetschek.bsky.social was featured in the Czech edition of @wired.com 😊 www.wired.cz/clanky/vyvij...

The pre-print of Dávid's ACL paper here: arxiv.org/abs/2506.04848 🗣️ MockConf, a 7-hour student interpreting dataset in 5 languages + InterAlign, a tool for analyzing simultaneous interpretation to advance automatic real-time translation.

Participate in the Terminology shared task at WMT 2025! (And there are some other cool tasks too.)

Přečtěte si rozhovor s @janastrakova.bsky.social na matfyz.cz! www.matfyz.cz/clanky/holky...

We got 3 papers accepted to the #ACL2025 main conf: 👉 An Expanded Massive Multilingual Dataset for High-Performance Language Technologies arxiv.org/abs/2503.10267 by @hajicjan.bsky.social, @jindrahelcl.bsky.social and many others: Data of 8T tokens in 193 langs + 380M parallel sentences in 51 langs

#NAACL2025 ended more than a week ago & @ufal-cuni.bsky.social folks were there: Main conf: @kathaem.bsky.social presented joint work w/ @tomlim.bsky.social, @jlibovicky.bsky.social and Alex Fraser: Beyond Literal Token Overlap: Token Alignability for Multilinguality aclanthology.org/2025.naacl-s...

On Saturday, @patuchen.bsky.social talked at the GHOST Day: Applied Machine Learning Conference in Poznań 🇵🇱 👉 Evaluating LLM-generated text at a scale 👈 lessons learned from evaluating hotel highlights and a script of a theater play, and practical recommendations for LLMs in a referenceless scenario

Collegium Carolinum, the Bavarian Academy of Sciences & @ufal-cuni.bsky.social cordially invite you to their workshop 👉 Automated Context? Practices, Opportunities and Risks of AI-Driven Translation in Bohemistics and the Humanities 🔗 www.collegium-carolinum.de/veranstaltun... 📍 Prague, May 15-16

🎶 Umíš plynně česky, je ti více než 18 a znáš píseň „Není nutno“? 🗣️ Přidej se tedy k výzkumu o řeči a zpěvu! 📅 Kdy a kde? 17.–18. 6. 2025 v Praze! 📲 Naskenuj QR kód z obrázku a zapoj se do studie ještě dnes! ❓ Dotazy směřuj na: [email protected] / [email protected]

Got a tokenization paper that just didn't make the cut for ICML? Submit it to the Tokenization Workshop TokShop at #ICML2025 -- we'd love to see it there! tokenization-workshop.github.io

The 👉Machine Learning Prague 2025👈 is happening right now! Today, @patuchen.bsky.social and @navitas.bsky.social presented their posters on text generation with LLMs. Also, don't miss @tuetschek.bsky.social's invited talk tomorrow at 11 a.m.

NameTag 3.1 🏷️ is an open-source named-entity recognizer developed by @janastrakova.bsky.social and @straka-milan.bsky.social at @ufal-cuni.bsky.social. Try it at lindat.mff.cuni.cz/services/nam... in 🤯 17 languages. 🌏🌍🌎

Submit your papers and discuss tokenization at Tokenization Workshop: TokShop @tokshop.bsky.social! Co-organized by @ufal-cuni.bsky.social's experts and alumnis @jlibovicky.bsky.social, @jindrahelcl.bsky.social, @tomlim.bsky.social. #NLP #Tokenization

Folks from our institute released a pre-print on using LLMs for span annotation. Check it out at llm-span-annotators.github.io

Join our PhD student @patuchen.bsky.social tomorrow, April 15th, as she presents her research on multilingual AI systems at @Toloka's expert panel. The discussion will cover dataset scarcity, evaluation challenges, and cultural nuances in VLMs. Register now: toloka.ai/events/multi... #AI #NLP

💡Digital humanities? Co to je? Proč to není užitečné jen pro ty, kteří zůstanou ve vědě? O tom všem hovoří Jiří Kocián a Barbora Vidová Hladká @fsv.unikarlova.cuni.cz @ufal-cuni.bsky.social @mff.unikarlova.cuni.cz @unikarlova.cuni.cz v článku @ukforum.cuni.cz www.ukforum.cz/rubriky/acad...

Participate in the 👉 CRAC 2025 Shared Task on Multilingual Coreference Resolution❗ ufal.mff.cuni.cz/corefud/crac25 If you have not already done so, register first. 👆 Then start discovering how words refer to each other in 1️⃣7️⃣ languages. This year includes a new ✨LLM✨ track 😮.

Shoutout to Emil Svoboda 🙌 who won the Bernard Bolzano prize for his PhD research on derivational morphology, connecting linguistics and neural networks. 💡 Congrats! 🥳

Nový seriál na matfyz.cz! Seznamte se s Patrícií Schmidtovou z @ufal-cuni.bsky.social, která zkoumá jazykové modely a během svých studií se podílela mj. na vzniku první divadelní hry napsané umělou inteligencí. 👇 www.matfyz.cz/clanky/holky... #holkyzinformatiky #novageneracevedkyn

Yesterday, Sourabrata Mukherjee defended his PhD thesis! 👨‍🎓🥳🍻 Souro's thesis was on Text Style Transfer with Neural Language Models and tackles key challenges in the topic, such as data scarcity and limited multilingual coverage. More Souros' work: scholar.google.com/citations?us...

Good news 👍 @unikarlova.cuni.cz grant agency will fund 💰6 new PhD students grants: 👉Nalin Kumar: General-purpose LLMs for low-res languages 👉 @patuchen.bsky.social : Reliable and Explainable LLMs for Text Generation 👉 Tomáš Polák: Comprehensibility and semantic consistency of 🇨🇿 legislation 1/2

Před lety se svými spolupracovníky učili robota psát divadelní hru, nyní se Rudolf Rosa @ufal-cuni.bsky.social #MFFUK @unikarlova.cuni.cz snaží naučit AI 🤖 psát poezii. 📕 K čemu je dobré, aby AI uměla skládat básně❓ Odpověď najdete níže ⬇️ 📌 link.cuni.cz/RudolfRosa

A recent participant of @straka-milan.bsky.social's Deep Learning course wrote a review on his blog 😊 blog.idnes.cz/vojtechkment... 👉 The course now also open to external participants as a part of @unikarlova.cuni.cz's micro-credential program ufal.mff.cuni.cz/courses/npfl...

🔬 Když se na jednom místě sejdou tři generace vědců, vzniká jedinečný pohled na vývoj oboru. Eva Hajičová, Jan Hajič a Jan Hajič jr. se podělili o svůj pohled a své příběhy. 📌 Přečtěte si celý rozhovor: www.ukforum.cz/rubriky/veda...

Two weeks ago, we hosted the kick-off meeting of the @openeurollm.bsky.social project in Prague. This project will deliver a series of foundation models for transparent AI in Europe, covering all EU's 🇪🇺 official and many other European languages.

Come to Helsinki for the 18th MT Marathon! Sponsored by EAMT @ufal-cuni.bsky.social

Next CIRCSE seminar “BACK TO THE ROOTS (AND OTHER MORPHEMES)” by Zdenëk Zabokrtský (@ufal-cuni.bsky.social) 🗓️ 19/03 4:30pm 🌍 In Milan and online: tinyurl.com/3cc96ncr

👨‍💻👩‍💻 Pod vedením @ufal-cuni.bsky.social #MFFUK @unikarlova.cuni.cz se začíná budovat rodina velkých jazykových modelů pro všechny evropské jazyky. V Karolinu dnes odstartoval mezinárodní projekt @openeurollm.bsky.social. 👏 www.ukforum.cz/rubriky/aktu...

Paper 👉Beyond Literal Token Overlap: Token Alignability for Multilinguality👈 by @kathaem.bsky.social, @tomlim.bsky.social, @jlibovicky.bsky.social and Alex Fraser will appear at #NAACL2025! arxiv.org/abs/2502.06468 Congratulations to all authors! 🥳

Our paper 'Beyond Literal Token Overlap: Token Alignability for Multilinguality' will be at #NAACL2025! We show that token alignability is a stronger predictor of cross-lingual transfer than literal token overlap. Read it here: arxiv.org/abs/2502.06468

💻🧬Rodina Hajičova má matematickou lingvistiku opravdu v genech. Přečtěte si unikátní trojrozhovor s vedoucím obřího projektu @openeurollm.bsky.social Janem Hajičem, jeho maminkou Evou a synem Janem @ufal-cuni.bsky.social #MFFUK @unikarlova.cuni.cz. www.ukforum.cz/rubriky/veda...

Following the MT Marathon, we're hosting a hackathon in Prague. Researchers and students from five institutions (+1 online) are working together to assess how robust #LLMs are to grammar errors in machine translation and related tasks. Thanks to EAMT for their support.