Profile avatar
perayson.bsky.social
Prof of #NLP in School of Computing and Communications @LancasterUni, word boffin, ex-CLARIN Ambassador, Director of @ucrelnlp.bsky.social #NLProc #CorpusLinguistics #DigitalHumanities, Doctor Who & stuff. Ravenclaw. https://www.linkedin.com/in/perayson/
28 posts 2,352 followers 1,364 following
Prolific Poster
Conversation Starter
comment in response to post
Correct, not yet. This is not supported yet in PyMUSAS so I could only set it up first for the English pipeline.
comment in response to post
Exciting news: there'll be a wmatrix & PyMUSAS workshop at @cl2025.co.uk in Birmingham on 29th June run by me, @johnvidler.co.uk and Daisy Lal including discussions about open source software and #corpuslinguistics #CL2025
comment in response to post
There's an explanation of the key differences (including much larger corpora & multilingual analysis in #wmatrix version 7) here: ucrel-wmatrix7.lancaster.ac.uk If you are a #wmatrix V5 user, then you should save your data and reload it into the new version. V5 is expected to be retired next month.
comment in response to post
Our Diversity in Data Science working group now has an active webpage with resources www.lancaster.ac.uk/dsi/about-us... Please do suggest additional resources we could link to if you know of ones that directly address EDI in data science, so we can build it further, thanks! @dsilu.bsky.social
comment in response to post
LENS paper photos... #COLING2025
comment in response to post
I've just finished presenting with Daisy Lal our software demo paper on "LENS: Learning Entities from Narratives of Skin Cancer" #COLING2025 part of the 4D Picture (4dpicture.eu) project aclanthology.org/2025.coling-... Code & demo are available online, links are in the paper
comment in response to post
I'm starting this morning at the low-resource languages 1 session #COLING2025 An interesting observation from @iamdddaryna.bsky.social in her paper aclanthology.org/2025.coling-... that Opus-MT better preserves the toxic lexicon compared to others where this is fine-tuned out
comment in response to post
An interesting paper from this morning's sessions at #COLING2025: "Why Does ChatGPT “Delve” So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models" aclanthology.org/2025.coling-...
comment in response to post
I finished yesterday afternoon in the @loreslm.bsky.social workshop: low resource languages are a super hot topic in the LoResLM workshop and widely across the #COLING2025 conference
comment in response to post
Now, I'm dipping in to the panel discussion on "Digital Archives and Cultural Heritage in the LLMs Era" at the sina.birzeit.edu/nakba-nlp/ workshop chaired by Mo El-Haj, with Muhammad Abdul-Mageed, Antonio Moreno Sandoval, @dawnknightcl.bsky.social, and Mustafa Jarrar #COLING2025
comment in response to post
Some of the many interesting posters today from @loreslm.bsky.social, FNP and other workshops #COLING2025
comment in response to post
Here's Daisy Lal from @ucrelnlp.bsky.social presenting her paper on "Hindi reading comprehension: do LLMs exhibit semantic understanding?" at the IndoNLP workshop #COLING2025
comment in response to post
I'm here at #COLING2025 to start this morning with the IndoNLP workshop, the First Workshop on Natural Language Processing for Indo-Aryan and Dravidian Languages indonlp-workshop.github.io/IndoNLP-Work...
comment in response to post
My walk to work this morning ... #COLING2025
comment in response to post
This afternoon, I'm attending the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal) #COLING2025 sites.google.com/nlg.csie.ntu...
comment in response to post
This morning I am attending The 1st Workshop on NLP for Languages Using Arabic Script (AbjadNLP 2025) wp.lancs.ac.uk/abjad/ #COLING2025
comment in response to post
Welcome, Sylviane! 👋
comment in response to post
... and many thanks to @johnvidler.co.uk for all the audio and video wrangling for the seminar recording!