webis.de - Profile | ThreadSky | a Reddit-style client for Bluesky

Our paper titled “The Two Paradigms of LLM Detection: Authorship Attribution vs. Authorship Verification” has been accepted to #ACL2025 (Findings). downloads.webis.de/publications... We discuss why LLM detection is a one-class problem and how that affects the prospective… 1/3 #ACL #NLP #ARR #LLM

submitted 18 days ago • 1 comment

PAN 2025 Call for Participation: Shared Tasks on Authorship Analysis, Computational Ethics, and Originality We'd like to invite you to participate in the following shared tasks at PAN 2025 held in conjunction with the CLEF conference in Madrid, Spain. Find out more at pan.webis.de/clef25/pan25...

submitted 106 days ago • 1 comment

Can LLM-generated ads be blocked? With OpenAI adding shopping options to ChatGPT, this question gains further importance. If you are interested in contributing to the research on LLM-based advertising, please check out our shared task: touche.webis.de/clef25/touch... More details below.

submitted 51 days ago • 1 comment

📢 Our paper "The Viability of Crowdsourcing for RAG Evaluation" has been accepted to #SIGIR2025 ! We compared how good humans and LLMs are at writing and judging RAG responses, assembling 1800+ responses across 3 styles, and 47K+ pairwise judgments in 7 quality dimensions. 🧵➡️

submitted 73 days ago • 1 comment

PAN 2025 Call for Participation: Shared Tasks on Authorship Analysis, Computational Ethics, and Originality We'd like to invite you to participate in the following shared tasks at PAN 2025 held in conjunction with the CLEF conference in Madrid, Spain. Find out more at pan.webis.de/clef25/pan25...

submitted 106 days ago • 1 comment

Interested in joining our research group or do you know someone who might be interested? We have a new vacancy: Research position at the Webis group on Watermarking for Large Language Models. More information: webis.de/for-students...

submitted 123 days ago • 0 comments

2nd International Workshop on Open Web Search: CfP We invite you to the #ECIR2025 Workshop on Open Web Search #wows2025. Please consider to submit to the scientific track or the WOWS-Eval shared task to enrich the Open Web Index with relevance judgments. Details: opensearchfoundation.org/wows2025

submitted 162 days ago • 0 comments

Time for a starter pack on information retrieval: go.bsky.app/MXPJoTn

submitted 217 days ago • 17 comments

Today we will present our poster on Query Variation Robustness of Transformer Models at #EMNLP2024. You can find us at the Information Retrieval and Text Mining 3 poster session at #EMNLP2024.

submitted 218 days ago • 1 comment

Below you can see our past tweets, just imported from “the darkened X”. Above, we see nothing but Bluesky.

submitted 223 days ago • 0 comments

Goodbye Washington! We had a fantastic week with interesting talks, discussions, and new ideas at #SIGIR24 #SIGIR2024. We hope to see you all again next year in Italy :) https://x.com/webis_de/status/1815115279510208625/photo/1

submitted 333 days ago • 0 comments

The paper can be found on our homepage (https://webis.de/publications.html#schmidt_2024) and the dataset is on Zenodo: https://zenodo.org/records/10802427

submitted 402 days ago • 0 comments

In our experiments, LLMs struggle with the task in a zero-shot setting, especially due to low precision values. Sentence transformers, however, can be finetuned to successfully detect the inserted ads and achieve precision and recall values of above 0.9 for unseen meta topics. https://t.co/VuuaW...

submitted 402 days ago • 1 comment

The Webis Generated Native Ads 2024 is the first public dataset to evaluate models on the task of detecting ads in responses of conversational search engines. It was created by simulating an advertising service for queries from popular meta topics (product/service categories). https://t.co/pjHr...

submitted 402 days ago • 0 comments

What if conversational search will be financed by inserting ads directly into generated responses? We present our work on detecting these generated native ads at #TheWebConf24. Come visit us at the short paper poster session on Thursday in the Central Ballroom. https://t.co/NRKbal57WO

submitted 402 days ago • 0 comments

Right now, we will start the second half of the SCAI'24 workshop at #CHIIR2024 in hybrid mode. We will move from the big ideas and human-centered metrics to the challenges of human-in-the-loop evaluations. https://x.com/webis_de/status/1768277930768015536/photo/1

submitted 462 days ago • 0 comments

Here's a study we did together with social scientists Arno Simons and Marion Schmidt on who Wikipedia editors consider notable enough to be mentioned in the history section of the CRISPR article. An awesome collaboration! https://twitter.com/WikiResearch/status/1766894251588325491

submitted 466 days ago • 0 comments

How will conversational search AI pay for itself? It may be native ads or product placement in generated answers. At #CHIIR2024 next week, we'll present a user study showing that many people don't recognize ads inserted by LLMs in generated search results: https://t.co/hrZE9moeKy https://t.co/qg...

submitted 470 days ago • 0 comments

Working in Argumentation? Time to participate in Touché 2024! Three shared tasks: - Human Value Detection - Ideology and Power Identification in Parliamentary Debates - Image Retrieval/Generation for Arguments Submission deadline is May 6th! More info: https://t.co/rtgSDxpDTx https://t.co/S6Kl...

submitted 549 days ago • 0 comments

We invite you to participate in the #ECIR2024 Workshop on Open Web Search. Let's discuss, develop, and promote an open web search ecosystem together! The workshop encourages submissions of scientific papers and implementations of retrieval components. https://t.co/WPgDBe3WiR https://t.co/cC2yIZwkpw

submitted 563 days ago • 0 comments

Today, we were happy to welcome @anja_reu and @juliusgonsior to our seminar to learn about current challenges in math retrieval/active learning: "Transformer Encoders for Mathematical Answer Retrieval" and "The Missing Piece of Active Learning Research: a Reference Benchmark". https://t.co/eh4IH...

submitted 602 days ago • 0 comments

Today we had the pleasure of listening to a talk from @WojciechKusa about evaluating automated citation screening in systematic reviews. Very interesting to hear about the work he has done in new metrics and datasets for this domain! https://x.com/webis_de/status/1710256640941822084/photo/1

submitted 623 days ago • 0 comments

We are glad to share the recording of the invited talk by Nicola Ferro @frrncl titled "Comparing IR System Performance Through Explanatory Linear Models". Thank you, Nicola, for many exciting insights and a fruitful discussion! Video: https://www.youtube.com/watch?v=sWlEnhQIr8g

submitted 643 days ago • 0 comments

Our @H1iReimer and @maik_froebe are thrilled to present two new resources at the @SIGIRConf poster session: • The TIREx platform to run reproducible, blinded IR experiments & shared tasks 🧪 • The Archive Query Log, 350M queries crawled from the Internet Archive 🔍 #SIGIR2023 https://t.co/Np...

submitted 696 days ago • 0 comments

TIRA, TIREx, and ir_datasets are open source, and everyone can host their own instances. There is no vendor lock-in, as Docker has open-source alternatives. We would be very happy to host your IR experiments on TIREx. Now is the time to promote software submissions in IR 😊

submitted 696 days ago • 0 comments

TIREx archives the Docker images for future replication and reproduction. Software submissions on TIREx can run on new additions to ir_datasets as retrieval approaches were implemented against the ir_datasets interface, promoting IR experiment #standardization. https://t.co/XRNGHmgBOa

submitted 696 days ago • 0 comments

TIREx covers shared tasks in IR. Organizers add their data to ir_datasets. Participants implement their approach against ir_datasets, making software submissions via Docker executed in a TIRA sandbox, enabling blinded experimentation and improving internal and external validity. https://t.co/GLg...

submitted 696 days ago • 0 comments

IR experiments are internally valid if the hypothesis is supported by the data and externally valid if repeating an experiment on similar data yields similar observations. With transparent leaderboards and one-click executions of models on new data, TIREx helps to improve both.

submitted 696 days ago • 0 comments

Information retrieval experiments face potential problems concerning (1) internal validity, (2) external validity, and, more recently, (3) leakage by large pre-trained models. TIREx aims to support IR experiments to mitigate those issues.

submitted 696 days ago • 0 comments

The Information Retrieval Experiment Platform (TIREx) integrates ir_datasets, ir_measures, PyTerrier, and TIRA for • standardized, • reproducible, • scalable, and ultimately • 𝗯𝗹𝗶𝗻𝗱𝗲𝗱 𝗲𝘅𝗽𝗲𝗿𝗶𝗺𝗲𝗻𝘁𝘀 in IR. Preprint: https://t.co/WWe26DCch2 #sigir2023 🧵 https://t.co/sPGvaNeYF9

submitted 696 days ago • 0 comments

TIRA, TIREx, and ir_datasets are open source and everyone can host their own instances. There is no vendor lock-in as Docker has open-source alternatives. We would be very happy to host your IR experiments on TIREx. Now is the time to promote software submissions in IR 😊

submitted 696 days ago • 0 comments

TIREx archives the Docker images for future replication and reproduction. Software submissions on TIREx can run on new additions to ir_datasets as retrieval approaches were implemented against the ir_datasets interface, promoting IR experiment #standardization. https://t.co/LllDr0g61P

submitted 696 days ago • 0 comments