85% of these contributions use or study closed, proprietary models, amplifying challenges to reproducibility, transparency regarding behavior and performance, and further evaluation.
Comments
Log in with your Bluesky account to leave a comment
Researchers should think about how the LLM’s role in the work relates to potential limitations of the system AND resulting claims made in a study of that system, particularly when using or studying closed models. Here are a few q's to consider:
What role did the LLM play in your project? How did you disclose the models and prompts? What are the potential limitations of using LLMs for your selected role? (For example: How will the performance of the LLM-powered research tool affect the validity of your research?)
We find LLM-related contributions in all domains such as health, productivity, writing, and design. As LLMs continue to permeate software systems, we need to update the way we discuss and evaluate them.
Read our 📎Preprint for more, incl practical guiding questions: https://arxiv.org/abs/2501.12557
Congrats to @rockpang.bsky.social on great leadership, and grateful to have learned so much from co-authors Kynnedy Smith, @s010n.bsky.social, @ziangxiao.bsky.social, @emtseng.bsky.social, and Danielle Bragg
Comments
Congrats to @rockpang.bsky.social on great leadership, and grateful to have learned so much from co-authors Kynnedy Smith, @s010n.bsky.social, @ziangxiao.bsky.social, @emtseng.bsky.social, and Danielle Bragg