85% of these contributions use or study closed, proprietary models, amplifying challenges to reproducibility, transparency regarding behavior and performance, and further evaluation. - ThreadSky

About ThreadSky

hopeschroeder.bsky.social • 21 days ago

85% of these contributions use or study closed, proprietary models, amplifying challenges to reproducibility, transparency regarding behavior and performance, and further evaluation.

Comments

hopeschroeder.bsky.social•21 days ago

Researchers should think about how the LLM’s role in the work relates to potential limitations of the system AND resulting claims made in a study of that system, particularly when using or studying closed models. Here are a few q's to consider:

hopeschroeder.bsky.social•21 days ago

What role did the LLM play in your project? How did you disclose the models and prompts? What are the potential limitations of using LLMs for your selected role? (For example: How will the performance of the LLM-powered research tool affect the validity of your research?)

hopeschroeder.bsky.social•21 days ago

We find LLM-related contributions in all domains such as health, productivity, writing, and design. As LLMs continue to permeate software systems, we need to update the way we discuss and evaluate them.

hopeschroeder.bsky.social•21 days ago

Read our 📎Preprint for more, incl practical guiding questions: https://arxiv.org/abs/2501.12557
Congrats to @rockpang.bsky.social on great leadership, and grateful to have learned so much from co-authors Kynnedy Smith, @s010n.bsky.social, @ziangxiao.bsky.social, @emtseng.bsky.social, and Danielle Bragg

Posting Rules

Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service

Comments

Posting Rules

Reply