ThreadSky
About ThreadSky
Log In
maxkw.bsky.social
•
93 days ago
Github Copilot output. A sad but fascinating alignment failure. Reveals hidden LLM biases by going out of their RLHF distribution
Comments
Log in
with your Bluesky account to leave a comment
[–]
caracter.bsky.social
•
93 days ago
They seem to have 'fixed' it in English, but the logic remains the same. Here is when I queried it in French.
1
reply
[–]
tsawallis.bsky.social
•
93 days ago
Verified just now in Python:
3
reply
[–]
gitremote.bsky.social
•
89 days ago
Great find. What other examples require distinguishing between normative and descriptive values?
0
reply
Posting Rules
Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service
×
Reply
Post Reply
Comments