We've released some preliminary research today demonstrating fine tuning attacks which can bypass the safety mechanism of DeepSeek-R1. These simple attacks show that DeepSeek's safety can be easily removed to provide harmful content, and potentially to a worse extent to non-reasoning LLMs.

Comments