As my first post on this platform, allow me to advertise the RL theory lecture notes I have been developing with Sasha Rakhlin: https://arxiv.org/abs/2312.16730
(shameless repost of my pinned tweet)
(shameless repost of my pinned tweet)
Comments
am trying to develop options for probabilistic firewalls
Q: what is/are the best security measure(s) that you are aware of to help stop or mitigate probabilistic injection ?
the simplest form of probabilistic injection is a ‘prompt injection’