- A jailbreaking method called Skeleton Key can prompt AI models to reveal harmful information.
- The technique bypasses safety guardrails in models like Meta's Llama3 and OpenAI GPT 3.5.
- Microsoft advises adding extra guardrails and monitoring AI systems to counteract Skeleton Key.
It doesn't take much for a large language model to give you the recipe for all kinds of dangerous things.