Definitions
Sorry, no definitions found. Check out and contribute to the discussion of this word!
Etymologies
Sorry, no etymologies found.
Support

Help support Wordnik (and make this page ad-free) by adopting the word assistant-prefill attack.
Examples
-
Across several evaluations, we consistently observed that assistant–prefill attacks, wherein the model is prompted as if it has already started to say something harmful, are sometimes effective at eliciting harmful behavior.
Highlights from the Claude 4 system prompt Simon Willison 2025
Comments
Log in or sign up to get involved in the conversation. It's quick and easy.