Simon Willison’s findings, while encouraging, highlight a broader issue in AI security: the arms race between attackers and defenders. The fact that 6,000 attempts failed to breach his assistant reportedly demonstrates that frontier models like Opus 4.6 are becoming harder to exploit. However, as Willison notes, this doesn’t guarantee immunity against more sophisticated attacks.

In our view, the real takeaway is that while AI labs are making strides in hardening their models, deploying these systems in production environments where prompt injection could cause irreversible damage remains risky. The industry must continue to innovate, as attackers are likely to evolve their methods in response to these improved defenses.