DeepMind Blog · 17d ago · 6 · research benchmark safety

Research release on empirically validated toolkit for measuring AI manipulation capabilities, tested across 10,000+ participants in finance and health domains. Provides open-source methodology and materials for evaluating how AI systems can be misused to deceptively influence human behavior and beliefs in high-stakes scenarios.