Contramont Research
The AI Safety Research Lab
Current projects: LM backdoors, real-world evals, scalable oversight
Recent work
Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits
The AI Safety Research Lab
Current projects: LM backdoors, real-world evals, scalable oversight
Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits