exploitbench/exploitbench: field of AI security evaluation has grown crowded with benchmarks that test
The field of AI security evaluation has grown crowded with benchmarks that test whether language models can identify vulnerabilities, explain code, or suggest fixes. Fewer tools probe whether those mo...
Continue Reading