sunblaze-ucb/exploitgym: large-scale, realistic benchmark built from real-world vulnerabilities across
ExploitGym is a large-scale, realistic benchmark built from real-world vulnerabilities across userspace programs, Google's V8 engine, and the Linux kernel, designed to evaluate AI agents' ability to d...
Continue Reading