Model Policy, Frontier Cyber Risk

OpenAI · San Francisco

$207k–295k/yr Hybrid

PythonEval harnessesLLM-as-judgeObservability

About the Team Our Safety Systems https://openai.com/safety/safety-systems team is at the forefront of OpenAI's mission to build and deploy safe AGI, driving our commitment to AI safety and fostering a culture of trust and transparency. Within Safety Systems, the Model Policy team aligns model behavior with desired human values and norms. We co-design policy with models and for models by driving rapid policy taxonomy iteration based on data and defining evaluation criteria for foundational models’ ability to reason about safety. About the Role Frontier AI systems are rapidly expanding what is possible in cybersecurity and software engineering. These capabilities create major defensive opportunities, but they also raise serious dual-use and misuse risks across areas such as malware development, exploit discovery, vulnerability chaining, credential abuse, cyber intrusion, and autonomous offensive operations. In this role, you will help define how OpenAI’s models should behave in high-risk cybersecurity contexts. You will develop policy frameworks, threat models, taxonomies, evaluations, and behavioral specifications that guide model behavior across training, deployment, and monitoring systems. This role sits at the intersection of cybersecurity, AI safety, threat modeling, evaluation science, and policy implementation. You will work closely with research, engineering, safety training, preparedness, and product teams to build policies that are technically grounded, measurable, enforceable, and responsive to real-world cyber risk. Your Responsibilities: - Design and maintain model policies for cybersecurity and frontier-risk domains, especially dual-use and high-risk cyber capabilities. - Translate cybersecurity threat models into clear behavioral specifications, evaluation criteria, grading guidance, and system-level mitigations. - Define practical boundaries between legitimate security research, defensive workflows, and assistance that could mat

Apply on company site →