All roles

Research Engineer/Research Scientist – Model Transparency

Remote · USA Full-time New today

The AI Security Institute is the world's largest and best-funded team dedicated to understanding advanced AI risks. They are seeking Research Engineers and Research Scientists for the Model Transparency team to drive research and build systems for evaluating AI models, ensuring that oversight keeps pace with their capabilities.

Responsibilities

  • Drive the technical substance of our work – staying abreast of the literature, proposing and designing experiments, conducting rigorous analyses, and owning the evidence stack from experiment through to written output
  • Write, critique, and strengthen the team's reports and publications
  • Build the systems and tooling that make our research possible and fast – scaling experimental workflows, automating processes, solving infrastructure challenges, and creating systems that accelerate the entire team's output
  • Develop a chain-of-thought monitorability benchmark and comparing monitorability properties across frontier AI systems, leveraging AISI’s unique access to reasoning traces from multiple labs
  • Design and run experiments on open-weight models to study alignment and oversight-relevant phenomena – such as reproducing emergent misalignment from reward hacking, or red-teaming techniques like inoculation prompting and character training
  • Use white-box and interpretability methods – such as activation oracles, sparse auto-encoders or probes – to detect misalignment that isn’t visible through behavioural evaluation alone
  • Build tooling and infrastructure for our research – including agent orchestration, large-scale RL pipelines, mechanistic interpretability methodologies, and auditing agents
  • Review frontier lab risk assessments and safety cases, providing independent analysis of alignment claims before deployment decisions
  • Conduct literature reviews and expert interviews to map the state of model transparency risks and inform AISI’s strategic priorities
  • Translate technical findings into actionable insights for AISI evaluation teams, UK government officials, and international partners

Skills

  • A get-things-done mindset – you take ownership, move fast, and care about shipping work that matters
  • A combination of self-sufficiency and enthusiasm for teamwork – you're equally happy defining your own agenda and contributing to shared goals. You're excited about growing, giving and receiving feedback, and building something together
  • An ability to build, supervise and orchestrate AI agents to complete tasks effectively, while verifying and maintaining quality of work
  • A demonstrated track record of relevant, high-quality work – whether technical publications, blog posts, or other publicly visible contributions
  • Hands-on research experience with large language models (LLMs) – such as evaluating or fine-tuning models, developing and testing monitors, or auditing models with white-box or black-box techniques
  • Ability and experience in writing research code for machine learning experiments, including experience with ML frameworks like PyTorch or evaluation frameworks like Inspect
  • An ability to write high-quality, concise research proposals that are well-motivated, tractable, and coherent
  • Good research taste – an ability to identify what's important, choose productive directions, and avoid getting lost in dead ends
  • An ability to read research critically, identify flawed arguments, and poke holes in safety claims
  • Strong software engineering skills and experience building systems that support ML research – infrastructure, pipelines, tooling, or experimental platforms
  • Ability and experience writing production-quality code in Python and familiarity with ML frameworks like PyTorch
  • Experience working with LLMs at scale in some capacity – fine-tuning, deploying, evaluating, or building scaffolds around them
  • An understanding of the needs of research scientists, experience working within and supporting a research team or building tools to support research
  • Experience designing and running alignment evaluations or working on model transparency research
  • Experience with interpretability or white-box methods – such as mechanistic interpretability, sparse autoencoders, probing, or activation analysis
  • Familiarity with alignment literature, current methods for post-training and aligning LLMs, and the current state of the field
  • Prior mentorship or training within technical AI safety – such as through the MATS program or similar
  • A track record of scaling AI automation – getting agents to do useful work, building orchestration systems, or accelerating research workflows with AI tooling
  • Experience working with very large models (~100B+) at scale, including post-training (RL, RLHF, DPO), fine-tuning pipelines, or distributed interpretability work on models that don't fit into memory
  • Experience with mechanistic interpretability tooling or white-box analysis infrastructure at scale
  • Strong open-source contributions, particularly related to LLMs or AI safety
  • Proficient usage of LLM coding tools and agents

Benefits

  • Incredibly talented, mission-driven and supportive colleagues.
  • Direct influence on how frontier AI is governed and deployed globally.
  • Work with the Prime Minister’s AI Advisor and leading AI companies.
  • Opportunity to shape the first & best-resourced public-interest research team focused on AI security.
  • Pre-release access to multiple frontier models and ample compute.
  • Extensive operational support so you can focus on research and ship quickly.
  • Work with experts across national security, policy, AI research and adjacent sciences.
  • If you’re talented and driven, you’ll own important problems early.
  • 5 days off and annual stipends for learning and development, and funding for conferences and external collaborations.
  • Freedom to pursue research bets without product pressure.
  • Opportunities to publish and collaborate externally.
  • Modern central London office (cafes, food court, gym), or where applicable, option to work in similar government offices in Birmingham, Cardiff, Darlington, Edinburgh, Salford or Bristol.
  • Hybrid working, flexibility for occasional remote work abroad and stipends for work-from-home equipment.
  • At least 25 days’ annual leave, 8 public holidays, extra team-wide breaks and 3 days off for volunteering.
  • Generous paid parental leave (36 weeks of UK statutory leave shared between parents + 3 extra paid weeks + option for additional unpaid time).
  • On top of your salary, we contribute 28.97% of your base salary to your pension.
  • Discounts and benefits for cycling to work, donations and retail/gyms.

Company Overview

  • AI Security Institute is a UK government body that works to make advanced artificial intelligence safe and secure. It was founded in 2023, and is headquartered in London, England, GBR, with a workforce of 51-200 employees. Its website is https://www.aisi.gov.uk/.
  • Apply To This Job

    Related roles

    Research Engineer/Research Scientist – Model Transparency

    Remote · USA Full-time

    Software Engineer (Embedded Systems)

    Remote · USA Full-time

    AWS Cloud Engineer

    Remote · USA Full-time

    [Remote] MTS SDET, Test Infrastructure

    Remote · USA Full-time

    GenAI Web Engineer – (Contract to Hire) – Tempe AZ - Recent College Grads

    Remote · USA Full-time

    Associate Data Engineer

    Remote · USA Full-time

    Electrical AI Developer

    Remote · USA Full-time

    Full Stack Developer / IT Programmer Analyst (Associate or Mid-level or Senior-level)

    Remote · USA Full-time

    [Remote] IT Security Controls Spec I

    Remote · USA Full-time

    Programmer Analyst, Associate

    Remote · USA Full-time

    Experienced Customer Service Representative – Work From Home Part-Time Opportunity at arenaflex

    Remote · USA Full-time

    Regulatory Submissions Associate/Senior Associate (Biological products)

    Remote · USA Full-time

    Associate Field Service Engineer (Territory: New England States)

    Remote · USA Full-time

    AI Engineer

    Remote · USA Full-time

    Experienced Data Entry Specialist – Remote Opportunity for 100 Genuine Online Data Entry Jobs

    Remote · USA Full-time

    Sr Analyst Clinical Report Writer (Remote)

    Remote · USA Full-time

    Einstieg in den Vertrieb: Sales Development Representative (m/w/d) | B2B Vertrieb - 100 % remote möglich

    Remote · USA Full-time

    Senior APIM Architect

    Remote · USA Full-time

    Staff Solutions Engineer - Manhattan Active Warehouse Management

    Remote · USA Full-time

    Experienced Hybrid Administrative Assistant – Data Entry and Virtual Support

    Remote · USA Full-time