Metr - Berkeley, CA

posted 14 days ago

Full-time - Mid Level
Berkeley, CA

About the position

METR is seeking ML research engineers/scientists to conduct evaluations of AI R&D capabilities, focusing on identifying potential risks associated with advanced AI models. The role involves developing benchmarks, running experiments, and collaborating with various teams to enhance evaluation protocols that predict catastrophic risks posed by new AI models.

Responsibilities

  • Produce tasks/benchmarks to determine if a model is dangerous
  • Run experiments to assess how elicitation techniques affect evaluation results
  • Conduct internal evaluations and support external partners in performing evaluations for autonomous capabilities
  • Improve tooling for researchers designing and running evaluations
  • Collaborate with evaluation science researchers to develop robust evaluation procedures
  • Understand the abilities that need to be evaluated and the properties required for effective evaluations
  • Identify promising research directions and design experiments and research roadmaps
  • Collaborate with the threat modeling team to evaluate models for AI R&D skills
  • Rapidly execute experiments and obtain reliable results
  • Design efficient pipelines and workflows for evaluations
  • Quickly interpret results and identify potential issues in ML experiments
  • Assess the workload and promise of different approaches, seeking information rapidly when uncertainties arise.

Requirements

  • Substantial ML research engineering experience
  • Research publications related to machine learning with a significant role
  • Professional experience optimizing compute for inference or training of large models
  • Significant role in the training or optimization of a large model

Nice-to-haves

  • Experience with frontier LLMs
  • Track record of successful execution-heavy research projects

Benefits

  • Diversity and equal opportunity in hiring
  • Potential sponsorship for cap-exempt H-1B visa for non-US candidates
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service