Anthropic Pbcposted 2 months ago
$320,000 - $405,000/Yr
Full-time • Manager
San Francisco, CA

About the position

We are seeking an experienced Engineering Manager to join our Inference Scalability and Capability team. This team is responsible for building and maintaining the critical systems that serve our LLMs to a diverse set of consumers. As the cornerstone of our service delivery, the team focuses on scaling inference systems, ensuring reliability, optimizing compute resource efficiency, and developing new inference capabilities. The team tackles complex distributed systems challenges across our entire inference stack, from optimal request routing to efficient prompt caching.

Responsibilities

  • Build and lead a high-performing team of engineers through technical mentorship, strategic hiring, and creating an environment that fosters innovation
  • Drive operational excellence of inference systems (deployments, auto-scaling, request routing, monitoring) across cloud providers
  • Facilitate development of advanced inference features (e.g., prompt caching, constrained sampling, fine-tuning)
  • Partner deeply with research teams to productionize new models, infrastructure teams to optimize hardware utilization, and product teams to deliver customer-facing features
  • Create clear technical roadmaps and execution strategies in a fast-moving environment while managing competing priorities

Requirements

  • Have 5+ years of experience leading large-scale distributed systems teams
  • Have excellence in building high-trust environments and helping teams navigate technical uncertainty while maintaining velocity
  • Exhibit demonstrated ability to recruit, scale, and retain engineering talent
  • Possess outstanding communication and leadership skills
  • Show a deep commitment to advancing AI capabilities responsibly
  • Have a strong technical background enabling you to make architectural decisions and guide technical direction

Nice-to-haves

  • Implementing and deploying machine learning systems at scale
  • LLM inference optimization including batching and caching strategies
  • Cloud-native architectures, containerization, and deployment across multiple cloud providers
  • High-performance computing environments and hardware acceleration (GPU, TPU, Trn)

Benefits

  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Lovely office space in which to collaborate with colleagues

Job Keywords

Hard Skills
  • Computer Performance
  • Environment Management
  • Hardware Acceleration
  • Machine Learning
  • System Deployment
  • 2lWfKIzb0LTN XQDuzk5Nxdn
  • cTfrAFg82 hFS1yMEdxf
  • pgxyKkYJe hZOEiaVMqo0
  • PUA36CEucD MNJuBv84O0Iy
  • weCYWmsZvJkg j5nr07avAtY
Soft Skills
  • beELMoP6R9 ELTmp8ODsHJ
  • NniOB n2J6ZRP85
Build your resume with AI

A Smarter and Faster Way to Build Your Resume

Go to AI Resume Builder
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service