We are seeking an experienced Engineering Manager to join our Inference Scalability and Capability team. This team is responsible for building and maintaining the critical systems that serve our LLMs to a diverse set of consumers. As the cornerstone of our service delivery, the team focuses on scaling inference systems, ensuring reliability, optimizing compute resource efficiency, and developing new inference capabilities. The team tackles complex distributed systems challenges across our entire inference stack, from optimal request routing to efficient prompt caching.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Manager
Education Level
Bachelor's degree