Principal Machine Learning Performance Engineer

AMD - San Jose, CA

posted 2 months ago

Full-time - Director

San Jose, CA

Computer and Electronic Product Manufacturing

About the position

At AMD, we are committed to transforming lives through our technology, and we are looking for a Principal Machine Learning Performance Engineer to join our team. This role is pivotal in focusing on ML performance modeling, projection, and optimization for various machine learning workloads. You will engage in hardware and software co-design, emphasizing the interaction between ML workloads and hardware architecture. Your work will involve modeling workloads, including generative AI models across multiple hardware configurations, and summarizing your recommendations based on your findings. You will collaborate closely with both customers and the business unit to project performance, analyze results, and develop solutions that meet customer needs. If you are passionate about performance optimization and shaping the future of AI acceleration, this role is tailored for you. As a Machine Learning Performance Engineer, you will delve into recent ML models, analyzing their compute and memory requirements to provide projections on various compute hardware for both inference and training. Your role will also involve identifying innovative ways to enhance performance. The ideal candidate will possess strong experience in ML hardware architecture, software optimization, and performance modeling, making them a key player in our mission to build great products that accelerate next-generation computing experiences.

Responsibilities

Performance modeling and analysis of ML training and inference workloads across single and multiple accelerators.
Explore various tradeoff and design decisions.
Participate in hardware-software co-design for future hardware optimization on various ML workloads.
Communicate and present the results of the performance analysis and modeling to stakeholders and provide concrete recommendations.
Develop and improve our framework, tools and infrastructure for performance estimation, modeling and reporting.
Cross team collaboration.

Requirements

Strong technical expertise and experience in performance analysis, projection, and hardware architecture.
Excellent written, verbal, and presentation skills.
Experienced in C++ coding.
PhD or master's degree in computer science, electrical engineering, or a related field, plus equivalent experience.

Benefits

Base pay dependent on skills, qualifications, experience, and location.
Eligibility for annual bonuses or sales incentives.
Opportunity to own shares of AMD stock through the Employee Stock Purchase Plan with discounts.
Competitive benefits package.

Principal Machine Learning Performance Engineer

About the position

Responsibilities

Requirements

Benefits

Tools

Career Hubs

Guides

Company