AMD - San Jose, CA
posted 4 months ago
At AMD, we are committed to transforming lives through our technology, and as a Sr. Machine Learning Performance Engineer, you will play a crucial role in this mission. This position focuses on ML performance modeling, projection, and optimization for various machine learning workloads, while also participating in hardware and software co-design. You will analyze the interaction between ML workloads and hardware architecture, particularly modeling workloads such as generative AI models across multiple hardware configurations. Your insights and recommendations will be vital in shaping the future of AI acceleration and ensuring that our products meet the evolving needs of our customers. In this role, you will be responsible for analyzing and exploring recent machine learning models, understanding their compute and memory requirements, and providing projections on various compute hardware for both inference and training. You will also be tasked with identifying innovative ways to enhance performance. Your work will involve performance modeling and analysis of ML training and inference workloads across single and multiple accelerators, exploring various trade-offs and design decisions. You will actively participate in hardware-software co-design efforts aimed at optimizing future hardware for various ML workloads. Effective communication is key in this role, as you will need to present the results of your performance analysis and modeling to stakeholders, providing concrete recommendations based on your findings. Additionally, you will contribute to the development and improvement of our framework, tools, and infrastructure for performance estimation, modeling, and reporting. Collaboration across teams will be essential to ensure that we are aligned in our goals and strategies.