Artificial Intelligence Performance Engineer

AMD - Santa Clara, CA

posted about 2 months ago

Full-time - Mid Level

Santa Clara, CA

Computer and Electronic Product Manufacturing

About the position

The Artificial Intelligence Performance Engineer at AMD is responsible for optimizing GPU-accelerated systems to ensure peak performance for AI workloads. This role involves defining performance metrics, benchmarking AI workloads, and identifying performance bottlenecks. The engineer will collaborate with software teams to enhance performance and stay updated on emerging AI technologies. The position requires a blend of technical expertise in GPU performance, data science skills, and effective communication to convey findings to various teams.

Responsibilities

Define performance suite and best practices for measuring GPU-accelerated workloads to assess scalability and efficiency of AI models and algorithms
Benchmark and analyze AI workloads in single and large multi-node configurations comparing against previous generations and competitors
Perform comprehensive performance analysis and report findings for the entire platform including GPU, CPU, interconnects, network, software stack, etc.
Identify performance bottlenecks that impact data center GPU-accelerated workloads, tune and collaborate with other software teams to improve performance
Stay up to date with emerging technologies and trends in the AI field and explore ways to improve the performance of GPU-accelerated workloads at scale

Requirements

Solid knowledge of Artificial Intelligence (AI) and Machine Learning (ML) concepts and techniques, including deep learning, reinforcement learning, natural language processing, generative AI, and computer vision
Experience in benchmarking methodologies, performance analysis, workload profiling, performance monitoring and debugging tools
Advanced Linux OS, container (e.g. Docker) and GitHub skills
Programming skills in a variety of relevant languages such as Python or C/C++
Expertise with deep learning frameworks like PyTorch and TensorFlow
Knowledge and interest in computer and GPU architecture
In-depth knowledge of GPU acceleration with either AMD or Nvidia GPU compute products
Excellent problem-solving skills and an automation mindset

Nice-to-haves

Experience applying AI and ML concepts to solve real-world problems through research or work experience
Familiarity with performance monitoring and tuning tools

Benefits

Base pay dependent on skills and experience
Eligibility for annual bonus or sales incentive
Opportunity to own shares of AMD stock
Discount on AMD stock through Employee Stock Purchase Plan
Competitive benefits package

Artificial Intelligence Performance Engineer

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company