AMD - Santa Clara, CA

posted about 2 months ago

Full-time - Mid Level
Santa Clara, CA
Computer and Electronic Product Manufacturing

About the position

The Artificial Intelligence Performance Engineer at AMD is responsible for optimizing GPU-accelerated systems to ensure peak performance for AI workloads. This role involves defining performance metrics, benchmarking AI workloads, and identifying performance bottlenecks. The engineer will collaborate with software teams to enhance performance and stay updated on emerging AI technologies. The position requires a blend of technical expertise in GPU performance, data science skills, and effective communication to convey findings to various teams.

Responsibilities

  • Define performance suite and best practices for measuring GPU-accelerated workloads to assess scalability and efficiency of AI models and algorithms
  • Benchmark and analyze AI workloads in single and large multi-node configurations comparing against previous generations and competitors
  • Perform comprehensive performance analysis and report findings for the entire platform including GPU, CPU, interconnects, network, software stack, etc.
  • Identify performance bottlenecks that impact data center GPU-accelerated workloads, tune and collaborate with other software teams to improve performance
  • Stay up to date with emerging technologies and trends in the AI field and explore ways to improve the performance of GPU-accelerated workloads at scale

Requirements

  • Solid knowledge of Artificial Intelligence (AI) and Machine Learning (ML) concepts and techniques, including deep learning, reinforcement learning, natural language processing, generative AI, and computer vision
  • Experience in benchmarking methodologies, performance analysis, workload profiling, performance monitoring and debugging tools
  • Advanced Linux OS, container (e.g. Docker) and GitHub skills
  • Programming skills in a variety of relevant languages such as Python or C/C++
  • Expertise with deep learning frameworks like PyTorch and TensorFlow
  • Knowledge and interest in computer and GPU architecture
  • In-depth knowledge of GPU acceleration with either AMD or Nvidia GPU compute products
  • Excellent problem-solving skills and an automation mindset

Nice-to-haves

  • Experience applying AI and ML concepts to solve real-world problems through research or work experience
  • Familiarity with performance monitoring and tuning tools

Benefits

  • Base pay dependent on skills and experience
  • Eligibility for annual bonus or sales incentive
  • Opportunity to own shares of AMD stock
  • Discount on AMD stock through Employee Stock Purchase Plan
  • Competitive benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service