AMD - Boxborough, MA

posted about 2 months ago

Full-time - Senior
Boxborough, MA
Computer and Electronic Product Manufacturing

About the position

The Application Software Optimization Engineer, ML/AI role at AMD involves optimizing machine learning applications for AMD's CPU and GPU platforms. This senior-level position is part of a team focused on enhancing AI-based products and requires collaboration with various groups within AMD to resolve application and customer issues. The role also includes developing training materials and presenting them at various venues.

Responsibilities

  • Port and optimize a variety of machine learning based models and applications for AMD CPU and GPU systems
  • Provide domain specific knowledge to other groups at AMD
  • Engage with AMD product groups to drive resolution of application and customer issues
  • Develop and present training materials to internal audiences, at customer venues, and at industry conferences

Requirements

  • Broad experience building, running and tuning machine learning models
  • In depth knowledge of current machine learning frameworks and commonly used models for training and inference
  • Strong performance analysis skills for both CPU and GPU
  • Extensive experience with C++ and Python
  • Familiarity with distributed model training via NCCL/RCCL, MPI, or similar network technologies
  • Experience in implementing and optimizing parallel methods on GPU accelerators in distributed memory systems with MPI, CUDA, HIP, OpenMP, etc.
  • Experience in scientific computing disciplines such as computational chemistry, fluid dynamics, weather modeling, and oil and gas applications
  • In-depth understanding of IO, parallel file systems, and network limitations and capabilities as used in AI models
  • Familiarity with installation and setup of various AI applications and machine learning frameworks
  • Experience provisioning clusters and validating their performance for use in machine learning applications
  • Experience with build system tools including Make, CMake, autoconf, and autotools
  • In-depth knowledge of software development practices including debug, test, revision control, documentation, and bug tracking
  • Strong team development skills including demonstrated expertise with git and Jira
  • Ability to work well in geographically dispersed teams

Benefits

  • Base pay dependent on skills and qualifications
  • Eligibility for annual bonus or sales incentive
  • Opportunity to own shares of AMD stock
  • Discount on AMD stock through Employee Stock Purchase Plan
  • Competitive benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service