Principal Machine Learning Compiler Engineer

AMD - San Jose, CA

posted 6 months ago

Full-time - Mid Level

San Jose, CA

Computer and Electronic Product Manufacturing

About the position

At AMD, we are committed to transforming lives through our technology, and we are looking for a skilled and experienced engineer to join our core team. This role involves developing a cutting-edge machine learning model compiler specifically targeting AMD Inference Accelerator (AIE) hardware devices. The compiler will be responsible for taking models written in popular frameworks such as PyTorch, TensorFlow, ONNX, or JAX and producing optimized control and executable code for the AIE VLIW processor array. Additionally, the compiler must efficiently handle the partitioning of tasks between x86 and hardware accelerators, generate optimal code for AMD x86 CPUs using AMD ZenDNN, and create code that interfaces with the AMD AIE-specific runtime and driver. The ideal candidate will be passionate about implementing and improving effective algorithms and techniques to enhance compiler and runtime for machine learning. This position requires strong leadership skills to drive complex issues to resolution and the ability to communicate effectively while collaborating with various teams across AMD. The role also involves mentoring and providing guidance to others, learning the latest industry trends, and bringing innovative ideas to the team. You will be responsible for designing and developing groundbreaking AMD technologies, debugging and fixing existing issues, and researching alternative, more efficient methods to accomplish tasks. Building technical relationships with peers and partners is also a key aspect of this role.

Responsibilities

Implement and improve passes in the compiler
Integrate compiler and compiled model with ML Frameworks (such as PyTorch and TensorFlow)
Implement model partitioning in ML Frameworks and/or MLIR
Implement runtime to distribute work to and collect results from x86 cores and the array of AIE cores
Mentor and provide guidance to others
Learn latest industry trends and bring new ideas to the team
Design and develop new groundbreaking AMD technologies
Debugging/fix existing issues and research alternative, more efficient ways to accomplish the same work
Develop technical relationships with peers and partners

Requirements

Strong object-oriented programming background in C/C++ and Python
Experience with ML Compiler and Runtime technologies such as OneDNN, MLIR, XLA, OpenXLA, IREE, OpenAI Triton compiler
Compiler building skills
Code generation for a ML hardware accelerator
GPU code generation
Machine Learning concepts and model development experience
Understanding of PyTorch, TensorFlow, ONNX, JAX, etc.
Ability to write high quality code with keen attention to detail
Experience with modern concurrent programming and threading APIs
Experience with software development processes and tools such as debuggers, source code control systems (GitHub), and profilers is a plus
Effective communication and problem-solving skills
Motivating leader with good interpersonal skills
Bachelor's, Master's or PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent

Benefits

Base pay depending on skills, qualifications, experience, and location
Eligibility for incentives such as annual bonuses or sales incentives
Opportunity to own shares of AMD stock
Discount when purchasing AMD stock through Employee Stock Purchase Plan
Competitive benefits package

Principal Machine Learning Compiler Engineer

About the position

Responsibilities

Requirements

Benefits

Tools

Career Hubs

Guides

Company