AMD - San Jose, CA

posted about 1 month ago

Full-time - Senior
San Jose, CA
Computer and Electronic Product Manufacturing

About the position

The Staff Applied Machine Learning Software Engineer - Generative AI at AMD is responsible for developing and optimizing software solutions for Generative AI applications. This role involves working on state-of-the-art research, model optimization, and compression algorithms to enhance AMD's AI inference capabilities across various products. The engineer will collaborate with multiple teams to influence the direction of AI/ML platforms and ensure high-quality software development.

Responsibilities

  • Accelerate inference of Generative AI on AMD's products.
  • Develop tools and techniques for model analysis, profiling, performance projections, and analyzing architecture bottlenecks.
  • Architect and prototype custom kernels on GPUs and CPUs (HIP, CUDA, OpenCL, Triton, etc.).
  • Optimize deep learning inference pipeline including graph compilation using AMD AI compilers.
  • Reproduce and improve upon state-of-the-art quantization, pruning, optimization algorithms in Pytorch and Python.
  • Develop high quality software to enable next-gen solutions.
  • Collaborate with AI/ML frameworks and infrastructure teams to enable new algorithms in the platforms.
  • Collaborate with AMD Research and Architecture teams to improve future products.
  • Influence the direction of AI/ML platforms for inference and training.

Requirements

  • Deep understanding of foundational AI model architectures.
  • Proficiency in software development and AI/ML frameworks such as Pytorch.
  • Experience with quantization and sparsity algorithms.
  • Demonstrated ability to efficiently map AI models onto GPUs and/or other hardware accelerators.
  • Strong development and debugging skills in Python.
  • Experience in C++ programming for GPUs and/or custom accelerators.

Nice-to-haves

  • Solid understanding of CNN, Transformer model architectures, LLMs, Stable Diffusion models.
  • Familiarity with torch.fx and/or other AI compilers, execution runtime.
  • Knowledge of LLMs/LMMs finetuning methods like RLHF.
  • Knowledge of parameter efficient techniques like LoRA QLoRA.
  • Contributions to open ML research or developer community.

Benefits

  • Employee stock purchase plan
  • Competitive benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service