Senior Machine Learning Engineer, Quantized Training

$180,000 - $339,250/Yr

Nvidia - Seattle, WA

posted 2 months ago

Full-time - Senior

Seattle, WA

Computer and Electronic Product Manufacturing

About the position

NVIDIA is seeking a Senior Machine Learning Engineer for Quantized Training to support next-generation recipes for mixed-precision training. In this role, you will be responsible for distilling large language model (LLM) research literature into its core components, translating that literature into scalable experiments, creating insights to support or refute the efficacy of various techniques, and generating reproducible training recipes. This position requires a deep understanding of the latest advancements in quantized training and the ability to apply this knowledge in practical settings. Your responsibilities will include reviewing state-of-the-art literature in quantized training, building robust, reproducible, and portable training recipes, and providing engineering support to customers using both hardware and software approaches. You will collaborate closely with hardware, software, and research teams to assess and adopt deep learning algorithmic advancements in quantization. Additionally, you will work with production software teams to implement these recipes into production workflows, ensuring that they are effective and efficient. This role is critical in shaping the future of AI at NVIDIA, as you will be at the forefront of integrating and optimizing deep learning frameworks on the most advanced GPUs. You will have the opportunity to influence the long-term opportunities that expand NVIDIA's impact on the datacenter and beyond, all while working in a creative and autonomous environment that encourages innovation.

Responsibilities

Review state-of-the-art literature in quantized training
Build robust, reproducible, and portable training recipes
Provide engineering support to customers using hardware and software approaches
Collaborate closely with hardware, software, and research teams to assess and adopt deep learning algorithmic advancements in quantization
Work with production software teams to realize recipes in production workflows

Requirements

Experience with PyTorch or similar frameworks such as JAX/XLA
Proficient in the math of machine learning
Familiarity with FP8 for training
Published research or significant contributions to the field of AI, particularly in algorithm development for hardware-software co-design
PhD, M.S. degree or equivalent experience in Computer Science or a related field
5+ years of experience working in ML/AI
Strong written and oral communication skills
Strong programming skills and ability to debug ML systems

Nice-to-haves

Experience in LLM training, fine-tuning, and optimization (quantization, sparsity)
Familiarity with MX formats for training
Experience with Transformer Engine, Megatron-LM, or NeMo

Benefits

Equity
Comprehensive health benefits
Flexible work environment
Opportunities for professional development

Senior Machine Learning Engineer, Quantized Training

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company