This job is closed
We regret to inform you that the job you were interested in has been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.
Meta - New York, NY
posted about 2 months ago
The Software Engineer, SystemML - Scaling / Performance role at Meta involves working within the Network.AI Software team to enhance the software stack around the NVIDIA Collective Communications Library (NCCL). This position focuses on enabling reliable and scalable distributed machine learning (ML) training, particularly for Generative AI (GenAI) and Large Language Models (LLM). The team is responsible for improving the performance and reliability of distributed ML workloads across Meta's extensive GPU infrastructure, ensuring that innovations in ML can leverage this technology effectively.
Match and compare your resume to any job description
Start Matching