Bytedance - Seattle, WA
posted 3 months ago
As a Research Scientist in Machine Learning Systems at ByteDance, you will be at the forefront of developing and implementing cutting-edge technologies in the field of machine learning. The AML (Applied Machine Learning) Machine Learning System team is dedicated to creating high-performance, reliable, and scalable systems that enhance the capabilities of machine learning applications. Your role will involve extensive research and development of our machine learning systems, focusing on heterogeneous computing architecture, management, monitoring, and deployment. You will engage in distributed task scheduling, machine learning training, and inference, while also optimizing AI algorithms across various layers of the system. This position offers a unique opportunity to work with advanced hardware for machine learning, including GPUs, FPGAs, and ASICs, ensuring that our systems run stably and reliably. In this role, you will be responsible for integrating large-scale heterogeneous systems that utilize GPU, RDMA, and storage technologies. You will have the chance to enrich your expertise in coding, performance improvement, and problem analysis, while also being involved in the decision-making processes that shape our machine learning systems. ByteDance values creativity and innovation, and as part of our team, you will contribute to a culture that encourages learning and growth, tackling challenges with courage and a commitment to excellence.