Bytedance - San Jose, CA

posted about 1 month ago

Full-time - Mid Level
San Jose, CA
Professional, Scientific, and Technical Services

About the position

ByteDance is seeking a Software Engineer in Machine Learning Systems to join our innovative team in San Jose. Founded in 2012, ByteDance's mission is to inspire creativity and enrich life through a suite of products, including TikTok and various platforms tailored for the China market. Our commitment to creativity drives us to build products that help imaginations thrive, and we are looking for individuals who share this vision. The successful candidate will be part of the ByteDance Doubao Team, which is dedicated to developing cutting-edge AI large model technology. This team aims to become a world-class research group, contributing significantly to technological and social advancements in the field of AI. The Machine Learning (ML) System sub-team focuses on integrating system engineering with machine learning to create and maintain distributed ML training and inference systems globally. This role offers the opportunity to work on high-performance, reliable, and scalable systems for LLM/AIGC/AGI. You will be involved in building large-scale heterogeneous systems that integrate GPU/NPU/RDMA/Storage, ensuring their stability and reliability. This position also allows you to enhance your skills in coding, performance analysis, and distributed systems while participating in the decision-making process. You will collaborate with a global team from the United States, China, and Singapore, working towards a unified project direction. As a Software Engineer, your responsibilities will include the development of machine learning systems, deployment of machine learning services, online serving of machine learning models, and iterating on the system based on customer-driven scenarios. This role requires a strong understanding of distributed and parallel computing principles, as well as proficiency in programming languages such as C/C++, Go, or Python in a Linux environment. At ByteDance, we are committed to creating an inclusive environment where employees are valued for their unique skills and perspectives. We celebrate diversity and strive to reflect the communities we serve. Our company also provides reasonable accommodations for candidates with disabilities or other protected reasons during the recruitment process. Join us in our mission to inspire creativity and enrich life!

Responsibilities

  • Development of machine learning systems, including key computing development, task scheduling, and machine learning system management and operation.
  • Deployment of machine learning services.
  • Online serving of machine learning models.
  • Iterate and develop the system using customer-driven scenarios.

Requirements

  • Master distributed, parallel computing principles; know the recent advances in computing, storage, networking, and hardware technologies.
  • Master at least one or two programming languages in Linux environment such as C/C++, Go, Python, etc.

Nice-to-haves

  • Experience contributing to GPU architecture, GPU cluster.
  • Familiar with Kubernetes / Kubeflow / Volcano orchestrations.
  • Familiar with at least one deep learning framework (PyTorch, Megatron, DeepSpeed, vLLM).
  • Experience in GPU based high-performance computing.

Benefits

  • 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents.
  • Health Savings Account (HSA) with a company match.
  • Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans.
  • Flexible Spending Account (FSA) Options like Health Care, Limited Purpose and Dependent Care.
  • 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) and 10 paid sick days per year.
  • 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.
  • Mental and emotional health benefits through EAP and Lyra.
  • 401K company match, gym and cellphone service reimbursements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service