Bytedance - Seattle, WA

posted about 1 month ago

Full-time - Intern
Seattle, WA
Professional, Scientific, and Technical Services

About the position

The Research Scientist Intern position at ByteDance's Doubao (Seed) Team focuses on advancing machine learning systems and technologies. This internship offers an opportunity to engage in cutting-edge research in AI, particularly in areas such as deep learning and reinforcement learning. Interns will work on developing efficient machine learning systems, contributing to the creation of state-of-the-art training frameworks, and improving the efficiency and stability of large-scale distributed training jobs. The internship is designed to provide hands-on experience and industry exposure, fostering personal and professional growth in a collaborative environment.

Responsibilities

  • Research and develop efficient machine learning systems, including optimizers and gradient efficient training techniques.
  • Develop a state-of-the-art asynchronous training framework ensuring convergence.
  • Implement general purpose training framework features and model-specific optimizations.
  • Improve efficiency and stability for large scale distributed training jobs.

Requirements

  • Currently enrolled in a PhD program focusing on distributed and parallel computing principles.
  • Familiarity with machine learning algorithms and frameworks such as PyTorch and Jax.
  • Basic understanding of GPU and/or ASIC operations.
  • Proficiency in at least one or two programming languages in a Linux environment: C/C++, CUDA, Python.
  • Must obtain and maintain work authorization in the country of employment.

Nice-to-haves

  • Experience with GPU-based high performance computing and RDMA high performance networks (MPI, NCCL, ibverbs).
  • Familiarity with distributed training framework optimizations like DeepSpeed, FSDP, Megatron, GSPMD.
  • Knowledge of AI compiler stacks such as torch.fx, XLA, and MLIR.
  • Experience in large scale data processing and parallel computing.
  • Experience in designing and operating large scale systems in cloud computing or machine learning.
  • In-depth CUDA programming and performance tuning experience.

Benefits

  • Hourly rate of $57.
  • 100% premium coverage for full-time intern medical insurance after 90 days.
  • Paid holidays and paid sick leave.
  • Mental and emotional health benefits through the Employee Assistance Program.
  • Reimbursement for mobile phone expenses.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service