Bytedance - San Jose, CA

posted about 1 month ago

Full-time - Intern
San Jose, CA
Professional, Scientific, and Technical Services

About the position

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. The Applied Machine Learning Enterprise team combines system engineering and machine learning to develop and operate massively distributed machine learning training and inference systems and services to serve both the big model vendors and users around the world. As a Software Engineer Intern in Applied Machine Learning, you will have the opportunity to build and enrich your expertise in coding, performance analysis, and large system management, and get heavily involved in the process of hardware and capacity decision-making. In the engineering team, you'll have the opportunity to build a large-scale heterogeneous system integrating with GPU/RDMA/Storage and keep it running steady and reliable. This internship aims to provide students with hands-on experience in developing fundamental skills and exploring potential career paths. A vibrant blend of social events and enriching development workshops will be available for you to explore. Here, you will utilize your knowledge in real-world scenarios while laying a strong foundation for personal and professional growth. The internship runs for 12-24 weeks and begins in May/June 2025 or August/September 2025. Successful candidates must be able to commit to one of the specified start dates. Applications will be reviewed on a rolling basis, and candidates can apply to a maximum of two positions.

Responsibilities

  • Building a next-generation big model as a service platform to serve hundreds of LLMs based applications.
  • Developing and maintaining the big model as a service platform, including offline training/finetuning, online inference, model management, and resource orchestration.
  • Managing a huge number of GPU resources and providing computing power efficiently.

Requirements

  • Currently pursuing a BS/MS degree in Computer Science or related technical field.
  • Proficient in deep learning frameworks such as PyTorch or TensorFlow.
  • Experience with software development in at least one of the following programming languages: C++, Python, Go.
  • Good sense of teamwork and communication skills, practical experience in relevant business scenarios is preferred.
  • Strong software programming capabilities, exhibits good code design and coding style.
  • Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment.

Nice-to-haves

  • Experience in NLP and LLM technologies.
  • Experience or research in Alignment for Large language models.
  • Experience in ModelOps, for example large model management, finetuning workflow management, etc.
  • Contribution to top-tier conference papers, including NeurIPS, ICML, ICLR, CVPR, ICCV, ACL, KDD, etc. is a big plus.

Benefits

  • Hourly rate range for this position is $60- $60.
  • 100% premium coverage for Full-Time intern medical insurance after 90 days from the date of hire.
  • Paid holidays and paid sick leave.
  • Mental and emotional health benefits through the Employee Assistance Program.
  • Reimbursements for mobile phone expenses.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service