Tiktok - San Jose, CA

posted 3 months ago

Full-time - Mid Level
San Jose, CA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

TikTok is the leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. We are currently seeking a Machine Learning Engineer to join our Model Training Infrastructure team. This role is pivotal in supporting and advancing our mission by pushing the next-generation AI infrastructure and recommendation platform for ads ranking, search ranking, and live & e-commerce ranking within the company. As part of our AML team, you will have the opportunity to drive substantial impact on the core businesses of TikTok. In this position, you will be responsible for the design and implementation of a global-scale machine learning system that supports feeds, ads, and search ranking models. You will work on improving the usability and flexibility of our machine learning infrastructure, enhancing the workflow of model training and serving, and optimizing data pipelines, storage systems, and resource management for multi-tenancy machine learning systems. Additionally, you will design and develop key components of the ML infrastructure and mentor interns, fostering a collaborative and innovative environment. At TikTok, we believe that every challenge is an opportunity to learn, innovate, and grow as a team. We are committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and we aim to reflect the diverse communities we serve. If you are passionate about machine learning and want to be part of a dynamic team that inspires creativity and brings joy, we encourage you to apply.

Responsibilities

  • Design and implement a global-scale machine learning system for feeds, ads, and search ranking models.
  • Improve the usability and flexibility of the machine learning infrastructure.
  • Enhance the workflow of model training and serving, data pipelines, storage systems, and resource management for multi-tenancy machine learning systems.
  • Design and develop key components of ML infrastructure and mentor interns.

Requirements

  • Proficient in C/C++/CUDA/Python with solid programming skills.
  • Familiar with deep learning frameworks such as TensorFlow and PyTorch.
  • Experience in developing and deploying large-scale systems.
  • Ability to work independently and complete projects from beginning to end in a timely manner.
  • Good communication and teamwork skills to clearly communicate technical concepts with other teammates.
  • Experience in improving core machine learning infrastructure (TensorFlow, PyTorch, and Jax).

Nice-to-haves

  • Experience contributing to an open-sourced machine learning framework (TensorFlow/PyTorch).
  • Experience in using/designing open-source machine learning lifecycle management systems: TFX.

Benefits

  • 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents.
  • Health Savings Account (HSA) with a company match.
  • Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans.
  • Flexible Spending Account (FSA) options for healthcare and dependent care.
  • 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) and 10 paid sick days per year.
  • 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.
  • Mental and emotional health benefits through EAP and Lyra.
  • 401K company match, gym, and cellphone service reimbursements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service