Tiktok - San Jose, CA

posted 27 days ago

Full-time - Senior
San Jose, CA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

TikTok is the leading destination for short-form mobile video, with a mission to inspire creativity and bring joy. The Generative AI team under Monetization Technology is focused on developing cutting-edge Generative AI technologies across various modalities, including text, image, video, and landing pages. This team is dedicated to creating industry-leading technical solutions that enhance creative efficiency for advertisers, agencies, and creators. By leveraging Generative AI technologies, the team aims to automate creative workflows and increase overall revenue for clients and creators alike. As a Senior Machine Learning Engineer specializing in Data Curation, you will play a pivotal role in driving and leading generative AI initiatives within the ads tech and creative industry. Your responsibilities will include collaborating with foundational model researchers to develop and maintain efficient, low-latency data pipelines, designing and implementing robust systems for data management, and enhancing user engagement through data insights and model evaluation pipelines. You will also be tasked with developing caching mechanisms to improve data retrieval speeds and staying updated with the latest advancements in academic research and open-source technologies to continuously enhance data operations and machine learning model performance. This position requires a strong technical background, a passion for innovation, and the ability to work collaboratively across functions with global teams. If you are someone who thrives on challenges and is eager to make a significant impact, we invite you to join our team and contribute to our mission of inspiring creativity and bringing joy.

Responsibilities

  • Collaborate with foundational model researchers, including specialists in Ads LLM, Text-to-Image, and Text-to-Video, to develop and maintain efficient, low-latency data pipelines.
  • Design and implement robust, scalable systems for data curation and management, supporting the foundational training of models across various formats in distributed environments.
  • Implement data insights and model evaluation pipelines to enhance user engagement and drive revenue growth.
  • Develop caching mechanisms to improve data retrieval speeds and enhance model responsiveness.
  • Stay abreast of the latest academic research and open-source advancements, integrating cutting-edge technologies to continuously improve our data operations and machine learning model performance.

Requirements

  • B.S./M.S./Ph.D. in Computer Science, Computer Engineering, or a related field.
  • Expertise in Python and a strong foundation in deep learning frameworks, such as PyTorch, as well as large model training libraries like FSDP/DeepSpeed and asyncio.
  • A minimum of 3 years' experience with Linux, Docker, and Kubernetes.
  • Demonstrated capability in data curation, management, and optimization within Generative AI ecosystems, encompassing both streaming and batch data processing.
  • Thorough understanding of machine learning frameworks, parallel data processing techniques, and proficiency with large language models (e.g., Llama series), text to image (e.g., Diffusion-Based Models, Diffusion Transformers), and text to video technologies (e.g., EMU series, MagViT).

Nice-to-haves

  • Experience in CUDA Optimization and a deep understanding of the application of Generative AI models across multiple domains.
  • Significant experience in managing large-scale data systems, with a strong preference for those who have worked with Vector Database solutions.
  • Proficiency in cloud services (AWS/GCP) and familiarity with machine learning training, deployment, and distributed computing frameworks like Spark.
  • Exceptional communication, teamwork, and project management skills.

Benefits

  • 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents.
  • Health Savings Account (HSA) with a company match.
  • Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans.
  • Flexible Spending Account (FSA) Options like Health Care, Limited Purpose and Dependent Care.
  • 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) and 10 paid sick days per year.
  • 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.
  • Mental and emotional health benefits through EAP and Lyra.
  • 401K company match, gym and cellphone service reimbursements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service