Tiktok - Mountain View, CA

posted 8 days ago

Full-time - Mid Level
Mountain View, CA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

As a Site Reliability Engineer (SRE) within the Recommendation Infrastructure team at TikTok U.S. Data Security (USDS), you will play a crucial role in enhancing the reliability and scalability of our recommendation systems. TikTok is a leading platform for short-form mobile video, and our mission is to inspire creativity and bring joy to our users. The USDS division is dedicated to ensuring the safety and security of U.S. user data, and your contributions will be vital in achieving this goal. You will engage in the entire lifecycle of recommendation systems, from system design consulting to deployment, operation, and refinement. This position requires a proactive approach to improving the tools and software that enhance the reliability of our services, automate operations, and boost research and development efficiency. In this role, you will be responsible for building the availability of large-scale services deployed across global data centers. You will plan, manage, and optimize cloud resource utilization while ensuring service level agreements (SLAs) for large-scale clusters are met. Monitoring and measuring availability, latency, and overall service health will be key components of your responsibilities. Additionally, you will practice sustainable incident response and conduct postmortems to learn from incidents and improve future performance. Your work will directly impact the user experience on TikTok, ensuring that millions of users can safely and reliably access the platform. TikTok values creativity and collaboration, and as part of our team, you will have the opportunity to work in a hybrid environment, requiring in-office presence three days a week. This flexible work model allows for collaboration while also accommodating personal work preferences. We are committed to fostering an inclusive workplace where diverse voices are celebrated, and we encourage candidates from all backgrounds to apply. Join us in our mission to inspire creativity and bring joy to users around the world.

Responsibilities

  • Engage in and improve the whole lifecycle of Recommendation systems — from system design consulting through to launch reviews, deployment, operation and refinement
  • Deliver tools/software to improve the reliability and scalability of services, automate operations and improve R&D efficiency
  • Build availability of large-scale services deployed across global data centers
  • Plan, manage and optimize cloud resources utilization, ensuring SLA of large-scale clusters
  • Measure and monitor availability, latency and overall service health
  • Practice sustainable incident response and postmortems.

Requirements

  • Bachelor's degree or above majoring in Computer Science or related fields, with at least 2 years of related work experience
  • Experience in SRE of large-scale systems deployment with high reliability and scalability
  • Familiar with system operation skills in Linux and network
  • Experience programming in at least one of the following languages: Python, Perl, Go, or C/C++
  • Experience in designing, analyzing and troubleshooting large-scale distributed systems
  • Familiar with popular CI/CD procedures and environments
  • Effective communication skills and a sense of ownership and drive

Benefits

  • 100% premium coverage for employee medical insurance
  • Approximately 75% premium coverage for dependents
  • Health Savings Account (HSA) with a company match
  • Dental insurance
  • Vision insurance
  • Short/Long term Disability insurance
  • Basic Life insurance
  • Voluntary Life and AD&D insurance plans
  • Flexible Spending Account (FSA) Options
  • 10 paid holidays per year
  • 17 days of Paid Personal Time Off (PPTO)
  • 10 paid sick days per year
  • 12 weeks of paid Parental leave
  • 8 weeks of paid Supplemental Disability
  • Mental and emotional health benefits through EAP and Lyra
  • 401K company match
  • Gym reimbursement
  • Cellphone service reimbursement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service