Tiktok - San Jose, CA

posted 3 months ago

Full-time - Mid Level
San Jose, CA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

TikTok is the leading destination for short-form mobile video, with a mission to inspire creativity and bring joy. The company has a global presence with offices in major cities including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul, and Tokyo. The core of TikTok's purpose is creation, and the platform is designed to help imaginations thrive. This ethos extends to the teams that make TikTok possible, where every challenge is viewed as an opportunity to learn, innovate, and grow together. TikTok is one of the fastest-growing apps in the world, and the company is currently seeking Site Reliability Engineers (SREs) to join its monetization technology team. This team is responsible for building and maintaining large-scale, globally distributed, fault-tolerant ad systems. SREs play a crucial role in ensuring these systems operate with the highest level of availability, providing users with the best possible experience. In this role, you will engage in and improve the entire lifecycle of Ads systems, from system design consulting to launch reviews, deployment, operation, and refinement. You will be tasked with building the availability of services deployed across multiple data centers globally and delivering tools and software to enhance the reliability, scalability, and operability of services. Monitoring and measuring availability, latency, and overall service health will be key responsibilities, along with practicing sustainable incident response and conducting postmortems. Participation in on-call rotations across continents is also expected, ensuring that the systems remain operational and efficient at all times.

Responsibilities

  • Engage in and improve the whole lifecycle of Ads systems from system design consulting through to launch reviews, deployment, operation, and refinement.
  • Build availability of services deployed across multiple data centers globally.
  • Deliver tools/software to improve the reliability, scalability, and operability of services.
  • Measure and monitor availability, latency, and overall service health.
  • Practice sustainable incident response and postmortems.
  • Participate in on-call rotations across continents.

Requirements

  • Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience.
  • 3+ years of experience in programming in at least one of the following programming languages: C, C++, Java, Python, Perl, or Go.
  • Expertise in Unix/Linux operating systems and IP networking.
  • Experience in problem solving, application issues, or production operations.
  • Experience in automating routine tasks.
  • Effective communication skills and a sense of ownership and drive.

Nice-to-haves

  • Experience in SRE of Ads/recommendation systems.
  • Experience designing, analyzing, and troubleshooting large-scale distributed systems.

Benefits

  • 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents, and a Health Savings Account (HSA) with a company match.
  • Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans.
  • Flexible Spending Account (FSA) options including Health Care, Limited Purpose, and Dependent Care.
  • 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated upon hire and increased by tenure) and 10 paid sick days per year.
  • 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.
  • Mental and emotional health benefits through EAP and Lyra.
  • 401K company match, gym, and cellphone service reimbursements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service