Tiktok - Seattle, WA

posted 2 days ago

Full-time - Mid Level
Seattle, WA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

As a Site Reliability Engineer within TikTok's U.S. Data Security (USDS) division, you will play a crucial role in ensuring the reliability and performance of the services that power the TikTok experience. This position is designed for individuals who are passionate about maintaining high service levels and enhancing system reliability through innovative solutions. You will be part of a global team dedicated to supporting site-up issues, ensuring that services are reliable, fault-tolerant, efficiently scalable, and cost-effective. Your responsibilities will include monitoring system health, measuring performance, and maintaining services to meet established service-level agreements (SLAs) and service-level objectives (SLOs). In this role, you will gain a solid understanding of the various components and services that contribute to the TikTok platform. You will be expected to scale systems sustainably through automation and push for changes that enhance reliability, efficiency, and velocity. Additionally, you will provide user support, respond to incidents, and conduct postmortems to learn from any issues that arise. This position requires a strong technical background, particularly in the deployment and administration of large-scale distributed systems, as well as a commitment to protecting sensitive data and information. TikTok is committed to creating an inclusive environment where employees are valued for their unique perspectives and skills. The company emphasizes collaboration and cross-functional partnerships, and as such, this role follows a hybrid work schedule, requiring employees to work in the office three days a week. TikTok's mission is to inspire creativity and bring joy, and this role is integral to achieving that goal by ensuring the platform remains secure and reliable for millions of users.

Responsibilities

  • Gain a solid understanding of the various components and services that power the TikTok experience
  • Maintain services to meet service-level-agreements (SLAs) and service-level-objectives (SLOs) by measuring and monitoring availability, performance, and overall system health
  • Participate as part of a global team to support site-up issues to ensure that services are reliable, fault-tolerant, efficiently scalable and cost-effective
  • Scale systems sustainably through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pushing for changes
  • Provide user support, incident responses and postmortems

Requirements

  • Bachelor or above degree in Computer Science or a related technical discipline
  • 3-5+ years experience in the deployment and administration of large-scale distributed systems
  • Strong understanding of Unix/Linux operating systems internals and administration
  • Knowledge of networking (e.g. TCP/IP, routing, network topologies and hardware)
  • Experience with storage systems and database systems
  • Experience in one or more programming languages, such as C, C++, Java, Python, Go, Ruby, Rust, JavaScript
  • Experience in debugging and optimizing code and automating routine tasks
  • Experience in development, testing, deployment and administration of systems like Nginx, Kubernetes, Docker, OpenStack, Hadoop, Spark, Flink, Kafka
  • Experience in designing and analyzing large-scale distributed systems is preferred
  • Strong skills in problem solving and communication

Nice-to-haves

  • Experience in designing and analyzing large-scale distributed systems

Benefits

  • 100% premium coverage for employee medical insurance
  • Approximately 75% premium coverage for dependents
  • Health Savings Account (HSA) with a company match
  • Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans
  • Flexible Spending Account (FSA) Options like Health Care, Limited Purpose and Dependent Care
  • 10 paid holidays per year
  • 17 days of Paid Personal Time Off (PPTO)
  • 10 paid sick days per year
  • 12 weeks of paid Parental leave
  • 8 weeks of paid Supplemental Disability
  • Mental and emotional health benefits through EAP and Lyra
  • 401K company match
  • Gym and cellphone service reimbursements
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service